aboutsummaryrefslogtreecommitdiff
path: root/hw/core/machine.c
AgeCommit message (Collapse)Author
2022-11-07hmat acpi: Don't require initiator value in -numaBrice Goglin
The "Memory Proximity Domain Attributes" structure of the ACPI HMAT has a "Processor Proximity Domain Valid" flag that is currently always set because Qemu -numa requires an initiator=X value when hmat=on. Unsetting this flag allows to create more complex memory topologies by having multiple best initiators for a single memory target. This patch allows -numa without initiator=X when hmat=on by keeping the default value MAX_NODES in numa_state->nodes[i].initiator. All places reading numa_state->nodes[i].initiator already check whether it's different from MAX_NODES before using it. Tested with qemu-system-x86_64 -accel kvm \ -machine pc,hmat=on \ -drive if=pflash,format=raw,file=./OVMF.fd \ -drive media=disk,format=qcow2,file=efi.qcow2 \ -smp 4 \ -m 3G \ -object memory-backend-ram,size=1G,id=ram0 \ -object memory-backend-ram,size=1G,id=ram1 \ -object memory-backend-ram,size=1G,id=ram2 \ -numa node,nodeid=0,memdev=ram0,cpus=0-1 \ -numa node,nodeid=1,memdev=ram1,cpus=2-3 \ -numa node,nodeid=2,memdev=ram2 \ -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=10 \ -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \ -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=20 \ -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \ -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-latency,latency=30 \ -numa hmat-lb,initiator=0,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576 \ -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=20 \ -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5242880 \ -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=10 \ -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10485760 \ -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-latency,latency=30 \ -numa hmat-lb,initiator=1,target=2,hierarchy=memory,data-type=access-bandwidth,bandwidth=1048576 which reports NUMA node2 at same distance from both node0 and node1 as seen in lstopo: Machine (2966MB total) + Package P#0 NUMANode P#2 (979MB) Group0 NUMANode P#0 (980MB) Core P#0 + PU P#0 Core P#1 + PU P#1 Group0 NUMANode P#1 (1007MB) Core P#2 + PU P#2 Core P#3 + PU P#3 Before this patch, we had to add ",initiator=X" to "-numa node,nodeid=2,memdev=ram2". The lstopo output difference between initiator=1 and no initiator is: @@ -1,10 +1,10 @@ Machine (2966MB total) + Package P#0 + NUMANode P#2 (979MB) Group0 NUMANode P#0 (980MB) Core P#0 + PU P#0 Core P#1 + PU P#1 Group0 NUMANode P#1 (1007MB) - NUMANode P#2 (979MB) Core P#2 + PU P#2 Core P#3 + PU P#3 Corresponding changes in the HMAT MPDA structure: @@ -49,10 +49,10 @@ [078h 0120 2] Structure Type : 0000 [Memory Proximity Domain Attributes] [07Ah 0122 2] Reserved : 0000 [07Ch 0124 4] Length : 00000028 -[080h 0128 2] Flags (decoded below) : 0001 - Processor Proximity Domain Valid : 1 +[080h 0128 2] Flags (decoded below) : 0000 + Processor Proximity Domain Valid : 0 [082h 0130 2] Reserved1 : 0000 -[084h 0132 4] Attached Initiator Proximity Domain : 00000001 +[084h 0132 4] Attached Initiator Proximity Domain : 00000080 [088h 0136 4] Memory Proximity Domain : 00000002 [08Ch 0140 4] Reserved2 : 00000000 [090h 0144 8] Reserved3 : 0000000000000000 Final HMAT SLLB structures: [0A0h 0160 2] Structure Type : 0001 [System Locality Latency and Bandwidth Information] [0A2h 0162 2] Reserved : 0000 [0A4h 0164 4] Length : 00000040 [0A8h 0168 1] Flags (decoded below) : 00 Memory Hierarchy : 0 [0A9h 0169 1] Data Type : 00 [0AAh 0170 2] Reserved1 : 0000 [0ACh 0172 4] Initiator Proximity Domains # : 00000002 [0B0h 0176 4] Target Proximity Domains # : 00000003 [0B4h 0180 4] Reserved2 : 00000000 [0B8h 0184 8] Entry Base Unit : 0000000000002710 [0C0h 0192 4] Initiator Proximity Domain List : 00000000 [0C4h 0196 4] Initiator Proximity Domain List : 00000001 [0C8h 0200 4] Target Proximity Domain List : 00000000 [0CCh 0204 4] Target Proximity Domain List : 00000001 [0D0h 0208 4] Target Proximity Domain List : 00000002 [0D4h 0212 2] Entry : 0001 [0D6h 0214 2] Entry : 0002 [0D8h 0216 2] Entry : 0003 [0DAh 0218 2] Entry : 0002 [0DCh 0220 2] Entry : 0001 [0DEh 0222 2] Entry : 0003 [0E0h 0224 2] Structure Type : 0001 [System Locality Latency and Bandwidth Information] [0E2h 0226 2] Reserved : 0000 [0E4h 0228 4] Length : 00000040 [0E8h 0232 1] Flags (decoded below) : 00 Memory Hierarchy : 0 [0E9h 0233 1] Data Type : 03 [0EAh 0234 2] Reserved1 : 0000 [0ECh 0236 4] Initiator Proximity Domains # : 00000002 [0F0h 0240 4] Target Proximity Domains # : 00000003 [0F4h 0244 4] Reserved2 : 00000000 [0F8h 0248 8] Entry Base Unit : 0000000000000001 [100h 0256 4] Initiator Proximity Domain List : 00000000 [104h 0260 4] Initiator Proximity Domain List : 00000001 [108h 0264 4] Target Proximity Domain List : 00000000 [10Ch 0268 4] Target Proximity Domain List : 00000001 [110h 0272 4] Target Proximity Domain List : 00000002 [114h 0276 2] Entry : 000A [116h 0278 2] Entry : 0005 [118h 0280 2] Entry : 0001 [11Ah 0282 2] Entry : 0005 [11Ch 0284 2] Entry : 000A [11Eh 0286 2] Entry : 0001 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Hesham Almatary <hesham.almatary@huawei.com> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com> Message-Id: <20221027100037.251-2-hesham.almatary@huawei.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-11-07virtio: core: vq reset feature negotation supportKangjie Xu
A a new command line parameter "queue_reset" is added. Meanwhile, the vq reset feature is disabled for pre-7.2 machines. Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221017092558.111082-5-xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-25hw: Add compat machines for 7.2Cornelia Huck
Add 7.2 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220727121755.395894-1-cohuck@redhat.com> [thuth: fixed conflict with pcmc->legacy_no_rng_seed] Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-06-09hw/cxl: Move the CXLState from MachineState to machine type specific state.Jonathan Cameron
This removes the last of the CXL code from the MachineState where it is visible to all Machines to only those that support CXL (currently i386/pc) As i386/pc always support CXL now, stop allocating the state independently. Note the pxb register hookup code runs even if cxl=off in order to detect pxb_cxl host bridges and fail to start if any are present as they won't have the control registers available. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Message-Id: <20220608145440.26106-8-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-09hw/cxl: Make the CXL fixed memory window setup a machine parameter.Jonathan Cameron
Paolo Bonzini requested this change to simplify the ongoing effort to allow machine setup entirely via RPC. Includes shortening the command line form cxl-fixed-memory-window to cxl-fmw as the command lines are extremely long even with this change. The json change is needed to ensure that there is a CXLFixedMemoryWindowOptionsList even though the actual element in the json is never used. Similar to existing SgxEpcProperties. Update qemu-options.hx to reflect that this is now a -machine parameter. The bulk of -M / -machine parameters are documented under machine, so use that in preference to M. Update cxl-test and bios-tables-test to reflect new parameters. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Message-Id: <20220608145440.26106-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-03hw/nvme: do not auto-generate eui64Klaus Jensen
We cannot provide auto-generated unique or persistent namespace identifiers (EUI64, NGUID, UUID) easily. Since 6.1, namespaces have been assigned a generated EUI64 of the form "52:54:00:<namespace counter>". This is will be unique within a QEMU instance, but not globally. Revert that this is assigned automatically and immediately deprecate the compatibility parameter. Users can opt-in to this with the `eui64-default=on` device parameter or set it explicitly with `eui64=UINT64`. Cc: libvir-list@redhat.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2022-05-19hw/intc/arm_gicv3: Use correct number of priority bits for the CPUPeter Maydell
Make the GICv3 set its number of bits of physical priority from the implementation-specific value provided in the CPU state struct, in the same way we already do for virtual priority bits. Because this would be a migration compatibility break, we provide a property force-8-bit-prio which is enabled for 7.0 and earlier versioned board models to retain the legacy "always use 8 bits" behaviour. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220512151457.3899052-6-peter.maydell@linaro.org Message-id: 20220506162129.2896966-5-peter.maydell@linaro.org
2022-05-13cxl: Machine level control on whether CXL support is enabledJonathan Cameron
There are going to be some potential overheads to CXL enablement, for example the host bridge region reserved in memory maps. Add a machine level control so that CXL is disabled by default. Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-14-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-12machine: move more memory validation to Machine objectPaolo Bonzini
This allows setting memory properties without going through vl.c, and have them validated just the same. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12machine: make memory-backend a link propertyPaolo Bonzini
Handle HostMemoryBackend creation and setting of ms->ram entirely in machine_run_board_init. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12machine: add mem compound propertyPaolo Bonzini
Make -m syntactic sugar for a compound property "-machine mem.{size,max-size,slots}". The new property does not have the magic conversion to megabytes of unsuffixed arguments, and also does not understand that "0" means the default size (you have to leave it out to get the default). This means that we need to convert the QemuOpts by hand to a QDict. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12machine: add boot compound propertyPaolo Bonzini
Make -boot syntactic sugar for a compound property "-machine boot.{order,menu,...}". machine_boot_parse is replaced by the setter for the property. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12machine: use QAPI struct for boot configurationPaolo Bonzini
As part of converting -boot to a property with a QAPI type, define the struct and use it throughout QEMU to access boot configuration. machine_boot_parse takes care of doing the QemuOpts->QAPI conversion by hand, for now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-09qapi/machine.json: Add cluster-idGavin Shan
This adds cluster-id in CPU instance properties, which will be used by arm/virt machine. Besides, the cluster-id is also verified or dumped in various spots: * hw/core/machine.c::machine_set_cpu_numa_node() to associate CPU with its NUMA node. * hw/core/machine.c::machine_numa_finish_cpu_init() to record CPU slots with no NUMA mapping set. * hw/core/machine-hmp-cmds.c::hmp_hotpluggable_cpus() to dump cluster-id. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-id: 20220503140304.855514-2-gshan@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-04-20hw: Add compat machines for 7.1Cornelia Huck
Add 7.1 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20220316145521.1224083-1-cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-04-06acpi: fix acpi_index migrationDr. David Alan Gilbert
vmstate_acpi_pcihp_use_acpi_index() was expecting AcpiPciHpState as state but it actually received PIIX4PMState, because VMSTATE_PCI_HOTPLUG is a macro and not another struct. So it ended up accessing random pointer, which resulted in 'false' return value and acpi_index field wasn't ever sent. However in 7.0 that pointer de-references to value > 0, and destination QEMU starts to expect the field which isn't sent in migratioon stream from older QEMU (6.2 and older). As result migration fails with: qemu-system-x86_64: Missing section footer for 0000:00:01.3/piix4_pm qemu-system-x86_64: load of migration failed: Invalid argument In addition with QEMU-6.2, destination due to not expected state, also never expects the acpi_index field in migration stream. Q35 is not affected as it always sends/expects the field as long as acpi based PCI hotplug is enabled. Fix issue by introducing compat knob to never send/expect acpi_index in migration stream for 6.2 and older PC machine types and always send it for 7.0 and newer PC machine types. Diagnosed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: b32bd76 ("pci: introduce acpi-index property for PCI device") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/932 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-18machine: Use host_memory_backend_is_mapped() in machine_consume_memdev()David Hildenbrand
memory_region_is_mapped() is the wrong check, we actually want to check whether the backend is already marked mapped. For example, memory regions mapped via an alias, such as NVDIMMs, currently don't make memory_region_is_mapped() return "true". As the machine is initialized before any memory devices (and thereby before NVDIMMs are initialized), this isn't a fix but merely a cleanup. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211102164317.45658-2-david@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-01-05hw: Add compat machines for 7.0Cornelia Huck
Add 7.0 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211217143948.289995-1-cohuck@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-31hw/core/machine: Introduce CPU cluster topology supportYanan Wang
The new Cluster-Aware Scheduling support has landed in Linux 5.16, which has been proved to benefit the scheduling performance (e.g. load balance and wake_affine strategy) on both x86_64 and AArch64. So now in Linux 5.16 we have four-level arch-neutral CPU topology definition like below and a new scheduler level for clusters. struct cpu_topology { int thread_id; int core_id; int cluster_id; int package_id; int llc_id; cpumask_t thread_sibling; cpumask_t core_sibling; cpumask_t cluster_sibling; cpumask_t llc_sibling; } A cluster generally means a group of CPU cores which share L2 cache or other mid-level resources, and it is the shared resources that is used to improve scheduler's behavior. From the point of view of the size range, it's between CPU die and CPU core. For example, on some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node, and 4 CPU cores in each cluster. The 4 CPU cores share a separate L2 cache and a L3 cache tag, which brings cache affinity advantage. In virtualization, on the Hosts which have pClusters (physical clusters), if we can design a vCPU topology with cluster level for guest kernel and have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can also make use of the cache affinity of CPU clusters to gain similar scheduling performance. This patch adds infrastructure for CPU cluster level topology configuration and parsing, so that the user can specify cluster parameter if their machines support it. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [PMD: Added '(since 7.0)' to @clusters in qapi/machine.json] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31hw/core: Rename smp_parse() -> machine_parse_smp_config()Philippe Mathieu-Daudé
All methods related to MachineState are prefixed with "machine_". smp_parse() does not need to be an exception. Rename it and const'ify the SMPConfiguration argument, since it doesn't need to be modified. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Yanan Wang <wangyanan55@huawei.com> Tested-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211216132015.815493-9-philmd@redhat.com>
2021-11-19hw/nvme: change nvme-ns 'shared' defaultKlaus Jensen
Change namespaces to be shared namespaces by default (parameter shared=on). Keep shared=off for older machine types. Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-11-05Fix virtio-net-pci* "vectors" compatEduardo Habkost
hw_compat_5_2 has an issue: it affects only "virtio-net-pci" but not "virtio-net-pci-transitional" and "virtio-net-pci-non-transitional". The solution is to use the "virtio-net-pci-base" type in compat_props. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1999141 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Jean-Louis Dupond <jean-louis@dupond.be> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-11-03Merge remote-tracking branch ↵Richard Henderson
'remotes/vivier/tags/trivial-branch-for-6.2-pull-request' into staging Trivial patches branch pull request 20211101 v2 # gpg: Signature made Tue 02 Nov 2021 07:21:44 PM EDT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] * remotes/vivier/tags/trivial-branch-for-6.2-pull-request: hw/input/lasips2: Fix typos in function names MAINTAINERS: Split HPPA TCG vs HPPA machines/hardware hw/core/machine: Add the missing delimiter in cpu_slot_to_string() monitor: Trim some trailing space from human-readable output Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-01machine: remove the done notifier for dynamic sysbus device type checkDamien Hedde
Now that we check sysbus device types during device creation, we can remove the check in the machine init done notifier. This was the only thing done by this notifier, so we remove the whole sysbus_notifier structure of the MachineState. Note: This notifier was checking all /peripheral and /peripheral-anon sysbus devices. Now we only check those added by -device cli option or device_add qmp command when handling the command/option. So if there are some devices added in one of these containers manually (eg in machine C code), these will not be checked anymore. This use case does not seem to appear apart from hw/xen/xen-legacy-backend.c (it uses qdev_set_id() and in this case, not for a sysbus device, so it's ok). Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211029142258.484907-4-damien.hedde@greensocs.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-11-01machine: add device_type_is_dynamic_sysbus functionDamien Hedde
Right now the allowance check for adding a sysbus device using -device cli option (or device_add qmp command) is done well after the device has been created. It is done during the machine init done notifier: machine_init_notify() in hw/core/machine.c This new function will allow us to do the check at the right time and issue an error if it fails. Also make device_is_dynamic_sysbus() use the new function. Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20211029142258.484907-2-damien.hedde@greensocs.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-11-01hw/core/machine: Split out the smp parsing codeYanan Wang
We are going to introduce an unit test for the parser smp_parse() in hw/core/machine.c, but now machine.c is only built in softmmu. In order to solve the build dependency on the smp parsing code and avoid building unrelated stuff for the unit tests, move the tested code from machine.c into a separate file, i.e., machine-smp.c and build it in common field. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211026034659.22040-2-wangyanan55@huawei.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-10-31hw/core/machine: Add the missing delimiter in cpu_slot_to_string()Yanan Wang
The expected output string from cpu_slot_to_string() ought to be like "socket-id: *, die-id: *, core-id: *, thread-id: *", so add the missing ", " before "die-id". This affects the readability of the error message. Fixes: 176d2cda0d ("i386/cpu: Consolidate die-id validity in smp context") Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20211008075040.18028-1-wangyanan55@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-10-05vhost-vsock: handle common features in vhost-vsock-commonStefano Garzarella
virtio-vsock features, like VIRTIO_VSOCK_F_SEQPACKET, can be handled by vhost-vsock-common parent class. In this way, we can reuse the same code for all virtio-vsock backends (i.e. vhost-vsock, vhost-user-vsock). Let's move `seqpacket` property to vhost-vsock-common class, add vhost_vsock_common_get_features() used by children, and disable `seqpacket` for vhost-user-vsock device for machine types < 6.2. The behavior of vhost-vsock device doesn't change; vhost-user-vsock device now supports `seqpacket` property. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20210921161642.206461-3-sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-05vhost-vsock: fix migration issue when seqpacket is supportedStefano Garzarella
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support") enabled the SEQPACKET feature bit. This commit is released with QEMU 6.1, so if we try to migrate a VM where the host kernel supports SEQPACKET but machine type version is less than 6.1, we get the following errors: Features 0x130000002 unsupported. Allowed features: 0x179000000 Failed to load virtio-vhost_vsock:virtio error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock' load of migration failed: Operation not permitted Let's disable the feature bit for machine types < 6.1. We add a new OnOffAuto property for this, called `seqpacket`. When it is `auto` (default), QEMU behaves as before, trying to enable the feature, when it is `on` QEMU will fail if the backend (vhost-vsock kernel module) doesn't support it. Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support") Cc: qemu-stable@nongnu.org Reported-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20210921161642.206461-2-sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-01machine: Put all sanity-check in the generic SMP parserYanan Wang
Put both sanity-check of the input SMP configuration and sanity-check of the output SMP configuration uniformly in the generic parser. Then machine_set_smp() will become cleaner, also all the invalid scenarios can be tested only by calling the parser. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-16-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Use g_autoptr in machine_set_smpPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Move smp_prefer_sockets to struct SMPCompatPropsYanan Wang
Now we have a common structure SMPCompatProps used to store information about SMP compatibility stuff, so we can also move smp_prefer_sockets there for cleaner code. No functional change intended. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Remove smp_parse callback from MachineClassYanan Wang
Now we have a generic smp parser for all arches, and there will not be any other arch specific ones, so let's remove the callback from MachineClass and call the parser directly. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-14-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Make smp_parse generic enough for all archesYanan Wang
Currently the only difference between smp_parse and pc_smp_parse is the support of dies parameter and the related error reporting. With some arch compat variables like "bool dies_supported", we can make smp_parse generic enough for all arches and the PC specific one can be removed. Making smp_parse() generic enough can reduce code duplication and ease the code maintenance, and also allows extending the topology with more arch specific members (e.g., clusters) in the future. Suggested-by: Andrew Jones <drjones@redhat.com> Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-13-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Tweak the order of topology members in struct CpuTopologyYanan Wang
Now that all the possible topology parameters are integrated in struct CpuTopology, tweak the order of topology members to be "cpus/sockets/ dies/cores/threads/maxcpus" for readability and consistency. We also tweak the comment by adding explanation of dies parameter. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-12-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Use ms instead of global current_machine in sanity-checkYanan Wang
In the sanity-check of smp_cpus and max_cpus against mc in function machine_set_smp(), we are now using ms->smp.max_cpus for the check but using current_machine->smp.max_cpus in the error message. Tweak this by uniformly using the local ms. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-11-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Prefer cores over sockets in smp parsing since 6.2Yanan Wang
In the real SMP hardware topology world, it's much more likely that we have high cores-per-socket counts and few sockets totally. While the current preference of sockets over cores in smp parsing results in a virtual cpu topology with low cores-per-sockets counts and a large number of sockets, which is just contrary to the real world. Given that it is better to make the virtual cpu topology be more reflective of the real world and also for the sake of compatibility, we start to prefer cores over sockets over threads in smp parsing since machine type 6.2 for different arches. In this patch, a boolean "smp_prefer_sockets" is added, and we only enable the old preference on older machines and enable the new one since type 6.2 for all arches by using the machine compat mechanism. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Improve the error reporting of smp parsingYanan Wang
We have two requirements for a valid SMP configuration: the product of "sockets * cores * threads" must represent all the possible cpus, i.e., max_cpus, and then must include the initially present cpus, i.e., smp_cpus. So we only need to ensure 1) "sockets * cores * threads == maxcpus" at first and then ensure 2) "maxcpus >= cpus". With a reasonable order of the sanity check, we can simplify the error reporting code. When reporting an error message we also report the exact value of each topology member to make users easily see what's going on. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210929025816.21076-7-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Set the value of cpus to match maxcpus if it's omittedYanan Wang
Currently we directly calculate the omitted cpus based on the given incomplete collection of parameters. This makes some cmdlines like: -smp maxcpus=16 -smp sockets=2,maxcpus=16 -smp sockets=2,dies=2,maxcpus=16 -smp sockets=2,cores=4,maxcpus=16 not work. We should probably set the value of cpus to match maxcpus if it's omitted, which will make above configs start to work. So the calculation logic of cpus/maxcpus after this patch will be: When both maxcpus and cpus are omitted, maxcpus will be calculated from the given parameters and cpus will be set equal to maxcpus. When only one of maxcpus and cpus is given then the omitted one will be set to its counterpart's value. Both maxcpus and cpus may be specified, but maxcpus must be equal to or greater than cpus. Note: change in this patch won't affect any existing working cmdlines but allows more incomplete configs to be valid. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-6-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Uniformly use maxcpus to calculate the omitted parametersYanan Wang
We are currently using maxcpus to calculate the omitted sockets but using cpus to calculate the omitted cores/threads. This makes cmdlines like: -smp cpus=8,maxcpus=16 -smp cpus=8,cores=4,maxcpus=16 -smp cpus=8,threads=2,maxcpus=16 work fine but the ones like: -smp cpus=8,sockets=2,maxcpus=16 -smp cpus=8,sockets=2,cores=4,maxcpus=16 -smp cpus=8,sockets=2,threads=2,maxcpus=16 break the sanity check. Since we require for a valid config that the product of "sockets * cores * threads" should equal to the maxcpus, we should uniformly use maxcpus to calculate their omitted values. Also the if-branch of "cpus == 0 || sockets == 0" was split into two branches of "cpus == 0" and "sockets == 0" so that we can clearly read that we are parsing the configuration with a preference on cpus over sockets over cores over threads. Note: change in this patch won't affect any existing working cmdlines but improves consistency and allows more incomplete configs to be valid. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-5-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Minor refactor/fix for the smp parsersYanan Wang
To pave the way for the functional improvement in later patches, make some refactor/cleanup for the smp parsers, including using local maxcpus instead of ms->smp.max_cpus in the calculation, defaulting dies to 0 initially like other members, cleanup the sanity check for dies. We actually also fix a hidden defect by avoiding directly using the provided *zero value* in the calculation, which could cause a segment fault (e.g. using dies=0 in the calculation). Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-4-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-01machine: Deprecate "parameter=0" SMP configurationsYanan Wang
In the SMP configuration, we should either provide a topology parameter with a reasonable value (greater than zero) or just omit it and QEMU will compute the missing value. The users shouldn't provide a configuration with any parameter of it specified as zero (e.g. -smp 8,sockets=0) which could possibly cause unexpected results in the -smp parsing. So we deprecate this kind of configurations since 6.2 by adding the explicit sanity check. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210929025816.21076-3-wangyanan55@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-01hw: Add compat machines for 6.2Yanan Wang
Add 6.2 machine types for arm/i440fx/q35/s390x/spapr. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-08-13hw/core: fix error checking in smp_parseDaniel P. Berrangé
machine_set_smp() mistakenly checks 'errp' not '*errp', and so thinks there is an error every single time it runs. This causes it to jump to the end of the method, skipping the max CPUs checks. The caller meanwhile sees no error and so carries on execution. The result of all this is: $ qemu-system-x86_64 -smp -1 qemu-system-x86_64: GLib: ../glib/gmem.c:142: failed to allocate 481036337048 bytes instead of $ qemu-system-x86_64 -smp -1 qemu-system-x86_64: Invalid SMP CPUs -1. The max CPUs supported by machine 'pc-i440fx-6.1' is 255 This is a regression from commit fe68090e8fbd6e831aaf3fc3bb0459c5cccf14cf Author: Paolo Bonzini <pbonzini@redhat.com> Date: Thu May 13 09:03:48 2021 -0400 machine: add smp compound property Closes: https://gitlab.com/qemu-project/qemu/-/issues/524 Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210812175353.4128471-1-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13hw/core: Add missing return on errorPhilippe Mathieu-Daudé
If dies is not supported by this machine's CPU topology, don't keep processing options and return directly. Fixes: 0aebebb561c ("machine: reject -smp dies!=1 for non-PC machines") Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210813112608.1452541-2-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02hw/net: e1000e: Correct the initial value of VET registerChristina Wang
The initial value of VLAN Ether Type (VET) register is 0x8100, as per the manual and real hardware. While Linux e1000e driver always writes VET register to 0x8100, it is not always the case for everyone. Drivers relying on the reset value of VET won't be able to transmit and receive VLAN frames in QEMU. Unlike e1000 in QEMU, e1000e uses a field 'vet' in "struct E1000Core" to cache the value of VET register, but the cache only gets updated when VET register is written. To always get a consistent VET value no matter VET is written or remains its reset value, drop the 'vet' field and use 'core->mac[VET]' directly. Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com> Signed-off-by: Christina Wang <christina.wang@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-08-02hw/net: e1000: Correct the initial value of VET registerChristina Wang
The initial value of VLAN Ether Type (VET) register is 0x8100, as per the manual and real hardware. While Linux e1000 driver always writes VET register to 0x8100, it is not always the case for everyone. Drivers relying on the reset value of VET won't be able to transmit and receive VLAN frames in QEMU. Reported-by: Markus Carlstedt <markus.carlstedt@windriver.com> Signed-off-by: Christina Wang <christina.wang@windriver.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-07-16hw/pci/pcie: Do not set HPC flag if acpihp is usedJulia Suvorova
Instead of changing the hot-plug type in _OSC register, do not set the 'Hot-Plug Capable' flag. This way guest will choose ACPI hot-plug if it is preferred and leave the option to use SHPC with pcie-pci-bridge. The ability to control hot-plug for each downstream port is retained, while 'hotplug=off' on the port means all hot-plug types are disabled. Signed-off-by: Julia Suvorova <jusual@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210713004205.775386-4-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-13numa: Report expected initiatorMichal Privoznik
When setting up NUMA with HMAT enabled there's a check performed in machine_set_cpu_numa_node() that reports an error when a NUMA node has a CPU but the node's initiator is not itself. The error message reported contains only the expected value and not the actual value (which is different because an error is being reported). Report both values in the error message. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Message-Id: <ebdf871551ea995bafa7a858899a26aa9bc153d3.1625662776.git.mprivozn@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-06machine: add smp compound propertyPaolo Bonzini
Make -smp syntactic sugar for a compound property "-machine smp.{cores,threads,cpu,...}". machine_smp_parse is replaced by the setter for the property. numa-test will now cover the new syntax, while other tests still use -smp. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>