aboutsummaryrefslogtreecommitdiff
path: root/hw/core/numa.c
AgeCommit message (Collapse)Author
2020-02-19remove no longer used memory_region_allocate_system_memory()Igor Mammedov
all boards were switched to using memdev backend for main RAM, so we can drop no longer used memory_region_allocate_system_memory() Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200219160953.13771-73-imammedo@redhat.com>
2020-02-19initialize MachineState::ram in NUMA caseIgor Mammedov
In case of NUMA there are 2 cases to consider: 1. '-numa node,memdev', the only one that will be available for 5.0 and newer machine types. In this case reuse current behavior, with only difference memdevs are put into MachineState::ram container + a temporary glue to keep memory_region_allocate_system_memory() working until all boards converted. 2. fake NUMA ("-numa node mem" and default RAM splitting) the later has been deprecated and will be removed but the former is going to stay available for compat reasons for 5.0 and older machine types it takes allocate_system_memory_nonnuma() path, like non-NUMA case and falls under conversion to memdev. So extend non-NUMA MachineState::ram initialization introduced in previous patch to take care of fake NUMA case. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20200219160953.13771-6-imammedo@redhat.com>
2020-02-19machine: introduce convenience MachineState::ramIgor Mammedov
the new field will be used by boards to get access to main RAM memory region and will help to save boiler plate in boards which often introduce a field or variable just for this purpose. Memory region will be equivalent to what currently used memory_region_allocate_system_memory() is returning apart from that it will come from hostmem backend. Followup patches will incrementally switch boards to using RAM from MachineState::ram. Patch takes care of non-NUMA case and follow up patch will initialize MachineState::ram for NUMA case. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20200219160953.13771-5-imammedo@redhat.com>
2020-02-19numa: remove deprecated -mem-path fallback to anonymous RAMIgor Mammedov
it has been deprecated since 4.0 by commit cb79224b7 (deprecate -mem-path fallback to anonymous RAM) Deprecation period ran out and it's time to remove it so it won't get in a way of switching to using hostmem backend for RAM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20200219160953.13771-2-imammedo@redhat.com>
2020-01-07Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio, pci, pc: fixes, features Bugfixes all over the place. HMAT support. New flags for vhost-user-blk utility. Auto-tuning of seg max for virtio storage. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (32 commits) intel_iommu: add present bit check for pasid table entries intel_iommu: a fix to vtd_find_as_from_bus_num() virtio-net: delete also control queue when TX/RX deleted virtio: reset region cache when on queue deletion virtio-mmio: update queue size on guest write tests: add virtio-scsi and virtio-blk seg_max_adjust test virtio: make seg_max virtqueue size dependent hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35 vhost-user-scsi: reset the device if supported vhost-user: add VHOST_USER_RESET_DEVICE to reset devices hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument hw/pci/pci_host: Remove redundant PCI_DPRINTF() virtio-mmio: Clear v2 transport state on soft reset ACPI: add expected files for HMAT tests (acpihmat) tests/bios-tables-test: add test cases for ACPI HMAT tests/numa: Add case for QMP build HMAT hmat acpi: Build Memory Side Cache Information Structure(s) hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) hmat acpi: Build Memory Proximity Domain Attributes Structure(s) numa: Extend CLI to provide memory side cache information ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-05numa: Extend CLI to provide memory side cache informationLiu Jingqi
Add -numa hmat-cache option to provide Memory Side Cache Information. These memory attributes help to build Memory Side Cache Information Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using hmat-cache option, enable HMAT with -machine hmat=on. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Liu Jingqi <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191213011929.2520-4-tao3.xu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2020-01-05numa: Extend CLI to provide memory latency and bandwidth informationLiu Jingqi
Add -numa hmat-lb option to provide System Locality Latency and Bandwidth Information. These memory attributes help to build System Locality Latency and Bandwidth Information Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using hmat-lb option, enable HMAT with -machine hmat=on. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Liu Jingqi <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191213011929.2520-3-tao3.xu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2020-01-05numa: Extend CLI to provide initiator information for numa nodesTao Xu
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT), The initiator represents processor which access to memory. And in 5.2.27.3 Memory Proximity Domain Attributes Structure, the attached initiator is defined as where the memory controller responsible for a memory proximity domain. With attached initiator information, the topology of heterogeneous memory can be described. Add new machine property 'hmat' to enable all HMAT specific options. Extend CLI of "-numa node" option to indicate the initiator numa node-id. In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report the platform's HMAT tables. Before using initiator option, enable HMAT with -machine hmat=on. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jingqi Liu <jingqi.liu@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191213011929.2520-2-tao3.xu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-12-19numa: remove not needed checkIgor Mammedov
Currently parse_numa_node() is always called from already numa enabled context. Drop unnecessary check if numa is supported. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1576154936-178362-2-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-11-12numa: Add missing \n to error messageGreg Kurz
If memory allocation fails when using -mem-path, QEMU is supposed to print out a message to indicate that fallback to anonymous RAM is deprecated. This is done with error_printf() which does output buffering. As a consequence, the message is only printed at the next flush, eg. when quiting QEMU, and it also lacks a trailing newline: qemu-system-ppc64: unable to map backing store for guest RAM: Cannot allocate memory qemu-system-ppc64: warning: falling back to regular RAM allocation QEMU 4.1.50 monitor - type 'help' for more information (qemu) q This is deprecated. Make sure that -mem-path specified path has sufficient resources to allocate -m specified RAM amountgreg@boss02:~/Work/qemu/qemu-spapr$ Add the missing \n to fix both issues. Fixes: cb79224b7e4b "deprecate -mem-path fallback to anonymous RAM" Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <157304440026.351774.14607704217028190097.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-10-15numa: Introduce MachineClass::auto_enable_numa for implicit NUMA nodeTao Xu
Add MachineClass::auto_enable_numa field. When it is true, a NUMA node is expected to be created implicitly. Acked-by: David Gibson <david@gibson.dropbear.id.au> Suggested-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190905083238.1799-1-tao3.xu@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-09-03numa: move numa global variable numa_info into MachineStateTao Xu
Move existing numa global numa_info (renamed as "nodes") into NumaState. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190809065731.9097-5-tao3.xu@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-09-03numa: move numa global variable have_numa_distance into MachineStateTao Xu
Move existing numa global have_numa_distance into NumaState. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liu Jingqi <jingqi.liu@intel.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190809065731.9097-4-tao3.xu@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-09-03numa: move numa global variable nb_numa_nodes into MachineStateTao Xu
Add struct NumaState in MachineState and move existing numa global nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable numa_support into MachineClass to decide which submachines support NUMA. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20190809065731.9097-3-tao3.xu@intel.com> [ehabkost: include hw/boards.h again to fix build failures] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-08-21hw/core: Move cpu.c, cpu.h from qom/ to hw/core/Markus Armbruster
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190709152053.16670-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h in comments replaced]
2019-08-16Include sysemu/sysemu.h a lot lessMarkus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/qdev-core.h includes sysemu/sysemu.h since recent commit e965ffa70a "qdev: add qdev_add_vm_change_state_handler()". This is a bad idea: hw/qdev-core.h is widely included. Move the declaration of qdev_add_vm_change_state_handler() to sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h. Touching sysemu/sysemu.h now recompiles some 1800 objects. qemu/uuid.h also drops from 5400 to 1800. A few more headers show smaller improvement: qemu/notify.h drops from 5600 to 5200, qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from 5500 to 5000. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190812052359.30071-28-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16numa: Move remaining NUMA declarations from sysemu.h to numa.hMarkus Armbruster
Commit e35704ba9c "numa: Move NUMA declarations from sysemu.h to numa.h" left a few NUMA-related macros behind. Move them now. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-26-armbru@redhat.com>
2019-08-16Include hw/boards.h a bit lessMarkus Armbruster
hw/boards.h pulls in almost 60 headers. The less we include it into headers, the better. As a first step, drop superfluous inclusions, and downgrade some more to what's actually needed. Gets rid of just one inclusion into a header. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-23-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2019-08-16Include migration/vmstate.h lessMarkus Armbruster
In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-07-05numa: allow memory-less nodes when using memdev as backendIgor Mammedov
QEMU fails to start if memory-less node is present when memdev is used qemu-system-x86_64 -object memory-backend-ram,id=ram0,size=128M \ -numa node -numa node,memdev=ram0 with error: "memdev option must be specified for either all or no nodes" which works as expected if legacy 'mem' is used. Fix check to make memory-less nodes valid when memdev option is used but still disallow mix of mem and memdev options. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20190702140745.27767-2-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05numa: Make deprecation warnings conditional on !qtest_enabled()Eduardo Habkost
This will help us avoid spurious warnings during "make check". Note that this will silence the warnings generated by tests/numa-test, but not the ones generated by tests/bios-tables-test. We still need to change tests/bios-tables-test to use "-numa ...,memdev=" to silence these warnings. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190702215726.23661-1-ehabkost@redhat.com>
2019-07-05deprecate -mem-path fallback to anonymous RAMIgor Mammedov
Fallback might affect guest or worse whole host performance or functionality if backing file were used to share guest RAM with another process. Patch deprecates fallback so that we could remove it in future and ensure that QEMU will provide expected behavior and fail if it can't use user provided backing file. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190626074228.11558-1-imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05numa: deprecate implict memory distribution between nodesIgor Mammedov
Implicit RAM distribution between nodes has exactly the same issues as: "numa: deprecate 'mem' parameter of '-numa node' option" only with QEMU being the user that's 'adding' 'mem' parameter. Deprecate it, to get it out of the way so that we could consolidate guest RAM allocation using memory backends making it consistent and possibly later on transition to using memory devices instead of adhoc memory mapping for the initial RAM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1559205199-233510-4-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05numa: deprecate 'mem' parameter of '-numa node' optionIgor Mammedov
The parameter allows to configure fake NUMA topology where guest VM simulates NUMA topology but not actually getting performance benefits from it. The same or better results could be achieved using 'memdev' parameter. Beside of unpredictable performance, '-numa node.mem' option has other issues when it's used with combination of -mem-path + + -mem-prealloc + memdev backends (pc-dimm), breaking binding of memdev backends since mem-path/mem-prealloc are global and affect the most of RAM allocations. It's possible to make memdevs and global -mem-path/mem-prealloc to play nicely together but that will just complicate already complicated code and add unobious ways it could break on 2 different memmory allocation pathes and their combinations. Instead of it, consolidate all guest RAM allocation over memdev which still allows to create fake NUMA configurations if desired and leaves one simplifyed code path to consider when it comes to guest RAM allocation. To achieve desired simplification deprecate 'mem' parameter as its ad-hoc partitioning of initial RAM MemoryRegion can't be translated to memdev based backend transparently to users and in compatible manner (migration wise). Later down the road that will allow to consolidate means of how guest RAM is allocated and would permit us to clean up quite a bit memory allocations and numa code, leaving only 'memdev' implementation in place. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1559205199-233510-3-git-send-email-imammedo@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05general: Replace global smp variables with smp machine propertiesLike Xu
Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio, pc, pci: features, fixes, cleanups virtio-pmem support. libvhost user mq support. A bunch of fixes all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 04 Jul 2019 22:00:49 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (22 commits) docs: avoid vhost-user-net specifics in multiqueue section libvhost-user: implement VHOST_USER_PROTOCOL_F_MQ libvhost-user: support many virtqueues libvhost-user: add vmsg_set_reply_u64() helper pc: Move compat_apic_id_mode variable to PCMachineClass virtio: Don't change "started" flag on virtio_vmstate_change() virtio: Make sure we get correct state of device on handle_aio_output() virtio: Set "start_on_kick" on virtio_set_features() virtio: Set "start_on_kick" for legacy devices virtio: add "use-started" property virtio-pci: fix missing device properties pc: Support for virtio-pmem-pci numa: Handle virtio-pmem in NUMA stats hmp: Handle virtio-pmem when printing memory device infos virtio-pci: Proxy for virtio-pmem virtio-pmem: sync linux headers virtio-pci: Allow to specify additional interfaces for the base type virtio-pmem: add virtio device pcie: minor cleanups for slot control/status pcie: work around for racy guest init ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-02hw/core: Collect QMP command handlers in hw/core/Markus Armbruster
The handlers for qapi/machine.json's QMP commands are spread over cpus.c, hw/core/numa.c, monitor/misc.c, monitor/qmp-cmds.c, and vl.c. Move them all to new hw/core/machine-qmp-cmds.c, where they are covered by MAINTAINERS section "Machine core", just like qapi/machine.json. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190619201050.19040-11-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-07-02hw/core: Move numa.c to hw/core/Markus Armbruster
Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190619201050.19040-10-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>