aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-12pcie: Support PCIe Gen5/Gen6 link speedsLukas Stockner
This patch extends the PCIe link speed option so that slots can be configured as supporting 32GT/s (Gen5) or 64GT/s (Gen5) speeds. This is as simple as setting the appropriate bit in LnkCap2 and the appropriate value in LnkCap and LnkCtl2. Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com> Message-Id: <20240215012326.3272366-1-lstockner@genesiscloud.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Mark mmap'ed region memory as MADV_DONTDUMPDavid Hildenbrand
We already use MADV_NORESERVE to deal with sparse memory regions. Let's also set madvise(MADV_DONTDUMP), otherwise a crash of the process can result in us allocating all memory in the mmap'ed region for dumping purposes. This change implies that the mmap'ed rings won't be included in a coredump. If ever required for debugging purposes, we could mark only the mapped rings MADV_DODUMP. Ignore errors during madvise() for now. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-15-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Dynamically remap rings after (temporarily?) removing memory ↵David Hildenbrand
regions Currently, we try to remap all rings whenever we add a single new memory region. That doesn't quite make sense, because we already map rings when setting the ring address, and panic if that goes wrong. Likely, that handling was simply copied from set_mem_table code, where we actually have to remap all rings. Remapping all rings might require us to walk quite a lot of memory regions to perform the address translations. Ideally, we'd simply remove that remapping. However, let's be a bit careful. There might be some weird corner cases where we might temporarily remove a single memory region (e.g., resize it), that would have worked for now. Further, a ring might be located on hotplugged memory, and as the VM reboots, we might unplug that memory, to hotplug memory before resetting the ring addresses. So let's unmap affected rings as we remove a memory region, and try dynamically mapping the ring again when required. Acked-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-14-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Factor out vq usability checkDavid Hildenbrand
Let's factor it out to prepare for further changes. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-13-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Use most of mmap_offset as fd_offsetDavid Hildenbrand
In the past, QEMU would create memory regions that could partially cover hugetlb pages, making mmap() fail if we would use the mmap_offset as an fd_offset. For that reason, we never used the mmap_offset as an offset into the fd and instead always mapped the fd from the very start. However, that can easily result in us mmap'ing a lot of unnecessary parts of an fd, possibly repeatedly. QEMU nowadays does not create memory regions that partially cover huge pages -- it never really worked with postcopy. QEMU handles merging of regions that partially cover huge pages (due to holes in boot memory) since 2018 in c1ece84e7c93 ("vhost: Huge page align and merge"). Let's be a bit careful and not unconditionally convert the mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb size and pass as much as we can as fd_offset, making sure that we call mmap() with a properly aligned offset. With QEMU and a virtio-mem device that is fully plugged (50GiB using 50 memslots) the qemu-storage daemon process consumes in the VA space 1281GiB before this change and 58GiB after this change. ================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 4 guest_phys_addr: 0x0000000200000000 memory_size: 0x0000000040000000 userspace_addr: 0x00007fb73bffe000 old mmap_offset: 0x0000000080000000 fd_offset: 0x0000000080000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f02f1bdc000 Successfully added new region ================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 5 guest_phys_addr: 0x0000000240000000 memory_size: 0x0000000040000000 userspace_addr: 0x00007fb77bffe000 old mmap_offset: 0x00000000c0000000 fd_offset: 0x00000000c0000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f0284000000 Successfully added new region Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-12-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va()David Hildenbrand
Let's speed up GPA to memory region / virtual address lookup. Store the memory regions ordered by guest physical addresses, and use binary search for address translation, as well as when adding/removing memory regions. Most importantly, this will speed up GPA->VA address translation when we have many memslots. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-11-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Factor out search for memory region by GPA and simplifyDavid Hildenbrand
Memory regions cannot overlap, and if we ever hit that case something would be really flawed. For example, when vhost code in QEMU decides to increase the size of memory regions to cover full huge pages, it makes sure to never create overlaps, and if there would be overlaps, it would bail out. QEMU commits 48d7c9757749 ("vhost: Merge sections added to temporary list"), c1ece84e7c93 ("vhost: Huge page align and merge") and e7b94a84b6cb ("vhost: Allow adjoining regions") added and clarified that handling and how overlaps are impossible. Consequently, each GPA can belong to at most one memory region, and everything else doesn't make sense. Let's factor out our search to prepare for further changes. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-10-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Don't search for duplicates when removing memory regionsDavid Hildenbrand
We cannot have duplicate memory regions, something would be deeply flawed elsewhere. Let's just stop the search once we found an entry. We'll add more sanity checks when adding memory regions later. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-9-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Don't zero out memory for memory regionsDavid Hildenbrand
dev->nregions always covers only valid entries. Stop zeroing out other array elements that are unused. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-8-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: No need to check for NULL when unmappingDavid Hildenbrand
We never add a memory region if mmap() failed. Therefore, no need to check for NULL. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-7-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Factor out adding a memory regionDavid Hildenbrand
Let's factor it out, reducing quite some code duplication and perparing for further changes. If we fail to mmap a region and panic, we now simply don't add that (broken) region. Note that we now increment dev->nregions as we are successfully adding memory regions, and don't increment dev->nregions if anything went wrong. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-6-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Merge vu_set_mem_table_exec_postcopy() into ↵David Hildenbrand
vu_set_mem_table_exec() Let's reduce some code duplication and prepare for further changes. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-5-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Factor out removing all mem regionsDavid Hildenbrand
Let's factor it out. Note that the check for MAP_FAILED was wrong as we never set mmap_addr if mmap() failed. We'll remove the NULL check separately. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-4-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Bump up VHOST_USER_MAX_RAM_SLOTS to 509David Hildenbrand
Let's support up to 509 mem slots, just like vhost in the kernel usually does and the rust vhost-user implementation recently [1] started doing. This is required to properly support memory hotplug, either using multiple DIMMs (ACPI supports up to 256) or using virtio-mem. The 509 used to be the KVM limit, it supported 512, but 3 were used for internal purposes. Currently, KVM supports more than 512, but it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot memory), except when other memory devices like PCI devices with BARs are used. So, 509 seems to work well for vhost in the kernel. Details can be found in the QEMU change that made virtio-mem consume up to 256 mem slots across all virtio-mem devices. [2] 509 mem slots implies 509 VMAs/mappings in the worst case (even though, in practice with virtio-mem we won't be seeing more than ~260 in most setups). With max_map_count under Linux defaulting to 64k, 509 mem slots still correspond to less than 1% of the maximum number of mappings. There are plenty left for the application to consume. [1] https://github.com/rust-vmm/vhost/pull/224 [2] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/ Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-3-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12libvhost-user: Dynamically allocate memory for memory slotsDavid Hildenbrand
Let's prepare for increasing VHOST_USER_MAX_RAM_SLOTS by dynamically allocating dev->regions. We don't have any ABI guarantees (not dynamically linked), so we can simply change the layout of VuDev. Let's zero out the memory, just as we used to do. Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-2-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: fix network breakage after cancelling migrationSi-Wei Liu
Fix an issue where cancellation of ongoing migration ends up with no network connectivity. When canceling migration, SVQ will be switched back to the passthrough mode, but the right call fd is not programed to the device and the svq's own call fd is still used. At the point of this transitioning period, the shadow_vqs_enabled hadn't been set back to false yet, causing the installation of call fd inadvertently bypassed. Message-Id: <1707910082-10243-13-git-send-email-si-wei.liu@oracle.com> Fixes: a8ac88585da1 ("vhost: Add Shadow VirtQueue call forwarding capabilities") Cc: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: indicate transitional state for SVQ switchingSi-Wei Liu
svq_switching indicates the transitional state whether or not SVQ mode switching is in progress, and towards which direction. Add the neccessary state around where the switching would take place. Message-Id: <1707910082-10243-12-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: define SVQ transitioning state for mode switchingSi-Wei Liu
Will be used in following patches. DISABLING(-1) means SVQ is being switched off to passthrough mode. ENABLING(1) means passthrough VQs are being switched to SVQ. DONE(0) means SVQ switching is completed. Message-Id: <1707910082-10243-11-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add trace event for vhost_vdpa_net_load_mqSi-Wei Liu
For better debuggability and observability. Message-Id: <1707910082-10243-10-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add trace events for vhost_vdpa_net_load_cmdSi-Wei Liu
For better debuggability and observability. Message-Id: <1707910082-10243-9-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add vhost_vdpa_set_dev_vring_base trace for svq modeSi-Wei Liu
For better debuggability and observability. Message-Id: <1707910082-10243-8-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add vhost_vdpa_get_vring_base trace for svq modeSi-Wei Liu
For better debuggability and observability. Message-Id: <1707910082-10243-7-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add vhost_vdpa_set_address_space_id traceSi-Wei Liu
For better debuggability and observability. Message-Id: <1707910082-10243-6-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: factor out vhost_vdpa_net_get_nc_vdpaSi-Wei Liu
Introduce new API. No functional change on existing API. Message-Id: <1707910082-10243-5-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: factor out vhost_vdpa_last_devSi-Wei Liu
Generalize duplicated condition check for the last vq of vdpa device to a common function. Message-Id: <1707910082-10243-4-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-12vdpa: add back vhost_vdpa_net_first_nc_vdpaSi-Wei Liu
Previous commits had it removed. Now adding it back because this function will be needed by future patches. Message-Id: <1707910082-10243-2-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-09Merge tag 'hw-misc-20240309' of https://github.com/philmd/qemu into stagingPeter Maydell
Misc HW patch queue - hmp: Shorter 'info qtree' output (Zoltan) - qdev: Add a granule_mode property (Eric) - Some ERRP_GUARD() fixes (Zhao) - Doc & style fixes in docs/interop/firmware.json (Thomas) - hw/xen: Housekeeping (Phil) - hw/ppc/mac99: Change timebase frequency 25 -> 100 MHz (Mark) - hw/intc/apic: Memory leak fix (Paolo) - hw/intc/grlib_irqmp: Ensure ncpus value is in range (Clément) - hw/m68k/mcf5208: Add support for reset (Angelo) - hw/i386/pc: Housekeeping (Phil) - hw/core/smp: Remove/deprecate parameter=0,1 adapting test-smp-parse (Zhao) # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmXstpMACgkQ4+MsLN6t # wN6XBw//dNItFhf1YX+Au4cNoQVDgHE9RtzEIGOnwcL1CgrA9rAQQfLRE5KWM6sN # 1qiPh+T6SPxtiQ2rw4AIpsI7TXjO72b/RDWpUUSwnfH39eC77pijkxIK+i9mYI9r # p0sPjuP6OfolUFYeSbYX+DmNZh1ONPf27JATJQEf0st8dyswn7lTQvJEaQ97kwxv # UKA0JD5l9LZV8Zr92cgCzlrfLcbVblJGux9GYIL09yN78yqBuvTm77GBC/rvC+5Q # fQC5PQswJZ0+v32AXIfSysMp2R6veo4By7VH9Lp51E/u9jpc4ZbcDzxzaJWE6zOR # fZ01nFzou1qtUfZi+MxNiDR96LP6YoT9xFdGYfNS6AowZn8kymCs3eo7M9uvb+rN # A2Sgis9rXcjsR4e+w1YPBXwpalJnLwB0QYhEOStR8wo1ceg7GBG6zHUJV89OGzsA # KS8X0aV1Ulkdm/2H6goEhzrcC6FWLg8pBJpfKK8JFWxXNrj661xM0AAFVL9we356 # +ymthS2x/RTABSI+1Lfsoo6/SyXoimFXJJWA82q9Yzoaoq2gGMWnfwqxfix6JrrA # PuMnNP5WNvh04iWcNz380P0psLVteHWcVfTRN3JvcJ9iJ2bpjcU1mQMJtvSF9wBn # Y8kiJTUmZCu3br2e5EfxmypM/h8y29VD/1mxPk8Dtcq3gjx9AU4= # =juZH # -----END PGP SIGNATURE----- # gpg: Signature made Sat 09 Mar 2024 19:20:51 GMT # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full] # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'hw-misc-20240309' of https://github.com/philmd/qemu: (43 commits) hw/m68k/mcf5208: add support for reset tests/unit/test-smp-parse: Test "parameter=0" SMP configurations tests/unit/test-smp-parse: Test smp_props.has_clusters tests/unit/test-smp-parse: Test the full 7-levels topology hierarchy tests/unit/test-smp-parse: Test "drawers" and "books" combination case tests/unit/test-smp-parse: Test "drawers" parameter in -smp tests/unit/test-smp-parse: Test "books" parameter in -smp tests/unit/test-smp-parse: Make test cases aware of the book/drawer tests/unit/test-smp-parse: Bump max_cpus to 4096 tests/unit/test-smp-parse: Use CPU number macros in invalid topology case tests/unit/test-smp-parse: Drop the unsupported "dies=1" case hw/core/machine-smp: Calculate total CPUs once in machine_parse_smp_config() hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations docs/interop/firmware.json: Fix doc for FirmwareFlashMode docs/interop/firmware.json: Align examples hw/intc/grlib_irqmp: abort realize when ncpus value is out of range mac_newworld: change timebase frequency from 100MHz to 25MHz for mac99 machine hmp: Add option to info qtree to omit details qdev: Add a granule_mode property ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-09Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into stagingPeter Maydell
trivial patches for 2024-03-09 # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmXshtIPHG1qdEB0bHMu # bXNrLnJ1AAoJEHAbT2saaT5ZMFUIAKTd1rYzRs6x4wXitaWbYIIs2d6UB/HLbzz2 # BVHZwoYqsW3TuNFJp4njHhexZ76nFlT8xMuOKB5tAm4KOmqOdxS/mfThuSGsWGP7 # CAk35ENOMQbii/jp6tqawop+H0rVMSJjBrkU4vLRAtQ7g1ISnX6tJi3wiyS+FtHq # 9eIfgJgM77tvq6RLPZTUrUBevMWQfjMcvXmMnYqL4Z1dnibIb5/R3RKAnEc4CUoS # hMw94wBcq+ZOQNPnY7d+WioKq7JcSWX7UW5NuHo+C+G83nq1/5vE8Oe2kNwzFyDL # 9sIqL8bz6v8iiqcVMIBykSAZhYH9QEuVRJso18UE5w0B8k4CQcM= # =dIAF # -----END PGP SIGNATURE----- # gpg: Signature made Sat 09 Mar 2024 15:57:06 GMT # gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59 # gpg: issuer "mjt@tls.msk.ru" # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full] # gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full] # gpg: aka "Michael Tokarev <mjt@debian.org>" [full] # Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5 # Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59 * tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu: docs/acpi/bits: add some clarity and details while also improving formating hw/mem/cxl_type3: Fix problem with g_steal_pointer() hw/pci-bridge/cxl_upstream: Fix problem with g_steal_pointer() hw/cxl/cxl-cdat: Fix type of buf in ct3_load_cdat() qerror: QERR_DEVICE_IN_USE is no longer used, drop blockdev: Fix block_resize error reporting for op blockers char: Slightly better error reporting when chardev is in use make-release: switch to .xz format by default hw/scsi/lsi53c895a: Fix typo in comment hw/vfio/pci.c: Make some structure static replay: Improve error messages about configuration conflicts Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-09hw/m68k/mcf5208: add support for resetAngelo Dureghello
Add reset support for mcf5208. Signed-off-by: Angelo Dureghello <angelo@kernel-space.org> Reviewed-by: Thomas Huth <huth@tuxfamily.org> Message-ID: <20240309093459.984565-1-angelo@kernel-space.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test "parameter=0" SMP configurationsZhao Liu
The support for "parameter=0" SMP configurations is removed, and QEMU returns error for those cases. So add the related test cases to ensure parameters can't accept 0. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-14-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test smp_props.has_clustersZhao Liu
The smp_props.has_clusters in MachineClass is not a user configured field, and it indicates if user specifies "clusters" in -smp. After -smp parsing, other module could aware if the cluster level is configured by user. This is used when the machine has only 1 cluster since there's only 1 cluster by default. Add the check to cover this field. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-13-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test the full 7-levels topology hierarchyZhao Liu
Currently, -smp supports up to 7-levels topology hierarchy: -drawers/books/sockets/dies/clusters/cores/threads. Though no machine supports all these 7 levels yet, these 7 levels have the strict containment relationship and together form the generic CPU topology representation of QEMU. Also, note that the maxcpus is calculated by multiplying all 7 levels: maxcpus = drawers * books * sockets * dies * clusters * cores * threads. To cover this code path, it is necessary to test the full topology case (with all 7 levels). This also helps to avoid introducing new issues by further expanding the CPU topology in the future. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-12-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test "drawers" and "books" combination caseZhao Liu
Since s390 machine supports both "drawers" and "books" in -smp, add the "drawers" and "books" combination test case to match the actual topology usage scenario. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-11-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test "drawers" parameter in -smpZhao Liu
Although drawer was introduced to -smp along with book by s390 machine, as a general topology level in QEMU that may be reused by other arches in the future, it is desirable to cover this parameter's parsing in a separate case. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-10-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Test "books" parameter in -smpZhao Liu
Although book was introduced to -smp along with drawer by s390 machine, as a general topology level in QEMU that may be reused by other arches in the future, it is desirable to cover this parameter's parsing in a separate case. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-9-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Make test cases aware of the book/drawerZhao Liu
Currently, -smp supports 2 more new levels: book and drawer. It is necessary to consider the effects of book and drawer in the test cases to ensure that the calculations are correct. This is also the preparation to add new book and drawer test cases. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-8-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Bump max_cpus to 4096Zhao Liu
The q35 machine is trying to support up to 4096 vCPUs [1], so it's necessary to bump max_cpus in test-smp-parse to 4096 to cover the topological needs of future machines. [1]: https://lore.kernel.org/qemu-devel/20240228143351.3967-1-anisinha@redhat.com/ Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-7-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Use CPU number macros in invalid topology caseZhao Liu
Use MAX_CPUS/MIN_CPUS macros in invalid topology case. This gives us the flexibility to change the maximum and minimum CPU limits. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xiaoling Song <xiaoling.song@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-6-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09tests/unit/test-smp-parse: Drop the unsupported "dies=1" caseZhao Liu
Unsupported "parameter=1" SMP configurations is marked as deprecated, so drop the related test case. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-5-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hw/core/machine-smp: Calculate total CPUs once in machine_parse_smp_config()Zhao Liu
In machine_parse_smp_config(), the number of total CPUs is calculated by: drawers * books * sockets * dies * clusters * cores * threads To avoid missing the future new topology level, use a local variable to cache the calculation result so that total CPUs are only calculated once. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240308160148.3130837-4-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurationsZhao Liu
Currently, it was allowed for users to specify the unsupported topology parameter as "1". For example, x86 PC machine doesn't support drawer/book/cluster topology levels, but user could specify "-smp drawers=1,books=1,clusters=1". This is meaningless and confusing, so that the support for this kind of configurations is marked deprecated since 9.0. And report warning message for such case like: qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid): Unsupported clusters parameter mustn't be specified as 1 qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid): Unsupported books parameter mustn't be specified as 1 qemu-system-x86_64: warning: Deprecated CPU topology (considered invalid): Unsupported drawers parameter mustn't be specified as 1 Users have to ensure that all the topology members described with -smp are supported by the target machine. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240308160148.3130837-3-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurationsZhao Liu
The "parameter=0" SMP configurations have been marked as deprecated since v6.2. For these cases, -smp currently returns the warning and adjusts the zeroed parameters to 1 by default. Remove the above compatibility logic in v9.0, and return error directly if any -smp parameter is set as 0. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Message-ID: <20240308160148.3130837-2-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09docs/interop/firmware.json: Fix doc for FirmwareFlashModeThomas Weißschuh
The doc title did not match the actual definition. Fixes: 2720ceda05 ("docs: expand firmware descriptor to allow flash without NVRAM") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240307-qapi-firmware-json-v2-2-3b29eabb9b9a@linutronix.de> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09docs/interop/firmware.json: Align examplesThomas Weißschuh
Adjust indentation for commit d23055b8db8 (qapi: Require descriptions and tagged sections to be indented). Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240307-qapi-firmware-json-v2-1-3b29eabb9b9a@linutronix.de> [PMD: Reword description using Markus suggestion] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hw/intc/grlib_irqmp: abort realize when ncpus value is out of rangeClément Chigot
Even if the error is set, the build is not aborted when the ncpus value is wrong, the return is missing. Signed-off-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Fixes: 6bf1478543 ("hw/intc/grlib_irqmp: add ncpus property") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240308152719.591232-1-chigot@adacore.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09mac_newworld: change timebase frequency from 100MHz to 25MHz for mac99 machineMark Cave-Ayland
MacOS X uses multiple techniques for calibrating timers depending upon the detected hardware. One of these calibration routines compares the change in the timebase against the KeyLargo timer and uses this to recalculate the clock frequency, timebase frequency and bus frequency if the calibration exceeds certain limits. This recalibration occurs despite the correct values being passed via the device tree, and is likely due to buggy firmware on some hardware. The timebase frequency of 100MHz was set way back in 2005 by commit fa296b0fb4 ("PIC fix - changed back TB frequency to 100 MHz") and with this value on a mac99,via=pmu machine the OSX 10.2 timer calibration incorrectly calculates the bus frequency as 400MHz instead of 100MHz. The most noticeable side-effect is the UI appears sluggish and not very responsive for normal use. Change the timebase frequency from 100MHz to 25MHz which matches that of a real G4 AGP machine (the closest match to QEMU's mac99 machine) and allows OSX 10.2 to correctly detect all of the clock frequency, timebase frequency and bus frequency. Tested on various MacOS images from OS 9.2 through to OSX 10.4, along with Linux and NetBSD and I was unable to find any regressions from this change. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240304073548.2098806-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hmp: Add option to info qtree to omit detailsBALATON Zoltan
The output of info qtree monitor command is very long. Add an option to print a brief overview omitting all the details. Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org> Message-ID: <20240307183812.0105D4E6004@zero.eik.bme.hu> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09qdev: Add a granule_mode propertyEric Auger
Introduce a new enum type property allowing to set an IOMMU granule. Values are 4k, 8k, 16k, 64k and host. This latter indicates the vIOMMU granule will match the host page size. A subsequent patch will add such a property to the virtio-iommu device. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240227165730.14099-2-eric.auger@redhat.com>
2024-03-09hw/intc/apic: fix memory leakPaolo Bonzini
deliver_bitmask is allocated on the heap in apic_deliver(), but there are many paths in the function that return before the corresponding g_free() is reached. Fix this by switching to g_autofree and, while at it, also switch to g_new. Do the same in apic_deliver_irq() as well for consistency. Fixes: b5ee0468e9d ("apic: add support for x2APIC mode", 2024-02-14) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-ID: <20240304224133.267640-1-pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-03-09hw/i386/pc: Have pc_init_isa() pass a NULL pci_type argumentPhilippe Mathieu-Daudé
The "isapc" machine only provides an ISA bus, not a PCI one, and doesn't instanciate any i440FX south bridge. Its machine class defines PCMachineClass::pci_enabled = false, and pc_init1() only uses the pci_type argument when pci_enabled is true. Since for this machine the argument is not used, passing NULL makes more sense. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20240301185936.95175-5-philmd@linaro.org>