aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio
AgeCommit message (Collapse)Author
2024-07-22virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domainEric Auger
Add a trace point on virtio_iommu_detach_endpoint_from_domain(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-7-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Remove the end point on detachEric Auger
We currently miss the removal of the endpoint in case of detach. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-5-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Free [host_]resv_ranges on unset_iommu_devicesEric Auger
We are currently missing the deallocation of the [host_]resv_regions in case of hot unplug. Also to make things more simple let's rule out the case where multiple HostIOMMUDevices would be aliased and attached to the same IOMMUDevice. This allows to remove the handling of conflicting Host reserved regions. Anyway this is not properly supported at guest kernel level. On hotunplug the reserved regions are reset to the ones set by virtio-iommu property. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-4-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-iommu: Remove probe_doneEric Auger
Now we have switched to PCIIOMMUOps to convey host IOMMU information, the host reserved regions are transmitted when the PCIe topology is built. This happens way before the virtio-iommu driver calls the probe request. So let's remove the probe_done flag that allowed to check the probe was not done before the IOMMU MR got enabled. Besides this probe_done flag had a flaw wrt migration since it was not saved/restored. The only case at risk is if 2 devices were plugged to a PCIe to PCI bridge and thus aliased. First of all we discovered in the past this case was not properly supported for neither SMMU nor virtio-iommu on guest kernel side: see [RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/ If this were supported by the guest kernel, it is unclear what the call sequence would be from a virtio-iommu driver point of view. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-3-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"Eric Auger
This reverts commit 1b889d6e39c32d709f1114699a014b381bcf1cb1. There are different problems with that tentative fix: - Some resources are left dangling (resv_regions, host_resv_ranges) and memory subregions are left attached to the root MR although freed as embedded in the sdev IOMMUDevice. Finally the sdev->as is not destroyed and associated listeners are left. - Even when fixing the above we observe a memory corruption associated with the deallocation of the IOMMUDevice. This can be observed when a VFIO device is hotplugged, hot-unplugged and a system reset is issued. At this stage we have not been able to identify the root cause (IOMMU MR or as structs beeing overwritten and used later on?). - Another issue is HostIOMMUDevice are indexed by non aliased BDF whereas the IOMMUDevice is indexed by aliased BDF - yes the current naming is really misleading -. Given the state of the code I don't think the virtio-iommu device works in non singleton group case though. So let's revert the patch for now. This means the IOMMU MR/as survive the hotunplug. This is what is done in the intel_iommu for instance. It does not sound very logical to keep those but currently there is no symetric function to pci_device_iommu_address_space(). probe_done issue will be handled in a subsequent patch. Also resv_regions and host_resv_regions will be deallocated separately. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-2-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-net: Implement SR-IOV VFAkihiko Odaki
A virtio-net device can be added as a SR-IOV VF to another virtio-pci device that will be the PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-7-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22virtio-pci: Implement SR-IOV PFAkihiko Odaki
Allow user to attach SR-IOV VF to a virtio-pci PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-6-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-21vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bitsJonah Palmer
Add support for the VIRTIO_F_IN_ORDER feature across a variety of vhost devices. The inclusion of VIRTIO_F_IN_ORDER in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-6-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-21virtio: virtqueue_ordered_flush - VIRTIO_F_IN_ORDER supportJonah Palmer
Add VIRTIO_F_IN_ORDER feature support for the virtqueue_flush operation. The goal of the virtqueue_ordered_flush operation when the VIRTIO_F_IN_ORDER feature has been negotiated is to write elements to the used/descriptor ring in-order and then update used_idx. The function iterates through the VirtQueueElement used_elems array in-order starting at vq->used_idx. If the element is valid (filled), the element is written to the used/descriptor ring. This process continues until we find an invalid (not filled) element. For packed VQs, the first entry (at vq->used_idx) is written to the descriptor ring last so the guest doesn't see any invalid descriptors. If any elements were written, the used_idx is updated. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-5-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2024-07-21virtio: virtqueue_ordered_fill - VIRTIO_F_IN_ORDER supportJonah Palmer
Add VIRTIO_F_IN_ORDER feature support for the virtqueue_fill operation. The goal of the virtqueue_ordered_fill operation when the VIRTIO_F_IN_ORDER feature has been negotiated is to search for this now-used element, set its length, and mark the element as filled in the VirtQueue's used_elems array. By marking the element as filled, it will indicate that this element has been processed and is ready to be flushed, so long as the element is in-order. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-21virtio: virtqueue_pop - VIRTIO_F_IN_ORDER supportJonah Palmer
Add VIRTIO_F_IN_ORDER feature support in virtqueue_split_pop and virtqueue_packed_pop. VirtQueueElements popped from the available/descritpor ring are added to the VirtQueue's used_elems array in-order and in the same fashion as they would be added the used and descriptor rings, respectively. This will allow us to keep track of the current order, what elements have been written, as well as an element's essential data after being processed. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-21hw/virtio/virtio-crypto: Fix op_code assignment in ↵Zheyu Ma
virtio_crypto_create_asym_session Currently, if the function fails during the key_len check, the op_code does not have a proper value, causing virtio_crypto_free_create_session_req not to free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702211835.3064505-1-zheyuma97@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-10virtio-mem: improve error message when unplug of device fails due to plugged ↵David Hildenbrand
memory The error message is actually expressive, considering QEMU only. But when called from Libvirt, talking about "size" can be confusing, because in Libvirt "size" translates to the memory backend size in QEMU (maximum size) and "current" translates to the QEMU "size" property. Let's simply avoid talking about the "size" property and spell out that some device memory is still plugged. Message-ID: <20240416141426.588544-1-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Cc: Liang Cong <lcong@redhat.com> Cc: Mario Casquero <mcasquer@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2024-07-09virtio-iommu: Revert transient enablement of IOMMU MR in bypass modeEric Auger
In 94df5b2180d6 ("virtio-iommu: Fix 64kB host page size VFIO device assignment"), in case of bypass mode, we transiently enabled the IOMMU MR to allow the set_page_size_mask() to be called and pass information about the page size mask constraint of cold plugged VFIO devices. Now we do not use the IOMMU MR callback anymore, we can just get rid of this hack. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-09memory: remove IOMMU MR iommu_set_page_size_mask() callbackEric Auger
Everything is now in place to use the Host IOMMU Device callbacks to retrieve the page size mask usable with a given assigned device. This new method brings the advantage to pass the info much earlier to the virtual IOMMU and before the IOMMU MR gets enabled. So let's remove the call to memory_region_iommu_set_page_size_mask in vfio common.c and remove the single implementation of the IOMMU MR callback in the virtio-iommu.c Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-09virtio-iommu : Retrieve page size mask on virtio_iommu_set_iommu_device()Eric Auger
Retrieve the Host IOMMU Device page size mask when this latter is set. This allows to get the information much sooner than when relying on IOMMU MR set_page_size_mask() call, whcih happens when the IOMMU MR gets enabled. We introduce check_page_size_mask() helper whose code is inherited from current virtio_iommu_set_page_size_mask() implementation. This callback will be removed in a subsequent patch. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-09HostIOMMUDevice : remove Error handle from get_iova_ranges callbackEric Auger
The error handle argument is not used anywhere. let's remove it. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-09virtio-iommu: Fix error handling in virtio_iommu_set_host_iova_ranges()Eric Auger
In case no IOMMUPciBus/IOMMUDevice are found we need to properly set the error handle and return. Fixes : Coverity CID 1549006 Signed-off-by: Eric Auger <eric.auger@redhat.com> Fixes: cf2647a76e ("virtio-iommu: Compute host reserved regions") Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-03virtio-iommu: Clear IOMMUDevice when VFIO device is unpluggedCédric Le Goater
When a VFIO device is hoplugged in a VM using virtio-iommu, IOMMUPciBus and IOMMUDevice cache entries are created in the .get_address_space() handler of the machine IOMMU device. However, these entries are never destroyed, not even when the VFIO device is detached from the machine. This can lead to an assert if the device is reattached again. When reattached, the .get_address_space() handler reuses an IOMMUDevice entry allocated when the VFIO device was first attached. virtio_iommu_set_host_iova_ranges() is called later on from the .set_iommu_device() handler an fails with an assert on 'probe_done' because the device appears to have been already probed when this is not the case. The IOMMUDevice entry is allocated in pci_device_iommu_address_space() called from under vfio_realize(), the VFIO PCI realize handler. Since pci_device_unset_iommu_device() is called from vfio_exitfn(), a sub function of the PCIDevice unrealize() handler, it seems that the .unset_iommu_device() handler is the best place to release resources allocated at realize time. Clear the IOMMUDevice cache entry there to fix hotplug. Fixes: 817ef10da23c ("virtio-iommu: Implement set|unset]_iommu_device() callbacks") Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20240701101453.203985-1-clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-03virtio: remove virtio_tswap16s() call in vring_packed_event_read()Stefano Garzarella
Commit d152cdd6f6 ("virtio: use virtio accessor to access packed event") switched using of address_space_read_cached() to virito_lduw_phys_cached() to access packed descriptor event. When we used address_space_read_cached(), we needed to call virtio_tswap16s() to handle the endianess of the field, but virito_lduw_phys_cached() already handles it internally, so we no longer need to call virtio_tswap16s() (as the commit had done for `off_wrap`, but forgot for `flags`). Fixes: d152cdd6f6 ("virtio: use virtio accessor to access packed event") Cc: jasowang@redhat.com Cc: qemu-stable@nongnu.org Reported-by: Xoykie <xoykie@gmail.com> Link: https://lore.kernel.org/qemu-devel/CAFU8RB_pjr77zMLsM0Unf9xPNxfr_--Tjr49F_eX32ZBc5o2zQ@mail.gmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240701075208.19634-1-sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-user: Skip unnecessary duplicated VHOST_USER_SET_LOG_BASE requestsBillXiang
The VHOST_USER_SET_LOG_BASE requests should be categorized into non-vring specific messages, and should be sent only once. If send more than once, dpdk will munmap old log_addr which may has been used and cause segmentation fault. Signed-off-by: BillXiang <xiangwencheng@dayudpu.com> Message-Id: <20240613065150.3100-1-xiangwencheng@dayudpu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-iommu: add error check before assertManos Pitsidianakis
A fuzzer case discovered by Zheyu Ma causes an assert failure. Add a check before the assert, and respond with an error before moving on to the next queue element. To reproduce the failure: cat << EOF | \ qemu-system-x86_64 \ -display none -machine accel=qtest -m 512M -machine q35 -nodefaults \ -device virtio-iommu -qtest stdio outl 0xcf8 0x80000804 outw 0xcfc 0x06 outl 0xcf8 0x80000820 outl 0xcfc 0xe0004000 write 0x10000e 0x1 0x01 write 0xe0004020 0x4 0x00001000 write 0xe0004028 0x4 0x00101000 write 0xe000401c 0x1 0x01 write 0x106000 0x1 0x05 write 0x100001 0x1 0x60 write 0x100002 0x1 0x10 write 0x100009 0x1 0x04 write 0x10000c 0x1 0x01 write 0x100018 0x1 0x04 write 0x10001c 0x1 0x02 write 0x101003 0x1 0x01 write 0xe0007001 0x1 0x00 EOF Reported-by: Zheyu Ma <zheyuma97@gmail.com> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2359 Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20240613-fuzz-2359-fix-v2-manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/virtio: Free vqs after vhost_dev_cleanup()Akihiko Odaki
This fixes LeakSanitizer warnings. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240627-san-v2-7-750bb0946dbd@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-pci: implement No_Soft_Reset bitJiqian Chen
In current code, when guest does S3, virtio-gpu are reset due to the bit No_Soft_Reset is not set. After resetting, the display resources of virtio-gpu are destroyed, then the display can't come back and only show blank after resuming. Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check this bit, if this bit is set, the devices resetting will not be done, and then the display can work after resuming. No_Soft_Reset bit is implemented for all virtio devices, and was tested only on virtio-gpu device. Set it false by default for safety. Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-pci: Fix the failure process in kvm_virtio_pci_vector_use_one()Cindy Lu
In function kvm_virtio_pci_vector_use_one(), the function will only use the irqfd/vector for itself. Therefore, in the undo label, the failing process is incorrect. To fix this, we can just remove this label. Fixes: f9a09ca3ea ("vhost: add support for configure interrupt") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20240528084840.194538-1-lulu@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-user: fix lost reconnect againLi Feng
When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fail at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. We need to ensure that even if vhost_dev_init initialization fails, the event handler still needs to be reinstalled when s->connected is false. All vhost-user devices have this issue, including vhost-user-blk/scsi. Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-3-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01Revert "vhost-user: fix lost reconnect"Li Feng
This reverts commit f02a4b8e6431598612466f76aac64ab492849abf. Since the current patch cannot completely fix the lost reconnect problem, there is a scenario that is not considered: - When the virtio-blk driver is removed from the guest os, s->connected has no chance to be set to false, resulting in subsequent reconnection not being executed. The next patch will completely fix this issue with a better approach. Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-2-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-pci: only reset pm state during resettingJiqian Chen
Fix bug imported by 27ce0f3afc9dd ("fix Power Management Control Register for PCI Express virtio devices" After this change, observe that QEMU may erroneously clear the power status of the device, or may erroneously clear non writable registers, such as NO_SOFT_RESET, etc. Only state of PM_CTRL is writable. Only when flag VIRTIO_PCI_FLAG_INIT_PM is set, need to reset state. Fixes: 27ce0f3afc9dd ("fix Power Management Control Register for PCI Express virtio devices" Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240515073526.17297-2-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01hw/virtio: Fix obtain the buffer id from the last descriptorWafer
The virtio-1.3 specification <https://docs.oasis-open.org/virtio/virtio/v1.3/virtio-v1.3.html> writes: 2.8.6 Next Flag: Descriptor Chaining Buffer ID is included in the last descriptor in the list. If the feature (_F_INDIRECT_DESC) has been negotiated, install only one descriptor in the virtqueue. Therefor the buffer id should be obtained from the first descriptor. In descriptor chaining scenarios, the buffer id should be obtained from the last descriptor. Fixes: 86044b24e8 ("virtio: basic packed virtqueue support") Signed-off-by: Wafer <wafer@jaguarmicro.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240510072753.26158-2-wafer@jaguarmicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost-vsock: add VIRTIO_F_RING_PACKED to feature_bitsHalil Pasic
Not having VIRTIO_F_RING_PACKED in feature_bits[] is a problem when the vhost-vsock device does not offer the feature bit VIRTIO_F_RING_PACKED but the in QEMU device is configured to try to use the packed layout (the virtio property "packed" is on). As of today, the Linux kernel vhost-vsock device does not support the packed queue layout (as vhost does not support packed), and does not offer VIRTIO_F_RING_PACKED. Thus when for example a vhost-vsock-ccw is used with packed=on, VIRTIO_F_RING_PACKED ends up being negotiated, despite the fact that the device does not actually support it, and one gets to keep the pieces. Fixes: 74b3e46630 ("virtio: add property to enable packed virtqueue") Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20240429113334.2454197-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bitsJonah Palmer
Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety of vhost devices. The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-6-jonah.palmer@oracle.com> Acked-by: Srujana Challa <schalla@marvell.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio-mmio: Handle extra notification dataJonah Palmer
Add support to virtio-mmio devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-mmio device when this feature is enabled varies depending on the device's virtqueue layout. The data passed to the virtio-mmio device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio: Prevent creation of device using notification-data with ioeventfdJonah Palmer
Prevent the realization of a virtio device that attempts to use the VIRTIO_F_NOTIFICATION_DATA transport feature without disabling ioeventfd. Due to ioeventfd not being able to carry the extra data associated with this feature, having both enabled is a functional mismatch and therefore Qemu should not continue the device's realization process. Although the device does not yet know if the feature will be successfully negotiated, many devices using this feature wont actually work without this extra data and would fail FEATURES_OK anyway. If ioeventfd is able to work with the extra notification data in the future, this compatibility check can be removed. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01virtio/virtio-pci: Handle extra notification dataJonah Palmer
Add support to virtio-pci devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-pci device when this feature is enabled varies depending on the device's virtqueue layout. In a split virtqueue layout, this data includes: - upper 16 bits: shadow_avail_idx - lower 16 bits: virtqueue index In a packed virtqueue layout, this data includes: - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx - lower 16 bits: virtqueue index Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-2-jonah.palmer@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-01vhost: Perform memory section dirty scans once per iterationSi-Wei Liu
On setups with one or more virtio-net devices with vhost on, dirty tracking iteration increases cost the bigger the number amount of queues are set up e.g. on idle guests migration the following is observed with virtio-net with vhost=on: 48 queues -> 78.11% [.] vhost_dev_sync_region.isra.13 8 queues -> 40.50% [.] vhost_dev_sync_region.isra.13 1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13 2 devices, 1 queue -> 18.60% [.] vhost_dev_sync_region.isra.14 With high memory rates the symptom is lack of convergence as soon as it has a vhost device with a sufficiently high number of queues, the sufficient number of vhost devices. On every migration iteration (every 100msecs) it will redundantly query the *shared log* the number of queues configured with vhost that exist in the guest. For the virtqueue data, this is necessary, but not for the memory sections which are the same. So essentially we end up scanning the dirty log too often. To fix that, select a vhost device responsible for scanning the log with regards to memory sections dirty tracking. It is selected when we enable the logger (during migration) and cleared when we disable the logger. If the vhost logger device goes away for some reason, the logger will be re-selected from the rest of vhost devices. After making mem-section logger a singleton instance, constant cost of 7%-9% (like the 1 queue report) will be seen, no matter how many queues or how many vhost devices are configured: 48 queues -> 8.71% [.] vhost_dev_sync_region.isra.13 2 devices, 8 queues -> 7.97% [.] vhost_dev_sync_region.isra.14 Co-developed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1710448055-11709-2-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2024-07-01vhost: dirty log should be per backend typeSi-Wei Liu
There could be a mix of both vhost-user and vhost-kernel clients in the same QEMU process, where separate vhost loggers for the specific vhost type have to be used. Make the vhost logger per backend type, and have them properly reference counted. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1710448055-11709-1-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-06-24virtio-iommu: Remove the implementation of iommu_set_iova_rangeEric Auger
Now that we use PCIIOMMUOps to convey information about usable IOVA ranges we do not to implement the iommu_set_iova_ranges IOMMU MR callback. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-06-24virtio-iommu: Compute host reserved regionsEric Auger
Compute the host reserved regions in virtio_iommu_set_iommu_device(). The usable IOVA regions are retrieved from the HostIOMMUDevice. The virtio_iommu_set_host_iova_ranges() helper turns usable regions into complementary reserved regions while testing the inclusion into existing ones. virtio_iommu_set_host_iova_ranges() reuse the implementation of virtio_iommu_set_iova_ranges() which will be removed in subsequent patches. rebuild_resv_regions() is just moved. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-06-24virtio-iommu: Implement set|unset]_iommu_device() callbacksEric Auger
Implement PCIIOMMUOPs [set|unset]_iommu_device() callbacks. In set(), the HostIOMMUDevice handle is stored in a hash table indexed by PCI BDF. The object will allow to retrieve information related to the physical IOMMU. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2024-06-19hw/mem/memory-device: Remove legacy_align from memory_device_pre_plug()Philippe Mathieu-Daudé
'legacy_align' is always NULL, remove it, simplifying memory_device_pre_plug(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20240617071118.60464-16-philmd@linaro.org>
2024-06-05util/hexdump: Add unit_len and block_len to qemu_hexdump_lineRichard Henderson
Generalize the current 1 byte unit and 4 byte blocking within the output. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240412073346.458116-5-richard.henderson@linaro.org>
2024-06-05util/hexdump: Use a GString for qemu_hexdump_lineRichard Henderson
Allocate a new, or append to an existing GString instead of using a fixed sized buffer. Require the caller to determine the length of the line -- do not bound len here. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240412073346.458116-4-richard.henderson@linaro.org>
2024-06-04util/hexdump: Remove ascii parameter from qemu_hexdump_lineRichard Henderson
Split out asciidump_line as a separate function, local to hexdump.c, for use by qemu_hexdump. Use "%-*s" to generate the alignment between the hex and the ascii, rather than explicit spaces. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240412073346.458116-3-richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-06-04util/hexdump: Remove b parameter from qemu_hexdump_lineRichard Henderson
Require that the caller output the offset and increment bufptr. Use QEMU_HEXDUMP_LINE_BYTES in vhost_vdpa_dump_config instead of raw integer. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240412073346.458116-2-richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-05-16memory: Add Error** argument to memory_get_xlat_addr()Cédric Le Goater
Let the callers do the reporting. This will be useful in vfio_iommu_map_dirty_notify(). Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2024-04-26exec: Declare target_words_bigendian() in 'exec/tswap.h'Philippe Mathieu-Daudé
We usually check target endianess before swapping values, so target_words_bigendian() declaration makes sense in "exec/tswap.h" with the target swapping helpers. Remove "hw/core/cpu.h" when it was only included to get the target_words_bigendian() declaration. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20231212123401.37493-16-philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-04-25hw, target: Add ResetType argument to hold and exit phase methodsPeter Maydell
We pass a ResetType argument to the Resettable class enter phase method, but we don't pass it to hold and exit, even though the callsites have it readily available. This means that if a device cared about the ResetType it would need to record it in the enter phase method to use later on. Pass the type to all three of the phase methods to avoid having to do that. Commit created with for dir in hw target include; do \ spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/reset-type.cocci \ --keep-comments --smpl-spacing --in-place \ --include-headers --dir $dir; done and no manual edits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Luc Michel <luc.michel@amd.com> Message-id: 20240412160809.1260625-5-peter.maydell@linaro.org
2024-04-23Merge tag 'migration-20240423-pull-request' of ↵Richard Henderson
https://gitlab.com/peterx/qemu into staging Migration pull for 9.1 - Het's new test cases for "channels" - Het's fix for a typo for vsock parsing - Cedric's VFIO error report series - Cedric's one more patch for dirty-bitmap error reports - Zhijian's rdma deprecation patch - Yuan's zeropage optimization to fix double faults on anon mem - Zhijian's COLO fix on a crash # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZig4HxIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wbQiwD/V5nSJzSuAG4Ra1Fjo+LRG2TT6qk8eNCi # fIytehSw6cYA/0wqarxOF0tr7ikeyhtG3w4xFf44kk6KcPkoVSl1tqoL # =pJmQ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 23 Apr 2024 03:37:19 PM PDT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown] # gpg: aka "Peter Xu <peterx@redhat.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240423-pull-request' of https://gitlab.com/peterx/qemu: (26 commits) migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed. migration/multifd: solve zero page causing multiple page faults migration: Add Error** argument to add_bitmaps_to_list() migration: Modify ram_init_bitmaps() to report dirty tracking errors migration: Add Error** argument to xbzrle_init() migration: Add Error** argument to ram_state_init() memory: Add Error** argument to the global_dirty_log routines migration: Introduce ram_bitmaps_destroy() memory: Add Error** argument to .log_global_start() handler migration: Add Error** argument to .load_setup() handler migration: Add Error** argument to .save_setup() handler migration: Add Error** argument to qemu_savevm_state_setup() migration: Add Error** argument to vmstate_save() migration: Always report an error in ram_save_setup() migration: Always report an error in block_save_setup() vfio: Always report an error in vfio_save_setup() s390/stattrib: Add Error** argument to set_migrationmode() handler tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str tests/qtest/migration: Add negative tests to validate migration QAPIs tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-04-23memory: Add Error** argument to .log_global_start() handlerCédric Le Goater
Modify all .log_global_start() handlers to take an Error** parameter and return a bool. Adapt memory_global_dirty_log_start() to interrupt on the first error the loop on handlers. In such case, a rollback is performed to stop dirty logging on all listeners where it was previously enabled. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com [peterx: modify & enrich the comment for listener_add_address_space() ] Signed-off-by: Peter Xu <peterx@redhat.com>
2024-04-18hw/virtio: move stubs out of stubs/Paolo Bonzini
Since the virtio memory device stubs are needed exactly when the Kconfig symbol is not enabled, they can be placed in hw/virtio/ and conditionalized on CONFIG_VIRTIO_MD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240408155330.522792-12-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>