aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio
AgeCommit message (Collapse)Author
2022-05-12vhost-backend: do not depend on CONFIG_VHOST_VSOCKPaolo Bonzini
The vsock callbacks .vhost_vsock_set_guest_cid and .vhost_vsock_set_running are the only ones to be conditional on #ifdef CONFIG_VHOST_VSOCK. This is different from any other device-dependent callbacks like .vhost_scsi_set_endpoint, and it also broke when CONFIG_VHOST_VSOCK was changed to a per-target symbol. It would be possible to also use the CONFIG_DEVICES include, but really there is no reason for most virtio files to be per-target so just remove the #ifdef to fix the issue. Reported-by: Dov Murik <dovmurik@linux.ibm.com> Fixes: 9972ae314f ("build: move vhost-vsock configuration to Kconfig") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-09virtio-scsi: don't waste CPU polling the event virtqueueStefan Hajnoczi
The virtio-scsi event virtqueue is not emptied by its handler function. This is typical for rx virtqueues where the device uses buffers when some event occurs (e.g. a packet is received, an error condition happens, etc). Polling non-empty virtqueues wastes CPU cycles. We are not waiting for new buffers to become available, we are waiting for an event to occur, so it's a misuse of CPU resources to poll for buffers. Introduce the new virtio_queue_aio_attach_host_notifier_no_poll() API, which is identical to virtio_queue_aio_attach_host_notifier() except that it does not poll the virtqueue. Before this patch the following command-line consumed 100% CPU in the IOThread polling and calling virtio_scsi_handle_event(): $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \ --object iothread,id=iothread0 \ --device virtio-scsi-pci,iothread=iothread0 \ --blockdev file,filename=test.img,aio=native,cache.direct=on,node-name=drive0 \ --device scsi-hd,drive=drive0 After this patch CPU is no longer wasted. Reported-by: Nir Soffer <nsoffer@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20220427143541.119567-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-05-07meson: use have_vhost_* variables to pick sourcesPaolo Bonzini
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-07build: move vhost-user-fs configuration to KconfigPaolo Bonzini
vhost-user-fs is a device and it should be possible to enable/disable it with --without-default-devices, not --without-default-features. Compute its default value in Kconfig to obtain the more intuitive behavior. In this case the configure options were undocumented, too. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-07build: move vhost-vsock configuration to KconfigPaolo Bonzini
vhost-vsock and vhost-user-vsock are two devices of their own; it should be possible to enable/disable them with --without-default-devices, not --without-default-features. Compute their default value in Kconfig to obtain the more intuitive behavior. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-07meson, virtio: place all virtio-pci devices under virtio_pci_ssPaolo Bonzini
Since a sourceset already exists for this, avoid unnecessary repeat of CONFIG_VIRTIO_PCI. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-05vhost-user: Use correct macro name TARGET_PPC64Murilo Opsfelder Araujo
The correct name of the macro is TARGET_PPC64. Fixes: 27598393a232 ("Lift max memory slots limit imposed by vhost-user") Reported-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com> Cc: Peter Turschmid <peter.turschm@nutanix.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20220503180108.34506-1-muriloo@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-04vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REGKevin Wolf
The spec clarifies now that QEMU should not send a file descriptor in a request to remove a memory region. Change it accordingly. For libvhost-user, this is a bug fix that makes it compatible with rust-vmm's implementation that doesn't send a file descriptor. Keep accepting, but ignoring a file descriptor for compatibility with older QEMU versions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220407133657.155281-4-kwolf@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-03util: rename qemu_*block() socket functionsMarc-André Lureau
The qemu_*block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit f9e8cacc5557e43 ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-05-03hw: replace qemu_set_nonblock()Marc-André Lureau
Those calls are non-socket fd, or are POSIX-specific. Use the dedicated GLib API. (qemu_set_nonblock() is for socket-like) (this is a preliminary patch before renaming qemu_set_nonblock()) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2022-04-26vdpa: Add missing tracing to batch mapping functionsEugenio Pérez
These functions were not traced properly. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220405063628.853745-1-eperezma@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-04-20Don't include sysemu/tcg.h if it is not necessaryThomas Huth
This header only defines the tcg_allowed variable and the tcg_enabled() function - which are not required in many files that include this header. Drop the #include statement there. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220315144107.1012530-1-thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-04-19Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingRichard Henderson
* Add cpu0-id to query-sev-capabilities * whpx support for breakpoints and stepping * initial support for Hyper-V Synthetic Debugging * use monotonic clock for QemuCond and QemuSemaphore * Remove qemu-common.h include from most units and lots of other clenaups * do not include headers for all virtio devices in virtio-ccw.h # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJXCQAUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNT6wf+NHDJUEdDiwaVGVTGXgHuiaycsymi # FpNPiw/+XxSGN5xF3fkUGgqaDrcwIYwVfnXlghKSz8kp1cP3cjxa5CzNMLGTp5je # N6BxFbD7yC6dhagGm3mj32jlsptv3M38OHqKc3t+RaUAotP5RF2VdCyfUBLG6vU0 # aMzvMfMtB5aG0D8Fr5EV63t1JMTceFU0YxsG73UCFs2Yx4Z0cGBbNxMbHweRhd1q # tPeVDS46MFPM3/2cGGHpeeqxkoCTU7A9j1VuNQI3k+Kg+6W5YVxiK/UP7bw77E/a # yAHsmIVTNro8ajMBch73weuHtGtdfFLvCKc6QX6aVjzK4dF1voQ01E7gPQ== # =rMle # -----END PGP SIGNATURE----- # gpg: Signature made Wed 13 Apr 2022 10:31:44 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (53 commits) target/i386: Remove unused XMMReg, YMMReg types and CPUState fields target/i386: do not access beyond the low 128 bits of SSE registers virtio-ccw: do not include headers for all virtio devices virtio-ccw: move device type declarations to .c files virtio-ccw: move vhost_ccw_scsi to a separate file s390x: follow qdev tree to detect SCSI device on a CCW bus hw: hyperv: Initial commit for Synthetic Debugging device hyperv: Add support to process syndbg commands hyperv: Add definitions for syndbg hyperv: SControl is optional to enable SynIc thread-posix: optimize qemu_sem_timedwait with zero timeout thread-posix: implement Semaphore with QemuCond and QemuMutex thread-posix: use monotonic clock for QemuCond and QemuSemaphore thread-posix: remove the posix semaphore support whpx: Added support for breakpoints and stepping build-sys: simplify AF_VSOCK check build-sys: drop ntddscsi.h check Remove qemu-common.h include from most units qga: remove explicit environ argument from exec/spawn Move fcntl_setfl() to oslib-posix ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-04-08virtio-iommu: use-after-free fixWentao Liang
A potential Use-after-free was reported in virtio_iommu_handle_command when using virtio-iommu: > I find a potential Use-after-free in QEMU 6.2.0, which is in > virtio_iommu_handle_command() (./hw/virtio/virtio-iommu.c). > > > Specifically, in the loop body, the variable 'buf' allocated at line 639 can be > freed by g_free() at line 659. However, if the execution path enters the loop > body again and the if branch takes true at line 616, the control will directly > jump to 'out' at line 651. At this time, 'buf' is a freed pointer, which is not > assigned with an allocated memory but used at line 653. As a result, a UAF bug > is triggered. > > > > 599 for (;;) { > ... > 615 sz = iov_to_buf(iov, iov_cnt, 0, &head, sizeof(head)); > 616 if (unlikely(sz != sizeof(head))) { > 617 tail.status = VIRTIO_IOMMU_S_DEVERR; > 618 goto out; > 619 } > ... > 639 buf = g_malloc0(output_size); > ... > 651 out: > 652 sz = iov_from_buf(elem->in_sg, elem->in_num, 0, > 653 buf ? buf : &tail, output_size); > ... > 659 g_free(buf); > > We can fix it by set ‘buf‘ to NULL after freeing it: > > > 651 out: > 652 sz = iov_from_buf(elem->in_sg, elem->in_num, 0, > 653 buf ? buf : &tail, output_size); > ... > 659 g_free(buf); > +++ buf = NULL; > 660 } Fix as suggested by the reporter. Signed-off-by: Wentao Liang <Wentao_Liang_g@163.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20220407095047.50371-1-mst@redhat.com Message-ID: <20220406040445-mutt-send-email-mst@kernel.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-04-06Remove qemu-common.h include from most unitsMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06include: move C/util-related declarations to cutils.hMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-22-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06Replace qemu_real_host_page variables with inlined functionsMarc-André Lureau
Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06Replace config-time define HOST_WORDS_BIGENDIANMarc-André Lureau
Replace a config-time define with a compile time condition define (compatible with clang and gcc) that must be declared prior to its usage. This avoids having a global configure time define, but also prevents from bad usage, if the config header wasn't included before. This can help to make some code independent from qemu too. gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> [ For the s390x parts I'm involved in ] Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06Replace qemu_gettimeofday() with g_get_real_time()Marc-André Lureau
GLib g_get_real_time() is an alternative to gettimeofday() which allows to simplify our code. For semihosting, a few bits are lost on POSIX host, but this shouldn't be a big concern. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220307070401.171986-5-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-31vhost-vdpa: fix typo in a commentStefano Garzarella
Replace vpda with vdpa. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220328152022.73245-1-sgarzare@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-03-29virtio: fix --enable-vhost-user build on non-LinuxPaolo Bonzini
The vhost-shadow-virtqueue.c build requires include files from linux-headers/, so it cannot be built on non-Linux systems. Fortunately it is only needed by vhost-vdpa, so move it there. Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-22Replace GCC_FMT_ATTR with G_GNUC_PRINTFMarc-André Lureau
One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-03-21Use g_new() & friends where that makes obvious senseMarkus Armbruster
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
2022-03-18virtio/virtio-balloon: Prefer Object* over void* parameterBernhard Beschow
*opaque is an alias to *obj. Using the ladder makes the code consistent with with other devices, e.g. accel/kvm/kvm-all and accel/tcg/tcg-all. It also makes the cast more typesafe. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220301222301.103821-2-shentey@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-03-15Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingPeter Maydell
* whpx fixes in preparation for GDB support (Ivan) * VSS header fixes (Marc-André) * 5-level EPT support (Vitaly) * AMX support (Jing Liu & Yang Zhong) * Bundle changes to MSI routes (Longpeng) * More precise emulation of #SS (Gareth) * Disable ASAN testing # gpg: Signature made Tue 15 Mar 2022 10:51:00 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (22 commits) gitlab-ci: do not run tests with address sanitizer KVM: SVM: always set MSR_AMD64_TSC_RATIO to default value i386: Add Icelake-Server-v6 CPU model with 5-level EPT support x86: Support XFD and AMX xsave data migration x86: add support for KVM_CAP_XSAVE2 and AMX state migration x86: Add AMX CPUIDs enumeration x86: Add XFD faulting bit for state components x86: Grant AMX permission for guest x86: Add AMX XTILECFG and XTILEDATA components x86: Fix the 64-byte boundary enumeration for extended state linux-headers: include missing changes from 5.17 target/i386: Throw a #SS when loading a non-canonical IST target/i386: only include bits in pg_mode if they are not ignored kvm/msi: do explicit commit when adding msi routes kvm-irqchip: introduce new API to support route change update meson-buildoptions.sh qga/vss: update informative message about MinGW qga/vss-win32: check old VSS SDK headers meson: fix generic location of vss headers vmxcap: Add 5-level EPT bit ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-15kvm/msi: do explicit commit when adding msi routesLongpeng(Mike)
We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-15vdpa: Expose VHOST_F_LOG_ALL on SVQEugenio Pérez
SVQ is able to log the dirty bits by itself, so let's use it to not block migration. Also, ignore set and clear of VHOST_F_LOG_ALL on set_features if SVQ is enabled. Even if the device supports it, the reports would be nonsense because SVQ memory is in the qemu region. The log region is still allocated. Future changes might skip that, but this series is already long enough. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vdpa: Never set log_base addr if SVQ is enabledEugenio Pérez
Setting the log address would make the device start reporting invalid dirty memory because the SVQ vrings are located in qemu's memory. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vdpa: Adapt vhost_vdpa_get_vring_base to SVQEugenio Pérez
This is needed to achieve migration, so the destination can restore its index. Setting base as last used idx, so destination will see as available all the entries that the device did not use, including the in-flight processing ones. This is ok for networking, but other kinds of devices might have problems with these retransmissions. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vdpa: Add custom IOTLB translations to SVQEugenio Pérez
Use translations added in VhostIOVATree in SVQ. Only introduce usage here, not allocation and deallocation. As with previous patches, we use the dead code paths of shadow_vqs_enabled to avoid commiting too many changes at once. These are impossible to take at the moment. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Add VhostIOVATreeEugenio Pérez
This tree is able to look for a translated address from an IOVA address. At first glance it is similar to util/iova-tree. However, SVQ working on devices with limited IOVA space need more capabilities, like allocating IOVA chunks or performing reverse translations (qemu addresses to iova). The allocation capability, as "assign a free IOVA address to this chunk of memory in qemu's address space" allows shadow virtqueue to create a new address space that is not restricted by guest's addressable one, so we can allocate shadow vqs vrings outside of it. It duplicates the tree so it can search efficiently in both directions, and it will signal overlap if iova or the translated address is present in any tree. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Shadow virtqueue buffers forwardingEugenio Pérez
Initial version of shadow virtqueue that actually forward buffers. There is no iommu support at the moment, and that will be addressed in future patches of this series. Since all vhost-vdpa devices use forced IOMMU, this means that SVQ is not usable at this point of the series on any device. For simplicity it only supports modern devices, that expects vring in little endian, with split ring and no event idx or indirect descriptors. Support for them will not be added in this series. It reuses the VirtQueue code for the device part. The driver part is based on Linux's virtio_ring driver, but with stripped functionality and optimizations so it's easier to review. However, forwarding buffers have some particular pieces: One of the most unexpected ones is that a guest's buffer can expand through more than one descriptor in SVQ. While this is handled gracefully by qemu's emulated virtio devices, it may cause unexpected SVQ queue full. This patch also solves it by checking for this condition at both guest's kicks and device's calls. The code may be more elegant in the future if SVQ code runs in its own iocontext. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vdpa: adapt vhost_ops callbacks to svqEugenio Pérez
First half of the buffers forwarding part, preparing vhost-vdpa callbacks to SVQ to offer it. QEMU cannot enable it at this moment, so this is effectively dead code at the moment, but it helps to reduce patch size. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15virtio: Add vhost_svq_get_vring_addrEugenio Pérez
It reports the shadow virtqueue address from qemu virtual address space. Since this will be different from the guest's vaddr, but the device can access it, SVQ takes special care about its alignment & lack of garbage data. It assumes that IOMMU will work in host_page_size ranges for that. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Add vhost_svq_valid_features to shadow vqEugenio Pérez
This allows SVQ to negotiate features with the guest and the device. For the device, SVQ is a driver. While this function bypasses all non-transport features, it needs to disable the features that SVQ does not support when forwarding buffers. This includes packed vq layout, indirect descriptors or event idx. Future changes can add support to offer more features to the guest, since the use of VirtQueue gives this for free. This is left out at the moment for simplicity. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Add Shadow VirtQueue call forwarding capabilitiesEugenio Pérez
This will make qemu aware of the device used buffers, allowing it to write the guest memory with its contents if needed. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Add Shadow VirtQueue kick forwarding capabilitiesEugenio Pérez
At this mode no buffer forwarding will be performed in SVQ mode: Qemu will just forward the guest's kicks to the device. Host memory notifiers regions are left out for simplicity, and they will not be addressed in this series. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15vhost: Add VhostShadowVirtqueueEugenio Pérez
Vhost shadow virtqueue (SVQ) is an intermediate jump for virtqueue notifications and buffers, allowing qemu to track them. While qemu is forwarding the buffers and virtqueue changes, it is able to commit the memory it's being dirtied, the same way regular qemu's VirtIO devices do. This commit only exposes basic SVQ allocation and free. Next patches of the series add functionality like notifications and buffers forwarding. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-06vhost: use wfd on functions setting vring call fdSergio Lopez
When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd can be used for writing. Use the recently introduced event_notifier_get_wfd() function to obtain the fd that our peer must use to signal the vring. Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220304100854.14829-3-slp@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06vhost-vsock: detach the virqueue element in case of errorStefano Garzarella
In vhost_vsock_common_send_transport_reset(), if an element popped from the virtqueue is invalid, we should call virtqueue_detach_element() to detach it from the virtqueue before freeing its memory. Fixes: fc0b9b0e1c ("vhost-vsock: add virtio sockets device") Fixes: CVE-2022-26354 Cc: qemu-stable@nongnu.org Reported-by: VictorV <vv474172261@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220228095058.27899-1-sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06virtio-iommu: Support bypass domainJean-Philippe Brucker
The driver can create a bypass domain by passing the VIRTIO_IOMMU_ATTACH_F_BYPASS flag on the ATTACH request. Bypass domains perform slightly better than domains with identity mappings since they skip translation. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20220214124356.872985-4-jean-philippe@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06virtio-iommu: Default to bypass during bootJean-Philippe Brucker
Currently the virtio-iommu device must be programmed before it allows DMA from any PCI device. This can make the VM entirely unusable when a virtio-iommu driver isn't present, for example in a bootloader that loads the OS from storage. Similarly to the other vIOMMU implementations, default to DMA bypassing the IOMMU during boot. Add a "boot-bypass" property, defaulting to true, that lets users change this behavior. Replace the VIRTIO_IOMMU_F_BYPASS feature, which didn't support bypass before feature negotiation, with VIRTIO_IOMMU_F_BYPASS_CONFIG. We add the bypass field to the migration stream without introducing subsections, based on the assumption that this virtio-iommu device isn't being used in production enough to require cross-version migration at the moment (all previous version required workarounds since they didn't support ACPI and boot-bypass). Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20220214124356.872985-3-jean-philippe@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06vhost-vdpa: make notifiers _init()/_uninit() symmetricLaurent Vivier
vhost_vdpa_host_notifiers_init() initializes queue notifiers for queues "dev->vq_index" to queue "dev->vq_index + dev->nvqs", whereas vhost_vdpa_host_notifiers_uninit() uninitializes the same notifiers for queue "0" to queue "dev->nvqs". This asymmetry seems buggy, fix that by using dev->vq_index as the base for both. Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible") Cc: jasowang@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20220211161309.1385839-1-lvivier@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06hw/virtio: vdpa: Fix leak of host-notifier memory-regionLaurent Vivier
If call virtio_queue_set_host_notifier_mr fails, should free host-notifier memory-region. This problem can trigger a coredump with some vDPA drivers (mlx5, but not with the vdpasim), if we unplug the virtio-net card from the guest after a stop/start. The same fix has been done for vhost-user: 1f89d3b91e3e ("hw/virtio: Fix leak of host-notifier memory-region") Fixes: d0416d487bd5 ("vhost-vdpa: map virtqueue notification area if possible") Cc: jasowang@redhat.com Resolves: https://bugzilla.redhat.com/2027208 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20220211170259.1388734-1-lvivier@redhat.com> Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04hw/vhost-user-i2c: Add support for VIRTIO_I2C_F_ZERO_LENGTH_REQUESTViresh Kumar
VIRTIO_I2C_F_ZERO_LENGTH_REQUEST is a mandatory feature, that must be implemented by everyone. Add its support. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Message-Id: <fc47ab63b1cd414319c9201e8d6c7705b5ec3bd9.1644490993.git.viresh.kumar@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04virtio: fix the condition for iommu_platform not supportedHalil Pasic
The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but unsupported") claims to fail the device hotplug when iommu_platform is requested, but not supported by the (vhost) device. On the first glance the condition for detecting that situation looks perfect, but because a certain peculiarity of virtio_platform it ain't. In fact the aforementioned commit introduces a regression. It breaks virtio-fs support for Secure Execution, and most likely also for AMD SEV or any other confidential guest scenario that relies encrypted guest memory. The same also applies to any other vhost device that does not support _F_ACCESS_PLATFORM. The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates "device can not access all of the guest RAM" and "iova != gpa, thus device needs to translate iova". Confidential guest technologies currently rely on the device/hypervisor offering _F_ACCESS_PLATFORM, so that, after the feature has been negotiated, the guest grants access to the portions of memory the device needs to see. So in for confidential guests, generally, _F_ACCESS_PLATFORM is about the restricted access to memory, but not about the addresses used being something else than guest physical addresses. This is the very reason for which commit f7ef7e6e3b ("vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM") fences _F_ACCESS_PLATFORM from the vhost device that does not need it, because on the vhost interface it only means "I/O address translation is needed". This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the situation when _F_ACCESS_PLATFORM is requested, but no I/O translation by the device, and thus no device capability is needed. In this situation claiming that the device does not support iommu_plattform=on is counter-productive. So let us stop doing that! Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com> Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but unsupported") Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-stable@nongnu.org Message-Id: <20220207112857.607829-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04vhost-user: fix VirtQ notifier cleanupXueming Li
When vhost-user device cleanup, remove notifier MR and munmaps notifier address in the event-handling thread, VM CPU thread writing the notifier in concurrent fails with an error of accessing invalid address. It happens because MR is still being referenced and accessed in another thread while the underlying notifier mmap address is being freed and becomes invalid. This patch calls RCU and munmap notifiers in the callback after the memory flatview update finish. Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-3-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04vhost-user: remove VirtQ notifier restoreXueming Li
Notifier set when vhost-user backend asks qemu to mmap an FD and offset. When vhost-user backend restart or getting killed, VQ notifier FD and mmap addresses become invalid. After backend restart, MR contains the invalid address will be restored and fail on notifier access. On the other hand, qemu should munmap the notifier, release underlying hardware resources to enable backend restart and allocate hardware notifier resources correctly. Qemu shouldn't reference and use resources of disconnected backend. This patch removes VQ notifier restore, uses the default vhost-user notifier to avoid invalid address access. After backend restart, the backend should ask qemu to install a hardware notifier if needed. Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-2-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-02-21include: Move qemu_madvise() and related #defines to new qemu/madvise.hPeter Maydell
The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-21Mark remaining global TypeInfo instances as constBernhard Beschow
More than 1k of TypeInfo instances are already marked as const. Mark the remaining ones, too. This commit was created with: git grep -z -l 'static TypeInfo' -- '*.c' | \ xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/' Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-id: 20220117145805.173070-2-shentey@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>