aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio/vhost-user.c
AgeCommit message (Collapse)Author
2023-11-07vhost-user: Interface for migration state transferHanna Czenczek
Add the interface for transferring the back-end's state during migration as defined previously in vhost-user.rst. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20231016134243.68248-6-hreitz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-25kvm: require KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTHPaolo Bonzini
KVM_CAP_IOEVENTFD_ANY_LENGTH was added in Linux 4.4, released in 2016. Assume that it is present. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-22vhost-user: fix lost reconnectLi Feng
When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fails at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. All vhost-user devices have this issue, including vhost-user-blk/scsi. With this patch, if the vdev->vdev is null, the fd callback will still be reinstalled. Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20231009044735.941655-6-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: do not send RESET_OWNER on device resetStefan Hajnoczi
The VHOST_USER_RESET_OWNER message is deprecated in the spec: This is no longer used. Used to be sent to request disabling all rings, but some back-ends interpreted it to also discard connection state (this interpretation would lead to bugs). It is recommended that back-ends either ignore this message, or use it to disable all rings. The only caller of vhost_user_reset_device() is vhost_user_scsi_reset(). It checks that F_RESET_DEVICE was negotiated before calling it: static void vhost_user_scsi_reset(VirtIODevice *vdev) { VHostSCSICommon *vsc = VHOST_SCSI_COMMON(vdev); struct vhost_dev *dev = &vsc->dev; /* * Historically, reset was not implemented so only reset devices * that are expecting it. */ if (!virtio_has_feature(dev->protocol_features, VHOST_USER_PROTOCOL_F_RESET_DEVICE)) { return; } if (dev->vhost_ops->vhost_reset_device) { dev->vhost_ops->vhost_reset_device(dev); } } Therefore VHOST_USER_RESET_OWNER is actually never sent by vhost_user_reset_device(). Remove the dead code. This effectively moves the vhost-user protocol specific code from vhost-user-scsi.c into vhost-user.c where it belongs. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20231004014532.1228637-2-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-10-22vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronouslyLaszlo Ersek
(1) The virtio-1.2 specification <http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes: > 3 General Initialization And Device Operation > 3.1 Device Initialization > 3.1.1 Driver Requirements: Device Initialization > > [...] > > 7. Perform device-specific setup, including discovery of virtqueues for > the device, optional per-bus setup, reading and possibly writing the > device’s virtio configuration space, and population of virtqueues. > > 8. Set the DRIVER_OK status bit. At this point the device is “live”. and > 4 Virtio Transport Options > 4.1 Virtio Over PCI Bus > 4.1.4 Virtio Structure PCI Capabilities > 4.1.4.3 Common configuration structure layout > 4.1.4.3.2 Driver Requirements: Common configuration structure layout > > [...] > > The driver MUST configure the other virtqueue fields before enabling the > virtqueue with queue_enable. > > [...] (The same statements are present in virtio-1.0 identically, at <http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.) These together mean that the following sub-sequence of steps is valid for a virtio-1.0 guest driver: (1.1) set "queue_enable" for the needed queues as the final part of device initialization step (7), (1.2) set DRIVER_OK in step (8), (1.3) immediately start sending virtio requests to the device. (2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES special virtio feature is negotiated, then virtio rings start in disabled state, according to <https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>. In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for enabling vrings. Therefore setting "queue_enable" from the guest (1.1) -- which is technically "buffered" on the QEMU side until the guest sets DRIVER_OK (1.2) -- is a *control plane* operation, which -- after (1.2) -- travels from the guest through QEMU to the vhost-user backend, using a unix domain socket. Whereas sending a virtio request (1.3) is a *data plane* operation, which evades QEMU -- it travels from guest to the vhost-user backend via eventfd. This means that operations ((1.1) + (1.2)) and (1.3) travel through different channels, and their relative order can be reversed, as perceived by the vhost-user backend. That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs against the Rust-language virtiofsd version 1.7.2. (Which uses version 0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost crate.) Namely, when VirtioFsDxe binds a virtiofs device, it goes through the device initialization steps (i.e., control plane operations), and immediately sends a FUSE_INIT request too (i.e., performs a data plane operation). In the Rust-language virtiofsd, this creates a race between two components that run *concurrently*, i.e., in different threads or processes: - Control plane, handling vhost-user protocol messages: The "VhostUserSlaveReqHandlerMut::set_vring_enable" method [crates/vhost-user-backend/src/handler.rs] handles VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled" flag according to the message processed. - Data plane, handling virtio / FUSE requests: The "VringEpollHandler::handle_event" method [crates/vhost-user-backend/src/event_loop.rs] handles the incoming virtio / FUSE request, consuming the virtio kick at the same time. If the vring's "enabled" flag is set, the virtio / FUSE request is processed genuinely. If the vring's "enabled" flag is clear, then the virtio / FUSE request is discarded. Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*. However, if the data plane processor in virtiofsd wins the race, then it sees the FUSE_INIT *before* the control plane processor took notice of VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane processor. Therefore the latter drops FUSE_INIT on the floor, and goes back to waiting for further virtio / FUSE requests with epoll_wait. Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock. The deadlock is not deterministic. OVMF hangs infrequently during first boot. However, OVMF hangs almost certainly during reboots from the UEFI shell. The race can be "reliably masked" by inserting a very small delay -- a single debug message -- at the top of "VringEpollHandler::handle_event", i.e., just before the data plane processor checks the "enabled" field of the vring. That delay suffices for the control plane processor to act upon VHOST_USER_SET_VRING_ENABLE. We can deterministically prevent the race in QEMU, by blocking OVMF inside step (1.2) -- i.e., in the write to the device status register that "unleashes" queue enablement -- until VHOST_USER_SET_VRING_ENABLE actually *completes*. That way OVMF's VCPU cannot advance to the FUSE_INIT submission before virtiofsd's control plane processor takes notice of the queue being enabled. Wait for VHOST_USER_SET_VRING_ENABLE completion by: - setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature has been negotiated, or - performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> [lersek@redhat.com: work Eugenio's explanation into the commit message, about QEMU containing step (1.1) until step (1.2)] Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-8-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: allow "vhost_set_vring" to wait for a replyLaszlo Ersek
The "vhost_set_vring" function already centralizes the common parts of "vhost_user_set_vring_num", "vhost_user_set_vring_base" and "vhost_user_set_vring_enable". We'll want to allow some of those callers to wait for a reply. Therefore, rebase "vhost_set_vring" from just "vhost_user_write" to "vhost_user_write_sync", exposing the "wait_for_reply" parameter. This is purely refactoring -- there is no observable change. That's because: - all three callers pass in "false" for "wait_for_reply", which disables all logic in "vhost_user_write_sync" except the call to "vhost_user_write"; - the fds=NULL and fd_num=0 arguments of the original "vhost_user_write" call inside "vhost_set_vring" are hard-coded within "vhost_user_write_sync". Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-7-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: hoist "write_sync", "get_features", "get_u64"Laszlo Ersek
In order to avoid a forward-declaration for "vhost_user_write_sync" in a subsequent patch, hoist "vhost_user_write_sync" -> "vhost_user_get_features" -> "vhost_user_get_u64" just above "vhost_set_vring". This is purely code movement -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-6-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"Laszlo Ersek
At this point, only "vhost_user_write_sync" calls "enforce_reply"; embed the latter into the former. This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-5-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: factor out "vhost_user_write_sync"Laszlo Ersek
The tails of the "vhost_user_set_vring_addr" and "vhost_user_set_u64" functions are now byte-for-byte identical. Factor the common tail out to a new function called "vhost_user_write_sync". This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-4-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: tighten "reply_supported" scope in "set_vring_addr"Laszlo Ersek
In the vhost_user_set_vring_addr() function, we calculate "reply_supported" unconditionally, even though we'll only need it if "wait_for_reply" is also true. Restrict the scope of "reply_supported" to the minimum. This is purely refactoring -- no observable change. Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-3-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-22vhost-user: strip superfluous whitespaceLaszlo Ersek
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost) Cc: Eugenio Perez Martin <eperezma@redhat.com> Cc: German Maglione <gmaglione@redhat.com> Cc: Liu Jiang <gerry@linux.alibaba.com> Cc: Sergio Lopez Pascual <slp@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Albert Esteve <aesteve@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20231002203221.17241-2-lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-12vhost: Remove vhost_backend_can_merge() callbackDavid Hildenbrand
Checking whether the memory regions are equal is sufficient: if they are equal, then most certainly the contained fd is equal. The whole vhost-user memslot handling is suboptimal and overly complicated. We shouldn't have to lookup a RAM memory regions we got notified about in vhost_user_get_mr_data() using a host pointer. But that requires a bigger rework -- especially an alternative vhost_set_mem_table() backend call that simply consumes MemoryRegionSections. For now, let's just drop vhost_backend_can_merge(). Message-ID: <20230926185738.277351-3-david@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12vhost: Rework memslot filtering and fix "used_memslot" trackingDavid Hildenbrand
Having multiple vhost devices, some filtering out fd-less memslots and some not, can mess up the "used_memslot" accounting. Consequently our "free memslot" checks become unreliable and we might run out of free memslots at runtime later. An example sequence which can trigger a potential issue that involves different vhost backends (vhost-kernel and vhost-user) and hotplugged memory devices can be found at [1]. Let's make the filtering mechanism less generic and distinguish between backends that support private memslots (without a fd) and ones that only support shared memslots (with a fd). Track the used_memslots for both cases separately and use the corresponding value when required. Note: Most probably we should filter out MAP_PRIVATE fd-based RAM regions (for example, via memory-backend-memfd,...,shared=off or as default with memory-backend-file) as well. When not using MAP_SHARED, it might not work as expected. Add a TODO for now. [1] https://lkml.kernel.org/r/fad9136f-08d3-3fd9-71a1-502069c000cf@redhat.com Message-ID: <20230926185738.277351-2-david@redhat.com> Fixes: 988a27754bbb ("vhost: allow backends to filter memory sections") Cc: Tiwei Bie <tiwei.bie@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-04vhost-user: add shared_object msgAlbert Esteve
Add three new vhost-user protocol `VHOST_USER_BACKEND_SHARED_OBJECT_* messages`. These new messages are sent from vhost-user back-ends to interact with the virtio-dmabuf table in order to add or remove themselves as virtio exporters, or lookup for virtio dma-buf shared objects. The action taken in the front-end depends on the type stored in the virtio shared object hash table. When the table holds a pointer to a vhost backend for a given UUID, the front-end sends a VHOST_USER_GET_SHARED_OBJECT to the backend holding the shared object. The messages can only be sent after successfully negotiating a new VHOST_USER_PROTOCOL_F_SHARED_OBJECT vhost-user protocol feature bit. Finally, refactor code to send response message so that all common parts both for the common REPLY_ACK case, and other data responses, can call it and avoid code repetition. Signed-off-by: Albert Esteve <aesteve@redhat.com> Message-Id: <20231002065706.94707-4-aesteve@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-10-04vhost-user: move VhostUserProtocolFeature definition to header fileJonah Palmer
Move the definition of VhostUserProtocolFeature to include/hw/virtio/vhost-user.h. Remove previous definitions in hw/scsi/vhost-user-scsi.c, hw/virtio/vhost-user.c, and hw/virtio/virtio-qmp.c. Previously there were 3 separate definitions of this over 3 different files. Now only 1 definition of this will be present for these 3 files. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20230926224107.2951144-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-07-10vhost-user: Make RESET_DEVICE a per device messageTom Lonergan
A device reset is issued per device, not per VQ. The legacy device reset message, VHOST_USER_RESET_OWNER, is already a per device message. Therefore, this change adds the proper message, VHOST_USER_RESET_DEVICE, to per device messages. Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com> Message-Id: <20230628163927.108171-3-tom.lonergan@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-07-10vhost-user: Change one_time to per_device requestTom Lonergan
Some devices, like virtio-scsi, consist of one vhost_dev, while others, like virtio-net, contain multiple vhost_devs. The QEMU vhost-user code has a concept of one-time messages which is misleading. One-time messages are sent once per operation on the device, not once for the lifetime of the device. Therefore, as discussed in [1], vhost_user_one_time_request should be renamed to vhost_user_per_device_request and the relevant comments updated to match the real functionality. [1] https://lore.kernel.org/qemu-devel/20230127083027-mutt-send-email-mst@kernel.org/ Signed-off-by: Tom Lonergan <tom.lonergan@nutanix.com> Message-Id: <20230628163927.108171-2-tom.lonergan@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2023-06-26vhost-user: fully use new backend/frontend namingManos Pitsidianakis
Slave/master nomenclature was replaced with backend/frontend in commit 1fc19b65279a ("vhost-user: Adopt new backend naming") This patch replaces all remaining uses of master and slave in the codebase. Signed-off-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20230613080849.2115347-1-manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2023-06-23cryptodev-vhost-user: add asymmetric crypto supportGowrishankar Muthukrishnan
Add asymmetric crypto support in vhost_user backend. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Message-Id: <20230516083139.2349744-1-gmuthukrishn@marvell.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-23hostmem-file: add offset optionAlexander Graf
Add an option for hostmem-file to start the memory object at an offset into the target file. This is useful if multiple memory objects reside inside the same target file, such as a device node. In particular, it's useful to map guest memory directly into /dev/mem for experimentation. To make this work consistently, also fix up all places in QEMU that expect fd offsets to be 0. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20230403221421.60877-1-graf@amazon.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-05-19vhost-user: Remove acpi-specific memslot limitDavid Hildenbrand
Let's just support 512 memslots on x86-64 and aarch64 as well. The maximum number of ACPI slots (256) is no longer completely expressive ever since we supported virtio-based memory devices. Further, we're completely ignoring other memslots used outside of memory device context, such as memslots used for boot memory. Note that the vhost memslot limit in the kernel is usually configured to be 509. With this change, we prepare vhost-user on the QEMU side to be closer to that limit, to eventually support ~512 memslots in most vhost implementations and have less "surprises" when cold/hotplugging vhost devices while also consuming more memslots than we're currently used to by memory devices (e.g., once virtio-mem starts using multiple memslots). Note that most vhost-user implementations only support a small number of memslots so far, which we can hopefully improve in the near future. We'll leave the PPC special-case as is for now. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20230503184144.808478-1-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-05-19vhost-user: send SET_STATUS 0 after GET_VRING_BASEStefan Hajnoczi
Setting the VIRTIO Device Status Field to 0 resets the device. The device's state is lost, including the vring configuration. vhost-user.c currently sends SET_STATUS 0 before GET_VRING_BASE. This risks confusion about the lifetime of the vhost-user state (e.g. vring last_avail_idx) across VIRTIO device reset. Eugenio Pérez <eperezma@redhat.com> adjusted the order for vhost-vdpa.c in commit c3716f260bff ("vdpa: move vhost reset after get vring base") and in that commit description suggested doing the same for vhost-user in the future. Go ahead and adjust vhost-user.c now. I ran various online code searches to identify vhost-user backends implementing SET_STATUS. It seems only DPDK implements SET_STATUS and Yajun Wu <yajunw@nvidia.com> has confirmed that it is safe to make this change. Fixes: commit 923b8921d210763359e96246a58658ac0db6c645 ("vhost-user: Support vhost_dev_start") Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Cindy Lu <lulu@redhat.com> Cc: Yajun Wu <yajunw@nvidia.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230501230409.274178-1-stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07hw/virtio/vhost-user: avoid using unitialized errpAlbert Esteve
During protocol negotiation, when we the QEMU stub does not support a backend with F_CONFIG, it throws a warning and supresses the VHOST_USER_PROTOCOL_F_CONFIG bit. However, the warning uses warn_reportf_err macro and passes an unitialized errp pointer. However, the macro tries to edit the 'msg' member of the unitialized Error and segfaults. Instead, just use warn_report, which prints a warning message directly to the output. Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Albert Esteve <aesteve@redhat.com> Message-Id: <20230302121719.9390-1-aesteve@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-02vhost-user: Adopt new backend namingMaxime Coquelin
The Vhost-user specification changed feature and request naming from _SLAVE_ to _BACKEND_. This patch adopts the new naming convention. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20230208203259.381326-4-maxime.coquelin@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-28Revert "vhost-user: Introduce nested event loop in vhost_user_read()"Greg Kurz
This reverts commit a7f523c7d114d445c5d83aecdba3efc038e5a692. The nested event loop is broken by design. It's only user was removed. Drop the code as well so that nobody ever tries to use it again. I had to fix a couple of trivial conflicts around return values because of 025faa872bcf ("vhost-user: stick to -errno error return convention"). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-3-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2023-01-28Revert "vhost-user: Monitor slave channel in vhost_user_read()"Greg Kurz
This reverts commit db8a3772e300c1a656331a92da0785d81667dc81. Motivation : this is breaking vhost-user with DPDK as reported in [0]. Received unexpected msg type. Expected 22 received 40 Fail to update device iotlb Received unexpected msg type. Expected 40 received 22 Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 1 ring restore failed: -71: Protocol error (71) Received unexpected msg type. Expected 22 received 11 Fail to update device iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 0 ring restore failed: -71: Protocol error (71) unable to start vhost net: 71: falling back on userspace virtio The failing sequence that leads to the first error is : - QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master socket - QEMU starts a nested event loop in order to wait for the VHOST_USER_GET_STATUS response and to be able to process messages from the slave channel - DPDK sends a couple of legitimate IOTLB miss messages on the slave channel - QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22) updates on the master socket - QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG but it gets the response for the VHOST_USER_GET_STATUS instead The subsequent errors have the same root cause : the nested event loop breaks the order by design. It lures QEMU to expect responses to the latest message sent on the master socket to arrive first. Since this was only needed for DAX enablement which is still not merged upstream, just drop the code for now. A working solution will have to be merged later on. Likely protect the master socket with a mutex and service the slave channel with a separate thread, as discussed with Maxime in the mail thread below. [0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/ Reported-by: Yanghang Liu <yanghliu@redhat.com> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20230119172424.478268-2-groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2023-01-28vhost-user: Skip unnecessary duplicated VHOST_USER_ADD/REM_MEM_REG requestsMinghao Yuan
The VHOST_USER_ADD/REM_MEM_REG requests should be categorized into non-vring specific messages, and should be sent only once. Signed-off-by: Minghao Yuan <yuanmh12@chinatelecom.cn> Message-Id: <20230123122119.194347-1-yuanmh12@chinatelecom.cn> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-27vhost-user: Correct a reference of TARGET_AARCH64Akihiko Odaki
Presumably TARGET_ARM_64 should be a mistake of TARGET_AARCH64. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20230109063130.81296-1-akihiko.odaki@daynix.com> Fixes: 27598393a2 ("Lift max memory slots limit imposed by vhost-user") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21vhost-user: send set log base message only onceYajun Wu
Vhost message VHOST_USER_SET_LOG_BASE is device wide. So only send it once with the first queue pair. Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Parav Pandit <parav@nvidia.com> Message-Id: <20221122051447.248462-1-yajunw@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-14cleanup: Tweak and re-run return_directly.cocciMarkus Armbruster
Tweak the semantic patch to drop redundant parenthesis around the return expression. Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored manually. Coccinelle messes up vmdk_co_create(), not sure why. Change dropped, will be done manually in the next commit. Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up manually. Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually. checkpatch.pl complains "return of an errno should typically be -ve" two times for hw/9pfs/9p-synth.c. Preexisting, the patch merely makes it visible to checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221122134917.1217307-2-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2022-12-01hw/virtio: generalise CHR_EVENT_CLOSED handlingAlex Bennée
..and use for both virtio-user-blk and virtio-user-gpio. This avoids the circular close by deferring shutdown due to disconnection until a later point. virtio-user-blk already had this mechanism in place so generalise it as a vhost-user helper function and use for both blk and gpio devices. While we are at it we also fix up vhost-user-gpio to re-establish the event handler after close down so we can reconnect later. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20221130112439.2527228-5-alex.bennee@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-11-07vhost-user: Support vhost_dev_startYajun Wu
The motivation of adding vhost-user vhost_dev_start support is to improve backend configuration speed and reduce live migration VM downtime. Today VQ configuration is issued one by one. For virtio net with multi-queue support, backend needs to update RSS (Receive side scaling) on every rx queue enable. Updating RSS is time-consuming (typical time like 7ms). Implement already defined vhost status and message in the vhost specification [1]. (a) VHOST_USER_PROTOCOL_F_STATUS (b) VHOST_USER_SET_STATUS (c) VHOST_USER_GET_STATUS Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK for device start and reset(0) for device stop. On reception of the DRIVER_OK message, backend can apply the needed setting only once (instead of incremental) and also utilize parallelism on enabling queues. This improves QEMU's live migration downtime with vhost user backend implementation by great margin, specially for the large number of VQs of 64 from 800 msec to 250 msec. [1] https://qemu-project.gitlab.io/qemu/interop/vhost-user.html Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Parav Pandit <parav@nvidia.com> Message-Id: <20221017064452.1226514-3-yajunw@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-11-07vhost-user: Fix out of order vring host notification handlingYajun Wu
vhost backend sends host notification for every VQ. If backend creates VQs in parallel, the VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG may arrive to QEMU in different order than incremental queue index order. For example VQ 1's message arrive earlier than VQ 0's: After alloc VhostUserHostNotifier for VQ 1. GPtrArray becomes [ nil, VQ1 pointer ] After alloc VhostUserHostNotifier for VQ 0. GPtrArray becomes [ VQ0 pointer, nil, VQ1 pointer ] This is wrong. fetch_notifier will return NULL for VQ 1 in vhost_user_get_vring_base, causes host notifier miss removal(leak). The fix is to remove current element from GPtrArray, make the right position for element to insert. Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers") Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Parav Pandit <parav@nvidia.com> Message-Id: <20221018023651.1359420-1-yajunw@nvidia.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07hw/virtio: fix some coding style issuesAlex Bennée
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220802095010.3330793-8-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-10-07hw/virtio: incorporate backend features in featuresAlex Bennée
There are some extra bits used over a vhost-user connection which are hidden from the device itself. We need to set them here to ensure we enable things like the protocol extensions. Currently net/vhost-user.c has it's own inscrutable way of persisting this data but it really should live in the core vhost_user code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220726192150.2435175-7-alex.bennee@linaro.org> Message-Id: <20220802095010.3330793-2-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-09-29vhost-user: Call qemu_socketpair() instead of socketpair()Guoyi Tu
As the close-on-exec flags is not set on the file descriptors returned by socketpair() at default, the fds will survive across exec' function. In the case that exec' function get invoked, such as the live-update feature which is been developing, it will cause fd leaks. To address this problem, we should call qemu_socketpair() to create an pair of connected sockets with the close-on-exec flag set. Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <7002b12a5fb0a30cd878e14e07da61c36da72913.1661240709.git.tugy@chinatelecom.cn>
2022-08-17hw/virtio: fix vhost_user_read tracepointAlex Bennée
As reads happen in the callback we were never seeing them. We only really care about the header so move the tracepoint to when the header is complete. Fixes: 6ca6d8ee9d (hw/virtio: add vhost_user_[read|write] trace points) Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220728135503.1060062-5-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-27vhost: add method vhost_set_vring_errKonstantin Khlebnikov
Kernel and user vhost may report virtqueue errors via eventfd. This is only reliable way to get notification about protocol error. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Message-Id: <20220623161325.18813-2-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
2022-06-16virtio/vhost-user: Fix wrong vhost notifier GPtrArray sizeYajun Wu
In fetch_or_create_notifier, idx begins with 0. So the GPtrArray size should be idx + 1 and g_ptr_array_set_size should be called with idx + 1. This wrong GPtrArray size causes fetch_or_create_notifier return an invalid address. Passing this invalid pointer to vhost_user_host_notifier_remove causes assert fail: qemu/include/qemu/int128.h:27: int128_get64: Assertion `r == a' failed. shutting down, reason=crashed Backends like dpdk-vdpa which sends out vhost notifier requests almost always hit qemu crash. Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers") Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Parav Pandit <parav@nvidia.com> Change-Id: I87e0f7591ca9a59d210879b260704a2d9e9d6bcd Message-Id: <20220526034851.683258-1-yajunw@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2022-06-09hw/virtio/vhost-user: don't use uninitialized variableChangpeng Liu
Variable `vdev` in `struct vhost_dev` will not be ready until start the device, so let's not use it for the error output here. Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Message-Id: <20220525125540.50979-1-changpeng.liu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Fixes: 5653493 ("hw/virtio/vhost-user: don't suppress F_CONFIG when supported") Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
2022-05-16virtio/vhost-user: dynamically assign VhostUserHostNotifiersAlex Bennée
At a couple of hundred bytes per notifier allocating one for every potential queue is very wasteful as most devices only have a few queues. Instead of having this handled statically dynamically assign them and track in a GPtrArray. [AJB: it's hard to trigger the vhost notifiers code, I assume as it requires a KVM guest with appropriate backend] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-14-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/virtio/vhost-user: don't suppress F_CONFIG when supportedAlex Bennée
Previously we would silently suppress VHOST_USER_PROTOCOL_F_CONFIG during the protocol negotiation if the QEMU stub hadn't implemented the vhost_dev_config_notifier. However this isn't the only way we can handle config messages, the existing vdc->get/set_config can do this as well. Lightly re-factor the code to check for both potential methods and instead of silently squashing the feature error out. It is unlikely that a vhost-user backend expecting to handle CONFIG messages will behave correctly if they never get sent. Fixes: 1c3e5a2617 ("vhost-user: back SET/GET_CONFIG requests with a protocol feature") Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-13-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-16hw/virtio: add vhost_user_[read|write] trace pointsAlex Bennée
These are useful when trying to debug the initial vhost-user negotiation, especially when it hard to get logging from the low level library on the other side. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220321153037.3622127-4-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-05-05vhost-user: Use correct macro name TARGET_PPC64Murilo Opsfelder Araujo
The correct name of the macro is TARGET_PPC64. Fixes: 27598393a232 ("Lift max memory slots limit imposed by vhost-user") Reported-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com> Cc: Peter Turschmid <peter.turschm@nutanix.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20220503180108.34506-1-muriloo@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-04vhost-user: Don't pass file descriptor for VHOST_USER_REM_MEM_REGKevin Wolf
The spec clarifies now that QEMU should not send a file descriptor in a request to remove a memory region. Change it accordingly. For libvhost-user, this is a bug fix that makes it compatible with rust-vmm's implementation that doesn't send a file descriptor. Keep accepting, but ignoring a file descriptor for compatibility with older QEMU versions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220407133657.155281-4-kwolf@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-03util: rename qemu_*block() socket functionsMarc-André Lureau
The qemu_*block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit f9e8cacc5557e43 ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-04-06Replace qemu_real_host_page variables with inlined functionsMarc-André Lureau
Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-04vhost-user: fix VirtQ notifier cleanupXueming Li
When vhost-user device cleanup, remove notifier MR and munmaps notifier address in the event-handling thread, VM CPU thread writing the notifier in concurrent fails with an error of accessing invalid address. It happens because MR is still being referenced and accessed in another thread while the underlying notifier mmap address is being freed and becomes invalid. This patch calls RCU and munmap notifiers in the callback after the memory flatview update finish. Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-3-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04vhost-user: remove VirtQ notifier restoreXueming Li
Notifier set when vhost-user backend asks qemu to mmap an FD and offset. When vhost-user backend restart or getting killed, VQ notifier FD and mmap addresses become invalid. After backend restart, MR contains the invalid address will be restored and fail on notifier access. On the other hand, qemu should munmap the notifier, release underlying hardware resources to enable backend restart and allocate hardware notifier resources correctly. Qemu shouldn't reference and use resources of disconnected backend. This patch removes VQ notifier restore, uses the default vhost-user notifier to avoid invalid address access. After backend restart, the backend should ask qemu to install a hardware notifier if needed. Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers") Cc: qemu-stable@nongnu.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Message-Id: <20220207071929.527149-2-xuemingl@nvidia.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07vhost-user: stick to -errno error return conventionRoman Kagan
VhostOps methods in user_ops are not very consistent in their error returns: some return negated errno while others just -1. Make sure all of them consistently return negated errno. This also helps error propagation from the functions being called inside. Besides, this synchronizes the error return convention with the other two vhost backends, kernel and vdpa, and will therefore allow for consistent error propagation in the generic vhost code (in a followup patch). Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>