slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2019-11-06	virtio: notify virtqueue via host notifier when available	Stefan Hajnoczi
	Host notifiers are used in several cases: 1. Traditional ioeventfd where virtqueue notifications are handled in the main loop thread. 2. IOThreads (aio_handle_output) where virtqueue notifications are handled in an IOThread AioContext. 3. vhost where virtqueue notifications are handled by kernel vhost or a vhost-user device backend. Most virtqueue notifications from the guest use the ioeventfd mechanism, but there are corner cases where QEMU code calls virtio_queue_notify(). This currently honors the host notifier for the IOThreads aio_handle_output case, but not for the vhost case. The result is that vhost does not receive virtqueue notifications from QEMU when virtio_queue_notify() is called. This patch extends virtio_queue_notify() to set the host notifier whenever it is enabled instead of calling the vq->(aio_)handle_output() function directly. We track the host notifier state for each virtqueue separately since some devices may use it only for certain virtqueues. This fixes the vhost case although it does add a trip through the eventfd for the traditional ioeventfd case. I don't think it's worth adding a fast path for the traditional ioeventfd case because calling virtio_queue_notify() is rare when ioeventfd is enabled. Reported-by: Felipe Franciosi <felipe@nutanix.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191105140946.165584-1-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29	virtio: Use auto rcu_read macros	Dr. David Alan Gilbert
	Use RCU_READ_LOCK_GUARD and WITH_RCU_READ_LOCK_GUARD to replace the manual rcu_read_(un)lock calls. I think the only change is virtio_load which was missing unlocks in error paths; those end up being fatal errors so it's not that important anyway. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191028161109.60205-1-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-29	Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵	Peter Maydell
	staging # gpg: Signature made Tue 29 Oct 2019 02:33:36 GMT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: COLO-compare: Fix incorrect `if` logic virtio-net: prevent offloads reset on migration virtio: new post_load hook net: add tulip (dec21143) driver Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-29	virtio: new post_load hook	Michael S. Tsirkin
	Post load hook in virtio vmsd is called early while device is processed, and when VirtIODevice core isn't fully initialized. Most device specific code isn't ready to deal with a device in such state, and behaves weirdly. Add a new post_load hook in a device class instead. Devices should use this unless they specifically want to verify the migration stream as it's processed, e.g. for bounds checking. Cc: qemu-stable@nongnu.org Suggested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-10-25	virtio: drop unused virtio_device_stop_ioeventfd() function	Stefan Hajnoczi
	virtio_device_stop_ioeventfd() has not been used since commit 310837de6c1e0badfd736b1b316b1698c53120a7 ("virtio: introduce grab/release_ioeventfd to fix vhost") in 2016. Nowadays ioeventfd is stopped implicitly by the virtio transport when lifecycle events such as the VM pausing or device unplug occur. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20191021150343.30742-1-stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-25	virtio: event suppression support for packed ring	Jason Wang
	This patch implements event suppression through device/driver area. Please refer virtio specification for more information. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20191025083527.30803-7-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-25	virtio: basic packed virtqueue support	Jason Wang
	This patch implements basic support for the packed virtqueue. Compare the split virtqueue which has three rings, packed virtqueue only have one which is supposed to have better cache utilization and more hardware friendly. Please refer virtio specification for more information. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20191025083527.30803-6-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-25	virtio: device/driver area size calculation refactor for split ring	Wei Xu
	There is slight size difference between split/packed rings. This is the refactor of split ring as well as a helper to expanding device and driver area size calculation for packed ring. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191025083527.30803-3-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-25	virtio: basic structure for packed ring	Wei Xu
	Define packed ring structure according to Qemu nomenclature, field data(wrap counter, etc) are also included. Signed-off-by: Wei Xu <wexu@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191025083527.30803-2-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-08-16	sysemu: Split sysemu/runstate.h off sysemu/sysemu.h	Markus Armbruster
	sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]
2019-08-16	sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.h	Markus Armbruster
	In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 1800 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous commit). Several headers include sysemu/sysemu.h just to get typedef VMChangeStateEntry. Move it from sysemu/sysemu.h to qemu/typedefs.h. Spell its structure tag the same while there. Drop the now superfluous includes of sysemu/sysemu.h from headers. Touching sysemu/sysemu.h now recompiles some 1100 objects. qemu/uuid.h also drops from 1800 to 1100, and qapi/qapi-types-run-state.h from 5000 to 4400. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-29-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16	Include hw/qdev-properties.h less	Markus Armbruster
	In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16	Include qemu/main-loop.h less	Markus Armbruster
	In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16	Include migration/qemu-file-types.h a lot less	Markus Armbruster
	In my "build everything" tree, changing migration/qemu-file-types.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The culprit is again hw/hw.h, which supposedly includes it for convenience. Include migration/qemu-file-types.h only where it's needed. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-07-08	virtio-scsi: restart DMA after iothread	Stefan Hajnoczi
	When the 'cont' command resumes guest execution the vm change state handlers are invoked. Unfortunately there is no explicit ordering between classic qemu_add_vm_change_state_handler() callbacks. When two layers of code both use vm change state handlers, we don't control which handler runs first. virtio-scsi with iothreads hits a deadlock when a failed SCSI command is restarted and completes before the iothread is re-initialized. This patch uses the new qdev_add_vm_change_state_handler() API to guarantee that virtio-scsi's virtio change state handler executes before the SCSI bus children. This way DMA is restarted after the iothread has re-initialized. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-07-04	virtio: Don't change "started" flag on virtio_vmstate_change()	Xie Yongji
	We will call virtio_set_status() on virtio_vmstate_change(). The "started" flag should not be changed in this case. Otherwise, we may get an incorrect value when we set "started" flag but not set DRIVER_OK in source VM. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-6-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04	virtio: Make sure we get correct state of device on handle_aio_output()	Xie Yongji
	We should set the flags: "start_on_kick" and "started" after we call the kick functions (handle_aio_output() and handle_output()). Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-5-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04	virtio: Set "start_on_kick" on virtio_set_features()	Xie Yongji
	The guest feature is not set correctly on virtio_reset() and virtio_init(). So we should not use it to set "start_on_kick" at that point. This patch set "start_on_kick" on virtio_set_features() instead. Fixes: badaf79cfdbd3 ("virtio: Introduce started flag to VirtioDevice") Signed-off-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190626023130.31315-4-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04	virtio: Set "start_on_kick" for legacy devices	Xie Yongji
	Besides virtio 1.0 transitional devices, we should also set "start_on_kick" flag for legacy devices (virtio 0.9). Signed-off-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190626023130.31315-3-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04	virtio: add "use-started" property	Xie Yongji
	In order to avoid migration issues, we introduce a "use-started" property to the base virtio device to indicate whether use "started" flag or not. This property will be true by default and set to false when machine type <= 4.0. Suggested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-2-xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12	Include qemu/module.h where needed, drop it from qemu-common.h	Markus Armbruster
	Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-05-24	hw/virtio: Use object_initialize_child for correct reference counting	Philippe Mathieu-Daudé
	As explained in commit aff39be0ed97: Both functions, object_initialize() and object_property_add_child() increase the reference counter of the new object, so one of the references has to be dropped afterwards to get the reference counting right. Otherwise the child object will not be properly cleaned up when the parent gets destroyed. Thus let's use now object_initialize_child() instead to get the reference counting here right. This patch was generated using the following Coccinelle script: @use_object_initialize_child@ expression parent_obj; expression child_ptr; expression child_name; expression child_type; expression child_size; expression errp; @@ ( - object_initialize(child_ptr, child_size, child_type); + object_initialize_child(parent_obj, child_name, child_ptr, child_size, + child_type, &error_abort, NULL); ... when != parent_obj - object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), NULL); ... ?- object_unref(OBJECT(child_ptr)); \| - object_initialize(child_ptr, child_size, child_type); + object_initialize_child(parent_obj, child_name, child_ptr, child_size, + child_type, errp, NULL); ... when != parent_obj - object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), errp); ... ?- object_unref(OBJECT(child_ptr)); ) While the object_initialize() function doesn't take an 'Error *errp' argument, the object_initialize_child() does. Since this code is used when a machine is created (and is not yet running), we deliberately choose to use the &error_abort argument instead of ignoring errors if an object creation failed. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Inspired-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190507163416.24647-4-philmd@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-05-20	virtio: Use started flag in virtio_vmstate_change()	Xie Yongji
	Currently, we use DRIVER_OK status bit to check whether guest driver has started the device in virtio_vmstate_change(). But it's not the case for virtio 1.0 transitional devices. If migration completes between kicking virtqueue and setting VIRTIO_CONFIG_S_DRIVER_OK, guest may be hung. So here we use started flag to check guest state instead. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190320112646.3712-3-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-20	virtio: Introduce started flag to VirtioDevice	Xie Yongji
	The virtio 1.0 transitional devices support driver uses the device before setting the DRIVER_OK status bit. So we introduce a started flag to indicate whether driver has started the device or not. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190320112646.3712-2-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-17	vhost_net: don't set backend for the uninitialized virtqueue	Jason Wang
	We used to set backend unconditionally, this won't work for some guests (e.g windows driver) who may not initialize all virtqueues. For kernel backend, this will fail since it may try to validate the rings during setting backend. Fixing this by simply skipping the backend set when we find desc is not ready. Reviewed-by: Michael S. Tsirkin<mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-02-22	virtio-net: make VirtIOFeature usable for other virtio devices	Stefano Garzarella
	In order to use VirtIOFeature also in other virtio devices, we move its declaration and the endof() macro (renamed in virtio_endof()) in virtio.h. We add virtio_feature_get_config_size() function to iterate the array of VirtIOFeature and to return the config size depending on the features enabled. (as virtio_net_set_config_size() did) Suggested-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-5-sgarzare@redhat.com Message-Id: <20190221103314.58500-5-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-01	virtio: add checks for the size of the indirect table	Dima Stepanov
	The virtqueue_pop() and virtqueue_get_avail_bytes() routines can use the INDIRECT table to get the data. It is possible to create a packet which will lead to the assert message like: include/exec/memory.h:1995: void address_space_read_cached(MemoryRegionCache , hwaddr, void , int): Assertion `addr < cache->len && len <= cache->len - addr' failed. Aborted To do it the first descriptor should have a link to the INDIRECT table and set the size of it to 0. It doesn't look good that the guest should be able to trigger the assert in qemu. Add additional check for the size of the INDIRECT table, which should not be 0. Signed-off-by: Dima Stepanov <dimastep@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-12-11	virtio: pass argument by value for virtqueue_map_iovec()	Dongli Zhang
	Pass num_sg by value instead of by pointer, as num_sg is never modified in virtqueue_map_iovec(). Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1541139396-4727-1-git-send-email-dongli.zhang@oracle.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-11-27	vmstate: constify VMStateField	Marc-André Lureau
	Because they are supposed to remain const. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19	clean up callback when del virtqueue	liujunjie
	Before, we did not clear callback like handle_output when delete the virtqueue which may result be segmentfault. The scene is as follows: 1. Start a vm with multiqueue vhost-net, 2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to triger multiqueue disable in this vm which will delete the virtqueue. In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh still exist. 3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh will be called and qemu will be crashed. Although the way described above is uncommon, we had better reinforce it. CC: qemu-stable@nongnu.org Signed-off-by: liujunjie <liujunjie23@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-17	cpu: Provide a proper prototype for target_words_bigendian() in a header	Thomas Huth
	We've got three places already that provide a prototype for this function in a .c file - that's ugly. Let's provide a proper prototype in a header instead, with a proper description why this function should not be used in most cases. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
2018-10-02	virtio: Return true from virtio_queue_empty if broken	Fam Zheng
	Both virtio-blk and virtio-scsi use virtio_queue_empty() as the loop condition in VQ handlers (virtio_blk_handle_vq, virtio_scsi_handle_cmd_vq). When a device is marked broken in virtqueue_pop, for example if a vIOMMU address translation failed, we want to break out of the loop. This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64 kernel with ATS enabled: $ qemu-system-x86_64 \ ... \ -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \ -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0 The dead loop happens immediately when the kernel boots and initializes the device, where virtio_scsi_data_plane_handle_cmd will not return: > ... > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > #17 0x00005586607885da in run_poll_handlers_once > #18 0x000055866078880e in try_poll_mode > #19 0x00005586607888eb in aio_poll > #20 0x0000558660784561 in aio_wait_bh_oneshot > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > #24 0x00005586605ab808 in virtio_pci_common_write > #25 0x0000558660242396 in memory_region_write_accessor > #26 0x00005586602425ab in access_with_adjusted_size > #27 0x0000558660245281 in memory_region_dispatch_write > #28 0x00005586601e008e in flatview_write_continue > #29 0x00005586601e01d8 in flatview_write > #30 0x00005586601e04de in address_space_write > #31 0x00005586601e052f in address_space_rw > #32 0x00005586602607f2 in kvm_cpu_exec > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > #34 0x000055866078bde7 in qemu_thread_start > #35 0x00007f5784906594 in start_thread > #36 0x00007f5784639e6f in clone With this patch, virtio_queue_empty will now return 1 as soon as the vdev is marked as broken, after a "virtio: zero sized buffers are not allowed" error. To be consistent, update virtio_queue_empty_rcu as well. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180910145616.8598-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-07	virtio: update MemoryRegionCaches when guest negotiates features	Paolo Bonzini
	Because the cache is sized to include the rings and the event indices, negotiating the VIRTIO_RING_F_EVENT_IDX feature will result in the size of the cache changing. And because MemoryRegionCache accesses are range-checked, if we skip this we end up with an assertion failure. This happens with OpenBSD 6.3. Reported-by: Fam Zheng <famz@redhat.com> Fixes: 97cd965c070152bc626c7507df9fb356bbe1cd81 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Fam Zheng <famz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-06-01	virtio: free MemoryRegionCache when initialization fails	Paolo Bonzini

2018-05-23	virtio: support setting memory region based host notifier	Tiwei Bie
	This patch introduces the support for setting memory region based host notifiers for virtio device. This is helpful when using a hardware accelerator for a virtio device, because hardware heavily depends on the notification, this will allow the guest driver in the VM to notify the hardware directly. Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08	virtio: improve virtio devices initialization time	Gal Hammer
	The loading time of a VM is quite significant when its virtio devices use a large amount of virt-queues (e.g. a virtio-serial device with max_ports=511). Most of the time is spend in the creation of all the required event notifiers (ioeventfd and memory regions). This patch pack all the changes to the memory regions in a single memory transaction. Reported-by: Sitong Liu Reported-by: Xiaoling Gao Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org>
2018-02-08	virtio: remove event notifier cleanup call on de-assign	Gal Hammer
	The virtio_bus_set_host_notifier function no longer calls event_notifier_cleanup when a event notifier is removed. The commit updates the code to match the new behavior and calls virtio_bus_cleanup_host_notifier after the notifier was de-assign and no longer in use. This change is a preparation to allow executing the virtio_bus_set_host_notifier function in a memory region transaction. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-24	Revert "virtio: improve virtio devices initialization time"	Michael S. Tsirkin
	This reverts commit 6f0bb230722931d17fb284eee8efd40b9d653822. This reverts commit f87d72f5c5bff0837d409a56bd34f439a90119ca as that is reported to break cleanup and migration. Cc: Gal Hammer <ghammer@redhat.com> Cc: Sitong Liu <siliu@redhat.com> Cc: Xiaoling Gao <xiagao@redhat.com> Suggested-by: Greg Kurz <groug@kaod.org> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
2018-01-18	virtio: improve virtio devices initialization time	Gal Hammer
	The loading time of a VM is quite significant when its virtio devices use a large amount of virt-queues (e.g. a virtio-serial device with max_ports=511). Most of the time is spend in the creation of all the required event notifiers (ioeventfd and memory regions). This patch pack all the changes to the memory regions in a single memory transaction. Reported-by: Sitong Liu <siliu@redhat.com> Reported-by: Xiaoling Gao <xiagao@redhat.com> Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-19	virtio_error: don't invoke status callbacks	Michael S. Tsirkin
	Backends don't need to know what frontend requested a reset, and notifying then from virtio_error is messy because virtio_error itself might be invoked from backend. Let's just set the status directly. Cc: qemu-stable@nongnu.org Reported-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01	virtio: check VirtQueue Vring object is set	Prasad J Pandit
	A guest could attempt to use an uninitialised VirtQueue object or unset Vring.align leading to a arithmetic exception. Add check to avoid it. Reported-by: Zhangboxian <zhangboxian@huawei.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2017-12-01	virtio: Add queue interface to restore avail index from vring used index	Maxime Coquelin
	In case of backend crash, it is not possible to restore internal avail index from the backend value as vhost_get_vring_base callback fails. This patch provides a new interface to restore internal avail index from the vring used index, as done by some vhost-user backend on reconnection. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-11-16	fix: unrealize virtio device if we fail to hotplug it	linzhecheng
	If we fail to hotplug virtio-blk device and then suspend or shutdown VM, qemu is likely to crash. Re-production steps: 1. Run VM named vm001 2. Create a virtio-blk.xml which contains wrong configurations: <disk device="lun" rawio="yes" type="block"> <driver cache="none" io="native" name="qemu" type="raw" /> <source dev="/dev/mapper/11-dm" /> <target bus="virtio" dev="vdx" /> </disk> 3. Run command : virsh attach-device vm001 virtio-blk.xml error: Failed to attach device from blk-scsi.xml error: internal error: unable to execute QEMU command 'device_add': Please set scsi=off for virtio-blk devices in order to use virtio 1.0 it means hotplug virtio-blk device failed. 4. Suspend or shutdown VM will leads to qemu crash Problem happens in virtio_vmstate_change which is called by vm_state_notify: vdev’s parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash. virtio_vmstate_change is added to the list vm_change_state_head at virtio_blk_device_realize(virtio_init), but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed from vm_change_state_head. Adding unrealize function of virtio-blk device can solve this problem. Signed-off-by: linzhecheng <linzhecheng@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15	virtio: fix descriptor counting in virtqueue_pop	Wolfgang Bumiller
	While changing the s/g list allocation, commit 3b3b0628 also changed the descriptor counting to count iovec entries as split by cpu_physical_memory_map(). Previously only the actual descriptor entries were counted and the split into the iovec happened afterwards in virtqueue_map(). Count the entries again instead to avoid erroneous "Looped descriptor" errors. Reported-by: Hans Middelhoek <h.middelhoek@ospito.nl> Link: https://forum.proxmox.com/threads/vm-crash-with-memory-hotplug.35904/ Fixes: 3b3b0628217e ("virtio: slim down allocation of VirtQueueElements") Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-09-27	migration: Route more error paths	Dr. David Alan Gilbert
	vmstate_save_state is called in lots of places. Route error returns from the easier cases back up; there are lots of more complex cases where their own error paths need fixing. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20170925112917.21340-7-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Commit message fix up as Peter's review
2017-09-19	osdep.h: Prohibit disabling assert() in supported builds	Eric Blake
	We already have several files that knowingly require assert() to work, sometimes because refactoring the code for proper error handling has not been tackled yet; there are probably other files that have a similar situation but with no comments documenting the same. In fact, we have places in migration that handle untrusted input with assertions, where disabling the assertions risks a worse security hole than the current behavior of losing the guest to SIGABRT when migration fails because of the assertion. Promote our current per-file safety-valve to instead be project-wide, and expand it to also cover glib's g_assert(). Note that we do NOT want to encourage 'assert(side-effects);' (that is a bad practice that prevents copy-and-paste of code to other projects that CAN disable assertions; plus it costs unnecessary reviewer mental cycles to remember whether a project special-cases the crippling of asserts); and we would LIKE to fix migration to not rely on asserts (but that takes a big code audit). But in the meantime, we DO want to send a message that anyone that disables assertions has to tweak code in order to compile, making it obvious that they are taking on additional risk that we are not going to support. At the same time, leave comments mentioning NDEBUG in files that we know still need to be scrubbed, so there is at least something to grep for. It would be possible to come up with some other mechanism for doing runtime checking by default, but which does not abort the program on failure, while leaving side effects in place (unlike how crippling assert() avoids even the side effects), perhaps under the name q_verify(); but it was not deemed worth the effort (developers should not have to learn a replacement when the standard C macro works just fine, and it would be a lot of churn for little gain). The patch specifically uses #error rather than #warn so that a user is forced to tweak the header to acknowledge the issue, even when not using a -Werror compilation. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-DaudÃ© <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20170911211320.25385-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-02	virtio: add virtqueue_alloc_element tracepoint	Paolo Bonzini
	This tracepoint can help diagnosing failures due to memory fragmentation in the guest. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-18	migration: migration.h was not needed	Juan Quintela
	This files don't use any function from migration.h, so drop it. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18	virtio: allow broken device to notify guest	Greg Kurz
	According to section 2.1.2 of the virtio-1 specification: "The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver." Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken" introduced a virtio_error() call that just does that: - internally mark the device as broken - set the DEVICE_NEEDS_RESET bit in the status - send a configuration change notification Unfortunately, virtio_notify_vector(), called by virtio_notify_config(), returns right away when the device is marked as broken and the notification isn't sent in this case. The spec doesn't say whether a broken device can send notifications in other situations or not. But since the driver isn't supposed to do anything but to reset the device, it makes sense to keep the check in virtio_notify_config(). Marking the device as broken AFTER the configuration change notification was sent is enough to fix the issue. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-22	virtio: always use handle_aio_output if registered	Paolo Bonzini
	Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane path if ioeventfd is active", 2016-10-30) broke the virtio 1.0 indirect access registers. The indirect access registers bypass the ioeventfd, so that virtio-blk and virtio-scsi now repeatedly try to initialize dataplane instead of triggering the guest->host EventNotifier. Detect the situation by checking vq->handle_aio_output; if it is not NULL, trigger the EventNotifier, which is how the device expects to get notifications and in fact the only thread-safe manner to deliver them. Fixes: ad07cd6 Fixes: 9ffe337 Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>