aboutsummaryrefslogtreecommitdiff
path: root/hw/virtio
AgeCommit message (Collapse)Author
2019-07-08virtio-scsi: restart DMA after iothreadStefan Hajnoczi
When the 'cont' command resumes guest execution the vm change state handlers are invoked. Unfortunately there is no explicit ordering between classic qemu_add_vm_change_state_handler() callbacks. When two layers of code both use vm change state handlers, we don't control which handler runs first. virtio-scsi with iothreads hits a deadlock when a failed SCSI command is restarted and completes before the iothread is re-initialized. This patch uses the new qdev_add_vm_change_state_handler() API to guarantee that virtio-scsi's virtio change state handler executes before the SCSI bus children. This way DMA is restarted after the iothread has re-initialized. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-07-04virtio: Don't change "started" flag on virtio_vmstate_change()Xie Yongji
We will call virtio_set_status() on virtio_vmstate_change(). The "started" flag should not be changed in this case. Otherwise, we may get an incorrect value when we set "started" flag but not set DRIVER_OK in source VM. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-6-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04virtio: Make sure we get correct state of device on handle_aio_output()Xie Yongji
We should set the flags: "start_on_kick" and "started" after we call the kick functions (handle_aio_output() and handle_output()). Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-5-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04virtio: Set "start_on_kick" on virtio_set_features()Xie Yongji
The guest feature is not set correctly on virtio_reset() and virtio_init(). So we should not use it to set "start_on_kick" at that point. This patch set "start_on_kick" on virtio_set_features() instead. Fixes: badaf79cfdbd3 ("virtio: Introduce started flag to VirtioDevice") Signed-off-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190626023130.31315-4-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04virtio: Set "start_on_kick" for legacy devicesXie Yongji
Besides virtio 1.0 transitional devices, we should also set "start_on_kick" flag for legacy devices (virtio 0.9). Signed-off-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190626023130.31315-3-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04virtio: add "use-started" propertyXie Yongji
In order to avoid migration issues, we introduce a "use-started" property to the base virtio device to indicate whether use "started" flag or not. This property will be true by default and set to false when machine type <= 4.0. Suggested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Xie Yongji <xieyongji@baidu.com> Message-Id: <20190626023130.31315-2-xieyongji@baidu.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-04virtio-pci: fix missing device propertiesMarc-André Lureau
Since commit a4ee4c8baa37154 ("virtio: Helper for registering virtio device types"), virtio-gpu-pci, virtio-vga, and virtio-crypto-pci lost some properties: "ioeventfd" and "vectors". This may cause various issues, such as failing migration or invalid properties. Since those VirtioPCI devices do not have a base name, their class are initialized with virtio_pci_generic_base_class_init(). However, if the VirtioPCIDeviceTypeInfo provided a class_init which sets dc->props, the properties were overwritten by virtio_pci_generic_class_init(). Instead, introduce an intermediary base-type to register the generic properties. Fixes: a4ee4c8baa37154f42b4dc6a13fee79268d15238 Cc: qemu-stable@nongnu.org Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20190625232333.30752-1-marcandre.lureau@redhat.com>
2019-07-04virtio-pci: Proxy for virtio-pmemPankaj Gupta
We need a proxy device for virtio-pmem, and this device has to be the actual memory device so we can cleanly hotplug it. Forward memory device class functions either to the actual device or use properties of the virtio-pmem device to implement these in the proxy. virtio-pmem will only be compiled for selected, supported architectures (that can deal with virtio/pci devices being memory devices). An architecture that is prepared for that can simply enable CONFIG_VIRTIO_PMEM to make it work. As not all architectures support memory devices (and CONFIG_VIRTIO_PMEM will be enabled per supported architecture), we have to move the PCI proxy to a separate file. Signed-off-by: Pankaj Gupta <pagupta@redhat.com> [ split up patches, memory-device changes, move pci proxy] Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190619094907.10131-5-pagupta@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-02virtio-pci: Allow to specify additional interfaces for the base typeDavid Hildenbrand
Let's allow to specify additional interfaces for the base type (e.g. later TYPE_MEMORY_DEVICE), something that was possible before the rework of virtio PCI device instantiation. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190619094907.10131-3-pagupta@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-07-02virtio-pmem: add virtio devicePankaj Gupta
This is the implementation of virtio-pmem device. Support will require machine changes for the architectures that will support it, so it will not yet be compiled. It can be unlocked with VIRTIO_PMEM_SUPPORTED per machine and disabled globally via VIRTIO_PMEM. We cannot use the "addr" property as that is already used e.g. for virtio-pci/pci devices. And we will have e.g. virtio-pmem-pci as a proxy. So we have to choose a different one (unfortunately). "memaddr" it is. That name should ideally be used by all other virtio-* based memory devices in the future. -device virtio-pmem-pci,id=p0,bus=bux0,addr=0x01,memaddr=0x1000000... Acked-by: Markus Armbruster <armbru@redhat.com> [ QAPI bits ] Signed-off-by: Pankaj Gupta <pagupta@redhat.com> [ MemoryDevice/MemoryRegion changes, cleanups, addr property "memaddr", split up patches, unplug handler ] Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190619094907.10131-2-pagupta@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-06-16vhost: fix vhost_log size overflow during migrationLi Hangjing
When a guest which doesn't support multiqueue is migrated with a multi queues vhost-user-blk deivce, a crash will occur like: 0 qemu_memfd_alloc (name=<value optimized out>, size=562949953421312, seals=<value optimized out>, fd=0x7f87171fe8b4, errp=0x7f87171fe8a8) at util/memfd.c:153 1 0x00007f883559d7cf in vhost_log_alloc (size=70368744177664, share=true) at hw/virtio/vhost.c:186 2 0x00007f88355a0758 in vhost_log_get (listener=0x7f8838bd7940, enable=1) at qemu-2-12/hw/virtio/vhost.c:211 3 vhost_dev_log_resize (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:263 4 vhost_migration_log (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:787 5 0x00007f88355463d6 in memory_global_dirty_log_start () at memory.c:2503 6 0x00007f8835550577 in ram_init_bitmaps (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2173 7 ram_init_all (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2192 8 ram_save_setup (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2219 9 0x00007f88357a419d in qemu_savevm_state_setup (f=0x7f88384ce600) at migration/savevm.c:1002 10 0x00007f883579fc3e in migration_thread (opaque=0x7f8837530400) at migration/migration.c:2382 11 0x00007f8832447893 in start_thread () from /lib64/libpthread.so.0 12 0x00007f8832178bfd in clone () from /lib64/libc.so.6 This is because vhost_get_log_size() returns a overflowed vhost-log size. In this function, it uses the uninitialized variable vqs->used_phys and vqs->used_size to get the vhost-log size. Signed-off-by: Li Hangjing <lihangjing@baidu.com> Reviewed-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Chai Wen <chaiwen@baidu.com> Message-Id: <20190603061524.24076-1-lihangjing@baidu.com> Cc: qemu-stable@nongnu.org Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-06-06Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
virtio, pci, pc: cleanups, features stricter rules for acpi tables: we now fail on any difference that isn't whitelisted. vhost-scsi migration. some cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Wed 05 Jun 2019 20:55:04 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: bios-tables-test: ignore identical binaries tests: acpi: add simple arm/virt testcase tests: add expected ACPI tables for arm/virt board bios-tables-test: list all tables that differ vhost-scsi: Allow user to enable migration vhost-scsi: Add VMState descriptor vhost-scsi: The vhost backend should be stopped when the VM is not running bios-tables-test: add diff allowed list vhost: fix memory leak in vhost_user_scsi_realize vhost: fix incorrect print type vhost: remove the dead code docs: smbios: remove family=x from type2 entry description pci: Fold pci_get_bus_devfn() into its sole caller pci: Make is_bridge a bool pcie: Simplify pci_adjust_config_limit() acpi: pci: use build_append_foo() API to construct MCFG hw/acpi: Consolidate build_mcfg to pci.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-30Merge remote-tracking branch 'remotes/kraxel/tags/vga-20190529-pull-request' ↵Peter Maydell
into staging vga: add vhost-user-gpu. # gpg: Signature made Wed 29 May 2019 05:40:02 BST # gpg: using RSA key 4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/vga-20190529-pull-request: hw/display: add vhost-user-vga & gpu-pci virtio-gpu: split virtio-gpu-pci & virtio-vga virtio-gpu: split virtio-gpu, introduce virtio-gpu-base spice-app: fix running when !CONFIG_OPENGL contrib: add vhost-user-gpu util: compile drm.o on posix virtio-gpu: add a pixman helper header virtio-gpu: add bswap helpers header vhost-user: add vhost_user_gpu_set_socket() virtio-gpu: add sanity check Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-29vhost: fix incorrect print typeJie Wang
fix incorrect print type in vhost_virtqueue_stop Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1556605773-42019-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-05-29vhost: remove the dead codeJie Wang
remove the dead code Signed-off-by: Jie Wang <wangjie88@huawei.com> Message-Id: <1556604614-32081-1-git-send-email-wangjie88@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-05-29vhost-user: add vhost_user_gpu_set_socket()Marc-André Lureau
Add a new vhost-user message to give a unix socket to a vhost-user backend for GPU display updates. Back when I started that work, I added a new GPU channel because the vhost-user protocol wasn't bidirectional. Since then, there is a vhost-user-slave channel for the slave to send requests to the master. We could extend it with GPU messages. However, the GPU protocol is quite orthogonal to vhost-user, thus I chose to have a new dedicated channel. See vhost-user-gpu.rst for the protocol details. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-05-24hw/virtio: Use object_initialize_child for correct reference countingPhilippe Mathieu-Daudé
As explained in commit aff39be0ed97: Both functions, object_initialize() and object_property_add_child() increase the reference counter of the new object, so one of the references has to be dropped afterwards to get the reference counting right. Otherwise the child object will not be properly cleaned up when the parent gets destroyed. Thus let's use now object_initialize_child() instead to get the reference counting here right. This patch was generated using the following Coccinelle script: @use_object_initialize_child@ expression parent_obj; expression child_ptr; expression child_name; expression child_type; expression child_size; expression errp; @@ ( - object_initialize(child_ptr, child_size, child_type); + object_initialize_child(parent_obj, child_name, child_ptr, child_size, + child_type, &error_abort, NULL); ... when != parent_obj - object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), NULL); ... ?- object_unref(OBJECT(child_ptr)); | - object_initialize(child_ptr, child_size, child_type); + object_initialize_child(parent_obj, child_name, child_ptr, child_size, + child_type, errp, NULL); ... when != parent_obj - object_property_add_child(parent_obj, child_name, OBJECT(child_ptr), errp); ... ?- object_unref(OBJECT(child_ptr)); ) While the object_initialize() function doesn't take an 'Error *errp' argument, the object_initialize_child() does. Since this code is used when a machine is created (and is not yet running), we deliberately choose to use the &error_abort argument instead of ignoring errors if an object creation failed. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Inspired-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190507163416.24647-4-philmd@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-05-22hw/virtio/virtio-mmio: Convert DPRINTF to trace and logBoxuan Li
Use traces for debug message and qemu_log_mask for errors. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Message-Id: <20190503154424.73933-1-liboxuan@connect.hku.hk> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-05-20pci: Simplify pci_bus_is_root()David Gibson
pci_bus_is_root() currently relies on a method in the PCIBusClass. But it's always known if a PCI bus is a root bus when we create it, so using a dynamic method is overkill. This replaces it with an IS_ROOT bit in a new flags field, which is set on root buses and otherwise clear. As a bonus this removes the special is_root logic from pci_expander_bridge, since it already creates its bus as a root bus. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190424041959.4087-3-david@gibson.dropbear.id.au>
2019-05-20virtio: Use started flag in virtio_vmstate_change()Xie Yongji
Currently, we use DRIVER_OK status bit to check whether guest driver has started the device in virtio_vmstate_change(). But it's not the case for virtio 1.0 transitional devices. If migration completes between kicking virtqueue and setting VIRTIO_CONFIG_S_DRIVER_OK, guest may be hung. So here we use started flag to check guest state instead. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190320112646.3712-3-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-20virtio: Introduce started flag to VirtioDeviceXie Yongji
The virtio 1.0 transitional devices support driver uses the device before setting the DRIVER_OK status bit. So we introduce a started flag to indicate whether driver has started the device or not. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190320112646.3712-2-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-20hw: report invalid disable-legacy|modern usage for virtio-1-only devsDaniel P. Berrangé
A number of virtio devices (gpu, crypto, mouse, keyboard, tablet) only support the virtio-1 (aka modern) mode. Currently if the user launches QEMU, setting those devices to enable legacy mode, QEMU will silently create them in modern mode, ignoring the user's (mistaken) request. This patch introduces proper data validation so that an attempt to configure a virtio-1-only devices in legacy mode gets reported as an error to the user. Checking this required introduction of a new field to explicitly track what operating model is to be used for a device, separately from the disable_modern and disable_legacy fields that record the user's requested configuration. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20190215103239.28640-2-berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-17vhost_net: don't set backend for the uninitialized virtqueueJason Wang
We used to set backend unconditionally, this won't work for some guests (e.g windows driver) who may not initialize all virtqueues. For kernel backend, this will fail since it may try to validate the rings during setting backend. Fixing this by simply skipping the backend set when we find desc is not ready. Reviewed-by: Michael S. Tsirkin<mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-05-13virtio-input-host-pci: cleanup typesGerd Hoffmann
virtio input is virtio-1.0 only, so we don't need the -transitional and -non-transitional variants. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190510105137.17481-1-kraxel@redhat.com
2019-05-10Add vhost-user-input-pciMarc-André Lureau
Add a new virtio-input device, which connects to a vhost-user backend. Instead of reading configuration directly from an input device / evdev (like virtio-input-host), it reads it over vhost-user protocol with {SET,GET}_CONFIG messages. The vhost-user-backend handles the queues & events setup. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: 20190503130034.24916-5-marcandre.lureau@redhat.com [ kraxel: drop -{non-,}transitional variants ] [ kraxel: fix "make check" on !linux ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2019-03-22trace-events: Shorten file names in commentsMarkus Armbruster
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-12vhost-user: Support transferring inflight buffer between qemu and backendXie Yongji
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared buffer between qemu and backend. Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the shared buffer from backend. Then qemu should send it back through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user. This shared buffer is used to track inflight I/O by backend. Qemu should retrieve a new one when vm reset. Signed-off-by: Xie Yongji <xieyongji@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20190228085355.9614-2-xieyongji@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12vhost-user: split vhost_user_read()Marc-André Lureau
Split vhost_user_read(), so only header can be read with vhost_user_read_header(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20190308140454.32437-8-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12vhost-user: wrap some read/write with retry handlingMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20190308140454.32437-6-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12vhost-user: simplify vhost_user_init/vhost_user_cleanupMarc-André Lureau
Take a VhostUserState* that can be pre-allocated, and initialize it with the associated chardev. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12virtio-balloon: Restore MADV_WILLNEED hint on balloon deflateDavid Gibson
Prior to f6deb6d9 "virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate", the balloon device issued an madvise() MADV_WILLNEED on pages removed from the balloon. That would hint to the host kernel that the pages were likely to be needed by the guest in the near future. It's unclear if this is actually valuable or not, and so f6deb6d9 removed this, essentially ignoring balloon deflate requests. However, concerns have been raised that this might cause a performance regression by causing extra latency for the guest in certain configurations. So, until we can get actual benchmark data to see if that's the case, this restores the old behaviour, issuing a MADV_WILLNEED when a page is removed from the balloon. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190306030601.21986-4-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12virtio-balloon: Fix possible guest memory corruption with inflates & deflatesDavid Gibson
This fixes a balloon bug with a nasty consequence - potentially corrupting guest memory - but which is extremely unlikely to be triggered in practice. The balloon always works in 4kiB units, but the host could have a larger page size on certain platforms. Since ed48c59 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size" we've handled this by accumulating requests to balloon 4kiB subpages until they formed a full host page. Since f6deb6d "virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate" we essentially ignore deflate requests. Suppose we have a host with 8kiB pages, and one host page has subpages A & B. If we get this sequence of events - inflate A deflate A inflate B - the current logic will discard the whole host page. That's incorrect because the guest has deflated subpage A, and could have written important data to it. This patch fixes the problem by adjusting our state information about partially ballooned host pages when deflate requests are received. Fixes: ed48c59 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size" Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190306030601.21986-3-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Hildenbrand <david@redhat.com>
2019-03-12virtio-balloon: Don't mismatch g_malloc()/free (CID 1399146)David Gibson
ed48c59875b6 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size" introduced a new temporary data structure which tracks 4kiB chunks which have been inserted into the balloon by the guest but don't yet form a full host page which we can discard. Unfortunately, I had a thinko and allocated that structure with g_malloc0() but freed it with a plain free() rather than g_free(). This corrects the problem. Fixes: ed48c59875b6 Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190306030601.21986-2-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2019-03-12virtio-balloon: fix a use-after-free caseWei Wang
The elem could theorically contain both outbuf and inbufs. We move the free operation to the end of this function to avoid using elem->in_sg while elem has been freed. Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1552383280-4122-1-git-send-email-wei.w.wang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-11virtio: add class_size to VirtioPCIDeviceTypeInfoGerd Hoffmann
Needed when VirtioPCIClass subclasses have their own class struct with some extra fields. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20190307080244.9011-2-kraxel@redhat.com
2019-03-07s390x: express dependencies with KconfigThomas Huth
Instead of hard-coding all config switches in the config file default-configs/s390x-softmmu.mak, let's use the new Kconfig files to express the necessary dependencies: The S390_CCW_VIRTIO config switch for the "s390-ccw-virtio" machine now selects all non-optional devices. And since we already have the VIRTIO_PCI and VIRTIO_MMIO config switches for the other two virtio transports, this patch also introduces a new config switch VIRTIO_CCW for the third, s390x-specific virtio transport, so that all three virtio transports are now handled in the same way. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07virtio: express virtio dependencies with KconfigYang Zhong
Signed-off-by: Yang Zhong <yang.zhong@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-42-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07build: convert pci.mak to KconfigPaolo Bonzini
Instead of including the same list of devices for each target, set CONFIG_PCI to true, and make the devices default to present whenever PCI is available. However, s390x does not want all the PCI devices, so there is a separate symbol to enable them. Done mostly with the following script: while read i; do i=${i%=y}; i=${i#CONFIG_} sed -i -e'/^config '$i'$/!b' -en \ -e'a\' -e' default y if PCI_DEVICES\' -e' depends on PCI' \ `grep -lw $i hw/*/Kconfig` done < default-configs/pci.mak followed by replacing a few "depends on" clauses with "select" whenever the symbol is not really related to PCI. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-31-yang.zhong@intel.com> Acked-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07kconfig: introduce kconfig filesPaolo Bonzini
The Kconfig files were generated mostly with this script: for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' ` shift if test $# = 1; then cat >> $(dirname $1)/Kconfig << EOF config ${i#CONFIG_} bool EOF git add $(dirname $1)/Kconfig else echo $i $* fi done sed -i '$d' hw/*/Kconfig for i in hw/*; do if test -d $i && ! test -f $i/Kconfig; then touch $i/Kconfig git add $i/Kconfig fi done Whenever a symbol is referenced from multiple subdirectories, the script prints the list of directories that reference the symbol. These symbols have to be added manually to the Kconfig files. Kconfig.host and hw/Kconfig were created manually. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20190123065618.3520-27-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-06virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINTWei Wang
The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. A notifier is registered to the migration precopy notifier chain. The notifier calls free_page_start after the migration thread syncs the dirty bitmap, so that the free page optimization starts to clear bits of free pages from the bitmap. It calls the free_page_stop before the migration thread syncs the bitmap, which is the end of the current round of ram save. The free_page_stop is also called to stop the optimization in the case when there is an error occurred in the process of ram saving. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this free_page_start call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Dropped kernel header update, fixed up CMD_ID_* name change
2019-03-04Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pci, pc, virtio: fixes, cleanups, tests Lots of work on tests: BiosTablesTest UEFI app, vhost-user testing for non-Linux hosts. Misc cleanups and fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 22 Feb 2019 15:51:40 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (26 commits) pci: Sanity test minimum downstream LNKSTA hw/smbios: fix offset of type 3 sku field pci: Move NVIDIA vendor id to the rest of ids virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size virtio-balloon: Use ram_block_discard_range() instead of raw madvise() virtio-balloon: Rework ballon_page() interface virtio-balloon: Corrections to address verification virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate i386/kvm: ignore masked irqs when update msi routes contrib/vhost-user-blk: fix the compilation issue Revert "contrib/vhost-user-blk: fix the compilation issue" pc-dimm: use same mechanism for [get|set]_addr tests/data: introduce "uefi-boot-images" with the "bios-tables-test" ISOs tests/uefi-test-tools: add build scripts tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app roms: build the EfiRom utility from the roms/edk2 submodule roms: add the edk2 project as a git submodule vhost-user-test: create a temporary directory per TestServer vhost-user-test: small changes to init_hugepagefs vhost-user-test: create a main loop per TestServer ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-02-22virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page sizeDavid Gibson
The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but we can only actually discard memory in units of the host page size. Now, we handle this very badly: we silently ignore balloon requests that aren't host page aligned, and for requests that are host page aligned we discard the entire host page. The latter can corrupt guest memory if its page size is smaller than the host's. The obvious choice would be to disable the balloon if the host page size is not 4kiB. However, that would break the special case where host and guest have the same page size, but that's larger than 4kiB. That case currently works by accident[1] - and is used in practice on many production POWER systems where 64kiB has long been the Linux default page size on both host and guest. To make the balloon safe, without breaking that useful special case, we need to accumulate 4kiB balloon requests until we have a whole contiguous host page to discard. We could in principle do that across all guest memory, but it would require a large bitmap to track. This patch represents a compromise: we track ballooned subpages for a single contiguous host page at a time. This means that if the guest discards all 4kiB chunks of a host page in succession, we will discard it. This is the expected behaviour in the (host page) == (guest page) != 4kiB case we want to support. If the guest scatters 4kiB requests across different host pages, we don't discard anything, and issue a warning. Not ideal, but at least we don't corrupt guest memory as the previous version could. Warning reporting is kind of a compromise here. Determining whether we're in a problematic state at realize() time is tricky, because we'd have to look at the host pagesizes of all memory backends, but we can't really know if some of those backends could be for special purpose memory that's not subject to ballooning. Reporting only when the guest tries to balloon a partial page also isn't great because if the guest page size happens to line up it won't indicate that we're in a non ideal situation. It could also cause alarming repeated warnings whenever a migration is attempted. So, what we do is warn the first time the guest attempts balloon a partial host page, whether or not it will end up ballooning the rest of the page immediately afterwards. [1] Because when the guest attempts to balloon a page, it will submit requests for each 4kiB subpage. Most will be ignored, but the one which happens to be host page aligned will discard the whole lot. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190214043916.22128-6-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-22virtio-net: make VirtIOFeature usable for other virtio devicesStefano Garzarella
In order to use VirtIOFeature also in other virtio devices, we move its declaration and the endof() macro (renamed in virtio_endof()) in virtio.h. We add virtio_feature_get_config_size() function to iterate the array of VirtIOFeature and to return the config size depending on the features enabled. (as virtio_net_set_config_size() did) Suggested-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-5-sgarzare@redhat.com Message-Id: <20190221103314.58500-5-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-21virtio-balloon: Use ram_block_discard_range() instead of raw madvise()David Gibson
Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually discard RAM pages inserted into the balloon. This is basically a Linux only interface (MADV_DONTNEED exists on some other platforms, but doesn't always have the same semantics). It also doesn't work on hugepages and has some other limitations. It turns out that postcopy also needs to discard chunks of memory, and uses a better interface for it: ram_block_discard_range(). It doesn't cover every case, but it covers more than going direct to madvise() and this gives us a single place to update for more possibilities in future. There are some subtleties here to maintain the current balloon behaviour: * For now, we just ignore requests to balloon in a hugepage backed region. That matches current behaviour, because MADV_DONTNEED on a hugepage would simply fail, and we ignore the error. * If host page size is > BALLOON_PAGE_SIZE we can frequently call this on non-host-page-aligned addresses. These would also fail in madvise(), which we then ignored. ram_block_discard_range() error_report()s calls on unaligned addresses, so we explicitly check that case to avoid spamming the logs. * We now call ram_block_discard_range() with the *host* page size, whereas we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly, this also matches existing behaviour. Although the kernel fails madvise on unaligned addresses, it will round unaligned sizes *up* to the host page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size we can incorrectly discard more memory than the guest asked us to. I'm planning to address that soon. Errors other than the ones discussed above, will now be reported by ram_block_discard_range(), rather than silently ignored, which means we have a much better chance of seeing when something is going wrong. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20190214043916.22128-5-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-21virtio-balloon: Rework ballon_page() interfaceDavid Gibson
This replaces the balloon_page() internal interface with ballon_inflate_page(), with a slightly different interface. The new interface will make future alterations simpler. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20190214043916.22128-4-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-21virtio-balloon: Corrections to address verificationDavid Gibson
The virtio-balloon device's verification of the address given to it by the guest has a number of faults: * The addresses here are guest physical addresses, which should be 'hwaddr' rather than 'ram_addr_t' (the distinction is admittedly pretty subtle and confusing) * We don't check for section.mr being NULL, which is the main way that memory_region_find() reports basic failures. We really need to check that before looking at any other section fields, because memory_region_find() doesn't initialize them on the failure path * We're passing a length of '1' to memory_region_find(), but really the guest is requesting that we put the entire page into the balloon, so it makes more sense to call it with BALLOON_PAGE_SIZE Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20190214043916.22128-3-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-21virtio-balloon: Remove unnecessary MADV_WILLNEED on deflateDavid Gibson
When the balloon is inflated, we discard memory place in it using madvise() with MADV_DONTNEED. And when we deflate it we use MADV_WILLNEED, which sounds like it makes sense but is actually unnecessary. The misleadingly named MADV_DONTNEED just discards the memory in question, it doesn't set any persistent state on it in-kernel; all that's necessary to bring the memory back is to touch it. MADV_WILLNEED in contrast specifically says that the memory will be used soon and faults it in. This patch simplify's the balloon operation by dropping the madvise() on deflate. This might have an impact on performance - it will move a delay at deflate time until that memory is actually touched, which might be more latency sensitive. However: * Memory that's being given back to the guest by deflating the balloon *might* be used soon, but it equally could just sit around in the guest's pools until needed (or even be faulted out again if the host is under memory pressure). * Usually, the timescale over which you'll be adjusting the balloon is long enough that a few extra faults after deflation aren't going to make a difference. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20190214043916.22128-2-david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-21vhost-net: revamp configure logicPaolo Bonzini
Detect all invalid configurations (e.g. mingw32 with vhost-user, non-Linux with vhost-kernel). As a collateral benefit, all vhost-kernel backends can be now disabled if one wants to reduce the attack surface. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1543851204-41186-6-git-send-email-pbonzini@redhat.com> Message-Id: <1550165756-21617-7-git-send-email-pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-21vhost: restrict Linux dependency to kernel vhostPaolo Bonzini
vhost-user does not depend on Linux; it can run on any POSIX system. Restrict vhost-kernel to Linux in hw/virtio/vhost-backend.c, everything else can be compiled on all POSIX systems. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1543851204-41186-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1550165756-21617-4-git-send-email-pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>