aboutsummaryrefslogtreecommitdiff
path: root/hw/block
AgeCommit message (Collapse)Author
2019-02-22virtio-blk: add DISCARD and WRITE_ZEROES featuresStefano Garzarella
This patch adds the support of DISCARD and WRITE_ZEROES commands, that have been introduced in the virtio-blk protocol to have better performance when using SSD backend. We support only one segment per request since multiple segments are not widely used and there are no userspace APIs that allow applications to submit multiple segments in a single call. Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-7-sgarzare@redhat.com Message-Id: <20190221103314.58500-7-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-22virtio-blk: set config size depending on the features enabledStefano Garzarella
Starting from DISABLE and WRITE_ZEROES features, we use an array of VirtIOFeature (as virtio-net) to properly set the config size depending on the features enabled. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-6-sgarzare@redhat.com Message-Id: <20190221103314.58500-6-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-22virtio-blk: add "discard" and "write-zeroes" propertiesStefano Garzarella
In order to avoid migration issues, we enable DISCARD and WRITE_ZEROES features only for machine type >= 4.0 As discussed with Michael S. Tsirkin and Stefan Hajnoczi on the list [1], DISCARD operation should not have security implications (eg. page cache attacks), so we can enable it by default. [1] https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg00504.html Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-4-sgarzare@redhat.com Message-Id: <20190221103314.58500-4-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-22virtio-blk: add host_features field in VirtIOBlockStefano Garzarella
Since configurable features for virtio-blk are growing, this patch adds host_features field in the struct VirtIOBlock. (as in virtio-net) In this way, we can avoid to add new fields for new properties and we can directly set VIRTIO_BLK_F* flags in the host_features. We update "config-wce" and "scsi" property definition to use the new host_features field without change the behaviour. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-3-sgarzare@redhat.com Message-Id: <20190221103314.58500-3-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-22virtio-blk: add acct_failed param to virtio_blk_handle_rw_error()Stefano Garzarella
We add acct_failed param in order to use virtio_blk_handle_rw_error() also when is not required to call block_acct_failed(). (eg. a discard operation is failed) Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20190221103314.58500-2-sgarzare@redhat.com Message-Id: <20190221103314.58500-2-sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-13virtio-blk: set correct config size for the host driverChangpeng Liu
Commit caa1ee43 "vhost-user-blk: add discard/write zeroes features support" added fields to struct virtio_blk_config. This changes the size of the config space and breaks migration from QEMU 3.1 and older: qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x10 read: 41 device: 1 cmask: ff wmask: 80 w1cmask:0 qemu-system-ppc64: Failed to load PCIDevice:config qemu-system-ppc64: Failed to load virtio-blk:virtio qemu-system-ppc64: error while loading state for instance 0x0 of device 'pci@800000020000000:01.0/virtio-blk' qemu-system-ppc64: load of migration failed: Invalid argument Since virtio-blk doesn't support the "discard" and "write zeroes" features, it shouldn't even expose the associated fields in the config space actually. Just include all fields up to num_queues to match QEMU 3.1 and older. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 1550022537-27565-1-git-send-email-changpeng.liu@intel.com Message-Id: <1550022537-27565-1-git-send-email-changpeng.liu@intel.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-12virtio-blk: cleanup using VirtIOBlock *s and VirtIODevice *vdevStefano Garzarella
In several part we still using req->dev or VIRTIO_DEVICE(req->dev) when we have already defined s and vdev pointers: VirtIOBlock *s = req->dev; VirtIODevice *vdev = VIRTIO_DEVICE(s); Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-id: 20190208142347.214815-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-02-05Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pci, pc, virtio: fixes, cleanups, features vhost user blk discard/write zeroes features misc cleanups and fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Tue 05 Feb 2019 16:00:20 GMT # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: contrib/libvhost-user: cleanup casts r2d: fix build on mingw mmap-alloc: fix hugetlbfs misaligned length in ppc64 mmap-alloc: unfold qemu_ram_mmap() i386, acpi: cleanup build_facs by removing second unused argument fw_cfg: fix the life cycle and the name of "qemu_extra_params_fw" acpi: Make TPM 2.0 with TIS available as MSFT0101 hw/virtio: Use CONFIG_VIRTIO_PCI switch instead of CONFIG_PCI vhost-user-blk: add discard/write zeroes features support contrib/vhost-user-blk: fix the compilation issue pci/msi: export msi_is_masked() intel_iommu: reset intr_enabled when system reset intel_iommu: fix operator in vtd_switch_address_space hw: virtio-pci: drop DO_UPCAST include: update Linux headers to 4.21-rc1/5.0-rc1 scripts/update-linux-headers.sh: adjust for Linux 4.21-rc1 (or 5.0-rc1) contrib/libvhost-user: switch to uint64_t virtio: add checks for the size of the indirect table Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-02-05vhost-user-blk: add discard/write zeroes features supportChangpeng Liu
Linux commit 1f23816b8 "virtio_blk: add discard and write zeroes support" added the support in the Guest kernel, while here also enable the features support with vhost-user-blk driver. Also enable the test example utility with DISCARD and WRITE ZEROES commands. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-02-04xen-block: handle resize callbackPaul Durrant
Some frontend drivers will handle dynamic resizing of PV disks, so set up the BlockDevOps resize_cb() method during xen_block_realize() to allow this to be done. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-31Merge remote-tracking branch 'remotes/xanclic/tags/pull-block-2019-01-31' ↵Peter Maydell
into staging Block patches: - New debugging QMP command to explore block graphs - Converted DPRINTF()s to trace events - Fixed qemu-io's use of getopt() for systems with optreset - Minor NVMe emulation fixes - An iotest fix # gpg: Signature made Thu 31 Jan 2019 00:51:46 GMT # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/xanclic/tags/pull-block-2019-01-31: iotests: Allow 147 to be run concurrently iotests: Bind qemu-nbd to localhost in 147 iotests.py: Add qemu_nbd_pipe() nvme: use pci_dev directly in nvme_realize nvme: ensure the num_queues is not zero nvme: use TYPE_NVME instead of constant string qemu-io: Add generic function for reinitializing optind. block/sheepdog: Convert from DPRINTF() macro to trace events block/file-posix: Convert from DPRINTF() macro to trace events block/curl: Convert from DPRINTF() macro to trace events block/ssh: Convert from DPRINTF() macro to trace events scripts: add render_block_graph function for QEMUMachine qapi: add x-debug-query-block-graph Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-31nvme: use pci_dev directly in nvme_realizeLi Qiang
There is no need to make another reference. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190120055558.32984-4-liq3ea@163.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31nvme: ensure the num_queues is not zeroLi Qiang
When it is zero, it causes segv. Using following command: "-drive file=//home/test/test1.img,if=none,id=id0 -device nvme,drive=id0,serial=test,num_queues=0" causes following Backtrack: Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffe9735700 (LWP 30952)] 0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825 825 if (unlikely(n->cq[0])) { (gdb) bt 0 0x0000555555a7a77c in nvme_start_ctrl (n=0x5555577473f0) at hw/block/nvme.c:825 1 0x0000555555a7af7f in nvme_write_bar (n=0x5555577473f0, offset=20, data=4587521, size=4) at hw/block/nvme.c:969 2 0x0000555555a7b81a in nvme_mmio_write (opaque=0x5555577473f0, addr=20, data=4587521, size=4) at hw/block/nvme.c:1163 3 0x0000555555869236 in memory_region_write_accessor (mr=0x555557747cd0, addr=20, value=0x7fffe97320f8, size=4, shift=0, mask=4294967295, attrs=...) at /home/test/qemu1/qemu/memory.c:502 4 0x0000555555869446 in access_with_adjusted_size (addr=20, value=0x7fffe97320f8, size=4, access_size_min=2, access_size_max=8, access_fn=0x55555586914d <memory_region_write_accessor>, mr=0x555557747cd0, attrs=...) at /home/test/qemu1/qemu/memory.c:568 5 0x000055555586c479 in memory_region_dispatch_write (mr=0x555557747cd0, addr=20, data=4587521, size=4, attrs=...) at /home/test/qemu1/qemu/memory.c:1499 6 0x00005555558030af in flatview_write_continue (fv=0x7fffe0061130, addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, addr1=20, l=4, mr=0x555557747cd0) at /home/test/qemu1/qemu/exec.c:3234 7 0x00005555558031f9 in flatview_write (fv=0x7fffe0061130, addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3273 8 0x00005555558034ff in address_space_write ( ---Type <return> to continue, or q <return> to quit--- as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4) at /home/test/qemu1/qemu/exec.c:3363 9 0x0000555555803550 in address_space_rw ( as=0x555556758480 <address_space_memory>, addr=4273930260, attrs=..., buf=0x7ffff7ff0028 "\001", len=4, is_write=true) at /home/test/qemu1/qemu/exec.c:3374 10 0x00005555558884a1 in kvm_cpu_exec (cpu=0x555556920e40) at /home/test/qemu1/qemu/accel/kvm/kvm-all.c:2031 11 0x000055555584cd9d in qemu_kvm_cpu_thread_fn (arg=0x555556920e40) at /home/test/qemu1/qemu/cpus.c:1281 12 0x0000555555dbaf6d in qemu_thread_start (args=0x5555569438a0) at util/qemu-thread-posix.c:502 13 0x00007ffff5dc86db in start_thread (arg=0x7fffe9735700) at pthread_create.c:463 14 0x00007ffff5af188f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190120055558.32984-3-liq3ea@163.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31nvme: use TYPE_NVME instead of constant stringLi Qiang
Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190120055558.32984-2-liq3ea@163.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-30virtio-blk: remove duplicate definition of VirtIOBlock *s pointerStefano Garzarella
VirtIOBlock *s is already defined and initialized with req->dev on top of virtio_blk_handle_request(), so we can remove it from the code block of VIRTIO_BLK_T_GET_ID case. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190130095231.42081-1-sgarzare@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-01-30hw/block: clean up stale xen_disk trace entriesPaul Durrant
This should have been removed then xen_disk.c was removed but I missed them. Fixes: 19f87870baa570bcd7e80e7657e030bf427f16be xen: remove the legacy 'xen_disk' backend Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190122145132.12571-1-paul.durrant@citrix.com> [lv: s/stake/stale/ and add "Fixes" tag] Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-01-21hw/block/xen: use proper format string for printing sectorsAlex Bennée
The %lu format string is different depending on the host architecture which causes builds like the debian-armhf-cross build to fail. Use the correct PRi64 format string. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190116121350.23863-1-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-01-14qemu: avoid memory leak while remove diskJian Wang
Memset vhost_dev to zero in the vhost_dev_cleanup function. This causes dev.vqs to be NULL, so that vqs does not free up space when calling the g_free function. This will result in a memory leak. But you can't release vqs directly in the vhost_dev_cleanup function, because vhost_net will also call this function, and vhost_net's vqs is assigned by array. In order to solve this problem, we first save the pointer of vqs, and release the space of vqs after vhost_dev_cleanup is called. Signed-off-by: Jian Wang <wangjian161@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-14xen-block: avoid repeated memory allocationTim Smith
The xen-block dataplane currently allocates memory to hold the data for each request as that request is used, and frees it afterwards. Because it requires page-aligned blocks, this interacts poorly with non-page- aligned allocations and balloons the heap. Instead, allocate the maximum possible buffer size required for the protocol, which is BLKIF_MAX_SEGMENTS_PER_REQUEST (currently 11) pages when the request structure is created, and keep that buffer until it is destroyed. Since the requests are re-used via a free list, this should actually improve memory usage. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve response latencyTim Smith
If the I/O ring is full, the guest cannot send any more requests until some responses are sent. Only sending all available responses just before checking for new work does not leave much time for the guest to supply new work, so this will cause stalls if the ring gets full. Also, not completing reads as soon as possible adds latency to the guest. To alleviate that, complete IO requests as soon as they come back. xen_block_send_response() already returns a value indicating whether a notify should be sent, which is all the batching we need. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve batching behaviourTim Smith
When I/O consists of many small requests, performance is improved by batching them together in a single io_submit() call. When there are relatively few requests, the extra overhead is not worth it. This introduces a check to start batching I/O requests via blk_io_plug()/ blk_io_unplug() in an amount proportional to the number which were already in flight at the time we started reading the ring. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove the legacy 'xen_disk' backendPaul Durrant
This backend has now been replaced by the 'xen-qdisk' XenDevice. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: automatically create XenBlockDevice-sPaul Durrant
This patch adds create and destroy function for XenBlockDevice-s so that they can be created automatically when the Xen toolstack instantiates a new PV backend via xenstore. When the XenBlockDevice is created this way it is also necessary to create a 'drive' which matches the configuration that the Xen toolstack has written into xenstore. This is done by formulating the parameters necessary for each 'blockdev' layer of the drive and then using qmp_blockdev_add() to create the layers. Also, for compatibility with the legacy 'xen_disk' implementation, an iothread is automatically created for the new XenBlockDevice. This, like the driver layers, will be destroyed after the XenBlockDevice is unrealized. The legacy backend scan for 'qdisk' is removed by this patch, which makes the 'xen_disk' code is redundant. The code will be removed by a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add implementations of xen-block connect and disconnect functions...Paul Durrant
...and wire in the dataplane. This patch adds the remaining code to make the xen-block XenDevice functional. The parameters that a block frontend expects to find are populated in the backend xenstore area, and the 'ring-ref' and 'event-channel' values specified in the frontend xenstore area are mapped/bound and used to set up the dataplane. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.cPaul Durrant
This is a purely cosmetic patch that purges remaining use of 'blk' and 'ioreq' in local function names, and then makes sure all functions are prefixed with 'xen_block_'. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.cPaul Durrant
This is a purely cosmetic patch that purges the name 'ioreq' from struct, variable and field names. (This name has been problematic for a long time as 'ioreq' is the name used for generic I/O requests coming from Xen). The patch replaces 'struct ioreq' with a new 'XenBlockRequest' type and 'ioreq' field/variable names with 'request', and then does necessary fix-up to adhere to coding style. Function names are not modified by this patch. They will be dealt with in a subsequent patch. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-blockPaul Durrant
This is a purely cosmetic patch that substitutes the old 'struct XenBlkDev' name with 'XenBlockDataPlane' and 'blkdev' field/variable names with 'dataplane', and then does necessary fix-up to adhere to coding style. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add header and build dataplane/xen-block.cPaul Durrant
This patch adds the transformations necessary to get dataplane/xen-block.c to build against the new XenBus/XenDevice framework. MAINTAINERS is also updated due to the introduction of dataplane/xen-block.h. NOTE: Existing data structure names are retained for the moment. These will be modified by subsequent patches. A typedef for XenBlockDataPlane has been added to the header (based on the old struct XenBlkDev name for the moment) so that the old names don't need to leak out of the dataplane code. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove unnecessary code from dataplane/xen-block.cPaul Durrant
Not all of the code duplicated from xen_disk.c is required as the basis for the new dataplane implementation so this patch removes extraneous code, along with the legacy #includes and calls to the legacy xen_pv_printf() function. Error messages are changed to be reported using error_report(). NOTE: The code is still not yet built. Further transformations will be required to make it correctly interface to the new XenBus/XenDevice framework. They will be delivered in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: duplicate xen_disk.c as basis of dataplane/xen-block.cPaul Durrant
The new xen-block XenDevice implementation requires the same core dataplane as the legacy xen_disk implementation it will eventually replace. This patch therefore copies the legacy xen_disk.c source module into a new dataplane/xen-block.c source module as the basis for the new dataplane and adjusts the MAINTAINERS file accordingly. NOTE: The duplicated code is not yet built. It is simply put into place by this patch (just fixing style violations) such that the modifications that will need to be made to the code are not conflated with code movement, thus making review harder. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add xenstore watcher infrastructurePaul Durrant
A Xen PV frontend communicates its state to the PV backend by writing to the 'state' key in the frontend area in xenstore. It is therefore necessary for a XenDevice implementation to be notified whenever the value of this key changes. This patch adds code to do this as follows: - an 'fd handler' is registered on the libxenstore handle which will be triggered whenever a 'watch' event occurs - primitives are added to xen-bus-helper to add or remove watch events - a list of Notifier objects is added to XenBus to provide a mechanism to call the appropriate 'watch handler' when its associated event occurs The xen-block implementation is extended with a 'frontend_changed' method, which calls as-yet stub 'connect' and 'disconnect' functions when the relevant frontend state transitions occur. A subsequent patch will supply a full implementation for these functions. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: create xenstore areas for XenDevice-sPaul Durrant
This patch adds a new source module, xen-bus-helper.c, which builds on basic libxenstore primitives to provide functions to create (setting permissions appropriately) and destroy xenstore areas, and functions to 'printf' and 'scanf' nodes therein. The main xen-bus code then uses these primitives [1] to initialize and destroy the frontend and backend areas for a XenDevice during realize and unrealize respectively. The 'xen-block' implementation is extended with a 'get_name' method that returns the VBD number. This number is required to 'name' the xenstore areas. NOTE: An exit handler is also added to make sure the xenstore areas are cleaned up if QEMU terminates without devices being unrealized. [1] The 'scanf' functions are actually not yet needed, but they will be needed by code delivered in subsequent patches. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: introduce 'xen-block', 'xen-disk' and 'xen-cdrom'Paul Durrant
This patch adds new XenDevice-s: 'xen-disk' and 'xen-cdrom', both derived from a common 'xen-block' parent type. These will eventually replace the 'xen_disk' (note the underscore rather than hyphen) legacy PV backend but it is illustrative to build up the implementation incrementally, along with the XenBus/XenDevice framework. Subsequent patches will therefore add to these devices' implementation as new features are added to the framework. After this patch has been applied it is possible to instantiate new 'xen-disk' or 'xen-cdrom' devices with a single 'vdev' parameter, which accepts values adhering to the Xen VBD naming scheme [1]. For example, a command-line instantiation of a xen-disk can be done with an argument similar to the following: -device xen-disk,vdev=hda The implementation of the vdev parameter formulates the appropriate VBD number for use in the PV protocol. [1] https://xenbits.xen.org/docs/unstable/man/xen-vbd-interface.7.html Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: re-name XenDevice to XenLegacyDevice...Paul Durrant
...and xen_backend.h to xen-legacy-backend.h Rather than attempting to convert the existing backend infrastructure to be QOM compliant (which would be hard to do in an incremental fashion), subsequent patches will introduce a completely new framework for Xen PV backends. Hence it is necessary to re-name parts of existing code to avoid name clashes. The re-named 'legacy' infrastructure will be removed once all backends have been ported to the new framework. This patch is purely cosmetic. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-11qemu/queue.h: leave head structs anonymous unless necessaryPaolo Bonzini
Most list head structs need not be given a name. In most cases the name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV or reverse iteration, but this does not apply to lists of other kinds, and even for QTAILQ in practice this is only rarely needed. In addition, we will soon reimplement those macros completely so that they do not need a name for the head struct. So clean up everything, not giving a name except in the rare case where it is necessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-16Merge remote-tracking branch 'remotes/pmaydell/tags/pull-misc-20181214' into ↵Peter Maydell
staging miscellaneous patches: * checkpatch.pl: Enforce multiline comment syntax * Rename cpu_physical_memory_write_rom() to address_space_write_rom() * disas, monitor, elf_ops: Use address_space_read() to read memory * Remove load_image() in favour of load_image_size() * Fix some minor memory leaks in arm boards/devices * virt: fix broken indentation # gpg: Signature made Fri 14 Dec 2018 14:41:20 GMT # gpg: using RSA key 3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-misc-20181214: (22 commits) virt: Fix broken indentation target/arm: Create timers in realize, not init tests/test-arm-mptimer: Don't leak string memory hw/sd/sdhci: Don't leak memory region in sdhci_sysbus_realize() hw/arm/mps2-tz.c: Free mscname string in make_dma() target/arm: Free name string in ARMCPRegInfo hashtable entries include/hw/loader.h: Document load_image_size() hw/core/loader.c: Remove load_image() device_tree.c: Don't use load_image() hw/block/tc58128.c: Don't use load_image() hw/i386/multiboot.c: Don't use load_image() hw/i386/pc.c: Don't use load_image() hw/pci/pci.c: Don't use load_image() hw/smbios/smbios.c: Don't use load_image() hw/ppc/ppc405_boards: Don't use load_image() hw/ppc/mac_newworld, mac_oldworld: Don't use load_image() elf_ops.h: Use address_space_write() to write memory monitor: Use address_space_read() to read memory disas.c: Use address_space_read() to read memory Rename cpu_physical_memory_write_rom() to address_space_write_rom() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-14hw/block/tc58128.c: Don't use load_image()Peter Maydell
The load_image() function is deprecated, as it does not let the caller specify how large the buffer to read the file into is. Instead use load_image_size(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20181130151712.2312-8-peter.maydell@linaro.org
2018-12-13block/noenand: Convert sysbus init function to realize functionMao Zhongyi
Use DeviceClass rather than SysBusDeviceClass in onenand_class_init(). Cc: kwolf@redhat.com Cc: mreitz@redhat.com Cc: qemu-block@nongnu.org Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20181130093852.20739-3-maozhongyi@cmss.chinamobile.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-12-12virtio-blk: fix comment for virtio_blk_rw_complete as nalloc is initially -1Dongli Zhang
The initial value of nalloc is -1, but not 1. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-id: 1541479952-32355-1-git-send-email-dongli.zhang@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-12-12virtio-blk: rename iov to out_iov in virtio_blk_handle_request()Dongli Zhang
In virtio_blk_handle_request(), in_iov is used for input header while iov is used for output header. Rename iov to out_iov to pair output header's name with in_iov to avoid confusing people when reading source code. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Message-id: 1541520556-8334-1-git-send-email-dongli.zhang@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-11-27nvme: Fix spurious interruptsKeith Busch
The code had asserted an interrupt every time it was requested to check for new completion queue entries.This can result in spurious interrupts seen by the guest OS. Fix this by asserting an interrupt only if there are un-acknowledged completion queue entries available. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22nvme: fix bug with PCI IRQ pins on teardownLogan Gunthorpe
When the submission and completion queues are being torn down the IRQ will be asserted for the completion queue when the submsission queue is deleted. Then when the completion queue is deleted it stays asserted. Thus, on systems that do not use MSI, no further interrupts can be triggered on the host. Linux sees this as a long delay when unbinding the nvme device. Eventually the interrupt timeout occurs and it continues. To fix this we ensure we deassert the IRQ for a CQ when it is deleted. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22nvme: fix CMB endianness confusionPaolo Bonzini
The CMB is marked as DEVICE_LITTLE_ENDIAN, so the data must be read/written as if it was little-endian output (in the case of big endian, we get two swaps, one in the memory core and one in nvme.c). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22Revert "nvme: fix oob access issue(CVE-2018-16847)"Kevin Wolf
This reverts commit 5e3c0220d7e4f0361c4d36c697a8842f2b583402. We have a better fix commited for this now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22nvme: fix out-of-bounds access to the CMBPaolo Bonzini
Because the CMB BAR has a min_access_size of 2, if you read the last byte it will try to memcpy *2* bytes from n->cmbuf, causing an off-by-one error. This is CVE-2018-16847. Another way to fix this might be to register the CMB as a RAM memory region, which would also be more efficient. However, that might be a change for big-endian machines; I didn't think this through and I don't know how real hardware works. Add a basic testcase for the CMB in case somebody does this change later on. Cc: Keith Busch <keith.busch@intel.com> Cc: qemu-block@nongnu.org Reported-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Tested-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-22nvme: call blk_drain in NVMe reset code to avoid lockupsIgor Druzhinin
When blk_flush called in NVMe reset path S/C queues are already freed which means that re-entering AIO handling loop having some IO requests unfinished will lockup or crash as their SG structures being potentially reused. Call blk_drain before freeing the queues to avoid this nasty scenario. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-19hw/block/onenand: use qemu_log_mask() for reportingPeter Maydell
Update the onenand device to use qemu_log_mask() for reporting guest errors and unimplemented features, rather than plain fprintf() and hw_error(). (We leave the hw_error() in onenand_reset(), as that is triggered by a failure to read the underlying block device for the bootRAM, not by guest action.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20181115143535.5885-3-peter.maydell@linaro.org
2018-11-19hw/block/onenand: Fix off-by-one error allowing out-of-bounds readPeter Maydell
An off-by-one error in a switch case in onenand_read() allowed a misbehaving guest to read off the end of a block of memory. NB: the onenand device is used only by the "n800" and "n810" machines, which are usable only with TCG, not KVM, so this is not a security issue. Reported-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20181115143535.5885-2-peter.maydell@linaro.org Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-11-19fdc: fix segfault in fdctrl_stop_transfer() when DMA is disabledMark Cave-Ayland
Commit c8a35f1cf0f "fdc: use IsaDma interface instead of global DMA_* functions" accidentally introduced a segfault in fdctrl_stop_transfer() for non-DMA transfers. If fdctrl->dma_chann has not been configured then the fdctrl->dma interface reference isn't initialised during isabus_fdc_realize(). Unfortunately fdctrl_stop_transfer() unconditionally references the DMA interface when finishing the transfer causing a NULL pointer dereference. Fix the issue by adding a check in fdctrl_stop_transfer() so that the DMA interface reference and release method is only invoked if fdctrl->dma_chann has been set. (This issue was discovered by Martin testing a recent change in the NetBSD installer under qemu-system-sparc) Cc: qemu-stable@nongnu.org Reported-by: Martin Husemann <martin@duskware.de> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-11-19nvme: fix oob access issue(CVE-2018-16847)Li Qiang
Currently, the nvme_cmb_ops mr doesn't check the addr and size. This can lead an oob access issue. This is triggerable in the guest. Add check to avoid this issue. Fixes CVE-2018-16847. Reported-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>