aboutsummaryrefslogtreecommitdiff
path: root/hw/block/dataplane
AgeCommit message (Collapse)Author
2020-08-21meson: convert hw/blockMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21trace: switch position of headers to what Meson requiresPaolo Bonzini
Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10xen: Use ERRP_GUARD()Vladimir Sementsov-Ogievskiy
If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). No such cases are being fixed here. This commit is generated by command sed -n '/^X86 Xen CPUs$/,/^$/{s/^F: //p}' MAINTAINERS | \ xargs git ls-files | grep '\.[hc]$' | \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-9-armbru@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]
2020-06-17virtio-blk: On restart, process queued requests in the proper contextSergio Lopez
On restart, we were scheduling a BH to process queued requests, which would run before starting up the data plane, leading to those requests being assigned and started on coroutines on the main context. This could cause requests to be wrongly processed in parallel from different threads (the main thread and the iothread managing the data plane), potentially leading to multiple issues. For example, stopping and resuming a VM multiple times while the guest is generating I/O on a virtio_blk device can trigger a crash with a stack tracing looking like this one: <------> Thread 2 (Thread 0x7ff736765700 (LWP 1062503)): #0 0x00005567a13b99d6 in iov_memset (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:69 #1 0x00005567a13bab73 in qemu_iovec_memset (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530 #2 0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86 #3 0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217 #4 0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323 #5 0x00005567a12f43d9 in qemu_laio_process_completions_and_submit (s=0x7ff7182e8420) at block/linux-aio.c:236 #6 0x00005567a12f44c2 in qemu_laio_poll_cb (opaque=0x7ff7182e8430) at block/linux-aio.c:267 #7 0x00005567a13aed83 in run_poll_handlers_once (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8) at util/aio-posix.c:520 #8 0x00005567a13aee9f in run_poll_handlers (ctx=0x5567a2b58c70, max_ns=16000, timeout=0x7ff7367645f8) at util/aio-posix.c:562 #9 0x00005567a13aefde in try_poll_mode (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8) at util/aio-posix.c:597 #10 0x00005567a13af115 in aio_poll (ctx=0x5567a2b58c70, blocking=true) at util/aio-posix.c:639 #11 0x00005567a109acca in iothread_run (opaque=0x5567a2b29760) at iothread.c:75 #12 0x00005567a13b2790 in qemu_thread_start (args=0x5567a2b694c0) at util/qemu-thread-posix.c:519 #13 0x00007ff73eedf2de in start_thread () at /lib64/libpthread.so.0 #14 0x00007ff73ec10e83 in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7ff743986f00 (LWP 1062500)): #0 0x00005567a13b99d6 in iov_memset (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:69 #1 0x00005567a13bab73 in qemu_iovec_memset (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530 #2 0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86 #3 0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217 #4 0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323 #5 0x00005567a12f4a2f in laio_do_submit (fd=19, laiocb=0x7ff5f4ff9ae0, offset=472363008, type=2) at block/linux-aio.c:375 #6 0x00005567a12f4af2 in laio_co_submit (bs=0x5567a2b8c460, s=0x7ff7182e8420, fd=19, offset=472363008, qiov=0x7ff5f4ff9ca0, type=2) at block/linux-aio.c:394 #7 0x00005567a12f1803 in raw_co_prw (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, type=2) at block/file-posix.c:1892 #8 0x00005567a12f1941 in raw_co_pwritev (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, flags=0) at block/file-posix.c:1925 #9 0x00005567a12fe3e1 in bdrv_driver_pwritev (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, qiov_offset=0, flags=0) at block/io.c:1183 #10 0x00005567a1300340 in bdrv_aligned_pwritev (child=0x5567a2b5b070, req=0x7ff5f4ff9db0, offset=472363008, bytes=20480, align=512, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0) at block/io.c:1980 #11 0x00005567a1300b29 in bdrv_co_pwritev_part (child=0x5567a2b5b070, offset=472363008, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0) at block/io.c:2137 #12 0x00005567a12baba1 in qcow2_co_pwritev_task (bs=0x5567a2b92740, file_cluster_offset=472317952, offset=487305216, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, l2meta=0x0) at block/qcow2.c:2444 #13 0x00005567a12bacdb in qcow2_co_pwritev_task_entry (task=0x5567a2b48540) at block/qcow2.c:2475 #14 0x00005567a13167d8 in aio_task_co (opaque=0x5567a2b48540) at block/aio_task.c:45 #15 0x00005567a13cf00c in coroutine_trampoline (i0=738245600, i1=32759) at util/coroutine-ucontext.c:115 #16 0x00007ff73eb622e0 in __start_context () at /lib64/libc.so.6 #17 0x00007ff6626f1350 in () #18 0x0000000000000000 in () <------> This is also known to cause crashes with this message (assertion failed): aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule' RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1812765 Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20200603093240.40489-3-slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-04-07xen-block: Fix double qlist remove and request leakAnthony PERARD
Commit a31ca6801c02 ("qemu/queue.h: clear linked list pointers on remove") revealed that a request was removed twice from a list, once in xen_block_finish_request() and a second time in xen_block_release_request() when both function are called from xen_block_complete_aio(). But also, the `requests_inflight' counter is decreased twice, and thus became negative. This is a bug that was introduced in bfd0d6366043 ("xen-block: improve response latency"), where a `finished' list was removed. That commit also introduced a leak of request in xen_block_do_aio(). That function calls xen_block_finish_request() but the request is never released after that. To fix both issue, we do two changes: - we squash finish_request() and release_request() together as we want to remove a request from 'inflight' list to add it to 'freelist'. - before releasing a request, we need to let the other end know the result, thus we should call xen_block_send_response() before releasing a request. The first change fixes the double QLIST_REMOVE() as we remove the extra call. The second change makes the leak go away because if we want to call finish_request(), we need to call a function that does all of finish, send response, and release. Fixes: bfd0d6366043 ("xen-block: improve response latency") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20200406140217.1441858-1-anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> [mreitz: Amended commit message as per Paul's suggestions] Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-02-27xen-bus/block: explicitly assign event channels to an AioContextPaul Durrant
It is not safe to close an event channel from the QEMU main thread when that channel's poller is running in IOThread context. This patch adds a new xen_device_set_event_channel_context() function to explicitly assign the channel AioContext, and modifies xen_device_bind_event_channel() to initially assign the channel's poller to the QEMU main thread context. The code in xen-block's dataplane is then modified to assign the channel to IOThread context during xen_block_dataplane_start() and de-assign it during in xen_block_dataplane_stop(), such that the channel is always assigned back to main thread context before it is closed. aio_set_fd_handler() already deals with all the necessary synchronization when moving an fd between AioContext-s so no extra code is needed to manage this. Reported-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20191216143451.19024-1-pdurrant@amazon.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-12-19virtio-blk: fix out-of-bounds access to bitmap in notify_guest_bhLi Hangjing
When the number of a virtio-blk device's virtqueues is larger than BITS_PER_LONG, the out-of-bounds access to bitmap[ ] will occur. Fixes: e21737ab15 ("virtio-blk: multiqueue batch notify") Cc: qemu-stable@nongnu.org Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Li Hangjing <lihangjing@baidu.com> Reviewed-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Chai Wen <chaiwen@baidu.com> Message-id: 20191216023050.48620-1-lihangjing@baidu.com Message-Id: <20191216023050.48620-1-lihangjing@baidu.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-09-03virtio-blk: Cancel the pending BH when the dataplane is resetPhilippe Mathieu-Daudé
When 'system_reset' is called, the main loop clear the memory region cache before the BH has a chance to execute. Later when the deferred function is called, some assumptions that were made when scheduling them are no longer true when they actually execute. This is what happens using a virtio-blk device (fresh RHEL7.8 install): $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo q) \ | qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \ -device virtio-blk-pci,id=image1,drive=drive_image1 \ -drive file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \ -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \ -monitor stdio -serial null -nographic (qemu) system_reset (qemu) system_reset (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: Assertion `caches != NULL' failed. Aborted (gdb) bt Thread 1 (Thread 0x7f109c17b680 (LWP 10939)): #0 0x00005604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at hw/virtio/virtio.c:227 #1 0x000056040832972b in vring_avail_flags (vq=0x56040a24bdd0) at hw/virtio/virtio.c:235 #2 0x000056040832d13d in virtio_should_notify (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648 #3 0x000056040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662 #4 0x00005604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at hw/block/dataplane/virtio-blk.c:75 #5 0x000056040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90 #6 0x000056040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118 #7 0x0000560408842af7 in aio_dispatch (ctx=0x560409161980) at util/aio-posix.c:460 #8 0x000056040883e068 in aio_ctx_dispatch (source=0x560409161980, callback=0x0, user_data=0x0) at util/async.c:261 #9 0x00007f10a8fca06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #10 0x0000560408841445 in glib_pollfds_poll () at util/main-loop.c:215 #11 0x00005604088414bf in os_host_main_loop_wait (timeout=0) at util/main-loop.c:238 #12 0x00005604088415c4 in main_loop_wait (nonblocking=0) at util/main-loop.c:514 #13 0x0000560408416b1e in main_loop () at vl.c:1923 #14 0x000056040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, envp=0x7ffc2c3f9d00) at vl.c:4578 Fix this by cancelling the BH when the virtio dataplane is stopped. [This is version of the patch was modified as discussed with Philippe on the mailing list thread. --Stefan] Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Fixes: https://bugs.launchpad.net/qemu/+bug/1839428 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190816171503.24761-1-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-06-24xen-bus / xen-block: add support for event channel pollingPaul Durrant
This patch introduces a poll callback for event channel fd-s and uses this to invoke the channel callback function. To properly support polling, it is necessary for the event channel callback function to return a boolean saying whether it has done any useful work or not. Thus xen_block_dataplane_event() is modified to directly invoke xen_block_handle_requests() and the latter only returns true if it actually processes any requests. This also means that the call to qemu_bh_schedule() is moved into xen_block_complete_aio(), which is more intuitive since the only reason for doing a deferred poll of the shared ring should be because there were previously insufficient resources to fully complete a previous poll. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-4-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-bus: allow AioContext to be specified for each event channelPaul Durrant
This patch adds an AioContext parameter to xen_device_bind_event_channel() and then uses aio_set_fd_handler() to set the callback rather than qemu_set_fd_handler(). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190408151617.13025-3-paul.durrant@citrix.com> [Call aio_set_fd_handler() with is_external=true] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-24xen-block: support feature-large-sector-sizePaul Durrant
A recent Xen commit [1] clarified the semantics of sector based quantities used in the blkif protocol such that it is now safe to create a xen-block device with a logical_block_size != 512, as long as the device only connects to a frontend advertizing 'feature-large-block-size'. This patch modifies xen-block accordingly. It also uses a stack variable for the BlockBackend in xen_block_realize() to avoid repeated dereferencing of the BlockConf pointer, and changes the parameters of xen_block_dataplane_create() so that the BlockBackend pointer and sector size are passed expicitly rather than implicitly via the BlockConf. These modifications have been tested against a recent Windows PV XENVBD driver [2] using a xen-disk device with a 4kB logical block size. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=67e1c050e36b2c9900cca83618e56189effbad98 [2] https://winpvdrvbuild.xenproject.org:8080/job/XENVBD-master/126 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190409164038.25484-1-paul.durrant@citrix.com> [Edited error message] Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-06-04block: Add Error to blk_set_aio_context()Kevin Wolf
Add an Error parameter to blk_set_aio_context() and use bdrv_child_try_set_aio_context() internally to check whether all involved nodes can actually support the AioContext switch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-04-04xen-block: scale sector based quantities correctlyPaul Durrant
The Xen blkif protocol requires that sector based quantities should be interpreted strictly as multiples of 512 bytes. Specifically: "first_sect and last_sect in blkif_request_segment, as well as sector_number in blkif_request, are always expressed in 512-byte units." Commit fcab2b464e06 "xen: add header and build dataplane/xen-block.c" incorrectly modified behaviour to use the block device logical_block_size property as the scale, instead of correctly shifting values by the hardcoded BDRV_SECTOR_BITS (and hence scaling them to 512 byte units). This patch undoes that change and restores compliance with the spec. Furthermore, this patch also restores the original xen_disk behaviour of advertizing a hardcoded 'sector-size' value of 512 in xenstore and scaling 'sectors' accordingly. The realize() method is also modified to fail if logical_block_size is set to anything other than 512. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190401121719.27208-1-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-03-22trace-events: Shorten file names in commentsMarkus Armbruster
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-07block: fix recursion in hw/block/dataplanePaolo Bonzini
There are Xen files in hw/block/dataplane that should be compiled even if virtio-blk is disabled. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-28dataplane/xen-block: remove dead codePaul Durrant
The if() statement is clearly bogus (dead code which should have been cleaned up when grant mapping was removed). Spotted by Coverity: CID 1398635 While in the neighbourhood, add a missing 'fall through' annotation. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20190215162533.19475-2-paul.durrant@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-02-04xen-block: handle resize callbackPaul Durrant
Some frontend drivers will handle dynamic resizing of PV disks, so set up the BlockDevOps resize_cb() method during xen_block_realize() to allow this to be done. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: avoid repeated memory allocationTim Smith
The xen-block dataplane currently allocates memory to hold the data for each request as that request is used, and frees it afterwards. Because it requires page-aligned blocks, this interacts poorly with non-page- aligned allocations and balloons the heap. Instead, allocate the maximum possible buffer size required for the protocol, which is BLKIF_MAX_SEGMENTS_PER_REQUEST (currently 11) pages when the request structure is created, and keep that buffer until it is destroyed. Since the requests are re-used via a free list, this should actually improve memory usage. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve response latencyTim Smith
If the I/O ring is full, the guest cannot send any more requests until some responses are sent. Only sending all available responses just before checking for new work does not leave much time for the guest to supply new work, so this will cause stalls if the ring gets full. Also, not completing reads as soon as possible adds latency to the guest. To alleviate that, complete IO requests as soon as they come back. xen_block_send_response() already returns a value indicating whether a notify should be sent, which is all the batching we need. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen-block: improve batching behaviourTim Smith
When I/O consists of many small requests, performance is improved by batching them together in a single io_submit() call. When there are relatively few requests, the extra overhead is not worth it. This introduces a check to start batching I/O requests via blk_io_plug()/ blk_io_unplug() in an amount proportional to the number which were already in flight at the time we started reading the ring. Signed-off-by: Tim Smith <tim.smith@citrix.com> Re-based and commit comment adjusted. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: purge 'blk' and 'ioreq' from function names in dataplane/xen-block.cPaul Durrant
This is a purely cosmetic patch that purges remaining use of 'blk' and 'ioreq' in local function names, and then makes sure all functions are prefixed with 'xen_block_'. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove 'ioreq' struct/varable/field names from dataplane/xen-block.cPaul Durrant
This is a purely cosmetic patch that purges the name 'ioreq' from struct, variable and field names. (This name has been problematic for a long time as 'ioreq' is the name used for generic I/O requests coming from Xen). The patch replaces 'struct ioreq' with a new 'XenBlockRequest' type and 'ioreq' field/variable names with 'request', and then does necessary fix-up to adhere to coding style. Function names are not modified by this patch. They will be dealt with in a subsequent patch. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove 'XenBlkDev' and 'blkdev' names from dataplane/xen-blockPaul Durrant
This is a purely cosmetic patch that substitutes the old 'struct XenBlkDev' name with 'XenBlockDataPlane' and 'blkdev' field/variable names with 'dataplane', and then does necessary fix-up to adhere to coding style. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: add header and build dataplane/xen-block.cPaul Durrant
This patch adds the transformations necessary to get dataplane/xen-block.c to build against the new XenBus/XenDevice framework. MAINTAINERS is also updated due to the introduction of dataplane/xen-block.h. NOTE: Existing data structure names are retained for the moment. These will be modified by subsequent patches. A typedef for XenBlockDataPlane has been added to the header (based on the old struct XenBlkDev name for the moment) so that the old names don't need to leak out of the dataplane code. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: remove unnecessary code from dataplane/xen-block.cPaul Durrant
Not all of the code duplicated from xen_disk.c is required as the basis for the new dataplane implementation so this patch removes extraneous code, along with the legacy #includes and calls to the legacy xen_pv_printf() function. Error messages are changed to be reported using error_report(). NOTE: The code is still not yet built. Further transformations will be required to make it correctly interface to the new XenBus/XenDevice framework. They will be delivered in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14xen: duplicate xen_disk.c as basis of dataplane/xen-block.cPaul Durrant
The new xen-block XenDevice implementation requires the same core dataplane as the legacy xen_disk implementation it will eventually replace. This patch therefore copies the legacy xen_disk.c source module into a new dataplane/xen-block.c source module as the basis for the new dataplane and adjusts the MAINTAINERS file accordingly. NOTE: The duplicated code is not yet built. It is simply put into place by this patch (just fixing style violations) such that the modifications that will need to be made to the code are not conflated with code movement, thus making review harder. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2018-06-28Replace '-enable-kvm' with '-accel kvm' in docs and help textsThomas Huth
The preferred way to select the KVM accelerator is to use "-accel kvm" these days, so let's be consistent in our documentation and help texts. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1528866321-23886-3-git-send-email-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01hw: Do not include "sysemu/block-backend.h" if it is not necessaryPhilippe Mathieu-Daudé
Remove those unneeded includes to speed up the compilation process a little bit. (Continue 7eceff5b5a1fa cleanup) Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180528232719.4721-13-f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-08virtio-blk: fix race between .ioeventfd_stop() and vq handlerStefan Hajnoczi
If the main loop thread invokes .ioeventfd_stop() just as the vq handler function begins in the IOThread then the handler may lose the race for the AioContext lock. By the time the vq handler is able to acquire the AioContext lock the ioeventfd has already been removed and the handler isn't supposed to run anymore! Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal from within the IOThread. This way no races with the vq handler are possible. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20180307144205.20619-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is presentSergio Lopez
Commit 5b2ffbe4d99843fd8305c573a100047a8c962327 ("virtio-blk: dataplane: notify guest as a batch") deferred guest notification to a BH in order batch notifications, with purpose of avoiding flooding the guest with interruptions. This optimization came with a cost. The average latency perceived in the guest is increased by a few microseconds, but also when multiple IO operations finish at the same time, the guest won't be notified until all completions from each operation has been run. On the contrary, virtio-scsi issues the notification at the end of each completion. On the other hand, nowadays we have the EVENT_IDX feature that allows a better coordination between QEMU and the Guest OS to avoid sending unnecessary interruptions. With this change, virtio-blk/dataplane only batches notifications if the EVENT_IDX feature is not present. Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1): - Test specs: * fio-3.4 (ioengine=sync, iodepth=1, direct=1) * qemu master * virtio-blk with a dedicated iothread (default poll-max-ns) * backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000 * 8 vCPUs pinned to isolated physical cores * Emulator and iothread also pinned to separate isolated cores * variance between runs < 1% - Not patched * numjobs=1: lat_avg=327.32 irqs=29998 * numjobs=4: lat_avg=337.89 irqs=29073 * numjobs=8: lat_avg=342.98 irqs=28643 - Patched: * numjobs=1: lat_avg=323.92 irqs=30262 * numjobs=4: lat_avg=332.65 irqs=29520 * numjobs=8: lat_avg=335.54 irqs=29323 Signed-off-by: Sergio Lopez <slp@redhat.com> Message-id: 20180307114459.26636-1-slp@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-02-08virtio: remove event notifier cleanup call on de-assignGal Hammer
The virtio_bus_set_host_notifier function no longer calls event_notifier_cleanup when a event notifier is removed. The commit updates the code to match the new behavior and calls virtio_bus_cleanup_host_notifier after the notifier was de-assign and no longer in use. This change is a preparation to allow executing the virtio_bus_set_host_notifier function in a memory region transaction. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-19hw/block: Fix the return typeMao Zhongyi
When the function no success value to transmit, it usually make the function return void. It has turned out not to be a success, because it means that the extra local_err variable and error_propagate() will be needed. It leads to cumbersome code, therefore, transmit success/ failure in the return value is worth. So fix the return type of blkconf_apply_backend_options(), blkconf_geometry() and virtio_blk_data_plane_create() to avoid it. Cc: John Snow <jsnow@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: ac0edc1fc70c4457e5cec94405eb7d1f89f9c2c1.1511317952.git.maozy.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-31docs: fix broken paths to docs/devel/tracing.txtPhilippe Mathieu-Daudé
With the move of some docs/ to docs/devel/ on ac06724a71, no references were updated. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-02-17virtio: Report real progress in VQ aio poll handlerFam Zheng
In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()" cases are making true progress. Currently the offending one is virtio-scsi event queue, whose handler does nothing if no event is pending. As a result aio_poll() will spin on the "non-empty" VQ and take 100% host CPU. Fix this by reporting actual progress from virtio queue aio handlers. Reported-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Fam Zheng <famz@redhat.com> Tested-by: Ed Swierk <eswierk@skyportsystems.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-31trace: clean up trace-events filesStefan Hajnoczi
There are a number of unused trace events that scripts/cleanup-trace-events.pl finds. The "hw/vfio/pci-quirks.c" filename was typoed and "qapi/qapi-visit-core.c" was missing the qapi/ directory prefix. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170126171613.1399-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-31trace: move hw/block/dataplane events to correct subdirDaniel P. Berrange
The trace-events for a given source file should generally always live in the same directory as the source file. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170125161417.31949-3-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-18virtio: set ISR on dataplane notificationsPaolo Bonzini
Dataplane has been omitting forever the step of setting ISR when an interrupt is raised. This caused little breakage, because the specification actually says that ISR may not be updated in MSI mode. Some versions of the Windows drivers however didn't clear MSI mode correctly, and proceeded using polling mode (using ISR, not the used ring index!) for crashdump and hibernation. If it were just crashdump and hibernation it would not be a big deal, but recent releases of Windows do not really shut down, but rather log out and hibernate to make the next startup faster. Hence, this manifested as a more serious hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs. Newer versions fixed this, while older versions do not use MSI at all. The failure has always been there for virtio dataplane, but it became visible after commits 9ffe337 ("virtio-blk: always use dataplane path if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk and virtio-scsi always use the dataplane code under KVM. The good news therefore is that it was not a bug in the patches---they were doing exactly what they were meant for, i.e. shake out remaining dataplane bugs. The fix is not hard, so it's worth arranging for the broken drivers. The virtio_should_notify+event_notifier_set pair that is common to virtio-blk and virtio-scsi dataplane is replaced with a new public function virtio_notify_irqfd that also sets ISR. The irqfd emulation code now need not set ISR anymore, so virtio_irq is removed. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30virtio-blk: always use dataplane path if ioeventfd is activePaolo Bonzini
Override start_ioeventfd and stop_ioeventfd to start/stop the whole dataplane logic. This has some positive side effects: - no need anymore for virtio_add_queue_aio (i.e. a revert of commit 0ff841f6d138904d514efa1d885bcaf54583852d) - no need anymore to switch from generic ioeventfd handlers to dataplane It detects some errors better: $ qemu-system-x86_64 -object iothread,id=io \ -drive id=null,file=null-aio://,if=none,format=raw \ -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null qemu-system-x86_64: -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null: ioeventfd is required for iothread while previously it would have started just fine. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30virtio: move ioeventfd_started flag to VirtioBusStatePaolo Bonzini
This simplifies the code and removes the ioeventfd_started and ioeventfd_set_started callback. The only difference is in how virtio-ccw handles an error---it doesn't disable ioeventfd forever anymore. It was the only backend to do so, and if desired this behavior should be implemented in virtio-bus.c. Instead of ioeventfd_started, the ioeventfd_assign callback now determines whether the virtio bus supports host notifiers. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-18virtio-blk: dataplane cleanupCao jin
No need duplicate the judgment, there is one in function entry. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468814749-14510-1-git-send-email-caoj.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28virtio-blk: dataplane multiqueue supportStefan Hajnoczi
Monitor ioeventfds for all virtqueues in the device's AioContext. This is not true multiqueue because requests from all virtqueues are processed in a single IOThread. In the future it will be possible to use multiple IOThreads when the QEMU block layer supports multiqueue. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-7-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28virtio-blk: tell dataplane which vq to notifyStefan Hajnoczi
Let the virtio_blk_data_plane_notify() caller decide which virtqueue to notify. This will allow the function to be used with multiqueue. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-4-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28virtio-blk: multiqueue batch notifyStefan Hajnoczi
The batch notification BH needs to know which virtqueues to notify when multiqueue is enabled. Use a bitmap to track the virtqueues with pending notifications. At this point there is only one virtqueue so hard-code virtqueue index 0. A later patch will switch to real virtqueue indices. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-3-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-24virtio-bus: remove old set_host_notifier callbackCornelia Huck
All users have been converted to the new ioevent callbacks. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24virtio-bus: have callers tolerate new host notifier apiCornelia Huck
Have vhost and dataplane use the new api for transports that have been converted. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07virtio-blk: Remove op blocker for dataplaneFam Zheng
Block layer is prepared to unspecialize dataplane, an evidence is this almost complete list of unblocked operations. It has all types except two (actually three if DATAPLANE itself counts but blockdev.c makes sure attaching twice is not possible): MIRROR_TARGET and BACKUP_TARGET. blockdev-mirror refuses to start if target is attached, so the first is not a problem. By removing BACKUP_TARGET, blockdev-backup will become permissive to write to a virtio-blk dataplane disk, but that is not worse than non-dataplane given the latter is already possible. In either case, blockdev.c always checks the target and source are on the same AioContext, or bring them together if possible. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 1463969978-24970-4-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-04-07virtio: merge virtio_queue_aio_set_host_notifier_handler with ↵Paolo Bonzini
virtio_queue_set_aio Eliminating the reentrancy is actually a nice thing that we can do with the API that Michael proposed, so let's make it first class. This also hides the complex assign/set_handler conventions from callers of virtio_queue_aio_set_host_notifier_handler, which in fact was always called with assign=true. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-04-07virtio-blk: use aio handler for data planeMichael S. Tsirkin
In addition to handling IO in vcpu thread and in io thread, dataplane introduces yet another mode: handling it by AioContext. This reuses the same handler as previous modes, which triggers races as these were not designed to be reentrant. Use a separate handler just for aio, and disable regular handlers when dataplane is active. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>