aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2016-05-19block: Introduce BdrvChild.opaqueKevin Wolf
BlockBackends use it to get a back pointer from BdrvChild to BlockBackend in any BdrvChildRole callbacks. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move I/O throttling configuration functions to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move actual I/O throttling to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move throttling fields from BDS to BBKevin Wolf
This patch changes where the throttling state is stored (used to be the BlockDriverState, now it is the BlockBackend), but it doesn't actually make it a BB level feature yet. For example, throttling is still disabled when the BDS is detached from the BB. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Convert throttle_group_get_name() to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: throttle-groups: Use BlockBackend pointers internallyKevin Wolf
As a first step towards moving I/O throttling to the BlockBackend level, this patch changes all pointers in struct ThrottleGroup from referencing a BlockDriverState to referencing a BlockBackend. This change is valid because we made sure that throttling can only be enabled on BDSes which have a BB attached. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Introduce BlockBackendPublicKevin Wolf
Some features, like I/O throttling, are implemented outside block-backend.c, but still want to keep information in BlockBackend, e.g. list entries that allow keeping a list of BlockBackends. In order to avoid exposing the whole struct layout in the public header file, this patch introduces an embedded public struct where such information can be added and a pair of functions to convert between BlockBackend and BlockBackendPublic. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Make sure throttled BDSes always have a BBKevin Wolf
It was already true in principle that a throttled BDS always has a BB attached, except that the order of operations while attaching or detaching a BDS to/from a BB wasn't careful enough. This commit breaks graph manipulations while I/O throttling is enabled. It would have been possible to keep things working with some temporary hacks, but quite cumbersome, so it's not worth the hassle. We'll fix things again in a minute. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-12Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Thu 12 May 2016 14:37:05 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (69 commits) qemu-iotests: iotests: fail hard if not run via "check" block: enable testing of LUKS driver with block I/O tests block: add support for encryption secrets in block I/O tests block: add support for --image-opts in block I/O tests qemu-io: Add 'write -z -u' to test MAY_UNMAP flag qemu-io: Add 'write -f' to test FUA flag qemu-io: Allow unaligned access by default qemu-io: Use bool for command line flags qemu-io: Make 'open' subcommand more like command line qemu-io: Add missing option documentation qmp: add monitor command to add/remove a child quorum: implement bdrv_add_child() and bdrv_del_child() Add new block driver interface to add/delete a BDS's child qemu-img: check block status of backing file when converting. iotests: fix the redirection order in 083 block: Inactivate all children block: Drop superfluous invalidating bs->file from drivers block: Invalidate all children nbd: Simplify client FUA handling block: Honor BDRV_REQ_FUA during write_zeroes ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12quorum: implement bdrv_add_child() and bdrv_del_child()Wen Congyang
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Message-id: 1462865799-19402-3-git-send-email-xiecl.fnst@cn.fujitsu.com Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12block: Drop superfluous invalidating bs->file from driversFam Zheng
Now they are invalidated by the block layer, so it's not necessary to do this in block drivers' implementations of .bdrv_invalidate_cache. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12nbd: Simplify client FUA handlingEric Blake
Now that the block layer honors per-bds FUA support, we don't have to duplicate the fallback flush at the NBD layer. The static function nbd_co_writev_flags() is no longer needed, and the driver can just directly use nbd_client_co_writev(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Honor BDRV_REQ_FUA during write_zeroesEric Blake
The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but back then, it did not matter because there was no FUA flag. It became observable when commit 93f5e6d8 paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit 9eeb6dd, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Make supported_write_flags a per-bds propertyEric Blake
Pre-patch, .supported_write_flags lives at the driver level, which means we are blindly declaring that all block devices using a given driver will either equally support FUA, or that we need a fallback at the block layer. But there are drivers where FUA support is a per-block decision: the NBD block driver is dependent on the remote server advertising NBD_FLAG_SEND_FUA (and has fallback code to duplicate the flush that the block layer would do if NBD had not set .supported_write_flags); and the iscsi block driver is dependent on the mode sense bits advertised by the underlying device (and is currently silently ignoring FUA requests if the underlying device does not support FUA). The fix is to make supported flags as a per-BDS option, set during .bdrv_open(). This patch moves the variable and fixes NBD and iscsi to set it only conditionally; later patches will then further simplify the NBD driver to quit duplicating work done at the block layer, as well as tackle the fact that SCSI does not support FUA semantics on WRITESAME(10/16) but only on WRITE(10/16). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qcow2: improve qcow2_co_write_zeroes()Denis V. Lunev
There is a possibility that qcow2_co_write_zeroes() will be called with the partial block. This could be synthetically triggered with qemu-io -c "write -z 32k 4k" and can happen in the real life in qemu-nbd. The latter happens under the following conditions: (1) qemu-nbd is started with --detect-zeroes=on and is connected to the kernel NBD client (2) third party program opens kernel NBD device with O_DIRECT (3) third party program performs write operation with memory buffer not aligned to the page In this case qcow2_co_write_zeroes() is unable to perform the operation and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller switches to non-optimized version and writes real zeroes to the disk. The patch creates a shortcut. If the block is read as zeroes, f.e. if it is unallocated, the request is extended to cover full block. User-visible situation with this block is not changed. Before the patch the block is filled in the image with real zeroes. After that patch the block is marked as zeroed in metadata. Thus any subsequent changes in backing store chain are not affected. Kevin, thank you for a cool suggestion. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Kill unused sector-based blk_* functionsEric Blake
Now that there are no remaining clients, we can drop the sector-based blk_read(), blk_write(), blk_aio_readv(), and blk_aio_writev(). Sadly, there are still remaining sector-based interfaces, such as blk_*discard(), or blk_write_compressed(); those will have to wait for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Introduce byte-based aio read/writeEric Blake
blk_aio_readv() and blk_aio_writev() are annoying in that they can't access sub-sector granularity, and cannot pass flags. Also, they require the caller to pass redundant information about the size of the I/O (qiov->size in bytes must match nb_sectors in sectors). Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix the flaws. The next few patches will upgrade callers, then finally delete the old interfaces. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_*write_zeroes() to byte interfaceEric Blake
Sector-based blk_write() should die; convert the one-off variant blk_write_zeroes() to use an offset/count interface instead. Likewise for blk_co_write_zeroes() and blk_aio_write_zeroes(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_read_unthrottled() to byte interfaceEric Blake
Sector-based blk_read() should die; convert the one-off variant blk_read_unthrottled(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Allow BDRV_REQ_FUA through blk_pwrite()Eric Blake
We have several block drivers that understand BDRV_REQ_FUA, and emulate it in the block layer for the rest by a full flush. But without a way to actually request BDRV_REQ_FUA during a pass-through blk_pwrite(), FUA-aware block drivers like NBD are forced to repeat the emulation logic of a full flush regardless of whether the backend they are writing to could do it more efficiently. This patch just wires up a flags argument; followup patches will actually make use of it in the NBD driver and in qemu-io. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12Allow users to specify the vmdk virtual hardware version.Janne Karhunen
Vmdk images have metadata to indicate the vmware virtual hardware version image was created/tested to run with. Allow users to specify that version via new 'hwversion' option. [ kwolf: Adjust qemu-iotests common.filter ] Signed-off-by: Janne Karhunen <Janne.Karhunen@gmail.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: always compile-check debug printsZhou Jie
Files with conditional debug statements should ensure that the printf is always compiled. This prevents bitrot of the format string of the debug statement. And switch debug output to stderr. Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Remove BlockDriver.bdrv_read/writeKevin Wolf
There are no block drivers left that implement the old .bdrv_read/write interface, so it can be removed now. This gets us rid of the corresponding emulation functions, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vvfat: Implement .bdrv_co_preadv/pwritev interfacesKevin Wolf
This doesn't really convert any of the actual vvfat logic to use vectored I/O (and it's doubtful whether that would make sense), but instead just adapts the wrappers to the modern interface. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vpc: Implement .bdrv_co_pwritev() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vpc: Implement .bdrv_co_preadv() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vmdk: Implement .bdrv_co_pwritev() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vmdk: Implement .bdrv_co_preadv() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vmdk: Add vmdk_find_offset_in_cluster()Kevin Wolf
This is a byte granularity version of vmdk_find_index_in_cluster(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vdi: Implement .bdrv_co_pwritev() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12vdi: Implement .bdrv_co_preadv() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12dmg: Implement .bdrv_co_preadv() interfaceKevin Wolf
This implements .bdrv_co_preadv() for the cloop block driver. While updating the error paths, change -1 to a valid -errno code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12cloop: Implement .bdrv_co_preadv() interfaceKevin Wolf
This implements .bdrv_co_preadv() for the cloop block driver. While updating the error paths, change -1 to a valid -errno code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12bochs: Implement .bdrv_co_preadv() interfaceKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: Introduce .bdrv_co_preadv/pwritev BlockDriver functionKevin Wolf
Many parts of the block layer are already byte granularity. The block driver interface, however, was still missing an interface that allows making use of this. This patch introduces a new BlockDriver interface, which is based on coroutines, vectored, has flags and uses a byte granularity. This is now the preferred interface for new drivers. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writevKevin Wolf
It used to be an internal helper function just for implementing bdrv_co_do_readv/writev(), but now that it's a public interface, it deserves a name without "do" in it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: Support AIO drivers in bdrv_driver_preadv/pwritev()Kevin Wolf
Instead of registering emulation functions as .bdrv_co_writev, just directly check whether the function is there or not, and use the AIO interface if it isn't. This makes the read/write functions more consistent with how things are done in other places (flush, discard, etc.) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: Introduce bdrv_driver_pwritev()Kevin Wolf
This is a function that simply calls into the block driver for doing a write, providing the byte granularity interface we want to eventually have everywhere, and using whatever interface that driver supports. This one is a bit more interesting than the version for reads: It adds support for .bdrv_co_writev_flags() everywhere, so that drivers implementing this function can drop .bdrv_co_writev() now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: Introduce bdrv_driver_preadv()Kevin Wolf
This is a function that simply calls into the block driver for doing a read, providing the byte granularity interface we want to eventually have everywhere, and using whatever interface that driver supports. For now, this is just a wrapper for calling bs->drv->bdrv_co_readv(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12linux-aio: make it more type safePaolo Bonzini
Replace void* with an opaque LinuxAioState type. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: plug whole tree at once, introduce bdrv_io_unplugged_begin/endPaolo Bonzini
Extract the handling of io_plug "depth" from linux-aio.c and let the main bdrv_drain loop do nothing but wait on I/O. Like the two newly introduced functions, bdrv_io_plug and bdrv_io_unplug now operate on all children. The visit order is now symmetrical between plug and unplug, making it possible for formats to implement plug/unplug. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: introduce bdrv_no_throttling_begin/endPaolo Bonzini
Extract the handling of throttling from bdrv_flush_io_queue. These new functions will soon become BdrvChildRole callbacks, as they can be generalized to "beginning of drain" and "end of drain". Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: extract bdrv_drain_poll/bdrv_co_yield_to_drain from ↵Paolo Bonzini
bdrv_drain/bdrv_co_drain Do not call bdrv_drain_recurse twice in bdrv_co_drain. A small tweak to the logic in Fam's patch, which is harmless since no one implements bdrv_drain anyway. But better get it right. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: move restarting of throttled reqs to block/throttle-groups.cPaolo Bonzini
We want to remove throttled_reqs from block/io.c. This is the easy part---hide the handling of throttled_reqs during disable/enable of throttling within throttle-groups.c. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: make bdrv_start_throttled_reqs return voidPaolo Bonzini
The return value is unused and I am not sure why it would be useful. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Don't disable I/O throttling on sync requestsKevin Wolf
We had to disable I/O throttling with synchronous requests because we didn't use to run timers in nested event loops when the code was introduced. This isn't true any more, and throttling works just fine even when using the synchronous API. The removed code is in fact dead code since commit a8823a3b ('block: Use blk_co_pwritev() for blk_write()') because I/O throttling can only be set on the top layer, but BlockBackend always uses the coroutine interface now instead of using the sync API emulation in block.c. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1458660792-3035-2-git-send-email-kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12qapi: Split visit_end_struct() into piecesEric Blake
As mentioned in previous patches, we want to call visit_end_struct() functions unconditionally, so that visitors can release resources tied up since the matching visit_start_struct() without also having to worry about error priority if more than one error occurs. Even though error_propagate() can be safely used to ignore a second error during cleanup caused by a first error, it is simpler if the cleanup cannot set an error. So, split out the error checking portion (basically, input visitors checking for unvisited keys) into a new function visit_check_struct(), which can be safely skipped if any earlier errors are encountered, and leave the cleanup portion (which never fails, but must be called unconditionally if visit_start_struct() succeeded) in visit_end_struct(). Generated code in qapi-visit.c has diffs resembling: |@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v, | goto out_obj; | } | visit_type_ACPIOSTInfo_members(v, obj, &err); |- error_propagate(errp, err); |- err = NULL; |+ if (err) { |+ goto out_obj; |+ } |+ visit_check_struct(v, &err); | out_obj: |- visit_end_struct(v, &err); |+ visit_end_struct(v); | out: and in qapi-event.c: @@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP | goto out; | } | visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err); |- visit_end_struct(v, err ? NULL : &err); |+ if (!err) { |+ visit_check_struct(v, &err); |+ } |+ visit_end_struct(v); | if (err) { | goto out; Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com> [Conflict with a doc fixup resolved] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-04-29vvfat: Fix default volume labelKevin Wolf
Commit d5941dd documented that it leaves the default volume name as it was ("QEMU VVFAT"), but it doesn't actually implement this. You get an empty name (eleven space characters) instead. This fixes the implementation to apply the advertised default. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-04-29vvfat: Fix volume name assertionKevin Wolf
Commit d5941dd made the volume name configurable, but it didn't consider that the rw code compares the volume name string to assert that the first directory entry is the volume name. This made vvfat crash in rw mode. This fixes the assertion to compare with the configured volume name instead of a literal string. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-04-22mirror: Workaround for unexpected iohandler events during completionFam Zheng
Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any protection against new requests that could sneak in just before the BH is dispatched. For example (assuming a code base at that commit): main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch aio_dispatch ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop qemu_iohandler_poll virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 <snip> aio_dispatch aio_bh_poll (c) mirror_exit At (a) we know the BDS has no pending request. However, the same main_loop_wait call is going to dispatch iohandlers (EventNotifier events), which may lead to a new I/O from guest. So the invariant is already broken at (c). Data loss. Commit f3926945c8 made iohandler to use aio API. The order of virtio_queue_host_notifier_read and block_job_defer_to_main_loop within a main_loop_wait becomes unpredictable, and even worse, if the host notifier event arrives at the next main_loop_wait call, the unpredictable order between mirror_exit and virtio_queue_host_notifier_read is also a trouble. As shown below, this commit made the bug easier to trigger: - Bug case 1: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite main_loop_wait # 2 ... aio_dispatch aio_bh_poll (c) mirror_exit - Bug case 2: main_loop_wait # 1 os_host_main_loop_wait g_main_context_dispatch aio_ctx_dispatch (qemu_aio_context) ... mirror_run bdrv_drain (a) block_job_defer_to_main_loop main_loop_wait # 2 ... aio_ctx_dispatch (iohandler_ctx) virtio_queue_host_notifier_read ... virtio_submit_multiwrite (b) blk_aio_multiwrite aio_dispatch aio_bh_poll (c) mirror_exit In both cases, (b) breaks the invariant wanted by (a) and (c). Until then, the request loss has been silent. Later, 3f09bfbc7be added asserts at (c) to check the invariant (in bdrv_replace_in_backing_chain), and Max reported an assertion failure first visible there, by doing active committing while the guest is running bonnie++. 2.5 added bdrv_drained_begin at (a) to protect the dataplane case from similar problems, but we never realize the main loop bug until now. As a bandage, this patch disables iohandler's external events temporarily together with bs->ctx. Launchpad Bug: 1570134 Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>