aboutsummaryrefslogtreecommitdiff
path: root/block/block-backend.c
AgeCommit message (Collapse)Author
2016-05-25block: Make blk_co_preadv/pwritev() publicKevin Wolf
Also add trace points now that the function can be directly called. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25block: Default to enabled write cache in blk_new()Kevin Wolf
The existing users of the function are: 1. blk_new_open(), which already enabled the write cache 2. Some test cases that don't care about the setting 3. blockdev_init() for empty drives, where the cache mode is overridden with the value from the options when a medium is inserted Therefore, this patch doesn't change the current behaviour. It will be convenient, however, for additional users of blk_new() (like block jobs) if the most sensible WCE setting is the default. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25block: Rename blk_write_zeroes()Eric Blake
Commit 983a1600 changed the semantics of blk_write_zeroes() to be byte-based rather than sector-based, but did not change the name, which is an open invitation for other code to misuse the function. Renaming to pwrite_zeroes() makes it more in line with other byte-based interfaces, and will help make it easier to track which remaining write_zeroes interfaces still need conversion. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25block: Fix reconfiguring graph with drained nodesKevin Wolf
When changing the BlockDriverState that a BdrvChild points to while the node is currently drained, we must call the .drained_end() parent callback. Conversely, when this means attaching a new node that is already drained, we need to call .drained_begin(). bdrv_root_attach_child() takes now an opaque parameter, which is needed because the callbacks must also be called if we're attaching a new child to the BlockBackend when the root node is already drained, and they need a way to identify the BlockBackend. Previously, child->opaque was set too late and the callbacks would still see it as NULL. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25block: Drop errp parameter from blk_new()Max Reitz
blk_new() cannot fail so its Error ** parameter has become superfluous. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25block: Make bdrv_open() return a BDSMax Reitz
There are no callers to bdrv_open() or bdrv_open_inherit() left that pass a pointer to a non-NULL BDS pointer as the first argument of these functions, so we can finally drop that parameter and just make them return the new BDS. Generally, the following pattern is applied: bs = NULL; ret = bdrv_open(&bs, ..., &local_err); if (ret < 0) { error_propagate(errp, local_err); ... } by bs = bdrv_open(..., errp); if (!bs) { ret = -EINVAL; ... } Of course, there are only a few instances where the pattern is really pure. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25block: Drop blk_new_with_bs()Max Reitz
Its only caller is blk_new_open(), so we can just inline it there. The bdrv_new_root() call is dropped in the process because we can just let bdrv_open() create the BDS. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25block: Fix bdrv_next() memory leakKevin Wolf
The bdrv_next() users all leaked the BdrvNextIterator after completing the iteration. Simply changing bdrv_next() to free the iterator before returning NULL at the end of list doesn't work because some callers exit the loop before looking at all BDSes. This patch moves the BdrvNextIterator from the heap to the stack of the caller and switches to a bdrv_first()/bdrv_next() interface for initialising the iterator. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-19block: Remove BlockDriverState.blkKevin Wolf
This patch removes the remaining users of bs->blk, which will allow us to have multiple BBs on top of a single BDS. In the meantime, all checks that are currently in place to prevent the user from creating such setups can be switched to bdrv_has_blk() instead of accessing BDS.blk. Future patches can allow them and e.g. enable users to mirror to a block device that already has a BlockBackend on it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Avoid bs->blk in bdrv_next()Kevin Wolf
We need to introduce a separate BdrvNextIterator struct that can keep more state than just the current BDS in order to avoid using the bs->blk pointer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Add bdrv_has_blk()Kevin Wolf
In many cases we just want to know whether a BDS has at least one BB attached, without needing to know the exact BB that is attached. In contrast to bs->blk, this is still a valid question when more than one BB can be attached, so just answer it by checking the parents list. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Remove bdrv_aio_multiwrite()Kevin Wolf
Since virtio-blk implements request merging itself these days, the only remaining users are test cases for the function. That doesn't make the function exactly useful any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-19block: User BdrvChild callback for device nameKevin Wolf
In order to get rid of bs->blk for bdrv_get_device_name() and bdrv_get_device_or_node_name(), ask all parents for their name and simply pick the first one. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Use BdrvChild callbacks for change_media/resizeKevin Wolf
We want to get rid of BlockDriverState.blk in order to allow multiple BlockBackends per BDS. Converting the device callbacks in block.c (which assume a single BlockBackend) to per-child callbacks gets us rid of the first few instances. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19block: Decouple throttling from BlockDriverStateKevin Wolf
This moves the throttling related part of the BDS life cycle management to BlockBackend. The throttling group reference is now kept even when no medium is inserted. With this commit, throttling isn't disabled and then re-enabled any more during graph reconfiguration. This fixes the temporary breakage of I/O throttling when used with live snapshots or block jobs that manipulate the graph. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Drain throttling queue with BdrvChild callbackKevin Wolf
This removes the last part of I/O throttling from block/io.c and moves it to the BlockBackend. Instead of having knowledge about throttling inside io.c, we can call a BdrvChild callback .drained_begin/end, which happens to drain the throttled requests for BlockBackend parents. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Introduce BdrvChild.opaqueKevin Wolf
BlockBackends use it to get a back pointer from BdrvChild to BlockBackend in any BdrvChildRole callbacks. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move I/O throttling configuration functions to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move actual I/O throttling to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Move throttling fields from BDS to BBKevin Wolf
This patch changes where the throttling state is stored (used to be the BlockDriverState, now it is the BlockBackend), but it doesn't actually make it a BB level feature yet. For example, throttling is still disabled when the BDS is detached from the BB. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Convert throttle_group_get_name() to BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Introduce BlockBackendPublicKevin Wolf
Some features, like I/O throttling, are implemented outside block-backend.c, but still want to keep information in BlockBackend, e.g. list entries that allow keeping a list of BlockBackends. In order to avoid exposing the whole struct layout in the public header file, this patch introduces an embedded public struct where such information can be added and a pair of functions to convert between BlockBackend and BlockBackendPublic. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19block: Make sure throttled BDSes always have a BBKevin Wolf
It was already true in principle that a throttled BDS always has a BB attached, except that the order of operations while attaching or detaching a BDS to/from a BB wasn't careful enough. This commit breaks graph manipulations while I/O throttling is enabled. It would have been possible to keep things working with some temporary hacks, but quite cumbersome, so it's not worth the hassle. We'll fix things again in a minute. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-12block: Kill unused sector-based blk_* functionsEric Blake
Now that there are no remaining clients, we can drop the sector-based blk_read(), blk_write(), blk_aio_readv(), and blk_aio_writev(). Sadly, there are still remaining sector-based interfaces, such as blk_*discard(), or blk_write_compressed(); those will have to wait for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Introduce byte-based aio read/writeEric Blake
blk_aio_readv() and blk_aio_writev() are annoying in that they can't access sub-sector granularity, and cannot pass flags. Also, they require the caller to pass redundant information about the size of the I/O (qiov->size in bytes must match nb_sectors in sectors). Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix the flaws. The next few patches will upgrade callers, then finally delete the old interfaces. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_*write_zeroes() to byte interfaceEric Blake
Sector-based blk_write() should die; convert the one-off variant blk_write_zeroes() to use an offset/count interface instead. Likewise for blk_co_write_zeroes() and blk_aio_write_zeroes(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Switch blk_read_unthrottled() to byte interfaceEric Blake
Sector-based blk_read() should die; convert the one-off variant blk_read_unthrottled(). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Allow BDRV_REQ_FUA through blk_pwrite()Eric Blake
We have several block drivers that understand BDRV_REQ_FUA, and emulate it in the block layer for the rest by a full flush. But without a way to actually request BDRV_REQ_FUA during a pass-through blk_pwrite(), FUA-aware block drivers like NBD are forced to repeat the emulation logic of a full flush regardless of whether the backend they are writing to could do it more efficiently. This patch just wires up a flags argument; followup patches will actually make use of it in the NBD driver and in qemu-io. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writevKevin Wolf
It used to be an internal helper function just for implementing bdrv_co_do_readv/writev(), but now that it's a public interface, it deserves a name without "do" in it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12block: introduce bdrv_no_throttling_begin/endPaolo Bonzini
Extract the handling of throttling from bdrv_flush_io_queue. These new functions will soon become BdrvChildRole callbacks, as they can be generalized to "beginning of drain" and "end of drain". Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-04-15block: Don't ignore flags in blk_{,co,aio}_write_zeroes()Kevin Wolf
Commit 57d6a428 neglected to pass the given flags to blk_aio_prwv(), which broke discard by WRITE SAME for scsi-disk (the UNMAP bit would be ignored). Commit fc1453cd introduced the same bug for blk_write_zeroes(). This is used for 'qemu-img convert' without has_zero_init (e.g. on a block device) and for preallocation=falloc in parallels. Commit 8896e088 is the version for blk_co_write_zeroes(). This function is only used in qemu-io. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-04-15block: Fix blk_aio_write_zeroes()Kevin Wolf
Commit 57d6a428 broke blk_aio_write_zeroes() because in some write functions in the call path don't have an explicit length argument but reuse qiov->size instead. Which is great, except that write_zeroes doesn't have a qiov, which this commit interprets as 0 bytes. Consequently, blk_aio_write_zeroes() didn't effectively do anything. This patch introduces an explicit acb->bytes in BlkAioEmAIOCB and uses that instead of acb->rwco.size. The synchronous version of the function is okay because it does pass a qiov (with the right size and a NULL pointer as its base). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30block: Remove BDRV_O_CACHE_WBKevin Wolf
The previous patches have successively made blk->enable_write_cache the true source for the information whether a writethrough mode must be implemented. The corresponding BDRV_O_CACHE_WB is only useless baggage we're carrying around, so now's the time to remove it. At the same time, we remove the 'cache.writeback' option parsing on the BDS level as the only effect was setting the BDRV_O_CACHE_WB flag. This change requires test cases that explicitly enabled the option to drop it. Other than that and the change of the error message when writethrough is enabled on the BDS level (from "Can't set writethrough mode" to "doesn't support the option"), there should be no change in behaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30block: Move enable_write_cache to BB levelKevin Wolf
Whether a write cache is used or not is a decision that concerns the user (e.g. the guest device) rather than the backend. It was already logically part of the BB level as bdrv_move_feature_fields() always kept it on top of the BDS tree; with this patch, the core of it (the actual flag and the additional flushes) is also implemented there. Direct callers of bdrv_open() must pass BDRV_O_CACHE_WB now if bs doesn't have a BlockBackend attached. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30block: Always set writeback mode in blk_new_open()Kevin Wolf
All callers of blk_new_open() either don't rely on the WCE bit set after blk_new_open() because they explicitly set it anyway, or they pass BDRV_O_CACHE_WB unconditionally. This patch changes blk_new_open() so that it always enables writeback mode and asserts that BDRV_O_CACHE_WB is clear. For those callers that used to pass BDRV_O_CACHE_WB unconditionally, the flag is removed now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30block: Remove blk_set_bs()Kevin Wolf
The function is unused since commit f21d96d0 ('block: Use BdrvChild in BlockBackend'). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-22util: move declarations out of qemu-common.hVeronia Bahaa
Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-17block: Use blk_co_pwritev() in blk_co_write_zeroes()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use blk_aio_prwv() for aio_read/write/write_zeroesKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use blk_prw() in blk_pread()/blk_pwrite()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use blk_co_pwritev() in blk_write_zeroes()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Pull up blk_read_unthrottled() implementationKevin Wolf
Use blk_read(), so that it goes through blk_co_preadv() like all read requests from the BB to the BDS. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use blk_co_pwritev() for blk_write()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use blk_co_preadv() for blk_read()Kevin Wolf
This patch introduces blk_co_preadv() as a central function on the BlockBackend level that is supposed to handle all read requests from the BB to its root BDS eventually. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Use BdrvChild in BlockBackendKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Add blk_next_root_bs()Max Reitz
This function iterates over all BDSs attached to a BB. We are going to need it when rewriting bdrv_next() so it no longer uses bdrv_states. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17block: Move some bdrv_*_all() functions to BBMax Reitz
Move bdrv_commit_all() and bdrv_flush_all() to the BlockBackend level. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()Max Reitz
We can basically inline it in hmp_drive_del(); monitor_remove_blk() is called already, so we just need to call bdrv_make_anon(), too. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17blockdev: Split monitor reference from BB creationMax Reitz
Before this patch, blk_new() automatically assigned a name to the new BlockBackend and considered it referenced by the monitor. This patch removes the implicit monitor_add_blk() call from blk_new() (and consequently the monitor_remove_blk() call from blk_delete(), too) and thus blk_new() (and related functions) no longer take a BB name argument. In fact, there is only a single point where blk_new()/blk_new_open() is called and the new BB is monitor-owned, and that is in blockdev_init(). Besides thus relieving us from having to invent names for all of the BBs we use in qemu-img, this fixes a bug where qemu cannot create a new image if there already is a monitor-owned BB named "image". If a BB and its BDS tree are created in a single operation, as of this patch the BDS tree will be created before the BB is given a name (whereas it was the other way around before). This results in minor change to the output of iotest 087, whose reference output is amended accordingly. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17blockdev: Separate BB name managementMax Reitz
Introduce separate functions (monitor_add_blk() and monitor_remove_blk()) which set or unset a BB name. Since the name is equivalent to the monitor's reference to a BB, adding a name the same as declaring the BB to be monitor-owned and removing it revokes this status, hence the function names. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>