aboutsummaryrefslogtreecommitdiff
path: root/include/block
AgeCommit message (Collapse)Author
2015-06-15Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer core and image format patches # gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (25 commits) block: Fix reopen flag inheritance block: Add BlockDriverState.inherits_from block: Add list of children to BlockDriverState queue.h: Add QLIST_FIX_HEAD_PTR() block: Drain requests before swapping nodes in bdrv_swap() block: Move flag inheritance to bdrv_open_inherit() block: Use QemuOpts in bdrv_open_common() block: Use macro for cache option names vmdk: Use bdrv_open_image() quorum: Use bdrv_open_image() check-qdict: Test cases for new functions qdict: Add qdict_{set,copy}_default() qdict: Add qdict_array_entries() iotests: Add tests for overriding BDRV_O_PROTOCOL block: driver should override flags in bdrv_open() block: Change bitmap truncate conditional to assertion block: record new size in bdrv_dirty_bitmap_truncate raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size vmdk: Use vmdk_find_index_in_cluster everywhere vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Fri Jun 12 15:57:47 2015 BST using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/block-pull-request: qemu-iotests: expand test 093 to support group throttling throttle: Update throttle infrastructure copyright throttle: add the name of the ThrottleGroup to BlockDeviceInfo throttle: acquire the ThrottleGroup lock in bdrv_swap() throttle: Add throttle group support throttle: Add throttle group infrastructure tests throttle: Add throttle group infrastructure throttle: Extract timers from ThrottleState into a separate structure raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Revert "iothread: release iothread around aio_poll" Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-12block: Add BlockDriverState.inherits_fromKevin Wolf
Currently, the block layer assumes that any block node can have only one parent, and if it has a parent, that it inherits some options/flags from this parent. This is not true any more: With references used in block device creation, a single node can be used by multiple parents, or it can be created separately and not inherit flags from any parent. To handle reopens correctly, a node must know from which parent it inherited options. This patch adds the information to BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Add list of children to BlockDriverStateKevin Wolf
This allows iterating over all children of a given BDS, not only including bs->file and bs->backing_hd, but also driver-specific ones like VMDK extents or Quorum children. For bdrv_swap(), the list of children of the swapped BDS stays at that BDS (because that's where the pointers stay as well). The list head moves and pointers to it must be fixed up therefore. The list of children in the parent of the swapped BDS is not affected by the swap. The contents of the BDS objects is swapped, so the existing pointer in the parent automatically points to the newly swapped in BDS. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Move flag inheritance to bdrv_open_inherit()Kevin Wolf
Instead of letting every caller of bdrv_open() determine the right flags for its child node manually and pass them to the function, pass the parent node and the role of the newly opened child (like backing file, protocol layer, etc.). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-06-12block: Use macro for cache option namesKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-06-12throttle: acquire the ThrottleGroup lock in bdrv_swap()Alberto Garcia
bdrv_swap() touches the fields of a BlockDriverState that are protected by the ThrottleGroup lock. Although those fields end up in their original place, they are temporarily swapped in the process, so there's a chance that an operation on a member of the same group happening on a different thread can try to use them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: d92dc40d7c4f1fc5cda5cbbf4ffb7a4670b79d17.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group supportAlberto Garcia
The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: f0082a86f3ac01c46170f7eafe2101a92e8fde39.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Add throttle group infrastructureAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12throttle: Extract timers from ThrottleState into a separate structureBenoît Canet
Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12main-loop: Drop qemu_set_fd_handler2Fam Zheng
All users are converted to qemu_set_fd_handler now, drop qemu_set_fd_handler2 and IOHandlerRecord.fd_read_poll. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1433400324-7358-9-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-29qapi: add dirty bitmap statusJohn Snow
Bitmaps can be in a handful of different states with potentially more to come as we tool around with migration and persistence patches. Management applications may need to know why certain bitmaps are unavailable for various commands, e.g. busy in another operation, busy being migrated, etc. Right now, all we offer is BlockDirtyInfo's boolean member 'frozen'. Instead of adding more booleans, replace it by an enumeration member 'status' with values 'active' and 'frozen'. Then add new value 'disabled'. Incompatible change. Fine because the changed part hasn't been released so far. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-05-22block: minimal bounce buffer alignmentDenis V. Lunev
The patch introduces new concept: minimal memory alignment for bounce buffers. Original so called "optimal" value is actually minimal required value for aligment. It should be used for validation that the IOVec is properly aligned and bounce buffer is not required. Though, from the performance point of view, it would be better if bounce buffer or IOVec allocated by QEMU will be aligned stricter. The patch does not change any alignment value yet. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-2-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-04-28block: extract bdrv_setup_io_funcs()Stefan Hajnoczi
Move the code to install coroutine and aio emulation function pointers in a BlockDriver to its own function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: add bdrv_set_dirty()/bdrv_reset_dirty() to block_int.hStefan Hajnoczi
The dirty bitmap functions are called from the block I/O processing code. Make them visible to block_int.h users so they can be used outside block.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Ensure consistent bitmap function prototypesJohn Snow
We often don't need the BlockDriverState for functions that operate on bitmaps. Remove it. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-15-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: add block-dirty-bitmap-clearJohn Snow
Add bdrv_clear_dirty_bitmap and a matching QMP command, qmp_block_dirty_bitmap_clear that enables a user to reset the bitmap attached to a drive. This allows us to reset a bitmap in the event of a full drive backup. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-12-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Add support of "dirty-bitmap" sync mode for drive-backupJohn Snow
For "dirty-bitmap" sync mode, the block job will iterate through the given dirty bitmap to decide if a sector needs backup (backup all the dirty clusters and skip clean ones), just as allocation conditions of "top" sync mode. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-11-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Add bitmap successorsJohn Snow
A bitmap successor is an anonymous BdrvDirtyBitmap that is intended to be created just prior to a sensitive operation (e.g. Incremental Backup) that can either succeed or fail, but during the course of which we still want a bitmap tracking writes. On creating a successor, we "freeze" the parent bitmap which prevents its deletion, enabling, anonymization, or creating a bitmap with the same name. On success, the parent bitmap can "abdicate" responsibility to the successor, which will inherit its name. The successor will have been tracking writes during the course of the backup operation. The parent will be safely deleted. On failure, we can "reclaim" the successor from the parent, unifying them such that the resulting bitmap describes all writes occurring since the last successful backup, for instance. Reclamation will thaw the parent, but not explicitly re-enable it. BdrvDirtyBitmap operations that target a single bitmap are protected by assertions that the bitmap is not frozen and/or disabled. BdrvDirtyBitmap operations that target a group of bitmaps, such as bdrv_{set,reset}_dirty will ignore frozen/disabled drives with a conditional instead. Internal functions that enable/disable dirty bitmaps have assertions added to them to prevent modifying frozen bitmaps. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-10-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Add bitmap disabled statusJohn Snow
Add a status indicating the enabled/disabled state of the bitmap. A bitmap is by default enabled, but you can lock the bitmap into a read-only state by setting disabled = true. A previous version of this patch added a QMP interface for changing the state of the bitmap, but it has since been removed for now until a use case emerges where this state must be revealed to the user. The disabled state WILL be used internally for bitmap migration and bitmap persistence. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-9-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: Introduce bdrv_dirty_bitmap_granularity()John Snow
This returns the granularity (in bytes) of dirty bitmap, which matches the QMP interface and the existing query interface. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-6-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-removeJohn Snow
The new command pair is added to manage a user created dirty bitmap. The dirty bitmap's name is mandatory and must be unique for the same device, but different devices can have bitmaps with the same names. The granularity is an optional field. If it is not specified, we will choose a default granularity based on the cluster size if available, clamped to between 4K and 64K to mirror how the 'mirror' code was already choosing granularity. If we do not have cluster size info available, we choose 64K. This code has been factored out into a helper shared with block/mirror. This patch also introduces the 'block_dirty_bitmap_lookup' helper, which takes a device name and a dirty bitmap name and validates the lookup, returning NULL and setting errp if there is a problem with either field. This helper will be re-used in future patches in this series. The types added to block-core.json will be re-used in future patches in this series, see: 'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}' Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1429314609-29776-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: Ensure consistent granularity typeJohn Snow
We treat this field with a variety of different types everywhere in the code. Now it's just uint32_t. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qapi: Add optional field "name" to block dirty bitmapFam Zheng
This field will be set for user created dirty bitmap. Also pass in an error pointer to bdrv_create_dirty_bitmap, so when a name is already taken on this BDS, it can report an error message. This is not global check, two BDSes can have dirty bitmap with a common name. Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will be used later when other QMP commands want to reference dirty bitmap by name. Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1429314609-29776-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28qmp: fill in the image field in BlockDeviceInfoAlberto Garcia
The image field in BlockDeviceInfo is supposed to contain an ImageInfo object. However that is being filled in by bdrv_query_info(), not by bdrv_block_device_info(), which is where BlockDeviceInfo is actually created. Anyone calling bdrv_block_device_info() directly will get a null image field. As a consequence of this, the HMP command 'info block -n -v' crashes QEMU. This patch moves the code that fills in that field from bdrv_query_info() to bdrv_block_device_info(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block: add bdrv_get_device_or_node_name()Alberto Garcia
This function gets the device name associated with a BlockDriverState, or its node name if the device name is empty. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28blockjob: Allow nested pauseFam Zheng
This patch changes block_job_pause to increase the pause counter and block_job_resume to decrease it. The counter will allow calling block_job_pause/block_job_resume unconditionally on a job when we need to suspend the IO temporarily. From now on, each block_job_resume must be paired with a block_job_pause to keep the counter balanced. The user pause from QMP or HMP will only trigger block_job_pause once until it's resumed, this is achieved by adding a user_paused flag in BlockJob. One occurrence of block_job_resume in mirror_complete is replaced with block_job_enter which does what is necessary. In block_job_cancel, the cancel flag is good enough to instruct coroutines to quit loop, so use block_job_enter to replace the unpaired block_job_resume. Upon block job IO error, user is notified about the entering to the pause state, so this pause belongs to user pause, set the flag accordingly and expect a matching QMP resume. [Extended doc comments as suggested by Paolo Bonzini <pbonzini@redhat.com>. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28AioContext: acquire/release AioContext during aio_pollPaolo Bonzini
This is the first step in pushing down acquire/release, and will let rfifolock drop the contention callback feature. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28aio-posix: move pollfds to thread-local storagePaolo Bonzini
By using thread-local storage, aio_poll can stop using global data during g_poll_ns. This will make it possible to drop callbacks from rfifolock. [Moved npfd = 0 assignment to end of walking_handlers region as suggested by Paolo. This resolves the assert(npfd == 0) assertion failure in pollfds_cleanup(). --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-18nbd: Set block size to BDRV_SECTOR_SIZEMax Reitz
Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-13-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-18nbd: Fix potential signed overflow issuesMax Reitz
Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-11-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-18nbd: Handle blk_getlength() failureMax Reitz
Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <1424887718-10800-9-git-send-email-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-16block: Drop bdrv_findFam Zheng
All callers are converted, so drop it. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1425296209-1476-5-git-send-email-famz@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-03-10block: add bdrv functions for geometry and blocksizeEkaterina Tumanova
Add driver functions for geometry and blocksize detection Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424087278-49393-2-git-send-email-tumanova@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10qcow2: Allow creation with refcount order != 4Max Reitz
Add a creation option to qcow2 for setting the refcount order of images to be created, and respect that option's value. This breaks some test outputs, fix them. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-09sheepdog: selectable object size supportTeruaki Ishizaki
Previously, qemu block driver of sheepdog used hard-coded VDI object size. This patch enables users to handle VDI object size. When you start qemu, you don't need to specify additional command option. But when you create the VDI which doesn't have default object size with qemu-img command, you specify object_size option. If you want to create a VDI of 8MB object size, you need to specify following command option. # qemu-img create -o object_size=8M sheepdog:test1 100M In addition, when you don't specify qemu-img command option, a default value of sheepdog cluster is used for creating VDI. # qemu-img create sheepdog:test2 100M Signed-off-by: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp> Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-09coroutine: Clean up qemu_coroutine_enter()Kevin Wolf
qemu_coroutine_enter() is now the only user of coroutine_swap(). Both functions are short, so inline it. Also, using COROUTINE_YIELD is now even more confusing because this code is never called during qemu_coroutine_yield() any more. In fact, this value is never read back, so we can just introduce a new COROUTINE_ENTER which documents the purpose of the task switch better. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-27block: Forbid bdrv_set_aio_context outside BQLFam Zheng
Even if the caller has both the old and the new AioContext's, there can be a deadlock, due to the leading bdrv_drain_all. Suppose there are four io threads (A, B, A0, B0) with A and B owning a BDS for each (bs_a, bs_b); Now A wants to move bs_a to iothread A0, and B wants to move bs_b to B0, at the same time: iothread A iothread B -------------------------------------------------------------------------- aio_context_acquire(A0) /* OK */ aio_context_acquire(B0) /* OK */ bdrv_set_aio_context(bs_a, A0) bdrv_set_aio_context(bs_b, B0) -> bdrv_drain_all() -> bdrv_drain_all() -> acquire A /* OK */ -> acquire A /* blocked */ -> acquire B /* blocked */ -> acquire B ... ... Deadlock happens because A is waiting for B, and B is waiting for A. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1423969591-23646-2-git-send-email-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16block: Remove "growable" from BDSMax Reitz
Now that request clamping is done in the BlockBackend, the "growable" field can be removed from the BlockDriverState. All BDSs are now treated as being "growable" (that is, they are allowed to grow; they are not necessarily actually able to). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16block: Add Error parameter to bdrv_find_protocol()Max Reitz
The argument given to bdrv_find_protocol() is just a file name, which makes it difficult for the caller to reconstruct what protocol bdrv_find_protocol() was hoping to find. This patch adds an Error parameter to that function to solve this issue. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1423162705-32065-4-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16nbd: Drop BDS backpointerMax Reitz
Before this patch, the "opaque" pointer in an NBD BDS points to a BDRVNBDState, which contains an NbdClientSession object, which in turn contains a pointer to the BDS. This pointer may become invalid due to bdrv_swap(), so drop it, and instead pass the BDS directly to the nbd-client.c functions which then retrieve the NbdClientSession object from there. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1423256778-3340-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-06block: New bdrv_add_key(), convert monitor to use itMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1422524221-8566-4-git-send-email-armbru@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-02-06block: introduce BDRV_REQUEST_MAX_SECTORSPeter Lieven
we check and adjust request sizes at several places with sometimes inconsistent checks or default values: INT_MAX INT_MAX >> BDRV_SECTOR_BITS UINT_MAX >> BDRV_SECTOR_BITS SIZE_MAX >> BDRV_SECTOR_BITS This patches introdocues a macro for the maximal allowed sectors per request and uses it at several places. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06nbd: Improve error messagesMax Reitz
This patch makes use of the Error object for nbd_receive_negotiate() so that errors during negotiation look nicer. Furthermore, this patch adds an additional error message if the received magic was wrong, but would be correct for the other protocol version, respectively: So if an export name was specified, but the NBD server magic corresponds to an old handshake, this condition is explicitly signaled to the user, and vice versa. As these messages are now part of the "Could not open image" error message, additional filtering has to be employed in iotest 083, which this patch does as well. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block: add event when disk usage exceeds thresholdFrancesco Romani
Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. To let the guest run smoothly and be not unnecessarily paused, oVirt sets a disk usage threshold (so called 'high water mark') based on the occupation of the device, and automatically extends the image once the threshold is reached or exceeded. In order to detect the crossing of the threshold, oVirt has no choice but aggressively polling the QEMU monitor using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: deployments with hundreds of VMs are no longer rare. To fix this, this patch adds: * A new monitor command `block-set-write-threshold', to set a mark for a given block device. * A new event `BLOCK_WRITE_THRESHOLD', to report if a block device usage exceeds the threshold. * A new `write_threshold' field into the `BlockDeviceInfo' structure, to report the configured threshold. This will allow the managing application to use smarter and more efficient monitoring, greatly reducing the need of polling. [Updated qemu-iotests 067 output to add the new 'write_threshold' property. --Stefan] [Changed g_assert_false() to !g_assert() to fix the build on older glib versions. --Kevin] Signed-off-by: Francesco Romani <fromani@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block: add accounting for merged requestsPeter Lieven
Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-01-23block: update string sizes for filename,backing_file,exact_filenameJeff Cody
The string field entries 'filename', 'backing_file', and 'exact_filename' in the BlockDriverState struct are defined as 1024 bytes. However, many places that use these values accept a maximum of PATH_MAX bytes, so we have a mixture of 1024 byte and PATH_MAX byte allocations. This patch makes the BlockDriverStruct field string sizes match usage. This patch also does a few fixes related to the size that needs to happen now: * the block qapi driver is updated to use PATH_MAX bytes * the qcow and qcow2 drivers have an additional safety check * the block vvfat driver is updated to use PATH_MAX bytes for the size of backing_file, for systems where PATH_MAX is < 1024 bytes. * qemu-img uses PATH_MAX rather than 1024. These instances were not changed to be dynamically allocated, however, as the extra temporary 3K in stack usage for qemu-img does not seem worrisome. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-01-14Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
Mostly bugfixes and cleanups from qemu-devel. Yet another small patch from the record/replay series, and a few SCSI and i386 patches as well. # gpg: Signature made Wed 14 Jan 2015 09:39:14 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: cpus: consistently use QEMU_CLOCK_VIRTUAL_RT for icount_warp_rt timer qemu-timer: rename timer_init to timer_init_tl scsi: fix cancellation when I/O was completed but DMA was not. rules.mak: Fix module build hw/scsi/lsi53c895a: add support for additional diag / debug registers qemu-common.h: optimise muldiv64 if int128 is available target-i386: do not memcpy in and out of xmm_regs target-i386: fix movntsd on big-endian hosts vl.c: fix regression when reading memory size from config file vl: Don't silently change topology when all -smp options were set vl: fix max_cpus check vl: Avoid unnecessary 'if' nesting 9pfs: changed to use event_notifier instead of qemu_pipe vl.c: fix regression when reading machine type from config file char: restore stdio echo on resume from suspend. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-01-14qemu-timer: rename timer_init to timer_init_tlPaolo Bonzini
timer_init is not called that often. Free the name for an equivalent of timer_new. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-13block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}Fam Zheng
Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET, block-commit involves two asymmetric devices. This change is not user-visible (yet), because commit only works with device names. But once we enable backing reference in blockdev-add, or specifying node-name in block-commit command, we don't want the user to start two commit jobs on the same backing chain, which will corrupt things because of the final bdrv_swap. Before we have per category blockers, splitting this type is still better. [Resolved virtio-blk dataplane conflict by replacing BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}. They are safe since the block job runs in the same AioContext as the dataplane IOThread. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>