aboutsummaryrefslogtreecommitdiff
path: root/block/mirror.c
AgeCommit message (Collapse)Author
2017-03-24trace: Fix backwards mirror_yield parametersEric Blake
block/trace-events lists the parameters for mirror_yield consistently with other mirror events (cnt just after s, like in mirror_before_sleep; in_flight last, like in mirror_yield_in_flight). But the callers were passing parameters in the wrong order, leading to poor trace messages, including type truncation when there are more than 4G dirty sectors involved. Broken since its introduction in commit bd48bde. While touching this, ensure that all callers use the same type (uint64_t) for cnt, as a later patch will enable the compiler to do stricter type-checking. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-17block: Always call bdrv_child_check_perm firstFam Zheng
bdrv_child_set_perm alone is not very usable because the caller must call bdrv_child_check_perm first. This is already encapsulated conveniently in bdrv_child_try_set_perm, so remove the other prototypes from the header and fix the one wrong caller, block/mirror.c. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13mirror: Implement .bdrv_refresh_filenameKevin Wolf
We want query-block to return the right filename, even if a mirror job put a bdrv_mirror_top on top of the actual image format driver. Let bdrv_mirror_top.bdrv_refresh_filename get the filename from its backing file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07block: Fix error handling in bdrv_replace_in_backing_chain()Kevin Wolf
When adding an Error parameter, bdrv_replace_in_backing_chain() would become nothing more than a wrapper around change_parent_backing_link(). So make the latter public, renamed as bdrv_replace_node(), and remove bdrv_replace_in_backing_chain(). Most of the callers just remove a node from the graph that they just inserted, so they can use &error_abort, but completion of a mirror job with 'replaces' set can actually fail. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07mirror: Fix error path for dirty bitmap creationKevin Wolf
mirror_top_bs must be removed from the graph again when creating the dirty bitmap fails. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07mirror: Fix permissions for removing mirror_top_bsKevin Wolf
mirror_top_bs takes write permissions on its backing file, which can make it impossible to attach that backing file node to another parent. However, this is exactly what needs to be done in order to remove mirror_top_bs from the backing chain. So give up the write permission first. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07mirror: Fix permission problem with 'replaces'Kevin Wolf
The 'replaces' option of drive-mirror can be used to mirror a Quorum node to a new image and then let the target image replace one of the Quorum children. In order for this graph modification to succeed, the mirror job needs to lift its restrictions on the target node first before actually replacing the child. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2017-02-28block: Add Error parameter to bdrv_append()Kevin Wolf
Aborting on error in bdrv_append() isn't correct. This patch fixes it and lets the callers handle failures. Test case 085 needs a reference output update. This is caused by the reversed order of bdrv_set_backing_hd() and change_parent_backing_link() in bdrv_append(): When the backing file of the new node is set, the parent nodes are still pointing to the old top, so the backing blocker is now initialised with the node name rather than the BlockBackend name. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28block: Add Error parameter to bdrv_set_backing_hd()Kevin Wolf
Not all callers of bdrv_set_backing_hd() know for sure that attaching the backing file will be allowed by the permission system. Return the error from the function rather than aborting. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28commit: Add filter-node-name to block-commitKevin Wolf
Management tools need to be able to know about every node in the graph and need a way to address them. Changing the graph structure was okay because libvirt doesn't really manage the node level yet, but future libvirt versions need to deal with both new and old version of qemu. This new option to blockdev-commit allows the client to set a node-name for the automatically inserted filter driver, and at the same time serves as a witness for a future libvirt that this version of qemu does automatically insert a filter driver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28mirror: Add filter-node-name to blockdev-mirrorKevin Wolf
Management tools need to be able to know about every node in the graph and need a way to address them. Changing the graph structure was okay because libvirt doesn't really manage the node level yet, but future libvirt versions need to deal with both new and old version of qemu. This new option to blockdev-mirror allows the client to set a node-name for the automatically inserted filter driver, and at the same time serves as a witness for a future libvirt that this version of qemu does automatically insert a filter driver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28mirror: Use real permissions in mirror/active commit block jobKevin Wolf
The mirror block job is mainly used for two different scenarios: Mirroring to an otherwise unused, independent target node, or for active commit where the target node is part of the backing chain of the source. Similarly to the commit block job patch, we need to insert a new filter node to keep the permissions correct during active commit. Note that one change this implies is that job->blk points to mirror_top_bs as its root now, and mirror_top_bs (rather than the actual source node) contains the bs->job pointer. This requires qemu-img commit to get the job by name now rather than just taking bs->job. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Acked-by: Max Reitz <mreitz@redhat.com>
2017-02-28blockjob: Add permissions to block_job_add_bdrv()Kevin Wolf
Block jobs don't actually do I/O through the the reference they create with block_job_add_bdrv(), but they might want to use the permisssion system to express what the block job does to intermediate nodes. This adds permissions to block_job_add_bdrv() to provide the means to request permissions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28blockjob: Add permissions to block_job_create()Kevin Wolf
This functions creates a BlockBackend internally, so the block jobs need to tell it what they want to do with the BB. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28block: Add error parameter to blk_insert_bs()Kevin Wolf
Now that blk_insert_bs() requests the BlockBackend permissions for the node it attaches to, it can fail. Instead of aborting, pass the errors to the callers. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28block: Add permissions to blk_new()Kevin Wolf
We want every user to be specific about the permissions it needs, so we'll pass the initial permissions as parameters to blk_new(). A user only needs to call blk_set_perm() if it wants to change the permissions after the fact. The permissions are stored in the BlockBackend and applied whenever a BlockDriverState should be attached in blk_insert_bs(). This does not include actually choosing the right set of permissions everywhere yet. Instead, the usual FIXME comment is added to each place and will be addressed in individual patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
2017-02-27block/mirror: fix broken sparseness detectionJohn Snow
int64_t is in all likelihood the actual scalar type we want. Yep, really. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1219541 Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-24mirror: Resize active commit base in mirror_run()Kevin Wolf
This is more consistent with the commit block job, and it moves the code to a place where we already have the necessary BlockBackends to resize the base image when bdrv_truncate() is changed to require a BdrvChild. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-21mirror: do not increase offset during initial zero_or_discard phaseAnton Nefedov
If explicit zeroing out before mirroring is required for the target image, it moves the block job offset counter to EOF, then offset and len counters count the image size twice. There is no harm but stats are confusing, specifically the progress of the operation is always reported as 99% by management tools. The patch skips offset increase for the first "technical" pass over the image. This should not cause any further harm. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1486045515-8009-1-git-send-email-den@openvz.org CC: Jeff Cody <jcody@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21block: explicitly acquire aiocontext in aio callbacks that need itPaolo Bonzini
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-16-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-11-14mirror: do not flush every time the disks are syncedPaolo Bonzini
This puts a huge strain on the disks when there are many concurrent migrations. With this patch we only flush twice: just before issuing the event, and just before pivoting to the destination. If management will complete the job close to the BLOCK_JOB_READY event, the cost of the second flush should be small anyway. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20161109162008.27287-2-pbonzini@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-14blockjob: add block_job_startJohn Snow
Instead of automatically starting jobs at creation time via backup_start et al, we'd like to return a job object pointer that can be started manually at later point in time. For now, add the block_job_start mechanism and start the jobs automatically as we have been doing, with conversions job-by-job coming in later patches. Of note: cancellation of unstarted jobs will perform all the normal cleanup as if the job had started, particularly abort and clean. The only difference is that we will not emit any events, because the job never actually started. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1478587839-9834-5-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-14blockjob: add .start fieldJohn Snow
Add an explicit start field to specify the entrypoint. We already have ownership of the coroutine itself AND managing the lifetime of the coroutine, let's take control of creation of the coroutine, too. This will allow us to delay creation of the actual coroutine until we know we'll actually start a BlockJob in block_job_start. This avoids the sticky question of how to "un-create" a Coroutine that hasn't been started yet. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1478587839-9834-4-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01blockjobs: split interface into public/private, Part 1John Snow
To make it a little more obvious which functions are intended to be public interface and which are intended to be for use only by jobs themselves, split the interface into "public" and "private" files. Convert blockjobs (e.g. block/backup) to using the private interface. Leave blockdev and others on the public interface. There are remaining uses of private state by qemu-img, and several cases in blockdev.c and block/io.c where we grab job->blk for the purposes of acquiring an AIOContext. These will be corrected in future patches. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-7-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01blockjob: centralize QMP event emissionsJohn Snow
There's no reason to leave this to blockdev; we can do it in blockjobs directly and get rid of an extra callback for most users. All non-internal events, even those created outside of QMP, will consistently emit events. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-5-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01Replication/Blockjobs: Create replication jobs as internalJohn Snow
Bubble up the internal interface to commit and backup jobs, then switch replication tasks over to using this methodology. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-4-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01blockjobs: Allow creating internal jobsJohn Snow
Add the ability to create jobs without an ID. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1477584421-1399-3-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-11-01block: Turn on "unmap" in active commitFam Zheng
We already specified BDRV_O_UNMAP when opening images in 'qemu-img commit', but didn't turn on the "unmap" in the active commit job. This patch fixes that so that zeroed clusters in top image can be discarded which is desired in the virt-sparsify use case, where a temporary overlay is created and fstrim'ed before commiting back, to free space in the original image. This also enables it for block-commit. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1474974892-5031-1-git-send-email-famz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-10-31block: Block all intermediate nodes in commit_active_start()Alberto Garcia
When block-commit is launched without the top parameter, it uses internally a mirror block job. In that case all intermediate nodes between the active and base nodes must be blocked as well. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31block: Use block_job_add_bdrv() in mirror_start_job()Alberto Garcia
Use block_job_add_bdrv() instead of blocking all operations in mirror_start_job() and unblocking them in mirror_exit(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-28mirror: use bdrv_drained_begin/bdrv_drained_endPaolo Bonzini
Ensure that there are no changes between the last check to bdrv_get_dirty_count and the switch to the target. There is already a bdrv_drained_end call, we only need to ensure that bdrv_drained_begin is not called twice. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-28blockjob: introduce .drain callback for jobsPaolo Bonzini
This is required to decouple block jobs from running in an AioContext. With multiqueue block devices, a BlockDriverState does not really belong to a single AioContext. The solution is to first wait until all I/O operations are complete; then loop in the main thread for the block job to complete entirely. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-Id: <1477565348-5458-3-git-send-email-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2016-10-24block: Hide HBitmap in block dirty bitmap interfaceFam Zheng
HBitmap is an implementation detail of block dirty bitmap that should be hidden from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying HBitmapIter. A small difference in the interface is, before, an HBitmapIter is initialized in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because the structure definition is in block/dirty-bitmap.c. Two current users are converted too. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1476395910-8697-2-git-send-email-jsnow@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-09-13mirror: auto complete active commitWen Congyang
Auto complete mirror job in background to prevent from blocking synchronously Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Signed-off-by: Wang WeiWei <wangww.fnst@cn.fujitsu.com> Message-id: 1469602913-20979-7-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-08-08mirror: finish earlier on errorVladimir Sementsov-Ogievskiy
Stop to produce new async copy requests from mirror_iteration if critical error (error action = BLOCK_ERROR_ACTION_REPORT) detected. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-26mirror: double performance of the bulk stage if the disc is fullVladimir Sementsov-Ogievskiy
Mirror can do up to 16 in-flight requests, but actually on full copy (the whole source disk is non-zero) in-flight is always 1. This happens as the request is not limited in size: the data occupies maximum available capacity of s->buf. The patch limits the size of the request to some artificial constant (1 Mb here), which is not that big or small. This effectively enables back parallelism in mirror code as it was designed. The result is important: the time to migrate 10 Gb disk is reduced from ~350 sec to 170 sec. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1468516741-82174-1-git-send-email-vsementsov@virtuozzo.com CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-21Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell
pc, pci, virtio: new features, cleanups, fixes - interrupt remapping for intel iommus - a bunch of virtio cleanups - fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 21 Jul 2016 18:49:30 BST # gpg: using RSA key 0x281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (57 commits) intel_iommu: avoid unnamed fields virtio: Update migration docs virtio-gpu: Wrap in vmstate virtio-gpu: Use migrate_add_blocker for virgl migration blocking virtio-input: Wrap in vmstate 9pfs: Wrap in vmstate virtio-serial: Wrap in vmstate virtio-net: Wrap in vmstate virtio-balloon: Wrap in vmstate virtio-rng: Wrap in vmstate virtio-blk: Wrap in vmstate virtio-scsi: Wrap in vmstate virtio: Migration helper function and macro virtio-serial: Remove old migration version support virtio-net: Remove old migration version support virtio-scsi: Replace HandleOutput typedef Revert "mirror: Workaround for unexpected iohandler events during completion" virtio-scsi: Call virtio_add_queue_aio virtio-blk: Call virtio_add_queue_aio virtio: Introduce virtio_add_queue_aio ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-21Revert "mirror: Workaround for unexpected iohandler events during completion"Fam Zheng
This reverts commit ab27c3b5e7408693dde0b565f050aa55c4a1bcef. The virtio storage device host notifiers now work with bdrv_drained_begin/end, so we don't need this hack any more. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-21Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell
staging Pull request v2: * Resolved merge conflict with block/iscsi.c [Peter] # gpg: Signature made Wed 20 Jul 2016 17:20:52 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: (25 commits) raw_bsd: Convert to byte-based interface nbd: Convert to byte-based interface block: Kill .bdrv_co_discard() sheepdog: Switch .bdrv_co_discard() to byte-based raw_bsd: Switch .bdrv_co_discard() to byte-based qcow2: Switch .bdrv_co_discard() to byte-based nbd: Switch .bdrv_co_discard() to byte-based iscsi: Switch .bdrv_co_discard() to byte-based gluster: Switch .bdrv_co_discard() to byte-based blkreplay: Switch .bdrv_co_discard() to byte-based block: Add .bdrv_co_pdiscard() driver callback block: Convert .bdrv_aio_discard() to byte-based rbd: Switch rbd_start_aio() to byte-based raw-posix: Switch paio_submit() to byte-based block: Convert BB interface to byte-based discards block: Convert bdrv_aio_discard() to byte-based block: Switch BlockRequest to byte-based block: Convert bdrv_discard() to byte-based block: Convert bdrv_co_discard() to byte-based iscsi: Rely on block layer to break up large requests ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: block/gluster.c
2016-07-20block: Convert BB interface to byte-based discardsEric Blake
Change sector-based blk_discard(), blk_co_discard(), and blk_aio_discard() to instead be byte-based blk_pdiscard(), blk_co_pdiscard(), and blk_aio_pdiscard(). NBD gets a lot simpler now that ignoring the unaligned portion of a byte-based discard request is handled under the hood by the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-6-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-19mirror: fix request throttling in drive-mirrorDenis V. Lunev
There are 2 deficiencies here: - mirror_iteration could start several requests inside. Thus we could simply have more in_flight requests than MAX_IN_FLIGHT. - keeping this in mind throttling in mirror_run which is checking s->in_flight == MAX_IN_FLIGHT is wrong. The patch adds the check and throttling into mirror_iteration and fixes the check in mirror_run() to be sure. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1466598927-5990-1-git-send-email-den@openvz.org CC: Jeff Cody <jcody@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> (cherry picked from commit e648dc95c28fbca12e67be26a1fc4b9a0676c3fe)
2016-07-19mirror: improve performance of mirroring of empty diskDenis V. Lunev
We should not take into account zero blocks for delay calculations. They are not read and thus IO throttling is not required. In the other case VM migration with 16 Tb QCOW2 disk with 4 Gb of data takes days. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-9-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-19mirror: efficiently zero out targetDenis V. Lunev
With a bdrv_co_write_zeroes method on a target BDS and when this method is working as indicated by the bdrv_can_write_zeroes_with_unmap(), zeroes will not be placed into the wire. Thus the target could be very efficiently zeroed out. This should be done with the largest chunk possible. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-8-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-19mirror: optimize dirty bitmap filling in mirror_run a bitDenis V. Lunev
There is no need to scan allocation tables if we have mark_all_dirty flag set. Just mark it all dirty. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-7-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-19mirror: create mirror_dirty_init helper for mirror_runDenis V. Lunev
The code inside the helper will be extended in the next patch. mirror_run itself is overbloated at the moment. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-5-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-19mirror: create mirror_throttle helperDenis V. Lunev
The patch also places last_pause_ns from stack in mirror_run into MirrorBlockJob structure. This helper will be useful in next patches. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1468503209-19498-4-git-send-email-den@openvz.org CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> CC: Eric Blake <eblake@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-19mirror: make sectors_in_flight int64_tDenis V. Lunev
We keep here the sum of int fields. Thus this could easily overflow, especially when we will start sending big requests in next patches. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468503209-19498-3-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-07-13Improve block job rate limiting for small bandwidth valuesSascha Silbe
ratelimit_calculate_delay() previously reset the accounting every time slice, no matter how much data had been processed before. This had (at least) two consequences: 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream. Not sure if there are real-world use cases where this would be a problem. Mirroring and backup over a slow link (e.g. DSL) would come to mind, though. 2. Tests for block job operations (e.g. cancel) were rather racy All block jobs currently use a time slice of 100ms. That's a reasonable value to get smooth output during regular operation. However this also meant that the state of block jobs changed every 100ms, no matter how low the configured limit was. On busy hosts, qemu often transferred additional chunks until the test case had a chance to cancel the job. Fix the block job rate limit code to delay for more than one time slice to address the above issues. To make it easier to handle oversized chunks we switch the semantics from returning a delay _before_ the current request to a delay _after_ the current request. If necessary, this delay consists of multiple time slice units. Since the mirror job sends multiple chunks in one go even if the rate limit was exceeded in between, we need to keep track of the start of the current time slice so we can correctly re-compute the delay for the updated amount of data. The minimum bandwidth now is 1 data unit per time slice. The block jobs are currently passing the amount of data transferred in sectors and using 100ms time slices, so this translates to 5120 bytes/second. With chunk sizes usually being O(512KiB), tests have plenty of time (O(100s)) to operate on block jobs. The chance of a race condition now is fairly remote, except possibly on insanely loaded systems. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Message-id: 1467127721-9564-2-git-send-email-silbe@linux.vnet.ibm.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-07-13coroutine: move entry argument to qemu_coroutine_createPaolo Bonzini
In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine *co = qemu_coroutine_create(entry); + Coroutine *co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-13commit: Add 'job-id' parameter to 'block-commit'Alberto Garcia
This patch adds a new optional 'job-id' parameter to 'block-commit', allowing the user to specify the ID of the block job to be created. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>