aboutsummaryrefslogtreecommitdiff
path: root/block/mirror.c
AgeCommit message (Collapse)Author
2014-04-29mirror: Check for bdrv_get_info resultFam Zheng
bdrv_get_info could fail. Add check before using the returned value. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-29mirror: Fix resource leak when bdrv_getlength failsFam Zheng
The direct return will skip releasing of all the resouces at immediate_exit, don't miss that. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-28mirror: Use DIV_ROUND_UPFam Zheng
Although bdrv_getlength() was just called above this, and checked for error, it is better to just use the value we already get, and use DIV_ROUND_UP. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-25Use error_is_set() only when necessary (again)Markus Armbruster
error_is_set(&var) is the same as var != NULL, but it takes whole-program analysis to figure that out. Unnecessarily hard for optimizers, static checkers, and human readers. Commit 84d18f0 dumbed it down to obvious, but a few more have crept in since, and documentation was overlooked. Dumb these down, too. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-22block: Handle error of bdrv_getlength in bdrv_create_dirty_bitmapFam Zheng
bdrv_getlength could fail, check the return value before using it. Return NULL and set errno if it fails. Callers are updated to handle the error case. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-03-25mirror: fix early wake from sleep due to aioStefan Hajnoczi
The mirror blockjob coroutine rate-limits itself by sleeping. The coroutine also performs I/O asynchronously so it's important that the aio callback doesn't wake the coroutine early as that breaks rate-limiting. Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-25mirror: fix throttling delay calculationPaolo Bonzini
The throttling delay calculation was using an inaccurate sector count to calculate the time to sleep. This broke rate-limiting for the block mirror job. Move the delay calculation into mirror_iteration() where we know how many sectors were transferred. This lets us calculate an accurate delay time. Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-06block: mirror - remove code cruft that has no functionJeff Cody
Originally, this built up the error message with the backing filename, so that errp was set as follows: error_set(errp, QERR_OPEN_FILE_FAILED, backing_filename); However, we now propagate the local_error from the bdrv_open_backing_file() call instead, making these 2 lines useless code. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-14block: mirror - use local_err to avoid NULL errpJeff Cody
When starting a block job, commit_active_start() relies on whether *errp is set by mirror_start_job. This allows it to determine if the mirror job start failed, so that it can clean up any changes to open flags from the bdrv_reopen(). If errp is NULL, then it will not be able to determine if mirror_start_job failed or not. To avoid this, use a local Error variable, and then propagate the error (if any) to errp. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14block: Don't throw away errno via error_setgJeff Cody
There are a handful of places in the block layer where a failure path has a valid -errno value, yet error_setg() is used. Those instances should instead use error_setg_errno(), to preserve as much error information as possible. This patch replaces those instances with error_setg_errno(), so that errno is passed up the stack in the error message. Reported-By: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-24block: resize backing image during active layer commit, if neededJeff Cody
If the top image to commit is the active layer, and also larger than the base image, then an I/O error will likely be returned during block-commit. For instance, if we have a base image with a virtual size 10G, and a active layer image of size 20G, then committing the snapshot via 'block-commit' will likely fail. This will automatically attempt to resize the base image, if the active layer image to be committed is larger. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24drive mirror:fix memory leakZhang Min
In the function mirror_iteration() -> qemu_iovec_init(), it allocates memory for op->qiov.iov, when the write request calls back, but in the function mirror_iteration_done(), it only frees the op, not free the op->qiov.iov, so this causes memory leak. It should use qemu_iovec_destroy() to free op->qiov. Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-20commit: Support commit active layerFam Zheng
If active is top, it will be mirrored to base, (with block/mirror.c code), then the image is switched when user completes the block job. QMP documentation is updated. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20block: Add commit_active_start()Fam Zheng
commit_active_start is implemented in block/mirror.c, It will create a job with "commit" type and designated base in block-commit command. This will be used for committing active layer of device. Sync mode is removed from MirrorBlockJob because there's no proper type for commit. The used information is is_none_mode. The common part of mirror_start and commit_active_start is moved to mirror_start_job(). Fix the comment wording for commit_start. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20mirror: Move base to MirrorBlockJobFam Zheng
This allows setting the base before entering mirror_run, commit will make use of it. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20mirror: Don't close targetFam Zheng
Let reference count manage target and don't call bdrv_close here. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29block: per caller dirty bitmapFam Zheng
Previously a BlockDriverState has only one dirty bitmap, so only one caller (e.g. a block job) can keep track of writing. This changes the dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the lifecycle is managed with these new functions: bdrv_create_dirty_bitmap bdrv_release_dirty_bitmap Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap. In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument is added to these functions, since each caller has its own dirty bitmap: bdrv_get_dirty bdrv_dirty_iter_init bdrv_get_dirty_count bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will internally walk the list of all dirty bitmaps and set them one by one. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11qapi: make use of new BlockJobTypeFam Zheng
Switch the string to enum type BlockJobType in BlockJobDriver. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11blockjob: rename BlockJobType to BlockJobDriverFam Zheng
We will use BlockJobType as the enum type name of block jobs in QAPI, rename current BlockJobType to BlockJobDriver, which will eventually become a set of operations, similar to block drivers. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12block: Error parameter for open functionsMax Reitz
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated functions to allow more specific error messages. Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-06block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinctionPaolo Bonzini
Now that bdrv_is_allocated detects coroutine context, the two can use the same code. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06block: make bdrv_delete() staticFam Zheng
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no longer public and should be called by bdrv_unref() if refcnt is decreased to 0. This is an identical change because effectively, there's no multiple reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22aio / timers: Switch entire codebase to the new timer APIAlex Bligh
This is an autogenerated patch using scripts/switch-timer-api. Switch the entire code base to using the new timer API. Note this patch may introduce some line length issues. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22aio / timers: convert block_job_sleep_ns and co_sleep_ns to new APIAlex Bligh
Convert block_job_sleep_ns and co_sleep_ns to use the new timer API. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-28block: Make BlockJobTypes constKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17block: mirror_complete(): use error_setg_file_open()Luiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22block: Add driver-specific options for backing filesKevin Wolf
Options starting in "backing." are passed to the backing file now. If you don't need to specify the filename for the backing file, you can add it on the command line instead of in the image file: $ qemu-nbd -t /tmp/test.img $ qemu-img create -f qcow2 empty.qcow2 1G $ qemu-system-x86_64 -drive file=empty.qcow2,backing.file.driver=nbd,\ backing.file.host=localhost Note that this doesn't override the backing filename from the image. If the image has one, this will fail because NBD doesn't want the options and a filename at the same time. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-01-25mirror: do nothing on zero-sized diskPaolo Bonzini
On a zero-sized disk we need to break out of the job successfully before bdrv_dirty_iter_init is called, otherwise you will get an assertion failure with the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: support arbitrarily-sized iterationsPaolo Bonzini
Yet another optimization is to extend the mirroring iteration to include more adjacent dirty blocks. This limits the number of I/O operations and makes mirroring efficient even with a small granularity. Most of the infrastructure is already in place; we only need to put a loop around the computation of the origin and sector count of the iteration. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: support more than one in-flight AIO operationPaolo Bonzini
With AIO support in place, we can start copying more than one chunk in parallel. This patch introduces the required infrastructure for this: the buffer is split into multiple granularity-sized chunks, and there is a free list to access them. Because of copy-on-write, a single operation may already require multiple chunks to be available on the free list. In addition, two different iterations on the HBitmap may want to copy the same cluster. We avoid this by keeping a bitmap of in-flight I/O operations, and blocking until the previous iteration completes. This should be a pretty rare occurrence, though; as long as there is no overlap the next iteration can start before the previous one finishes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: add buf-size argument to drive-mirrorPaolo Bonzini
This makes sense when the next commit starts using the extra buffer space to perform many I/O operations asynchronously. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: switch mirror_iteration to AIOPaolo Bonzini
There is really no change in the behavior of the job here, since there is still a maximum of one in-flight I/O operation between the source and the target. However, this patch already introduces the AIO callbacks (which are unmodified in the next patch) and some of the logic to count in-flight operations and only complete the job when there is none. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: allow customizing the granularityPaolo Bonzini
The desired granularity may be very different depending on the kind of operation (e.g. continuous replication vs. collapse-to-raw) and whether the VM is expected to perform lots of I/O while mirroring is in progress. Allow the user to customize it, while providing a sane default so that in general there will be no extra allocated space in the target compared to the source. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: allow customizing the granularity of the dirty bitmapPaolo Bonzini
Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: return count of dirty sectors, not chunksPaolo Bonzini
Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25mirror: perform COW if the cluster size is bigger than the granularityPaolo Bonzini
When mirroring runs, the backing files for the target may not yet be ready. However, this means that a copy-on-write operation on the target would fill the missing sectors with zeros. Copy-on-write only happens if the granularity of the dirty bitmap is smaller than the cluster size (and only for clusters that are allocated in the source after the job has started copying). So far, the granularity was fixed to 1MB; to avoid the problem we detected the situation and required the backing files to be available in that case only. However, we want to lower the granularity for efficiency, so we need a better solution. The solution is to always copy a whole cluster the first time it is touched. The code keeps a bitmap of clusters that have already been allocated by the mirroring job, and only does "manual" copy-on-write if the chunk being copied is zero in the bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25block: implement dirty bitmap using HBitmapPaolo Bonzini
This actually uses the dirty bitmap in the block layer, and converts mirroring to use an HBitmapIter. Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts) Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-15block: Fix how mirror_run() frees its bufferMarkus Armbruster
It allocates with qemu_blockalign(), therefore it must free with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-11aio: Get rid of qemu_aio_flush()Kevin Wolf
There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24mirror: add support for on-source-error/on-target-errorPaolo Bonzini
Error management is important for mirroring; otherwise, an error on the target (even something as "innocent" as ENOSPC) requires to start again with a full copy. Similar to on_read_error/on_write_error, two separate knobs are provided for on_source_error (reads) and on_target_error (writes). The default is 'report' for both. The 'ignore' policy will leave the sector dirty, so that it will be retried later. Thus, it will not cause corruption. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24mirror: implement completionPaolo Bonzini
Switching to the target of the migration is done mostly asynchronously, and reported to management via the BLOCK_JOB_COMPLETED event; the only synchronous phase is opening the backing files. bdrv_open_backing_file can always be done, even for migration of the full image (aka sync: 'full'). In this case, qmp_drive_mirror will create the target disk with no backing file at all, and bdrv_open_backing_file will be a no-op. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-24mirror: introduce mirror jobPaolo Bonzini
This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>