aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-12-06aio: make aio_poll(ctx, true) block with no fdsStefan Hajnoczi
This patch drops a special case where aio_poll(ctx, true) returns false instead of blocking if no file descriptors are waiting on I/O. Now it is possible to block in aio_poll() to wait for aio_notify(). This change eliminates busy waiting. bdrv_drain_all() used to rely on busy waiting to completed throttled I/O requests but this is no longer required so we can simplify aio_poll(). Note that aio_poll() still returns false when aio_notify() was used. In other words, stopping a blocking aio_poll() wait is not considered making progress. Adjust test-aio /aio/bh/callback-delete/one which assumed aio_poll(ctx, true) would immediately return false instead of blocking. Reviewed-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06block: clean up bdrv_drain_all() throttling commentsStefan Hajnoczi
Since cc0681c45430a1f1a4c2d06e9499b7775afc9a18 ("block: Enable the new throttling code in the block layer.") bdrv_drain_all() no longer spins. The code used to look as follows: do { busy = qemu_aio_wait(); /* FIXME: We do not have timer support here, so this is effectively * a busy wait. */ QTAILQ_FOREACH(bs, &bdrv_states, list) { while (qemu_co_enter_next(&bs->throttled_reqs)) { busy = true; } } } while (busy); Note that throttle requests are kicked but I/O throttling limits are still in effect. The loop spins until the vm_clock time allows the request to make progress and complete. The new throttling code introduced bdrv_start_throttled_reqs(). This function not only kicks throttled requests but also temporarily disables throttling so requests can run. The outdated FIXME comment can be removed. Also drop the busy = true assignment since we overwrite it immediately afterwards. Reviewed-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06qcow2: use start_of_cluster() and offset_into_cluster() everywhereHu Tao
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06qemu-img: decrease progress update interval on convertPeter Lieven
when doing very large jobs updating the progress only every 2% is too rare. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06qemu-img: round down request length to an aligned sectorPeter Lieven
this patch shortens requests to end at an aligned sector so that the next request starts aligned. [Squashed Peter's fix for bdrv_get_info() failure discussed on the mailing list. --Stefan] Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05qemu-img: dynamically adjust iobuffer size during convertPeter Lieven
since the convert process is basically a sync operation it might be benificial in some case to change the hardcoded I/O buffer size to a greater value. This patch increases the I/O buffer size if the output driver advertises an optimal transfer length or discard alignment that is greater than the default buffer size of 2M. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05block/iscsi: set bs->bl.opt_transfer_lengthPeter Lieven
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05block: add opt_transfer_length to BlockLimitsPeter Lieven
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05block/iscsi: set bdi->cluster_sizePeter Lieven
this patch aims to set bdi->cluster_size to the internal page size of the iscsi target so that enabled callers can align requests properly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05qemu-img: fix usage instruction for qemu-img convertPeter Lieven
Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05qemu-img: add support for skipping zeroes in input during convertPeter Lieven
we currently do not check if a sector is allocated during convert. This means if a sector is unallocated that we allocate a bounce buffer of zeroes, find out its zero later and do not write it in the best case. In the worst case this can lead to reading blocks from a raw device (like iSCSI) altough we could easily know via get_block_status that they are zero and simply skip them. This patch also fixes the progress output not being at 100% after a successful conversion. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-nbd: add doc for option -fWenchao Xia
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: add test for snapshot in qemu-img convertWenchao Xia
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-img: add -l for snapshot in convertWenchao Xia
Now qemu-img convert have similar options as qemu-nbd for internal snapshot. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: add 058 internal snapshot export with qemu-nbd caseWenchao Xia
This case can't run when IMGPROTO=nbd, since it needs to create some internal snapshot which would fail for EOF write request, even when TEST_IMG is exported with "-f raw" in common.rc, so set _supported_proto to file. _require_command() is changed to tip what util is missing, instead of printing a blank. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-nbd: support internal snapshot exportWenchao Xia
Now it is possible to directly export an internal snapshot, which can be used to probe the snapshot's contents without qemu-img convert. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04snapshot: distinguish id and name in load_tmpWenchao Xia
Since later this function will be used so improve it. The only caller of it now is qemu-img, and it is not impacted by introduce function bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp() twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return int to let caller know the errno, and errno will be used later. Also fix a typo in comments of bdrv_snapshot_delete(). Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Split qcow2 only cases in 048Fam Zheng
Format "raw" doesn't always work on certain file systems (e.g. tmpfs). Use qcow2 to make the allocation status explicit and split into a new case. [Resolved merge conflict due to "qemu-io> " prompt filter, added 074 to group file, and fixed up s/048/074/ copy-paste mistake. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Clean up spaces in usage outputFam Zheng
Whitespace changes to align columns. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Change default cache mode to "writeback"Fam Zheng
So that the tests can run faster. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Add _default_cache_mode and _supported_cache_modesFam Zheng
This replaces _unsupported_qemu_io_options and check for support of current cache mode, and allow to provide a default if user didn't specify. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Honour cache mode in iotests.pyFam Zheng
This will allow overriding cache mode from the "-c mode" option. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qemu-iotests: Add "-c <cache-mode>" optionFam Zheng
The option sets cache mode used in the tests. "-nocache" is changed to an alias to "-c none", and internally passes "-t none" to qemu-io. Python scripts will make use of option this in the next commit. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04qcow2: Zero-initialise first cluster for new imagesKevin Wolf
Strictly speaking, this is only required for has_zero_init() == false, but it's easy enough to just do a cluster-aligned write that is padded with zeros after the header. This fixes that after 'qemu-img create' header extensions are attempted to be parsed that are really just random leftover data. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04block: Close backing file early in bdrv_img_createMax Reitz
Leaving the backing file open although it is not needed anymore can cause problems if it is opened through a block driver which allows exclusive access only and if the create function of the block driver used for the top image (the one being created) tries to close and reopen the image file (which will include opening the backing file a second time). In particular, this will happen with a backing file opened through qemu-nbd and using qcow2 as the top image file format (which reopens the image to flush it to disk). In addition, the BlockDriverState in bdrv_img_create() is used for the backing file only; it should therefore be made local to the respective block. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03scsi-disk: correctly implement WRITE SAMEPaolo Bonzini
Fetch the data to be written from the input buffer. If it is all zeroes, we can use the write_zeroes call (possibly with the new MAY_UNMAP flag). Otherwise, do as many write cycles as needed, writing 512k at a time. Strictly speaking, this is still incorrect because a zero cluster should only be written if the MAY_UNMAP flag is set. But this is a bug in qcow2 and the other formats, not in the SCSI code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03scsi-disk: reject ANCHOR=1 for UNMAP and WRITE SAME commandsPaolo Bonzini
Since we report ANC_SUP==0 in VPD page B2h, we need to return an error (ILLEGAL REQUEST/INVALID FIELD IN CDB) for all WRITE SAME requests with ANCHOR==1. Inspired by a similar patch to the LIO in-kernel target. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03scsi-disk: catch write protection errors in UNMAPPaolo Bonzini
This is the same that is already done for WRITE SAME. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03qemu-iotests: 033 is fastPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03raw-posix: add support for write_zeroes on XFS and block devicesPaolo Bonzini
The code is similar to the implementation of discard and write_zeroes with UNMAP. However, failure must be propagated up to block.c. The stale page cache problem can be reproduced as follows: # modprobe scsi-debug lbpws=1 lbprz=1 # ./qemu-io /dev/sdXX qemu-io> write -P 0xcc 0 2M qemu-io> write -z 0 1M qemu-io> read -P 0x00 0 512 Pattern verification failed at offset 0, 512 bytes qemu-io> read -v 0 512 00000000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ ... # ./qemu-io --cache=none /dev/sdXX qemu-io> write -P 0xcc 0 2M qemu-io> write -z 0 1M qemu-io> read -P 0x00 0 512 qemu-io> read -v 0 512 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ... And similarly with discard instead of "write -z". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03raw-posix: implement write_zeroes with MAY_UNMAP for block devicesPaolo Bonzini
See the next commit for the description of the Linux kernel problem that is worked around in raw_open_common. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03raw-posix: implement write_zeroes with MAY_UNMAP for filesPaolo Bonzini
Writing zeroes to a file can be done by punching a hole if MAY_UNMAP is set. Note that in this case ENOTSUP is not ignored, but makes the block layer fall back to the generic implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block/iscsi: check WRITE SAME support differently depending on MAY_UNMAPPaolo Bonzini
The current check is right for MAY_UNMAP=1. For MAY_UNMAP=0, just try and fall back to regular writes as soon as a WRITE SAME command fails. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block/iscsi: updated copyrightPeter Lieven
added myself to reflect recent work on the iscsi block driver. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block/iscsi: remove .bdrv_has_zero_initPeter Lieven
since commit 3ac21627 the default value changed to 0. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block drivers: expose requirement for write same alignment from formatsPaolo Bonzini
This will let misaligned but large requests use zero clusters. This is important because the cluster size is not guest visible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block drivers: add discard/write_zeroes properties to bdrv_get_info ↵Paolo Bonzini
implementation Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03vpc, vhdx: add get_infoPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: make bdrv_co_do_write_zeroes stricter in producing aligned requestsPaolo Bonzini
Right now, bdrv_co_do_write_zeroes will only try to align the beginning of the request. However, it is simpler for many formats to expect the block layer to separate both the head *and* the tail. This makes sure that the format's bdrv_co_write_zeroes function will be called with aligned sector_num and nb_sectors for the bulk of the request. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: handle ENOTSUP from discard in generic codePaolo Bonzini
Similar to write_zeroes, let the generic code receive a ENOTSUP for discard operations. Since bdrv_discard has advisory semantics, we can just swallow the error. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: add bdrv_aio_write_zeroesPaolo Bonzini
This will be used by the SCSI layer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: add flags argument to bdrv_co_write_zeroes tracepointPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: add flags to BlockRequestPaolo Bonzini
This lets bdrv_co_do_rw receive flags, so that it can be used for zero writes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03block: generalize BlockLimits handling to cover bdrv_aio_discard tooPaolo Bonzini
bdrv_co_discard is only covering drivers which have a .bdrv_co_discard() implementation, but not those with .bdrv_aio_discard(). Not very nice, and easy to avoid. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03vmdk: Fix creating big description fileFam Zheng
The buffer for description file was 4096 which only covers a few hundred of extents. This changes the buffer to dynamic allocated with g_strdup_printf in order to support bigger cases. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-02coroutine: remove unused CoQueue AioContextMarc-André Lureau
The AioContext ctx field is apparently unused in qemu codebase since 02ffb504485. Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-02coroutine: remove qemu_co_queue_wait_insert_headMarc-André Lureau
qemu_co_queue_wait_insert_head() is unused in qemu code base now. Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29qemu-iotests: Add sample image and test for VMDK version 3Fam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29vmdk: Allow read only open of VMDK version 3Fam Zheng
Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29qemu-iotests: Filter out 'qemu-io> ' promptFam Zheng
This removes "qemu-io> " prompt from qemu-io output in _filter_qemu_io, and updates all the output files with the following command: cd tests/qemu-iotests && sed -i "s/qemu-io> //g" *.out Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>