aboutsummaryrefslogtreecommitdiff
path: root/block.h
AgeCommit message (Collapse)Author
2012-09-24block: Framework for reopening files safelyJeff Cody
This is based on Supriya Kannery's bdrv_reopen() patch series. This provides a transactional method to reopen multiple images files safely. Image files are queue for reopen via bdrv_reopen_queue(), and the reopen occurs when bdrv_reopen_multiple() is called. Changes are staged in bdrv_reopen_prepare() and in the equivalent driver level functions. If any of the staged images fails a prepare, then all of the images left untouched, and the staged changes for each image abandoned. Block drivers are passed a reopen state structure, that contains: * BDS to reopen * flags for the reopen * opaque pointer for any driver-specific data that needs to be persistent from _prepare to _commit/_abort * reopen queue pointer, if the driver needs to queue additional BDS for a reopen Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24block: correctly set the keep_read_only flagJeff Cody
I believe the bs->keep_read_only flag is supposed to reflect the initial open state of the device. If the device is initially opened R/O, then commit operations, or reopen operations changing to R/W, are prohibited. Currently, the keep_read_only flag is only accurate for the active layer, and its backing file. Subsequent images end up always having the keep_read_only flag set. For instance, what happens now: [ base ] kro = 1, ro = 1 | v [ snap-1 ] kro = 1, ro = 1 | v [ snap-2 ] kro = 0, ro = 1 | v [ active ] kro = 0, ro = 0 What we want: [ base ] kro = 0, ro = 1 | v [ snap-1 ] kro = 0, ro = 1 | v [ snap-2 ] kro = 0, ro = 1 | v [ active ] kro = 0, ro = 0 Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10block: add BLOCK_O_CHECK for qemu-img checkStefan Hajnoczi
Image formats with a dirty bit, like qed and qcow2, repair dirty image files upon open with BDRV_O_RDWR. Performing automatic repair when qemu-img check runs is not ideal because the bdrv_open() call repairs the image before the actual bdrv_check() call from qemu-img.c. Fix this "double repair" since it leads to confusing output from qemu-img check. Tell the block driver that this image is being opened just for bdrv_check(). This skips automatic repair and qemu-img.c can invoke it manually with bdrv_check(). Update the golden output for qemu-iotests 039 to reflect the new qemu-img check output. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-03block: create bdrv_get_backing_file_depth()Benoît Canet
Create bdrv_get_backing_file_depth() in order to be able to show in QMP and HMP how many ancestors backing an image a block device have. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-07-17hw/block-common: Move BlockConf & friends from block.hMarkus Armbruster
This stuff doesn't belong to block layer, and was put there only because a better home didn't exist then. Now it does. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-17block: Geometry and translation hints are now useless, purge themMarkus Armbruster
There are two producers of these hints: drive_init() on behalf of -drive, and hd_geometry_guess(). The only consumer of the hint is hd_geometry_guess(). The callers of hd_geometry_guess() call it only when drive_init() didn't set the hints. Therefore, drive_init()'s hints are never used. Thus, hd_geometry_guess() only ever sees hints it produced itself in a prior call. Only the first call computes something, subsequent calls just repeat the first call's results. However, hd_geometry_guess() is never called more than once: the device models don't, and the block device is destroyed on unplug. Thus, dropping the repeat feature doesn't break anything now. If a block device wasn't destroyed on unplug and could be reused with a new device, then repeating old results would be wrong. Thus, dropping the repeat feature prevents future breakage. This renders the hints unused. Purge them from the block layer. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-17qdev: Introduce block geometry propertiesMarkus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-17hd-geometry: Move disk geometry guessing back from block.cMarkus Armbruster
Commit f3d54fc4 factored it out of hw/ide.c for reuse. Sensible, except it was put into block.c. Device-specific functionality should be kept in device code, not the block layer. Move it to hw/hd-geometry.c, and make stylistic changes required to keep checkpatch.pl happy. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-17fdc: Move floppy geometry guessing back from block.cMarkus Armbruster
Commit 5bbdbb46 moved it to block.c because "other geometry guessing functions already reside in block.c". Device-specific functionality should be kept in device code, not the block layer. Move it back. Disk geometry guessing is still in block.c. To be moved out in a later patch series. Bonus: the floppy type used in pc_cmos_init() now obviously matches the one in the FDrive. Before, we relied on bdrv_get_floppy_geometry_hint() picking the same type both in fd_revalidate() and in pc_cmos_init(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09block: Factor bdrv_read_unthrottled() out of guess_disk_lchs()Markus Armbruster
To prepare move of guess_disk_lchs() into hw/, where it poking BlockDriverState member io_limits_enabled directly would be unclean. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09block: introduce bdrv_swap, implement bdrv_append on top of itPaolo Bonzini
The new function can be made a bit nicer than bdrv_append. It swaps the whole contents, and then swaps back (using the usual t=a;a=b;b=t idiom) the fields that need to stay on top. Thus, it does not need explicit bdrv_detach_dev, bdrv_iostatus_disable, etc. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09blkdebug: remove sync i/o eventsPaolo Bonzini
These are unused, except (by mistake more or less) in QED. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15block: Replace bdrv_get_format() by bdrv_get_format_name()Markus Armbruster
So callers don't need to know anything about maximum name length. Returning a pointer is safe, because the name string lives as long as the block driver it names, and block drivers don't die. Requested by Peter Maydell. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15block: add bdrv_set_enable_write_cachePaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15block: New bdrv_get_flags()Markus Armbruster
Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qemu-img check: Print fixed clusters and recheckKevin Wolf
When any inconsistencies have been fixed, print the statistics and run another check to make sure everything is correct now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qemu-img check -r for repairing imagesKevin Wolf
The QED block driver already provides the functionality to not only detect inconsistencies in images, but also fix them. However, this functionality cannot be manually invoked with qemu-img, but the check happens only automatically during bdrv_open(). This adds a -r switch to qemu-img check that allows manual invocation of an image repair. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15stream: move is_allocated_above to block.cPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10qemu-img: make "info" backing file output correct and easier to usePaolo Bonzini
qemu-img info should use the same logic as qemu when printing the backing file path, or debugging becomes quite tricky. We can also simplify the output in case the backing file has an absolute path or a protocol. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05block: add a function to clear incoming live migration flagsBenoît Canet
This function will clear all BDRV_O_INCOMING flags. Signed-off-by: Benoit Canet <benoit.canet@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05block: Add new BDRV_O_INCOMING flag to notice incoming live migrationBenoît Canet
From original patch with Patchwork-id: 31110 by Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> "Add a flag to indicate that incoming migration is pending and care needs to be taken for data consistency. Block drivers should not modify the image file before incoming migration is complete since the migration source host is still using the image file." The rationale for not using bdrv->read_only is the following. "Unfortunately this is not possible because too many other places in QEMU test bdrv_is_read_only() and use it for their own evil purposes. For example, ide_init_drive() will error out because read-only harddisks are not supported. We're mixing guest and host side read-only concepts so this simpler alternative does not work." Signed-off-by: Benoit Canet <benoit.canet@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05qemu-img: add dirty flag statusDong Xu Wang
Some block drivers can verify their image files are clean or not. So we can show it while using "qemu-img info". Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05qemu-img: add image fragmentation statisticsDong Xu Wang
Discussion can be found at: http://patchwork.ozlabs.org/patch/128730/ This patch add image fragmentation statistics while using qemu-img check. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05block: enforce constraints on block size propertiesStefan Hajnoczi
Nicolae Mogoreanu <mogo@google.com> noticed that I/O requests can lead to QEMU crashes when the logical_block_size property is smaller than 512 bytes. Using the new "blocksize" property we can properly enforce constraints on the block size such that QEMU's block layer is able to operate correctly. Reported-by: Nicolae Mogoreanu <mogo@google.com> Reported-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05aio: move BlockDriverAIOCB to qemu-aio.hPaolo Bonzini
And remove several block_int.h inclusions that should not be there. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-03-12block: handle -EBUSY in bdrv_commit_all()Stefan Hajnoczi
Monitor operations that manipulate image files must not execute while a background job (like image streaming) is in progress. This prevents corruptions from happening when two pieces of code are manipulating the image file without knowledge of each other. The monitor "commit" command raises QERR_DEVICE_IN_USE when bdrv_commit() returns -EBUSY but "commit all" has no error handling. This is easy to fix, although note that we do not deliver a detailed error about which device was busy in the "commit all" case. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29qapi: Introduce blockdev-group-snapshot-sync commandJeff Cody
This is a QAPI/QMP only command to take a snapshot of a group of devices. This is similar to the blockdev-snapshot-sync command, except blockdev-group-snapshot-sync accepts a list devices, filenames, and formats. It is attempted to keep the snapshot of the group atomic; if the creation or open of any of the new snapshots fails, then all of the new snapshots are abandoned, and the name of the snapshot image that failed is returned. The failure case should not interrupt any operations. Rather than use bdrv_close() along with a subsequent bdrv_open() to perform the pivot, the original image is never closed and the new image is placed 'in front' of the original image via manipulation of the BlockDriverState fields. Thus, once the new snapshot image has been successfully created, there are no more failure points before pivoting to the new snapshot. This allows the group of disks to remain consistent with each other, even across snapshot failures. Signed-off-by: Jeff Cody <jcody@redhat.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29block: add a transfer rate for floppy typesHervé Poussineau
Floppies must be read at a specific transfer rate, depending of its own format. Update floppy description table to include required transfer rate. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-22block: bdrv_eject(): Make eject_flag a real boolLuiz Capitulino
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
2012-02-22block: Rename bdrv_mon_event() & BlockMonEventActionLuiz Capitulino
They are QMP events, not monitor events. Rename them accordingly. Also, move bdrv_emit_qmp_error_event() up in the file. A new event will be added soon and it's good to have them next each other. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09block: add .bdrv_co_write_zeroes() interfaceStefan Hajnoczi
The ability to zero regions of an image file is a useful primitive for higher-level features such as image streaming or zero write detection. Image formats may support an optimized metadata representation instead of writing zeroes into the image file. This allows zero writes to be potentially faster than regular write operations and also preserve sparseness of the image file. The .bdrv_co_write_zeroes() interface should be implemented by block drivers that wish to provide efficient zeroing. Note that this operation is different from the discard operation, which may leave the contents of the region indeterminate. That means discarded blocks are not guaranteed to contain zeroes and may contain junk data instead. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-01-26block: add bdrv_find_backing_imageMarcelo Tosatti
Add bdrv_find_backing_image: given a BlockDriverState pointer, and an id, traverse the backing image chain to locate the id. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-01-26block: make copy-on-read a per-request flagStefan Hajnoczi
Previously copy-on-read could only be enabled for all requests to a block device. This means requests coming from the guest as well as QEMU's internal requests would perform copy-on-read when enabled. For image streaming we want to support finer-grained behavior than just populating the image file from its backing image. Image streaming supports partial streaming where a common backing image is preserved. In this case guest requests should not perform copy-on-read because they would indiscriminately copy data which should be left in a backing image from the backing chain. Introduce a per-request flag for copy-on-read so that a block device can process both regular and copy-on-read requests. Overlapping reads and writes still need to be serialized for correctness when copy-on-read is happening, so add an in-flight reference count to track this. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15qcow2: Allow >4 GB VM stateKevin Wolf
This is a compatible extension to the snapshot header format that allows saving a 64 bit VM state size. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-14Fix spelling in comments, documentation and messagesStefan Weil
accidently->accidentally annother->another choosen->chosen consideres->considers decriptor->descriptor developement->development paramter->parameter preceed->precede preceeding->preceding priviledge->privilege propogation->propagation substraction->subtraction throught->through upto->up to usefull->useful Fix also grammar in posix-aio-compat.c Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05block: convert qemu_aio_flush() calls to bdrv_drain_all()Stefan Hajnoczi
Many places in QEMU call qemu_aio_flush() to complete all pending asynchronous I/O. Most of these places actually want to drain all block requests but there is no block layer API to do so. This patch introduces the bdrv_drain_all() API to wait for requests across all BlockDriverStates to complete. As a bonus we perform checks after qemu_aio_wait() to ensure that requests really have finished. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add interface to toggle copy-on-readStefan Hajnoczi
The bdrv_enable_copy_on_read()/bdrv_disable_copy_on_read() functions can be used to programmatically enable or disable copy-on-read for a block device. Later patches add the actual copy-on-read logic. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add bdrv_co_is_allocated() interfaceStefan Hajnoczi
This patch introduces the public bdrv_co_is_allocated() interface which can be used to query image allocation status while the VM is running. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add I/O throttling algorithmZhi Yong Wu
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05block: add the blockio limits command line supportZhi Yong Wu
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-21block: allow migration to work with image files (v3)Anthony Liguori
Image files have two types of data: immutable data that describes things like image size, backing files, etc. and mutable data that includes offset and reference count tables. Today, image formats aggressively cache mutable data to improve performance. In some cases, this happens before a guest even starts. When dealing with live migration, since a file is open on two machines, the caching of meta data can lead to data corruption. This patch addresses this by introducing a mechanism to invalidate any cached mutable data a block driver may have which is then used by the live migration code. NB, this still requires coherent shared storage. Addressing migration without coherent shared storage (i.e. NFS) requires additional work. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-11block: add eject request callbackPaolo Bonzini
Recent versions of udev always keep the tray locked so that the kernel can observe "eject request" events (aka tray button presses) even on discs that aren't mounted. Add support for these events in the ATAPI and SCSI cd drive device models. To let management cope with the behavior of udev, an event should also be added for "tray opened/closed". This way, after issuing an "eject" command, management can poll until the guests actually reacts to the command. They can then issue the "change" command after the tray has been opened, or try with "eject -f" after a (configurable?) timeout. However, with this patch and the corresponding support in the device models, at least it is possible to do a manual two-step eject+change sequence. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-27qapi: Convert query-blockLuiz Capitulino
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-27block: Rename the BlockIOStatus enum valuesLuiz Capitulino
The biggest change is to rename its prefix from BDRV_IOS to BLOCK_DEVICE_IO_STATUS. Next commit will convert the query-block command to the QAPI and that's how the enumeration is going to be generated. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-27block: iostatus: Drop BDRV_IOS_INVALLuiz Capitulino
A future commit will convert bdrv_info() to the QAPI and it won't provide IOS_INVAL. Luckily all we have to do is to add a new 'iostatus_enabled' member to BlockDriverState and use it instead. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2011-10-21block: add bdrv_co_discard and bdrv_aio_discard supportPaolo Bonzini
This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21block: unify flush implementationsPaolo Bonzini
Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-11block: Keep track of devices' I/O statusLuiz Capitulino
This commit adds support to the BlockDriverState type to keep track of devices' I/O status. There are three possible status: BDRV_IOS_OK (no error), BDRV_IOS_ENOSPC (no space error) and BDRV_IOS_FAILED (any other error). The distinction between no space and other errors is important because a management application may want to watch for no space in order to extend the space assigned to the VM and put it to run again. Qemu devices supporting the I/O status feature have to enable it explicitly by calling bdrv_iostatus_enable() _and_ have to be configured to stop the VM on errors (ie. werror=stop|enospc or rerror=stop). In case of multiple errors being triggered in sequence only the first one is stored. The I/O status is always reset to BDRV_IOS_OK when the 'cont' command is issued. Next commits will add support to some devices and extend the query-block/info block commands to return the I/O status information. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: New change_media_cb() parameter loadMarkus Armbruster
To let device models distinguish between eject and load. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12block: New bdrv_set_buffer_alignment()Markus Armbruster
Device models should be able to set it without an unclean include of block_int.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>