aboutsummaryrefslogtreecommitdiff
path: root/block/qed.h
AgeCommit message (Collapse)Author
2023-02-23block: Mark public read/write functions GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pread*/pwrite*() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-12-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-23block: Mark bdrv_co_flush() and callers GRAPH_RDLOCKEmanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_flush() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-8-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-07-06qed: Simplify backing readsEric Blake
The other four drivers that support backing files (qcow, qcow2, parallels, vmdk) all rely on the block layer to populate zeroes when reading beyond EOF of a short backing file. We can simplify the qed code by doing likewise. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200528094405.145708-11-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-04-30block/qed: add missed coroutine_fn markersVladimir Sementsov-Ogievskiy
qed_read_table and qed_write_table use coroutine-only interfaces but are not marked coroutine_fn. Happily, they are called only from coroutine context, so we only need to add missed markers. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-17qed: protect table cache with CoMutexPaolo Bonzini
This makes the driver thread-safe. The CoMutex is dropped temporarily while accessing the data clusters or the backing file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-10-pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-26qed: Add coroutine_fn to I/O path functionsKevin Wolf
Now that we stay in coroutine context for the whole request when doing reads or writes, we can add coroutine_fn annotations to many functions that can do I/O or yield directly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Simplify request handlingKevin Wolf
Now that we process a request in the same coroutine from beginning to end and don't drop out of it any more, we can look like a proper coroutine-based driver and simply call qed_aio_next_io() and get a return value from it instead of spawning an additional coroutine that reenters the parent when it's done. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Use CoQueue for serialising allocationsKevin Wolf
Now that we're running in coroutine context, the ad-hoc serialisation code (which drops a request that has to wait out of coroutine context) can be replaced by a CoQueue. This means that when we resume a serialised request, it is running in coroutine context again and its I/O isn't blocking any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Add return value to qed_aio_read/write_data()Kevin Wolf
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but just return an error code and let the caller handle it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_write_table()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove GenericCBKevin Wolf
The GenericCB infrastructure isn't used any more. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_find_cluster()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26qed: Remove callback from qed_read_l2_table()Kevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21block: explicitly acquire aiocontext in timers that need itPaolo Bonzini
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-13-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-10-07block: use aio_bh_schedule_oneshotPaolo Bonzini
This simplifies bottom half handlers by removing calls to qemu_bh_delete and thus removing the need to stash the bottom half pointer in the opaque datum. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-22util: move declarations out of qemu-common.hVeronia Bahaa
Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-06qed: Really remove unused field QEDAIOCB.finishedFam Zheng
The commit 533ffb17a that removed qed_aiocb_info.cancel said to remove this but didn't do it. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block: Rename BlockDriverCompletionFunc to BlockCompletionFuncMarkus Armbruster
I'll use it with block backends shortly, and the name is going to fit badly there. It's a block layer thing anyway, not just a block driver thing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20block: Rename BlockDriverAIOCB* to BlockAIOCB*Markus Armbruster
I'll use BlockDriverAIOCB with block backends shortly, and the name is going to fit badly there. It's a block layer thing anyway, not just a block driver thing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14qed: Make qiov match request size until backing file EOFKevin Wolf
If a QED image has a shorter backing file and a read request to unallocated clusters goes across EOF of the backing file, the backing file sees a shortened request and the rest is filled with zeros. However, the original too long qiov was used with the shortened request. This patch makes the qiov size match the request size, avoiding a potential buffer overflow in raw-posix. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-16qed.c: replace QEMUOptionParameter with QemuOptsChunyan Liu
One extra change is to define QED_DEFAULT_CLUSTER_SIZE = 65536 instead of 64 * 1024; because: according to existing create_options, "cluster size" has default value = QED_DEFAULT_CLUSTER_SIZE, after switching to create_opts, this has to be stringized and set to .def_value_str. That is, .def_value_str = stringify(QED_DEFAULT_CLUSTER_SIZE), so the QED_DEFAULT_CLUSTER_SIZE could not be a expression. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Chunyan Liu <cyliu@suse.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-25block: qed - use QEMU_PACKED for on-disk structuresJeff Cody
QEDHeader is read, and written, directly from on-disk images via bdrv_pread()/write(). To avoid any unintentional padding, these structs should be packed. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-08-10qed: mark image clean after repair succeedsStefan Hajnoczi
The dirty bit is cleared after image repair succeeds in qed_open(). Move this into qed_check() so that all callers benefit from this behavior when fix=true. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05qed: remove incoming live migration blockerBenoƮt Canet
Signed-off-by: Benoit Canet <benoit.canet@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09qed: add .bdrv_co_write_zeroes() supportStefan Hajnoczi
Zero writes are a dedicated interface for writing regions of zeroes into the image file. If clusters are not yet allocated it is possible to use an efficient metadata representation which keeps the image file compact and does not store individual zero bytes. Implementing this for the QED image format is fairly straightforward. The only issue is that when a zero write touches an existing cluster we have to allocate a bounce buffer and perform a regular write. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09qed: replace is_write with flags fieldStefan Hajnoczi
Per-request attributes like read/write are currently implemented as bool fields in the QEDAIOCB struct. This becomes unwiedly as the number of attributes grows. For example, the qed_aio_setup() function would have to take multiple bool arguments and at call sites it would be hard to distinguish the meaning of each bool. Instead use a flags field with bitmask constants. This will be used when zero write support is added. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-21qed: add migration blocker (v2)Anthony Liguori
Now when you try to migrate with qed, you get: (qemu) migrate tcp:localhost:1025 Block format 'qed' used by device 'ide0-hd0' does not support feature 'live migration' (qemu) Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-05-18qed: Periodically flush and clear need check bitStefan Hajnoczi
One strategy to limit the startup delay of consistency check when opening image files is to ensure that the file is marked dirty for as little time as possible. QED currently marks the image dirty when the first allocating write request is issued and clears the dirty bit again when the image is cleanly closed. In practice that means the image is marked dirty for most of a guest's lifetime and prone to being in a dirty state upon crash or power failure. It is safe to clear the dirty bit after all allocating write requests have completed and a flush has been performed. This patch adds a timer after the last allocating write request completes. When the timer fires it will flush and then clear the dirty bit. The timer is set to 5 seconds and is cancelled upon arrival of a new allocating write request. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-27qed: Fix consistency check on 32-bit hostsStefan Hajnoczi
The qed_bytes_to_clusters() function is normally used with size_t lengths. Consistency check used it with file size length and therefore failed on 32-bit hosts when the image file is 4 GB or more. Make qed_bytes_to_clusters() explicitly 64-bit and update consistency check to keep 64-bit cluster counts. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-13qed: Add support for zero clustersAnthony Liguori
Zero clusters are similar to unallocated clusters except instead of reading their value from a backing file when one is available, the cluster is always read as zero. This implements read support only. At this stage, QED will never write a zero cluster. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17qed: Consistency check supportStefan Hajnoczi
This patch adds support for the qemu-img check command. It also introduces a dirty bit in the qed header to mark modified images as needing a check. This bit is cleared when the image file is closed cleanly. If an image file is opened and it has the dirty bit set, a consistency check will run and try to fix corrupted table offsets. These corruptions may occur if there is power loss while an allocating write is performed. Once the image is fixed it opens as normal again. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17qed: Read/write supportStefan Hajnoczi
This patch implements the read/write state machine. Operations are fully asynchronous and multiple operations may be active at any time. Allocating writes lock tables to ensure metadata updates do not interfere with each other. If two allocating writes need to update the same L2 table they will run sequentially. If two allocating writes need to update different L2 tables they will run in parallel. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17qed: Table, L2 cache, and cluster functionsStefan Hajnoczi
This patch adds code to look up data cluster offsets in the image via the L1/L2 tables. The L2 tables are writethrough cached in memory for performance (each read/write requires a lookup so it is essential to cache the tables). With cluster lookup code in place it is possible to implement bdrv_is_allocated() to query the number of contiguous allocated/unallocated clusters. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17qed: Add QEMU Enhanced Disk image formatStefan Hajnoczi
This patch introduces the qed on-disk layout and implements image creation. Later patches add read/write and other functionality. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>