aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2024-08-28block/blkio: use FUA flag on write zeroes only if supportedStefano Garzarella
libblkio supports BLKIO_REQ_FUA with write zeros requests only since version 1.4.0, so let's inform the block layer that the blkio driver supports it only in this case. Otherwise we can have runtime errors as reported in https://issues.redhat.com/browse/RHEL-32878 Fixes: fd66dbd424 ("blkio: add libblkio block driver") Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-32878 Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20240808080545.40744-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 547c4e50929ec6c091d9c16a7b280e829b12b463) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-28nbd/server: CVE-2024-7409: Cap default max-connections to 100Eric Blake
Allowing an unlimited number of clients to any web service is a recipe for a rudimentary denial of service attack: the client merely needs to open lots of sockets without closing them, until qemu no longer has any more fds available to allocate. For qemu-nbd, we default to allowing only 1 connection unless more are explicitly asked for (-e or --shared); this was historically picked as a nice default (without an explicit -t, a non-persistent qemu-nbd goes away after a client disconnects, without needing any additional follow-up commands), and we are not going to change that interface now (besides, someday we want to point people towards qemu-storage-daemon instead of qemu-nbd). But for qemu proper, and the newer qemu-storage-daemon, the QMP nbd-server-start command has historically had a default of unlimited number of connections, in part because unlike qemu-nbd it is inherently persistent until nbd-server-stop. Allowing multiple client sockets is particularly useful for clients that can take advantage of MULTI_CONN (creating parallel sockets to increase throughput), although known clients that do so (such as libnbd's nbdcopy) typically use only 8 or 16 connections (the benefits of scaling diminish once more sockets are competing for kernel attention). Picking a number large enough for typical use cases, but not unlimited, makes it slightly harder for a malicious client to perform a denial of service merely by opening lots of connections withot progressing through the handshake. This change does not eliminate CVE-2024-7409 on its own, but reduces the chance for fd exhaustion or unlimited memory usage as an attack surface. On the other hand, by itself, it makes it more obvious that with a finite limit, we have the problem of an unauthenticated client holding 100 fds opened as a way to block out a legitimate client from being able to connect; thus, later patches will further add timeouts to reject clients that are not making progress. This is an INTENTIONAL change in behavior, and will break any client of nbd-server-start that was not passing an explicit max-connections parameter, yet expects more than 100 simultaneous connections. We are not aware of any such client (as stated above, most clients aware of MULTI_CONN get by just fine on 8 or 16 connections, and probably cope with later connections failing by relying on the earlier connections; libvirt has not yet been passing max-connections, but generally creates NBD servers with the intent for a single client for the sake of live storage migration; meanwhile, the KubeSAN project anticipates a large cluster sharing multiple clients [up to 8 per node, and up to 100 nodes in a cluster], but it currently uses qemu-nbd with an explicit --shared=0 rather than qemu-storage-daemon with nbd-server-start). We considered using a deprecation period (declare that omitting max-parameters is deprecated, and make it mandatory in 3 releases - then we don't need to pick an arbitrary default); that has zero risk of breaking any apps that accidentally depended on more than 100 connections, and where such breakage might not be noticed under unit testing but only under the larger loads of production usage. But it does not close the denial-of-service hole until far into the future, and requires all apps to change to add the parameter even if 100 was good enough. It also has a drawback that any app (like libvirt) that is accidentally relying on an unlimited default should seriously consider their own CVE now, at which point they are going to change to pass explicit max-connections sooner than waiting for 3 qemu releases. Finally, if our changed default breaks an app, that app can always pass in an explicit max-parameters with a larger value. It is also intentional that the HMP interface to nbd-server-start is not changed to expose max-connections (any client needing to fine-tune things should be using QMP). Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20240807174943.771624-12-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [ericb: Expand commit message to summarize Dan's argument for why we break corner-case back-compat behavior without a deprecation period] Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c8a76dbd90c2f48df89b75bef74917f90a59b623) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-28vvfat: Fix reading files with non-continuous clustersAmjad Alsharafi
When reading with `read_cluster` we get the `mapping` with `find_mapping_for_cluster` and then we call `open_file` for this mapping. The issue appear when its the same file, but a second cluster that is not immediately after it, imagine clusters `500 -> 503`, this will give us 2 mappings one has the range `500..501` and another `503..504`, both point to the same file, but different offsets. When we don't open the file since the path is the same, we won't assign `s->current_mapping` and thus accessing way out of bound of the file. From our example above, after `open_file` (that didn't open anything) we will get the offset into the file with `s->cluster_size*(cluster_num-s->current_mapping->begin)`, which will give us `0x2000 * (504-500)`, which is out of bound for this mapping and will produce some issues. Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Message-ID: <1f3ea115779abab62ba32c788073cdc99f9ad5dd.1721470238.git.amjadsharafi10@gmail.com> [kwolf: Simplified the patch based on Amjad's analysis and input] Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 5eed3db336506b529b927ba221fe0d836e5b8819) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-28vvfat: Fix wrong checks for cluster mappings invariantAmjad Alsharafi
How this `abort` was intended to check for was: - if the `mapping->first_mapping_index` is not the same as `first_mapping_index`, which **should** happen only in one case, when we are handling the first mapping, in that case `mapping->first_mapping_index == -1`, in all other cases, the other mappings after the first should have the condition `true`. - From above, we know that this is the first mapping, so if the offset is not `0`, then abort, since this is an invalid state. The issue was that `first_mapping_index` is not set if we are checking from the middle, the variable `first_mapping_index` is only set if we passed through the check `cluster_was_modified` with the first mapping, and in the same function call we checked the other mappings. One approach is to go into the loop even if `cluster_was_modified` is not true so that we will be able to set `first_mapping_index` for the first mapping, but since `first_mapping_index` is only used here, another approach is to just check manually for the `mapping->first_mapping_index != -1` since we know that this is the value for the only entry where `offset == 0` (i.e. first mapping). Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <b0fbca3ee208c565885838f6a7deeaeb23f4f9c2.1721470238.git.amjadsharafi10@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit f60a6f7e17bf2a2a0f0a08265ac9b077fce42858) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-28vvfat: Fix usage of `info.file.offset`Amjad Alsharafi
The field is marked as "the offset in the file (in clusters)", but it was being used like this `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect. Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <72f19a7903886dda1aa78bcae0e17702ee939262.1721470238.git.amjadsharafi10@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 21b25a0e466a5bba0a45600bb8100ab564202ed1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-28vvfat: Fix bug in writing to middle of fileAmjad Alsharafi
Before this commit, the behavior when calling `commit_one_file` for example with `offset=0x2000` (second cluster), what will happen is that we won't fetch the next cluster from the fat, and instead use the first cluster for the read operation. This is due to off-by-one error here, where `i=0x2000 !< offset=0x2000`, thus not fetching the next cluster. Signed-off-by: Amjad Alsharafi <amjadsharafi10@gmail.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <b97c1e1f1bc2f776061ae914f95d799d124fcd73.1721470238.git.amjadsharafi10@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit b881cf00c99e03bc8a3648581f97736ff275b18b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-07-03qcow2: Don't open data_file with BDRV_O_NO_IOKevin Wolf
One use case for 'qemu-img info' is verifying that untrusted images don't reference an unwanted external file, be it as a backing file or an external data file. To make sure that calling 'qemu-img info' can't already have undesired side effects with a malicious image, just don't open the data file at all with BDRV_O_NO_IO. If nothing ever tries to do I/O, we don't need to have it open. This changes the output of iotests case 061, which used 'qemu-img info' to show that opening an image with an invalid data file fails. After this patch, it succeeds. Replace this part of the test with a qemu-io call, but keep the final 'qemu-img info' to show that the invalid data file is correctly displayed in the output. Fixes: CVE-2024-4467 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> (cherry picked from commit bd385a5298d7062668e804d73944d52aec9549f1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-27block-backend: fix edge case in bdrv_next_cleanup() where BDS associated to ↵Fiona Ebner
BB changes Same rationale as for commit "block-backend: fix edge case in bdrv_next() where BDS associated to BB changes". The block graph might change between the bdrv_next() call and the bdrv_next_cleanup() call, so it could be that the associated BDS is not the same that was referenced previously anymore. Instead, rely on bdrv_next() to set it->bs to the BDS it referenced and unreference that one in any case. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-4-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit bac09b093ebbb79e6a7444c7b979c32ca5540132) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-27block-backend: fix edge case in bdrv_next() where BDS associated to BB changesFiona Ebner
The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-3-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit f6d38c9f6dae6fce99dcaf6ca16a1fe5b5e19c4c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-27block/io: accept NULL qiov in bdrv_pad_requestStefan Reiter
Some operations, e.g. block-stream, perform reads while discarding the results (only copy-on-read matters). In this case, they will pass NULL as the target QEMUIOVector, which will however trip bdrv_pad_request, since it wants to extend its passed vector. In particular, this is the case for the blk_co_preadv() call in stream_populate(). If there is no qiov, no operation can be done with it, but the bytes and offset still need to be updated, so the subsequent aligned read will actually be aligned and not run into an assertion failure. In particular, this can happen when the request alignment of the top node is larger than the allocated part of the bottom node, in which case padding becomes necessary. For example: > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M -o cluster_size=32768 > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "compress", "file": "node0", "node-name": "node1" } } > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node1" } } > EOF Originally-by: Stefan Reiter <s.reiter@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> [FE: do update bytes and offset in any case add reproducer to commit message] Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-2-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 3f934817c82c2f1bf1c238f8d1065a3be10a3c9e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-03-19mirror: Don't call job_pause_point() under graph lockKevin Wolf
Calling job_pause_point() while holding the graph reader lock potentially results in a deadlock: bdrv_graph_wrlock() first drains everything, including the mirror job, which pauses it. The job is only unpaused at the end of the drain section, which is when the graph writer lock has been successfully taken. However, if the job happens to be paused at a pause point where it still holds the reader lock, the writer lock can't be taken as long as the job is still paused. Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly. Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-28125 Fixes: 004915a96a7a ("block: Protect bs->backing with graph_lock") Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20240313153000.33121-1-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit ae5a40e8581185654a667fbbf7e4adbc2a2a3e45) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-02-09block/blkio: Make s->mem_region_alignment be 64 bitsRichard W.M. Jones
With GCC 14 the code failed to compile on i686 (and was wrong for any version of GCC): ../block/blkio.c: In function ‘blkio_file_open’: ../block/blkio.c:857:28: error: passing argument 3 of ‘blkio_get_uint64’ from incompatible pointer type [-Wincompatible-pointer-types] 857 | &s->mem_region_alignment); | ^~~~~~~~~~~~~~~~~~~~~~~~ | | | size_t * {aka unsigned int *} In file included from ../block/blkio.c:12: /usr/include/blkio.h:49:67: note: expected ‘uint64_t *’ {aka ‘long long unsigned int *’} but argument is of type ‘size_t *’ {aka ‘unsigned int *’} 49 | int blkio_get_uint64(struct blkio *b, const char *name, uint64_t *value); | ~~~~~~~~~~^~~~~ Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 20240130122006.2977938-1-rjones@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 615eaeab3d318ba239d54141a4251746782f65c1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-26block/blklogwrites: Fix a bug when logging "write zeroes" operations.Ari Sundholm
There is a bug in the blklogwrites driver pertaining to logging "write zeroes" operations, causing log corruption. This can be easily observed by setting detect-zeroes to something other than "off" for the driver. The issue is caused by a concurrency bug pertaining to the fact that "write zeroes" operations have to be logged in two parts: first the log entry metadata, then the zeroed-out region. While the log entry metadata is being written by bdrv_co_pwritev(), another operation may begin in the meanwhile and modify the state of the blklogwrites driver. This is as intended by the coroutine-driven I/O model in QEMU, of course. Unfortunately, this specific scenario is mishandled. A short example: 1. Initially, in the current operation (#1), the current log sector number in the driver state is only incremented by the number of sectors taken by the log entry metadata, after which the log entry metadata is written. The current operation yields. 2. Another operation (#2) may start while the log entry metadata is being written. It uses the current log position as the start offset for its log entry. This is in the sector right after the operation #1 log entry metadata, which is bad! 3. After bdrv_co_pwritev() returns (#1), the current log sector number is reread from the driver state in order to find out the start offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the offset will be the sector right after the (misplaced) operation #2 log entry, which means that the zeroed-out region begins at the wrong offset. 4. As a result of the above, the log is corrupt. Fix this by only reading the driver metadata once, computing the offsets and sizes in one go (including the optional zeroed-out region) and setting the log sector number to the appropriate value for the next operation in line. Signed-off-by: Ari Sundholm <ari@tuxera.com> Cc: qemu-stable@nongnu.org Message-ID: <20240109184646.1128475-1-megari@gmx.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit a9c8ea95470c27a8a02062b67f9fa6940e828ab6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-25block/io: clear BDRV_BLOCK_RECURSE flag after recursing in bdrv_co_block_statusFiona Ebner
Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20240116154839.401030-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 8a9be7992426c8920d4178e7dca59306a18c7a3a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-12-22block: Fix crash when loading snapshot on inactive nodeKevin Wolf
bdrv_is_read_only() only checks if the node is configured to be read-only eventually, but even if it returns false, writing to the node may not be permitted at the moment (because it's inactive). bdrv_is_writable() checks that the node can be written to right now, and this is what the snapshot operations really need. Change bdrv_can_snapshot() to use bdrv_is_writable() to fix crashes like the following: $ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer qemu-system-x86_64: ../block/io.c:1990: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. The resulting error message after this patch isn't perfect yet, but at least it doesn't crash any more: $ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231201142520.32255-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d3007d348adaaf04ee8b099a475282034a662414) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-28export/vhost-user-blk: Fix consecutive drainsKevin Wolf
The vhost-user-blk export implement AioContext switches in its drain implementation. This means that on drain_begin, it detaches the server from its AioContext and on drain_end, attaches it again and schedules the server->co_trip coroutine in the updated AioContext. However, nothing guarantees that server->co_trip is even safe to be scheduled. Not only is it unclear that the coroutine is actually in a state where it can be reentered externally without causing problems, but with two consecutive drains, it is possible that the scheduled coroutine didn't have a chance yet to run and trying to schedule an already scheduled coroutine a second time crashes with an assertion failure. Following the model of NBD, this commit makes the vhost-user-blk export shut down server->co_trip during drain so that resuming the export means creating and scheduling a new coroutine, which is always safe. There is one exception: If the drain call didn't poll (for example, this happens in the context of bdrv_graph_wrlock()), then the coroutine didn't have a chance to shut down. However, in this case the AioContext can't have changed; changing the AioContext always involves a polling drain. So in this case we can simply assert that the AioContext is unchanged and just leave the coroutine running or wake it up if it has yielded to wait for the AioContext to be attached again. Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf Fixes: https://issues.redhat.com/browse/RHEL-1708 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231127115755.22846-1-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-28vmdk: Don't corrupt desc file in vmdk_write_cidFam Zheng
If the text description file is larger than DESC_SIZE, we force the last byte in the buffer to be 0 and write it out. This results in a corruption. Try to allocate a big buffer in this case. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923 Signed-off-by: Fam Zheng <fam@euphon.net> Message-ID: <20231124115654.3239137-1-fam@euphon.net> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21stream: Fix AioContext locking during bdrv_graph_wrlock()Kevin Wolf
In stream_prepare(), we need to temporarily drop the AioContext lock that job_prepare_locked() took for us while calling the graph write lock functions which can poll. All block nodes related to this block job are in the same AioContext, so we can pass any of them to bdrv_graph_wrlock()/ bdrv_graph_wrunlock(). Unfortunately, the one that we picked is base, which can be NULL - and in this case the AioContext lock is not released and deadlocks can occur. Fix this by passing s->target_bs, which is never NULL. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-4-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21block: Fix deadlocks in bdrv_graph_wrunlock()Kevin Wolf
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that have a nested event loop. Nested event loops can depend on other iothreads making progress, so in order to allow them to make progress it must not hold the AioContext lock of another thread while calling aio_poll(). This introduces a @bs parameter to bdrv_graph_wrunlock() whose AioContext is temporarily dropped (which matches bdrv_graph_wrlock()), and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState doesn't necessarily exist any more when unlocking. This also requires a change to bdrv_schedule_unref(), which was relying on the incorrectly taken lock. It needs to take the lock itself now. While this is a separate bug, it can't be fixed a separate patch because otherwise the intermediate state would either deadlock or try to release a lock that we don't even hold. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [kwolf: Fixed up bdrv_schedule_unref()] Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-21block: Fix bdrv_graph_wrlock() call in blk_remove_bs()Kevin Wolf
While not all callers of blk_remove_bs() are correct in this respect, the assumption in the function is that callers hold the AioContext lock of the BlockBackend (this is required by the drain calls in it). In order to avoid deadlock in the nested event loop, bdrv_graph_wrlock() has then to be called with the root BlockDriverState as its parameter instead of NULL, so that this AioContext lock is temporarily dropped. Fixes: https://issues.redhat.com/browse/RHEL-1761 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231115172012.112727-2-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-13block/snapshot: Fix compiler warning with -Wshadow=localThomas Huth
No need to declare a new variable in the the inner code block here, we can re-use the "ret" variable that has been declared at the beginning of the function. With this change, the code can now be successfully compiled with -Wshadow=local again. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-ID: <20231023175038.111607-1-thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-11-08block: Protect bs->file with graph_lockKevin Wolf
Almost all functions that access bs->file already take the graph lock now. Add locking to the remaining users and finally annotate the struct field itself as protected by the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-25-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Take graph lock for most of .bdrv_openKevin Wolf
Most implementations of .bdrv_open first open their file child (which is an operation that internally takes the write lock and therefore we shouldn't hold the graph lock while calling it), and afterwards many operations that require holding the graph lock, e.g. for accessing bs->file. This changes block drivers that follow this pattern to take the graph lock after opening the child node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-24-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08vhdx: Take locks for accessing bs->fileKevin Wolf
This updates the vhdx code to add GRAPH_RDLOCK annotations for all places that read bs->file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-23-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08qcow2: Take locks for accessing bs->fileKevin Wolf
This updates the qcow2 code to add GRAPH_RDLOCK annotations for all places that read bs->file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-22-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Add missing GRAPH_RDLOCK annotationsKevin Wolf
This adds GRAPH_RDLOCK to some driver callbacks that are already called with the graph lock held, and which will need the annotation because they access bs->file, but don't have it yet. This also covers a few callbacks that were not marked GRAPH_RDLOCK before, but where updating BlockDriver is trivially possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-21-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Introduce bdrv_co_change_backing_file()Kevin Wolf
bdrv_change_backing_file() is called both inside and outside coroutine context. This makes it difficult for it to take the graph lock internally. It also means that driver implementations need to be able to run outside of coroutines, too. Switch it to the usual model with a coroutine based implementation and a co_wrapper instead. The new function is marked GRAPH_RDLOCK. As the co_wrapper now runs the function in the AioContext of the node (as it should always have done), this is not GLOBAL_STATE_CODE() any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-20-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08blkverify: Add locking for request_fnKevin Wolf
This is either bdrv_co_preadv() or bdrv_co_pwritev() which both need to have the graph locked. Annotate the function pointer accordingly and add locking to its callers. This shouldn't actually have resulted in a bug because the graph lock is already held by blkverify_co_prwv(), which waits for the coroutines to terminate. Annotate with GRAPH_RDLOCK as well to make this clearer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-19-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-08block: Protect bs->backing with graph_lockKevin Wolf
Almost all functions that access bs->backing already take the graph lock now. Add locking to the remaining users and finally annotate the struct field itself as protected by the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-18-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_replace_node() GRAPH_WRLOCKKevin Wolf
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_replace_node(). Its callers may already want to hold the graph lock and so wouldn't be able to call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-17-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_set_backing_hd_drained() GRAPH_WRLOCKKevin Wolf
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_set_backing_hd_drained(). Basically everthing in the function needs the lock and its callers may already want to hold the graph lock and so wouldn't be able to call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-14-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_cow_child() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_cow_child() need to hold a reader lock for the graph because it accesses bs->backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-13-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_chain_contains() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_chain_contains() need to hold a reader lock for the graph because it calls bdrv_filter_or_cow_bs(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-11-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_(un)freeze_backing_chain() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_(un)freeze_backing_chain() need to hold a reader lock for the graph because it calls bdrv_filter_or_cow_child(), which accesses bs->file/backing. Use the opportunity to make bdrv_is_backing_chain_frozen() static, it has no external callers. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-10-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_skip_filters() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_skip_filters() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-9-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_skip_implicit_filters() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_skip_implicit_filters() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-8-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_filter_or_cow_bs() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_filter_or_cow_bs() need to hold a reader lock for the graph because it calls bdrv_filter_or_cow_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-7-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark block_job_add_bdrv() GRAPH_WRLOCKKevin Wolf
Instead of taking the writer lock internally, require callers to already hold it when calling block_job_add_bdrv(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-6-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_root_attach_child() GRAPH_WRLOCKKevin Wolf
Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_root_attach_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-5-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_filter_bs() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_filter_bs() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-4-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_has_zero_init() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_has_zero_init() need to hold a reader lock for the graph because it calls bdrv_filter_bs(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07block: Mark bdrv_probe_blocksizes() and callers GRAPH_RDLOCKKevin Wolf
This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_probe_blocksizes() need to hold a reader lock for the graph because it calls bdrv_filter_bs(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-07Merge tag 'pull-block-2023-11-06' of https://gitlab.com/hreitz/qemu into stagingStefan Hajnoczi
Block patches: - One patch to make qcow2's discard-no-unref option do better what it is supposed to do (i.e. prevent fragmentation) - Two fixes for zoned requests # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmVJHbgSHGhyZWl0ekBy # ZWRoYXQuY29tAAoJEKH6QNCYAZzfLn4QAKxuUYZaXirv6K4U2tW4aAJtc5uESdwv # WYhG7YU7MleBGCY0fRoih5thrPrzRLC8o1QhbRcA36+/PAZf4BYrJEfqLUdzuN5x # 6Vb1n3NRUzPD1+VfL/B9hVZhFbtTOUZuxPGEqCoHAmqBaeKuYRT1bLZbtRtPVLSk # 5eTMiyrpRMlBWc7O71eGKLqU4k0vAznwHBGf2Z93qWAsKcRZCwbAWYa7Q6rJ9jJ8 # 1jNsQuAk0p74/uGEpFhoEVrFEcV6pMbI4+jB9i0t9YYxT0tLIdIX1VUx+AHJfItk # IF2stB6SFOaAy2W3Fn+0oJvz40aMLzg9VjEeTpGmdlKC67ZTYa6Obwzy5WNLPIap # k7VUheUEe8qoKUtxQNxGLR/HKEJSFXyhU0lgAGxE1gl2xc1QFFFsrimpwFd3d37j # 3PwfhjARHonf4ZXgsvtIjb7nG9seMZYO7Vht0OztJyW8c2XN5OFVPir9xLbd9VUg # wZNGB8jAsHgj77+S/mRIwpP+laKL8wB7zYZ1mgFI98QJIYqL8tGdV/IiUhLljHzc # XAmwekOhBMMbgHhliBy9zDuTy59+zZ0FoxZPn/JvBjqBAkEnz9EbhHxi2imQg+1d # XSoLbx1X1yEbepWz8mCGiveLIPkt+3qMJuuQF76nURaA+nm3tCl/nKca6QLnVKzU # 2QtPWS0qRmwd # =5w7S # -----END PGP SIGNATURE----- # gpg: Signature made Tue 07 Nov 2023 01:09:12 HKT # gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF # gpg: issuer "hreitz@redhat.com" # gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF * tag 'pull-block-2023-11-06' of https://gitlab.com/hreitz/qemu: file-posix: fix over-writing of returning zone_append offset block/file-posix: fix update_zones_wp() caller qcow2: keep reference on zeroize with discard-no-unref enabled Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-07Merge tag 'pull-target-arm-20231106' of ↵Stefan Hajnoczi
https://git.linaro.org/people/pmaydell/qemu-arm into staging target-arm queue: * hw/arm/virt: fix PMU IRQ registration * hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables * hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow * util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow * mc146818rtc: rtc_set_time(): initialize tm to zeroes * block/nvme: nvme_process_completion() fix bound for cid * hw/core/loader: gunzip(): initialize z_stream * io/channel-socket: qio_channel_socket_flush(): improve msg validation * hw/arm/vexpress-a9: Remove useless mapping of RAM at address 0 * target/arm: Fix A64 LDRA immediate decode # -----BEGIN PGP SIGNATURE----- # # iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmVJBtUZHHBldGVyLm1h # eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3qYTEACYqLV57JezgRFXzMEwKX3l # 9IYbFje+lGemobdJOEHhRvXjCNb+5TwhEfQasri0FBzokw16S3WOOF7roGb6YOU1 # od1SGiS2AbrmiazlBpamVO8z0WAEgbnXIoQa/3xKAGPJXszD2zK+06KnXS5xuCuD # nHojzIx7Gv4HEIs4huY39/YL2HMaxrqvXC8IAu51eqY+TPnETT+WI3HxlZ2OMIsn # 1Jnn+FeZfA1bhKx4JsD9MyHM1ovbjOwYkHOlzjU6fmTFFPGKRy0nxnjMNCBcXHQ+ # unemc/9BhEFup76tkX+JIlSBrPre5Mnh93DsGKSapwKPKq+fQhUDmzXY2r3OvQZX # ryxO4PJkCNTM1wZU6GeEDPWVfhgBKHUMv+tr9Mf9iBlyXRsmXLSEl7AFUUaFlgAL # dSMyiAaUlfvGa7Gtta9eFAJ/GeaiuJu2CYq6lvtRrNIHflLm3gVCef8gmwM5Eqxm # 3PNzEoabKyQQfz69j9RCLpoutMBq1sg2IzxW8UjAFupugcIABjLf0Sl11qA0/B89 # YX67B0ynQD9ajI2GS8ULid/tvEiJVgdZ2Ua3U3xpG54vKG1/54EUiCP8TtoIuoMy # bKg8AU9EIPN962PxoAwS+bSSdCu7/zBjVpg4T/zIzWRdgSjRsE21Swu5Ca934ng5 # VpVUuiwtI/zvHgqaiORu+w== # =UbqJ # -----END PGP SIGNATURE----- # gpg: Signature made Mon 06 Nov 2023 23:31:33 HKT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full] # gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * tag 'pull-target-arm-20231106' of https://git.linaro.org/people/pmaydell/qemu-arm: target/arm: Fix A64 LDRA immediate decode hw/arm/vexpress-a9: Remove useless mapping of RAM at address 0 io/channel-socket: qio_channel_socket_flush(): improve msg validation hw/core/loader: gunzip(): initialize z_stream block/nvme: nvme_process_completion() fix bound for cid mc146818rtc: rtc_set_time(): initialize tm to zeroes util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow tests/qtest/bios-tables-test: Update virt SPCR and DBG2 golden references hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables. tests/qtest/bios-tables-test: Allow changes to virt SPCR and DBG2 hw/arm/virt: fix PMU IRQ registration Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-06file-posix: fix over-writing of returning zone_append offsetNaohiro Aota
raw_co_zone_append() sets "s->offset" where "BDRVRawState *s". This pointer is used later at raw_co_prw() to save the block address where the data is written. When multiple IOs are on-going at the same time, a later IO's raw_co_zone_append() call over-writes a former IO's offset address before raw_co_prw() completes. As a result, the former zone append IO returns the initial value (= the start address of the writing zone), instead of the proper address. Fix the issue by passing the offset pointer to raw_co_prw() instead of passing it through s->offset. Also, remove "offset" from BDRVRawState as there is no usage anymore. Fixes: 4751d09adcc3 ("block: introduce zone append write for zoned devices") Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Message-Id: <20231030073853.2601162-1-naohiro.aota@wdc.com> Reviewed-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-11-06block/file-posix: fix update_zones_wp() callerSam Li
When the zoned request fail, it needs to update only the wp of the target zones for not disrupting the in-flight writes on these other zones. The wp is updated successfully after the request completes. Fixed the callers with right offset and nr_zones. Signed-off-by: Sam Li <faithilikerun@gmail.com> Message-Id: <20230825040556.4217-1-faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [hreitz: Rebased and fixed comment spelling] Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-11-06qcow2: keep reference on zeroize with discard-no-unref enabledJean-Louis Dupond
When the discard-no-unref flag is enabled, we keep the reference for normal discard requests. But when a discard is executed on a snapshot/qcow2 image with backing, the discards are saved as zero clusters in the snapshot image. When committing the snapshot to the backing file, not discard_in_l2_slice is called but zero_in_l2_slice. Which did not had any logic to keep the reference when discard-no-unref is enabled. Therefor we add logic in the zero_in_l2_slice call to keep the reference on commit. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Message-Id: <20231003125236.216473-2-jean-louis@dupond.be> [hreitz: Made the documentation change more verbose, as discussed on-list] Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-11-06block/nvme: nvme_process_completion() fix bound for cidVladimir Sementsov-Ogievskiy
NVMeQueuePair::reqs has length NVME_NUM_REQS, which less than NVME_QUEUE_SIZE by 1. Fixes: 1086e95da17050 ("block/nvme: switch to a NVMeRequest freelist") Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru> Message-id: 20231017125941.810461-5-vsementsov@yandex-team.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-11-03util/uuid: Add UUID_STR_LEN definitionCédric Le Goater
qemu_uuid_unparse() includes a trailing NUL when writing the uuid string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a define for this size and use it where required. Cc: Fam Zheng <fam@euphon.net> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-01cpr: relax blockdev migration blockersSteve Sistare
Some blockdevs block migration because they do not support sharing across hosts and/or do not support dirty bitmaps. These prohibitions do not apply if the old and new qemu processes do not run concurrently, and if new qemu starts on the same host as old, which is the case for cpr. Narrow the scope of these blockers so they only apply to normal mode. They will not block cpr modes when they are added in subsequent patches. No functional change until a new mode is added. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-4-git-send-email-steven.sistare@oracle.com>