aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-14Update version for 4.1.1 releasev4.1.1Michael Roth
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12mirror: Keep mirror_top_bs drained after dropping permissionsKevin Wolf
mirror_top_bs is currently implicitly drained through its connection to the source or the target node. However, the drain section for target_bs ends early after moving mirror_top_bs from src to target_bs, so that requests can already be restarted while mirror_top_bs is still present in the chain, but has dropped all permissions and therefore runs into an assertion failure like this: qemu-system-x86_64: block/io.c:1634: bdrv_co_write_req_prepare: Assertion `child->perm & BLK_PERM_WRITE' failed. Keep mirror_top_bs drained until all graph changes have completed. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit d2da5e288a2e71e82866c8fdefd41b5727300124) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block/create: Do not abort if a block driver is not availablePhilippe Mathieu-Daudé
The 'blockdev-create' QMP command was introduced as experimental feature in commit b0292b851b8, using the assert() debug call. It got promoted to 'stable' command in 3fb588a0f2c, but the assert call was not removed. Some block drivers are optional, and bdrv_find_format() might return a NULL value, triggering the assertion. Stable code is not expected to abort, so return an error instead. This is easily reproducible when libnfs is not installed: ./configure [...] module support no Block whitelist (rw) Block whitelist (ro) libiscsi support yes libnfs support no [...] Start QEMU: $ qemu-system-x86_64 -S -qmp unix:/tmp/qemu.qmp,server,nowait Send the 'blockdev-create' with the 'nfs' driver: $ ( cat << 'EOF' {'execute': 'qmp_capabilities'} {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} EOF ) | socat STDIO UNIX:/tmp/qemu.qmp {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 4}, "package": "v4.1.0-733-g89ea03a7dc"}, "capabilities": ["oob"]}} {"return": {}} QEMU crashes: $ gdb qemu-system-x86_64 core Program received signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007ffff510957f in raise () at /lib64/libc.so.6 #1 0x00007ffff50f3895 in abort () at /lib64/libc.so.6 #2 0x00007ffff50f3769 in _nl_load_domain.cold.0 () at /lib64/libc.so.6 #3 0x00007ffff5101a26 in .annobin_assert.c_end () at /lib64/libc.so.6 #4 0x0000555555d7e1f1 in qmp_blockdev_create (job_id=0x555556baee40 "x", options=0x555557666610, errp=0x7fffffffc770) at block/create.c:69 #5 0x0000555555c96b52 in qmp_marshal_blockdev_create (args=0x7fffdc003830, ret=0x7fffffffc7f8, errp=0x7fffffffc7f0) at qapi/qapi-commands-block-core.c:1314 #6 0x0000555555deb0a0 in do_qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false, errp=0x7fffffffc898) at qapi/qmp-dispatch.c:131 #7 0x0000555555deb2a1 in qmp_dispatch (cmds=0x55555645de70 <qmp_commands>, request=0x7fffdc005c70, allow_oob=false) at qapi/qmp-dispatch.c:174 With this patch applied, QEMU returns a QMP error: {'execute': 'blockdev-create', 'arguments': {'job-id': 'x', 'options': {'size': 0, 'driver': 'nfs', 'location': {'path': '/', 'server': {'host': '::1', 'type': 'inet'}}}}, 'id': 'x'} {"id": "x", "error": {"class": "GenericError", "desc": "Block driver 'nfs' not found or not supported"}} Cc: qemu-stable@nongnu.org Reported-by: Xu Tian <xutian@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d90d5cae2b10efc0e8d0b3cc91ff16201853d3ba) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12vhost: Fix memory region section comparisonDr. David Alan Gilbert
Using memcmp to compare structures wasn't safe, as I found out on ARM when I was getting falce miscompares. Use the helper function for comparing the MRSs. Fixes: ade6d081fc33948e56e6 ("vhost: Regenerate region list from changed sections list") Cc: qemu-stable@nongnu.org Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190814175535.2023-4-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 3fc4a64cbaed2ddee4c60ddc06740b320e18ab82) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12memory: Provide an equality function for MemoryRegionSectionsDr. David Alan Gilbert
Provide a comparison function that checks all the fields are the same. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190814175535.2023-3-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 9366cf02e4e31c2a8128904d4d8290a0fad5f888) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12memory: Align MemoryRegionSections fieldsDr. David Alan Gilbert
MemoryRegionSection includes an Int128 'size' field; on some platforms the compiler causes an alignment of this to a 128bit boundary, leaving 8 bytes of dead space. This deadspace can be filled with junk. Move the size field to the top avoiding unnecessary alignment. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190814175535.2023-2-dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 44f85d3276397cfa2cfa379c61430405dad4e644) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12tests: make filemonitor test more robust to event orderingDaniel P. Berrangé
The ordering of events that are emitted during the rmdir test have changed with kernel >= 5.3. Semantically both new & old orderings are correct, so we must be able to cope with either. To cope with this, when we see an unexpected event, we push it back onto the queue and look and the subsequent event to see if that matches instead. Tested-by: Peter Xu <peterx@redhat.com> Tested-by: Wei Yang <richardw.yang@linux.intel.com> Tested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> (cherry picked from commit bf9e0313c27d8e6ecd7f7de3d63e1cb25d8f6311) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block: posix: Always allocate the first blockNir Soffer
When creating an image with preallocation "off" or "falloc", the first block of the image is typically not allocated. When using Gluster storage backed by XFS filesystem, reading this block using direct I/O succeeds regardless of request length, fooling alignment detection. In this case we fallback to a safe value (4096) instead of the optimal value (512), which may lead to unneeded data copying when aligning requests. Allocating the first block avoids the fallback. Since we allocate the first block even with preallocation=off, we no longer create images with zero disk size: $ ./qemu-img create -f raw test.raw 1g Formatting 'test.raw', fmt=raw size=1073741824 $ ls -lhs test.raw 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw And converting the image requires additional cluster: $ ./qemu-img measure -f raw -O qcow2 test.raw required size: 458752 fully allocated size: 1074135040 When using format like vmdk with multiple files per image, we allocate one block per file: $ ./qemu-img create -f vmdk -o subformat=twoGbMaxExtentFlat test.vmdk 4g Formatting 'test.vmdk', fmt=vmdk size=4294967296 compat6=off hwversion=undefined subformat=twoGbMaxExtentFlat $ ls -lhs test*.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f001.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 2.0G Aug 27 03:23 test-f002.vmdk 4.0K -rw-r--r--. 1 nsoffer nsoffer 353 Aug 27 03:23 test.vmdk I did quick performance test for copying disks with qemu-img convert to new raw target image to Gluster storage with sector size of 512 bytes: for i in $(seq 10); do rm -f dst.raw sleep 10 time ./qemu-img convert -f raw -O raw -t none -T none src.raw dst.raw done Here is a table comparing the total time spent: Type Before(s) After(s) Diff(%) --------------------------------------- real 530.028 469.123 -11.4 user 17.204 10.768 -37.4 sys 17.881 7.011 -60.7 We can see very clear improvement in CPU usage. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20190827010528.8818-2-nsoffer@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 3a20013fbb26d2a1bd11ef148eefdb1508783787) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12file-posix: Handle undetectable alignmentNir Soffer
In some cases buf_align or request_alignment cannot be detected: 1. With Gluster, buf_align cannot be detected since the actual I/O is done on Gluster server, and qemu buffer alignment does not matter. Since we don't have alignment requirement, buf_align=1 is the best value. 2. With local XFS filesystem, buf_align cannot be detected if reading from unallocated area. In this we must align the buffer, but we don't know what is the correct size. Using the wrong alignment results in I/O error. 3. With Gluster backed by XFS, request_alignment cannot be detected if reading from unallocated area. In this case we need to use the correct alignment, and failing to do so results in I/O errors. 4. With NFS, the server does not use direct I/O, so both buf_align cannot be detected. In this case we don't need any alignment so we can use buf_align=1 and request_alignment=1. These cases seems to work when storage sector size is 512 bytes, because the current code starts checking align=512. If the check succeeds because alignment cannot be detected we use 512. But this does not work for storage with 4k sector size. To determine if we can detect the alignment, we probe first with align=1. If probing succeeds, maybe there are no alignment requirement (cases 1, 4) or we are probing unallocated area (cases 2, 3). Since we don't have any way to tell, we treat this as undetectable alignment. If probing with align=1 fails with EINVAL, but probing with one of the expected alignments succeeds, we know that we found a working alignment. Practically the alignment requirements are the same for buffer alignment, buffer length, and offset in file. So in case we cannot detect buf_align, we can use request alignment. If we cannot detect request alignment, we can fallback to a safe value. To use this logic, we probe first request alignment instead of buf_align. Here is a table showing the behaviour with current code (the value in parenthesis is the optimal value). Case Sector buf_align (opt) request_alignment (opt) result ====================================================================== 1 512 512 (1) 512 (512) OK 1 4096 512 (1) 4096 (4096) FAIL ---------------------------------------------------------------------- 2 512 512 (512) 512 (512) OK 2 4096 512 (4096) 4096 (4096) FAIL ---------------------------------------------------------------------- 3 512 512 (1) 512 (512) OK 3 4096 512 (1) 512 (4096) FAIL ---------------------------------------------------------------------- 4 512 512 (1) 512 (1) OK 4 4096 512 (1) 512 (1) OK Same cases with this change: Case Sector buf_align (opt) request_alignment (opt) result ====================================================================== 1 512 512 (1) 512 (512) OK 1 4096 4096 (1) 4096 (4096) OK ---------------------------------------------------------------------- 2 512 512 (512) 512 (512) OK 2 4096 4096 (4096) 4096 (4096) OK ---------------------------------------------------------------------- 3 512 4096 (1) 4096 (512) OK 3 4096 4096 (1) 4096 (4096) OK ---------------------------------------------------------------------- 4 512 4096 (1) 4096 (1) OK 4 4096 4096 (1) 4096 (1) OK I tested that provisioning VMs and copying disks on local XFS and Gluster with 4k bytes sector size work now, resolving bugs [1],[2]. I tested also on XFS, NFS, Gluster with 512 bytes sector size. [1] https://bugzilla.redhat.com/1737256 [2] https://bugzilla.redhat.com/1738657 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit a6b257a08e3d72219f03e461a52152672fec0612) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block/file-posix: Let post-EOF fallocate serializeMax Reitz
The XFS kernel driver has a bug that may cause data corruption for qcow2 images as of qemu commit c8bb23cbdbe32f. We can work around it by treating post-EOF fallocates as serializing up until infinity (INT64_MAX in practice). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 292d06b925b2787ee6f2430996b95651cae42fce) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block: Add bdrv_co_get_self_request()Max Reitz
Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit c28107e9e55b11cd35cf3dc2505e3e69d10dcf13) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block: Make wait/mark serialising requests publicMax Reitz
Make both bdrv_mark_request_serialising() and bdrv_wait_serialising_requests() public so they can be used from block drivers. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 304d9d7f034ff7f5e1e66a65b7f720f63a72c57e) Conflicts: block/io.c *drop context dependency on 1acc3466a2 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12block/io: refactor paddingVladimir Sementsov-Ogievskiy
We have similar padding code in bdrv_co_pwritev, bdrv_co_do_pwrite_zeroes and bdrv_co_preadv. Let's combine and unify it. [Squashed in Vladimir's qemu-iotests 077 fix --Stefan] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190604161514.262241-4-vsementsov@virtuozzo.com Message-Id: <20190604161514.262241-4-vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 7a3f542fbdfd799be4fa6f8b96dc8c1e6933fce4) *prereq for 292d06b9 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12util/iov: improve qemu_iovec_is_zeroVladimir Sementsov-Ogievskiy
We'll need to check a part of qiov soon, so implement it now. Optimization with align down to 4 * sizeof(long) is dropped due to: 1. It is strange: it aligns length of the buffer, but where is a guarantee that buffer pointer is aligned itself? 2. buffer_is_zero() is a better place for optimizations and it has them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190604161514.262241-3-vsementsov@virtuozzo.com Message-Id: <20190604161514.262241-3-vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit f76889e7b947d896db51be8a4d9c941c2f70365a) *prereq for 292d06b9 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12util/iov: introduce qemu_iovec_init_extendedVladimir Sementsov-Ogievskiy
Introduce new initialization API, to create requests with padding. Will be used in the following patch. New API uses qemu_iovec_init_buf if resulting io vector has only one element, to avoid extra allocations. So, we need to update qemu_iovec_destroy to support destroying such QIOVs. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190604161514.262241-2-vsementsov@virtuozzo.com Message-Id: <20190604161514.262241-2-vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit d953169d4840f312d3b9a54952f4a7ccfcb3b311) *prereq for 292d06b9 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12qcow2-bitmap: Fix uint64_t left-shift overflowTuguoyi
There are two issues in In check_constraints_on_bitmap(), 1) The sanity check on the granularity will cause uint64_t integer left-shift overflow when cluster_size is 2M and the granularity is BIGGER than 32K. 2) The way to calculate image size that the maximum bitmap supported can map to is a bit incorrect. This patch fix it by add a helper function to calculate the number of bytes needed by a normal bitmap in image and compare it to the maximum bitmap bytes supported by qemu. Fixes: 5f72826e7fc62167cf3a Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com> Message-id: 4ba40cd1e7ee4a708b40899952e49f22@h3c.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 570542ecb11e04b61ef4b3f4d0965a6915232a88) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12iotests: Add peek_file* functionsMax Reitz
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-16-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit fc8ba423ca6b37bee56ec9dc339b44043c39553d) *prereq for 570542ec Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-12iotests: Add test for 4G+ compressed qcow2 writeMax Reitz
Test what qemu-img check says about an image after one has written compressed data to an offset above 4 GB. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191028161841.1198-3-mreitz@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit b7cd2c11f76d27930f53d3cf26d7b695c78d613b) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-11qcow2: Fix QCOW2_COMPRESSED_SECTOR_MASKMax Reitz
Masks for L2 table entries should have 64 bit. Fixes: b6c246942b14d3e0dec46a6c5868ed84e7dbea19 Buglink: https://bugs.launchpad.net/qemu/+bug/1850000 Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191028161841.1198-2-mreitz@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 24552feb6ae2f615b76c2b95394af43901f75046) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-05virtio-blk: Cancel the pending BH when the dataplane is resetPhilippe Mathieu-Daudé
When 'system_reset' is called, the main loop clear the memory region cache before the BH has a chance to execute. Later when the deferred function is called, some assumptions that were made when scheduling them are no longer true when they actually execute. This is what happens using a virtio-blk device (fresh RHEL7.8 install): $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo q) \ | qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \ -device virtio-blk-pci,id=image1,drive=drive_image1 \ -drive file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \ -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \ -monitor stdio -serial null -nographic (qemu) system_reset (qemu) system_reset (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: Assertion `caches != NULL' failed. Aborted (gdb) bt Thread 1 (Thread 0x7f109c17b680 (LWP 10939)): #0 0x00005604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at hw/virtio/virtio.c:227 #1 0x000056040832972b in vring_avail_flags (vq=0x56040a24bdd0) at hw/virtio/virtio.c:235 #2 0x000056040832d13d in virtio_should_notify (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648 #3 0x000056040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662 #4 0x00005604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at hw/block/dataplane/virtio-blk.c:75 #5 0x000056040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90 #6 0x000056040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118 #7 0x0000560408842af7 in aio_dispatch (ctx=0x560409161980) at util/aio-posix.c:460 #8 0x000056040883e068 in aio_ctx_dispatch (source=0x560409161980, callback=0x0, user_data=0x0) at util/async.c:261 #9 0x00007f10a8fca06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #10 0x0000560408841445 in glib_pollfds_poll () at util/main-loop.c:215 #11 0x00005604088414bf in os_host_main_loop_wait (timeout=0) at util/main-loop.c:238 #12 0x00005604088415c4 in main_loop_wait (nonblocking=0) at util/main-loop.c:514 #13 0x0000560408416b1e in main_loop () at vl.c:1923 #14 0x000056040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, envp=0x7ffc2c3f9d00) at vl.c:4578 Fix this by cancelling the BH when the virtio dataplane is stopped. [This is version of the patch was modified as discussed with Philippe on the mailing list thread. --Stefan] Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Fixes: https://bugs.launchpad.net/qemu/+bug/1839428 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190816171503.24761-1-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit ebb6ff25cd888a52a64a9adc3692541c6d1d9a42) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-05scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)Paolo Bonzini
When executing script in lsi_execute_script(), the LSI scsi adapter emulator advances 's->dsp' index to read next opcode. This can lead to an infinite loop if the next opcode is empty. Move the existing loop exit after 10k iterations so that it covers no-op opcodes as well. Reported-by: Bugs SysSec <bugs-syssec@rub.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit de594e47659029316bbf9391efb79da0a1a08e08) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-05target/xtensa: regenerate and re-import test_mmuhifi_c3 coreMax Filippov
Overlay part of the test_mmuhifi_c3 core has GPL3 copyright headers in it. Fix that by regenerating test_mmuhifi_c3 core overlay and re-importing it. Fixes: d848ea776728 ("target/xtensa: add test_mmuhifi_c3 core") Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> (cherry picked from commit d5eaec84e592bb0085f84bef54d0a41e31faa99a) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04target/arm: Allow reading flags from FPSCR for M-profileChristophe Lyon
rt==15 is a special case when reading the flags: it means the destination is APSR. This patch avoids rejecting vmrs apsr_nzcv, fpscr as illegal instruction. Cc: qemu-stable@nongnu.org Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org> Message-id: 20191025095711.10853-1-christophe.lyon@linaro.org [PMM: updated the comment] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 2529ab43b8a05534494704e803e0332d111d8b91) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04hbitmap: handle set/reset with zero lengthVladimir Sementsov-Ogievskiy
Passing zero length to these functions leads to unpredicted results. Zero-length set/reset may occur in active-mirror, on zero-length write (which is unlikely, but not guaranteed to never happen). Let's just do nothing on zero-length request. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191011090711.19940-2-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit fed33bd175f663cc8c13f8a490a4f35a19756cfe) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04util/hbitmap: strict hbitmap_resetVladimir Sementsov-Ogievskiy
hbitmap_reset has an unobvious property: it rounds requested region up. It may provoke bugs, like in recently fixed write-blocking mode of mirror: user calls reset on unaligned region, not keeping in mind that there are possible unrelated dirty bytes, covered by rounded-up region and information of this unrelated "dirtiness" will be lost. Make hbitmap_reset strict: assert that arguments are aligned, allowing only one exception when @start + @count == hb->orig_size. It's needed to comfort users of hbitmap_next_dirty_area, which cares about hb->orig_size. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20190806152611.280389-1-vsementsov@virtuozzo.com> [Maintainer edit: Max's suggestions from on-list. --js] [Maintainer edit: Eric's suggestion for aligned macro. --js] Signed-off-by: John Snow <jsnow@redhat.com> (cherry picked from commit 48557b138383aaf69c2617ca9a88bfb394fc50ec) *prereq for fed33bd175f663cc8c13f8a490a4f35a19756cfe Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04COLO-compare: Fix incorrect `if` logicFan Yang
'colo_mark_tcp_pkt' should return 'true' when packets are the same, and 'false' otherwise. However, it returns 'true' when 'colo_compare_packet_payload' returns non-zero while 'colo_compare_packet_payload' is just a 'memcmp'. The result is that COLO-compare reports inconsistent TCP packets when they are actually the same. Fixes: f449c9e549c ("colo: compare the packet based on the tcp sequence number") Cc: qemu-stable@nongnu.org Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Fan Yang <Fan_Yang@sjtu.edu.cn> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 1e907a32b77e5d418538453df5945242e43224fa) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04virtio-net: prevent offloads reset on migrationMikhail Sennikovsky
Currently offloads disabled by guest via the VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET command are not preserved on VM migration. Instead all offloads reported by guest features (via VIRTIO_PCI_GUEST_FEATURES) get enabled. What happens is: first the VirtIONet::curr_guest_offloads gets restored and offloads are getting set correctly: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=0, tso6=0, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_post_load_device (opaque=0x555557701ca0, version_id=11) at hw/net/virtio-net.c:2334 #3 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577c80 <vmstate_virtio_net_device>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:168 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2197 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 However later on the features are getting restored, and offloads get reset to everything supported by features: #0 qemu_set_offload (nc=0x555556a11400, csum=1, tso4=1, tso6=1, ecn=0, ufo=0) at net/net.c:474 #1 virtio_net_apply_guest_offloads (n=0x555557701ca0) at hw/net/virtio-net.c:720 #2 virtio_net_set_features (vdev=0x555557701ca0, features=5104441767) at hw/net/virtio-net.c:773 #3 virtio_set_features_nocheck (vdev=0x555557701ca0, val=5104441767) at hw/virtio/virtio.c:2052 #4 virtio_load (vdev=0x555557701ca0, f=0x5555569dc010, version_id=11) at hw/virtio/virtio.c:2220 #5 virtio_device_get (f=0x5555569dc010, opaque=0x555557701ca0, size=0, field=0x55555668cd00 <__compound_literal.5>) at hw/virtio/virtio.c:2036 #6 vmstate_load_state (f=0x5555569dc010, vmsd=0x555556577ce0 <vmstate_virtio_net>, opaque=0x555557701ca0, version_id=11) at migration/vmstate.c:143 #7 vmstate_load (f=0x5555569dc010, se=0x5555578189e0) at migration/savevm.c:829 #8 qemu_loadvm_section_start_full (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2211 #9 qemu_loadvm_state_main (f=0x5555569dc010, mis=0x5555569eee20) at migration/savevm.c:2395 #10 qemu_loadvm_state (f=0x5555569dc010) at migration/savevm.c:2467 #11 process_incoming_migration_co (opaque=0x0) at migration/migration.c:449 Fix this by preserving the state in saved_guest_offloads field and pushing out offload initialization to the new post load hook. Cc: qemu-stable@nongnu.org Signed-off-by: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 7788c3f2e21e35902d45809b236791383bbb613e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04virtio: new post_load hookMichael S. Tsirkin
Post load hook in virtio vmsd is called early while device is processed, and when VirtIODevice core isn't fully initialized. Most device specific code isn't ready to deal with a device in such state, and behaves weirdly. Add a new post_load hook in a device class instead. Devices should use this unless they specifically want to verify the migration stream as it's processed, e.g. for bounds checking. Cc: qemu-stable@nongnu.org Suggested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mikhail Sennikovsky <mikhail.sennikovskii@cloud.ionos.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 1dd713837cac8ec5a97d3b8492d72ce5ac94803c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04ui: Fix hanging up Cocoa display on macOS 10.15 (Catalina)Hikaru Nishida
macOS API documentation says that before applicationDidFinishLaunching is called, any events will not be processed. However, some events are fired before it is called in macOS Catalina. This causes deadlock of iothread_lock in handleEvent while it will be released after the app_started_sem is posted. This patch avoids processing events before the app_started_sem is posted to prevent this deadlock. Buglink: https://bugs.launchpad.net/qemu/+bug/1847906 Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com> Message-id: 20191015010734.85229-1-hikarupsp@gmail.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit dff742ad27efa474ec04accdbf422c9acfd3e30e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04mirror: Do not dereference invalid pointersMax Reitz
mirror_exit_common() may be called twice (if it is called from mirror_prepare() and fails, it will be called from mirror_abort() again). In such a case, many of the pointers in the MirrorBlockJob object will already be freed. This can be seen most reliably for s->target, which is set to NULL (and then dereferenced by blk_bs()). Cc: qemu-stable@nongnu.org Fixes: 737efc1eda23b904fbe0e66b37715fb0e5c3e58b Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191014153931.20699-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit f93c3add3a773e0e3f6277e5517583c4ad3a43c2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04iotests: Test large write request to qcow2 fileMax Reitz
Without HEAD^, the following happens when you attempt a large write request to a qcow2 file such that the number of bytes covered by all clusters involved in a single allocation will exceed INT_MAX: (A) handle_alloc_space() decides to fill the whole area with zeroes and fails because bdrv_co_pwrite_zeroes() fails (the request is too large). (B) If handle_alloc_space() does not do anything, but merge_cow() decides that the requests can be merged, it will create a too long IOV that later cannot be written. (C) Otherwise, all parts will be written separately, so those requests will work. In either B or C, though, qcow2_alloc_cluster_link_l2() will have an overflow: We use an int (i) to iterate over nb_clusters, and then calculate the L2 entry based on "i << s->cluster_bits" -- which will overflow if the range covers more than INT_MAX bytes. This then leads to image corruption because the L2 entry will be wrong (it will be recognized as a compressed cluster). Even if that were not the case, the .cow_end area would be empty (because handle_alloc() will cap avail_bytes and nb_bytes at INT_MAX, so their difference (which is the .cow_end size) will be 0). So this test checks that on such large requests, the image will not be corrupted. Unfortunately, we cannot check whether COW will be handled correctly, because that data is discarded when it is written to null-co (but we have to use null-co, because writing 2 GB of data in a test is not quite reasonable). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit a1406a9262a087d9ec9627b88da13c4590b61dae) Conflicts: tests/qemu-iotests/group *drop context dep. on tests not in 4.1 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04qcow2: Limit total allocation range to INT_MAXMax Reitz
When the COW areas are included, the size of an allocation can exceed INT_MAX. This is kind of limited by handle_alloc() in that it already caps avail_bytes at INT_MAX, but the number of clusters still reflects the original length. This can have all sorts of effects, ranging from the storage layer write call failing to image corruption. (If there were no image corruption, then I suppose there would be data loss because the .cow_end area is forced to be empty, even though there might be something we need to COW.) Fix all of it by limiting nb_clusters so the equivalent number of bytes will not exceed INT_MAX. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit d1b9d19f99586b33795e20a79f645186ccbc070f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04hw/core/loader: Fix possible crash in rom_copy()Thomas Huth
Both, "rom->addr" and "addr" are derived from the binary image that can be loaded with the "-kernel" paramer. The code in rom_copy() then calculates: d = dest + (rom->addr - addr); and uses "d" as destination in a memcpy() some lines later. Now with bad kernel images, it is possible that rom->addr is smaller than addr, thus "rom->addr - addr" gets negative and the memcpy() then tries to copy contents from the image to a bad memory location. This could maybe be used to inject code from a kernel image into the QEMU binary, so we better fix it with an additional sanity check here. Cc: qemu-stable@nongnu.org Reported-by: Guangming Liu Buglink: https://bugs.launchpad.net/qemu/+bug/1844635 Message-Id: <20190925130331.27825-1-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit e423455c4f23a1a828901c78fe6d03b7dde79319) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04vhost-user: save features if the char dev is closedAdrian Moreno
That way the state can be correctly restored when the device is opened again. This might happen if the backend is restarted. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1738768 Reported-by: Pei Zhang <pezhang@redhat.com> Fixes: 6ab79a20af3a ("do not call vhost_net_cleanup() on running net from char user event") Cc: ddstreet@canonical.com Cc: Michael S. Tsirkin <mst@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Message-Id: <20190924162044.11414-1-amorenoz@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit c6beefd674fff8d41b90365dfccad32e53a5abcb) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04iotests: Test internal snapshots with -blockdevKevin Wolf
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> (cherry picked from commit 92b22e7b1789b0e5f20d245706e72eae70dbddce) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-11-04block/snapshot: Restrict set of snapshot nodesKevin Wolf
Nodes involved in internal snapshots were those that were returned by bdrv_next(), inserted and not read-only. bdrv_next() in turn returns all nodes that are either the root node of a BlockBackend or monitor-owned nodes. With the typical -drive use, this worked well enough. However, in the typical -blockdev case, the user defines one node per option, making all nodes monitor-owned nodes. This includes protocol nodes etc. which often are not snapshottable, so "savevm" only returns an error. Change the conditions so that internal snapshot still include all nodes that have a BlockBackend attached (we definitely want to snapshot anything attached to a guest device and probably also the built-in NBD server; snapshotting block job BlockBackends is more of an accident, but a preexisting one), but other monitor-owned nodes are only included if they have no parents. This makes internal snapshots usable again with typical -blockdev configurations. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> (cherry picked from commit 05f4aced658a02b02d3e89a6c7a2281008fcf26c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30s390: PCI: fix IOMMU region initMatthew Rosato
The fix in dbe9cf606c shrinks the IOMMU memory region to a size that seems reasonable on the surface, however is actually too small as it is based against a 0-mapped address space. This causes breakage with small guests as they can overrun the IOMMU window. Let's go back to the prior method of initializing iommu for now. Fixes: dbe9cf606c ("s390x/pci: Set the iommu region size mpcifc request") Cc: qemu-stable@nongnu.org Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Tested-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reported-by: Stefan Zimmerman <stzi@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <1569507036-15314-1-git-send-email-mjrosato@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> (cherry picked from commit 7df1dac5f1c85312474df9cb3a8fcae72303da62) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30roms/Makefile.edk2: don't pull in submodules when building from tarballMichael Roth
Currently the `make efi` target pulls submodules nested under the roms/edk2 submodule as dependencies. However, when we attempt to build from a tarball this fails since we are no longer in a git tree. A preceding patch will pre-populate these submodules in the tarball, so assume this build dependency is only needed when building from a git tree. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Bruce Rogers <brogers@suse.com> Cc: qemu-stable@nongnu.org # v4.1.0 Reported-by: Bruce Rogers <brogers@suse.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Message-Id: <20190912231202.12327-3-mdroth@linux.vnet.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> (cherry picked from commit f3e330e3c319160ac04954399b5a10afc965098c) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30make-release: pull in edk2 submodules so we can build it from tarballsMichael Roth
The `make efi` target added by 536d2173 is built from the roms/edk2 submodule, which in turn relies on additional submodules nested under roms/edk2. The make-release script currently only pulls in top-level submodules, so these nested submodules are missing in the resulting tarball. We could try to address this situation more generally by recursively pulling in all submodules, but this doesn't necessarily ensure the end-result will build properly (this case also required other changes). Additionally, due to the nature of submodules, we may not always have control over how these sorts of things are dealt with, so for now we continue to handle it on a case-by-case in the make-release script. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Bruce Rogers <brogers@suse.com> Cc: qemu-stable@nongnu.org # v4.1.0 Reported-by: Bruce Rogers <brogers@suse.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Message-Id: <20190912231202.12327-2-mdroth@linux.vnet.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> (cherry picked from commit 45c61c6c23918e3b05ed9ecac5b2328ebae5f774) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30hw/arm/boot.c: Set NSACR.{CP11,CP10} for NS kernel bootsPeter Maydell
If we're booting a Linux kernel directly into Non-Secure state on a CPU which has Secure state, then make sure we set the NSACR CP11 and CP10 bits, so that Non-Secure is allowed to access the FPU. Otherwise an AArch32 kernel will UNDEF as soon as it tries to use the FPU. It used to not matter that we didn't do this until commit fc1120a7f5f2d4b6, where we implemented actually honouring these NSACR bits. The problem only exists for CPUs where EL3 is AArch32; the equivalent AArch64 trap bits are in CPTR_EL3 and are "0 to not trap, 1 to trap", so the reset value of the register permits NS access, unlike NSACR. Fixes: fc1120a7f5 Fixes: https://bugs.launchpad.net/qemu/+bug/1844597 Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190920174039.3916-1-peter.maydell@linaro.org (cherry picked from commit ece628fcf69cbbd4b3efb6fbd203af07609467a2) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30block/backup: fix backup_cow_with_offload for last clusterVladimir Sementsov-Ogievskiy
We shouldn't try to copy bytes beyond EOF. Fix it. Fixes: 9ded4a0114968e Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190920142056.12778-3-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 1048ddf0a32dcdaa952e581bd503d49adad527cc) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30block/backup: fix max_transfer handling for copy_rangeVladimir Sementsov-Ogievskiy
Of course, QEMU_ALIGN_UP is a typo, it should be QEMU_ALIGN_DOWN, as we are trying to find aligned size which satisfy both source and target. Also, don't ignore too small max_transfer. In this case seems safer to disable copy_range. Fixes: 9ded4a0114968e Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190920142056.12778-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 981fb5810aa3f68797ee6e261db338bd78857614) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()Kevin Wolf
qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which requires s->lock to be taken to protect its accesses to the refcount table and refcount blocks. However, nothing in this code path actually took the lock. This could cause the same cache entry to be used by two requests at the same time, for different tables at different offsets, resulting in image corruption. As it would be preferable to base the detection on consistent data (even though it's just heuristics), let's take the lock not only around the qcow2_get_refcount() calls, but around the whole function. This patch takes the lock in qcow2_co_block_status() earlier and asserts in qcow2_detect_metadata_preallocation() that we hold the lock. Fixes: 69f47505ee66afaa513305de0c1895a224e52c45 Cc: qemu-stable@nongnu.org Reported-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 5e9785505210e2477e590e61b1ab100d0ec22b01) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30coroutine: Add qemu_co_mutex_assert_locked()Kevin Wolf
Some functions require that the caller holds a certain CoMutex for them to operate correctly. Add a function so that they can assert the lock is really held. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 944f3d5dd216fcd8cb007eddd4f82dced0a15b3d) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30block/qcow2: Fix corruption introduced by commit 8ac0f15f335Maxim Levitsky
This fixes subtle corruption introduced by luks threaded encryption in commit 8ac0f15f335 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922 The corruption happens when we do a write that * writes to two or more unallocated clusters at once * doesn't fully cover the first sector * doesn't fully cover the last sector * uses luks encryption In this case, when allocating the new clusters we COW both areas prior to the write and after the write, and we encrypt them. The above mentioned commit accidentally made it so we encrypt the second COW area using the physical cluster offset of the first area. The problem is that offset_in_cluster in do_perform_cow_encrypt can be larger that the cluster size, thus cluster_offset will no longer point to the start of the cluster at which encrypted area starts. Next patch in this series will refactor the code to avoid all these assumptions. In the bugreport that was triggered by rebasing a luks image to new, zero filled base, which lot of such writes, and causes some files with zero areas to contain garbage there instead. But as described above it can happen elsewhere as well Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190915203655.21638-2-mlevitsk@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 38e7d54bdc518b5a05a922467304bcace2396945) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30blockjob: update nodes head while removing all bdrvSergio Lopez
block_job_remove_all_bdrv() iterates through job->nodes, calling bdrv_root_unref_child() for each entry. The call to the latter may reach child_job_[can_]set_aio_ctx(), which will also attempt to traverse job->nodes, potentially finding entries that where freed on previous iterations. To avoid this situation, update job->nodes head on each iteration to ensure that already freed entries are no longer linked to the list. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1746631 Signed-off-by: Sergio Lopez <slp@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190911100316.32282-1-mreitz@redhat.com Reviewed-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit d876bf676f5e7c6aa9ac64555e48cba8734ecb2f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30curl: Handle success in multi_check_completionMax Reitz
Background: As of cURL 7.59.0, it verifies that several functions are not called from within a callback. Among these functions is curl_multi_add_handle(). curl_read_cb() is a callback from cURL and not a coroutine. Waking up acb->co will lead to entering it then and there, which means the current request will settle and the caller (if it runs in the same coroutine) may then issue the next request. In such a case, we will enter curl_setup_preadv() effectively from within curl_read_cb(). Calling curl_multi_add_handle() will then fail and the new request will not be processed. Fix this by not letting curl_read_cb() wake up acb->co. Instead, leave the whole business of settling the AIOCB objects to curl_multi_check_completion() (which is called from our timer callback and our FD handler, so not from any cURL callbacks). Reported-by: Natalie Gavrielov <ngavrilo@redhat.com> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1740193 Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190910124136.10565-7-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit bfb23b480a49114315877aacf700b49453e0f9d9) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30curl: Report only ready socketsMax Reitz
Instead of reporting all sockets to cURL, only report the one that has caused curl_multi_do_locked() to be called. This lets us get rid of the QLIST_FOREACH_SAFE() list, which was actually wrong: SAFE foreaches are only safe when the current element is removed in each iteration. If it possible for the list to be concurrently modified, we cannot guarantee that only the current element will be removed. Therefore, we must not use QLIST_FOREACH_SAFE() here. Fixes: ff5ca1664af85b24a4180d595ea6873fd3deac57 Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190910124136.10565-6-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 9abaf9fc474c3dd53e8e119326abc774c977c331) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30curl: Pass CURLSocket to curl_multi_do()Max Reitz
curl_multi_do_locked() currently marks all sockets as ready. That is not only inefficient, but in fact unsafe (the loop is). A follow-up patch will change that, but to do so, curl_multi_do_locked() needs to know exactly which socket is ready; and that is accomplished by this patch here. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190910124136.10565-5-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 9dbad87d25587ff640ef878f7b6159fc368ff541) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-10-30curl: Check completion in curl_multi_do()Max Reitz
While it is more likely that transfers complete after some file descriptor has data ready to read, we probably should not rely on it. Better be safe than sorry and call curl_multi_check_completion() in curl_multi_do(), too, just like it is done in curl_multi_read(). With this change, curl_multi_do() and curl_multi_read() are actually the same, so drop curl_multi_read() and use curl_multi_do() as the sole FD handler. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190910124136.10565-4-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit 948403bcb1c7e71dcbe8ab8479cf3934a0efcbb5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>