aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2023-07-04block/blkio: fix module_block.py parsingStefan Hajnoczi
When QEMU is built with --enable-modules, the module_block.py script parses block/*.c to find block drivers that are built as modules. The script generates a table of block drivers called block_driver_modules[]. This table is used for block driver module loading. The blkio.c driver uses macros to define its BlockDriver structs. This was done to avoid code duplication but the module_block.py script is unable to parse the macro. The result is that libblkio-based block drivers can be built as modules but will not be found at runtime. One fix is to make the module_block.py script or build system fancier so it can parse C macros (e.g. by parsing the preprocessed source code). I chose not to do this because it raises the complexity of the build, making future issues harder to debug. Keep things simple: use the macro to avoid duplicating BlockDriver function pointers but define .format_name and .protocol_name manually for each BlockDriver. This way the module_block.py is able to parse the code. Also get rid of the block driver name macros (e.g. DRIVER_IO_URING) because module_block.py cannot parse them either. Fixes: fd66dbd424f5 ("blkio: add libblkio block driver") Reported-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230704123436.187761-1-stefanha@redhat.com Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-28Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into stagingRichard Henderson
Block layer patches - Re-enable the graph lock - More fixes to coroutine_fn marking # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmScQCQRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9bNSA//WIzPT45rFhl2U9QgyOJu26ho6ahsgwgI # Z3QM5kCDB1dAN9USRPxhGboLGo8CyY7eeSwSrR7RtwBGYrWrAoJfGp5gK/7d9s5Q # o0AGgRPnJGhFkBhRRMytsDsewM6Kk4IRmk4HMK3cOH3rsSM8RHs6KmDSBKesllu0 # QVGf3qW4u8LHyZyGM5OlPVUbtuDuK6/52FGhpXBp+x4oyNegOhjwO4mGOvTG+xIk # Q5zwWZaPfjxaEDkvW8iahB6/D7Tpt64BmMf1Ydhxcd5eKEp932CiBI36aAlNKoRD # Al5wztRx1GEh12ekN39jIi7Ypp3JX26keJcieKU0q656pT551UFRYjU0Rk08/Cca # qv2oiQDu6bHgQ9zCQ1nMfa9+K2MyBwx0b5qfYkvs2RzgCTl8ImgBQANHfw8tz6Bq # HUo1zsFBXCaK0boUB5iFwdf3rlx3t9UTEuDej/RaHqZjZD5xeG/smCcOlSfHaKUa # wXfYxvm8ZfefJn1D6io1A+7M956uvIQNtmh13cU44clgFX9Y/bBNMg/5lMRsJKo8 # xxjvqCAyxo/pPfUsVWx4pc8AXbfVa85gyoSiaLEYZnqP54sJ2lFccqykCsTy58Lo # VDcoPnoSc+LNqBOvtzxXgQbEWFCXU6fe0+TZgVYUvExWFIAOImeDWg2GD1JVrwsX # e9QrPhL3DXg= # =ZQcP # -----END PGP SIGNATURE----- # gpg: Signature made Wed 28 Jun 2023 04:13:56 PM CEST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (23 commits) block: use bdrv_co_debug_event in coroutine context block: use bdrv_co_getlength in coroutine context qcow2: mark more functions as coroutine_fns and GRAPH_RDLOCK vhdx: mark more functions as coroutine_fns and GRAPH_RDLOCK vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCK dmg: mark more functions as coroutine_fns and GRAPH_RDLOCK cloop: mark more functions as coroutine_fns and GRAPH_RDLOCK block: mark another function as coroutine_fns and GRAPH_UNLOCKED bochs: mark more functions as coroutine_fns and GRAPH_RDLOCK vpc: mark more functions as coroutine_fns and GRAPH_RDLOCK qed: mark more functions as coroutine_fns and GRAPH_RDLOCK file-posix: remove incorrect coroutine_fn calls Revert "graph-lock: Disable locking for now" graph-lock: Unlock the AioContext while polling blockjob: Fix AioContext locking in block_job_add_bdrv() block: Fix AioContext locking in bdrv_open_backing_file() block: Fix AioContext locking in bdrv_open_inherit() block: Fix AioContext locking in bdrv_reopen_parse_file_or_backing() block: Fix AioContext locking in bdrv_attach_child_common() block: Fix AioContext locking in bdrv_open_child() ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-28block: use bdrv_co_debug_event in coroutine contextPaolo Bonzini
bdrv_co_debug_event was recently introduced, with bdrv_debug_event becoming a wrapper for use in unknown context. Because most of the time bdrv_debug_event is used on a BdrvChild via the wrapper macro BLKDBG_EVENT, introduce a similar macro BLKDBG_CO_EVENT that calls bdrv_co_debug_event, and switch whenever possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-13-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28block: use bdrv_co_getlength in coroutine contextPaolo Bonzini
bdrv_co_getlength was recently introduced, with bdrv_getlength becoming a wrapper for use in unknown context. Switch to bdrv_co_getlength when possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-12-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28qcow2: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-11-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28vhdx: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-10-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28dmg: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-8-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28cloop: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-7-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28bochs: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-5-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28vpc: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-4-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28qed: mark more functions as coroutine_fns and GRAPH_RDLOCKPaolo Bonzini
Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-3-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28file-posix: remove incorrect coroutine_fn callsPaolo Bonzini
raw_co_getlength is called by handle_aiocb_write_zeroes, which is not a coroutine function. This is harmless because raw_co_getlength does not actually suspend, but in the interest of clarity make it a non-coroutine_fn that is just wrapped by the coroutine_fn raw_co_getlength. Likewise, check_cache_dropped was only a coroutine_fn because it called raw_co_getlength, so it can be made non-coroutine as well. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-2-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28Revert "graph-lock: Disable locking for now"Kevin Wolf
Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that its caller holds, it can poll without causing deadlocks. We can now re-enable graph locking. This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230605085711.21261-12-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-28graph-lock: Unlock the AioContext while pollingKevin Wolf
If the caller keeps the AioContext lock for a block node in an iothread, polling in bdrv_graph_wrlock() deadlocks if the condition isn't fulfilled immediately. Now that all callers make sure to actually have the AioContext locked when they call bdrv_replace_child_noperm() like they should, we can change bdrv_graph_wrlock() to take a BlockDriverState whose AioContext lock the caller holds (NULL if it doesn't) and unlock it temporarily while polling. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230605085711.21261-11-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-06-26vhost-user: fully use new backend/frontend namingManos Pitsidianakis
Slave/master nomenclature was replaced with backend/frontend in commit 1fc19b65279a ("vhost-user: Adopt new backend naming") This patch replaces all remaining uses of master and slave in the codebase. Signed-off-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20230613080849.2115347-1-manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2023-06-20meson: Replace softmmu_ss -> system_ssPhilippe Mathieu-Daudé
We use the user_ss[] array to hold the user emulation sources, and the softmmu_ss[] array to hold the system emulation ones. Hold the latter in the 'system_ss[]' array for parity with user emulation. Mechanical change doing: $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-06Merge tag 'pull-request-2023-06-06' of https://gitlab.com/thuth/qemu into ↵Richard Henderson
staging * Fix emulated LCCB, LOCFHR, MXDB and MXDBR s390x instructions * Fix the malta machine on s390x (big endian) hosts * Emulate /proc/cpuinfo on s390x * Remove pointless QOM casts * Improve the inclusion logic for libkeyutils and ipmi-bt-test in meson.build # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmR+ycgRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbWXXw//WPz3ng50KLS+M1t3/ULEjO6XkGfP2LQZ # RsZq3hf9THFPZgcREk+6SQvttOSTuvHNfakfujS6U1Ou5thReWqLe4itFW6+hB5j # kQ+Sm6YJ+fpezkBnSefcUoL5nA9VVKZ6KE6kxq5CUBZNoIk1sSsfrU8y8wjzW0yg # 2nraOcG10aLpO2BfvKHVEAhJtwl9pHJsFANmHC2/h2wC9BZIAzdxiytzdcJ909gN # AAa0hIrLK/oFgJjkSSxu+QTaVGPARXqkx5WV546F/zmDMFUWd9nrXaegwqxjgPBN # m9Ua0SXll5hX2Z57vjJWlbTYkD+JUB22L0N7p5/xzhYRpLVSq1pdveo9psrzIC3E # Bt7chZB58acQepJHxxa3UHDOHcnfdfaN+Dd9wD29wHr7nK8lOcsen7/7V+5YXomc # qenkCtkpjKTl07OBxe6MDGZtPZYA8fK1CjEyYwHCe8QvxEzsyg96Bm3j4N2VPxQU # +f/sFPX7SgogZI4mB4wdoxOF1RmQ+DXQ2tnB970txZRkmFq2jJHpW86jkkbq2Jl1 # KIjgdIXjVgy+MPtuQzO5cT+jfhGQL7FQynGXHjv/UidBid5XD3TDVNa9AthN3Mng # +rPT90VJ7j9soMqvmNT1COSIRD+M49dQKBIQuq/gWplaTOHaAcJrCwYScwqe0u0P # zmjCNeuPVw8= # =dfJr # -----END PGP SIGNATURE----- # gpg: Signature made Mon 05 Jun 2023 10:53:12 PM PDT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [unknown] # gpg: aka "Thomas Huth <thuth@redhat.com>" [unknown] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2023-06-06' of https://gitlab.com/thuth/qemu: linux-user: Emulate /proc/cpuinfo on s390x linux-user/elfload: Introduce elf_hwcap_str() on s390x linux-user/elfload: Expose get_elf_hwcap() on s390x s390x/tcg: Fix CPU address returned by STIDP bulk: Remove pointless QOM casts scripts: Add qom-cast-macro-clean-cocci-gen.py hw/mips/malta: Fix the malta machine on big endian hosts gitlab-ci: Remove unused Python package tests/qtest: Run ipmi-bt-test only if CONFIG_IPMI_EXTERN is set tests/tcg/s390x: Test MXDB and MXDBR target/s390x: Fix MXDB and MXDBR Add conditional dependency for libkeyutils tests/tcg/s390x: Test single-stepping SVC linux-user/s390x: Fix single-stepping SVC tests/tcg/s390x: Test LOCFHR target/s390x: Fix LOCFHR taking the wrong half of R2 tests/tcg/s390x: Test LCBB target/s390x: Fix LCBB overwriting the top 32 bits Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05bulk: Remove pointless QOM castsPhilippe Mathieu-Daudé
Mechanical change running Coccinelle spatch with content generated from the qom-cast-macro-clean-cocci-gen.py added in the previous commit. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230601093452.38972-3-philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-06-05qcow2: add discard-no-unref optionJean-Louis Dupond
When we for example have a sparse qcow2 image and discard: unmap is enabled, there can be a lot of fragmentation in the image after some time. Especially on VM's that do a lot of writes/deletes. This causes the qcow2 image to grow even over 110% of its virtual size, because the free gaps in the image get too small to allocate new continuous clusters. So it allocates new space at the end of the image. Disabling discard is not an option, as discard is needed to keep the incremental backup size as low as possible. Without discard, the incremental backups would become large, as qemu thinks it's just dirty blocks but it doesn't know the blocks are unneeded. So we need to avoid fragmentation but also 'empty' the unneeded blocks in the image to have a small incremental backup. In addition, we also want to send the discards further down the stack, so the underlying blocks are still discarded. Therefor we introduce a new qcow2 option "discard-no-unref". When setting this option to true, discards will no longer have the qcow2 driver relinquish cluster allocations. Other than that, the request is handled as normal: All clusters in range are marked as zero, and, if pass-discard-request is true, it is passed further down the stack. The only difference is that the now-zero clusters are preallocated instead of being unallocated. This will avoid fragmentation on the qcow2 image. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Message-Id: <20230605084523.34134-2-jean-louis@dupond.be> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Incorrect condition in out-of-image checkAlexander Ivanov
All the offsets in the BAT must be lower than the file size. Fix the check condition for correct check. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-13-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARDAlexander Ivanov
Replace the way we use mutex in parallels_co_check() for simplier and less error prone code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-12-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Move statistic collection to a separate functionAlexander Ivanov
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-11-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Move check of leaks to a separate functionAlexander Ivanov
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-10-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Fix statistics calculationAlexander Ivanov
Exclude out-of-image clusters from allocated and fragmented clusters calculation. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-9-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Move check of cluster outside image to a separate functionAlexander Ivanov
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-8-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Move check of unclean image to a separate functionAlexander Ivanov
We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-7-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Use generic infrastructure for BAT writing in parallels_co_check()Alexander Ivanov
BAT is written in the context of conventional operations over the image inside bdrv_co_flush() when it calls parallels_co_flush_to_os() callback. Thus we should not modify BAT array directly, but call parallels_set_bat_entry() helper and bdrv_co_flush() further on. After that there is no need to manually write BAT and track its modification. This makes code more generic and allows to split parallels_set_bat_entry() for independent pieces. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-6-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: create parallels_set_bat_entry_helper() to assign BAT valueAlexander Ivanov
This helper will be reused in next patches during parallels_co_check rework to simplify its code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-5-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Fix image_end_offset and data_end after out-of-image checkAlexander Ivanov
Set data_end to the end of the last cluster inside the image. In such a way we can be sure that corrupted offsets in the BAT can't affect on the image size. If there are no allocated clusters set image_end_offset by data_end. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-4-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Fix high_off calculation in parallels_co_check()Alexander Ivanov
Don't let high_off be more than the file size even if we don't fix the image. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-3-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05parallels: Out of image offset in BAT leads to image inflationAlexander Ivanov
data_end field in BDRVParallelsState is set to the biggest offset present in BAT. If this offset is outside of the image, any further write will create the cluster at this offset and/or the image will be truncated to this offset on close. This is definitely not correct. Raise an error in parallels_open() if data_end points outside the image and it is not a check (let the check to repaire the image). Set data_end to the end of the cluster with the last correct offset. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-2-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
2023-06-05block: Collapse padded I/O vecs exceeding IOV_MAXHanna Czenczek
When processing vectored guest requests that are not aligned to the storage request alignment, we pad them by adding head and/or tail buffers for a read-modify-write cycle. The guest can submit I/O vectors up to IOV_MAX (1024) in length, but with this padding, the vector can exceed that limit. As of 4c002cef0e9abe7135d7916c51abce47f7fc1ee2 ("util/iov: make qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the limit, instead returning an error to the guest. To the guest, this appears as a random I/O error. We should not return an I/O error to the guest when it issued a perfectly valid request. Before 4c002cef0e9abe7135d7916c51abce47f7fc1ee2, we just made the vector longer than IOV_MAX, which generally seems to work (because the guest assumes a smaller alignment than we really have, file-posix's raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and so emulate the request, so that the IOV_MAX does not matter). However, that does not seem exactly great. I see two ways to fix this problem: 1. We split such long requests into two requests. 2. We join some elements of the vector into new buffers to make it shorter. I am wary of (1), because it seems like it may have unintended side effects. (2) on the other hand seems relatively simple to implement, with hopefully few side effects, so this patch does that. To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request() is effectively replaced by the new function bdrv_create_padded_qiov(), which not only wraps the request IOV with padding head/tail, but also ensures that the resulting vector will not have more than IOV_MAX elements. Putting that functionality into qemu_iovec_init_extended() is infeasible because it requires allocating a bounce buffer; doing so would require many more parameters (buffer alignment, how to initialize the buffer, and out parameters like the buffer, its length, and the original elements), which is not reasonable. Conversely, it is not difficult to move qemu_iovec_init_extended()'s functionality into bdrv_create_padded_qiov() by using public qemu_iovec_* functions, so that is what this patch does. Because bdrv_pad_request() was the only "serious" user of qemu_iovec_init_extended(), the next patch will remove the latter function, so the functionality is not implemented twice. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-3-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
2023-06-02cutils: Adjust signature of parse_uint[_full]Eric Blake
It's already confusing that we have two very similar functions for wrapping the parse of a 64-bit unsigned value, differing mainly on whether they permit leading '-'. Adjust the signature of parse_uint() and parse_uint_full() to be like all of qemu_strto*(): put the result parameter last, use the same types (uint64_t and unsigned long long have the same width, but are not always the same type), and mark endptr const (this latter change only affects the rare caller of parse_uint). Adjust all callers in the tree. While at it, note that since cutils.c already includes: QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long)); we are guaranteed that the result of parse_uint* cannot exceed UINT64_MAX (or the build would have failed), so we can drop pre-existing dead comparisons in opts-visitor.c that were never false. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-8-eblake@redhat.com> [eblake: Drop dead code spotted by Markus] Signed-off-by: Eric Blake <eblake@redhat.com>
2023-06-01block/blkio: use qemu_open() to support fd passing for virtio-blkStefano Garzarella
Some virtio-blk drivers (e.g. virtio-blk-vhost-vdpa) supports the fd passing. Let's expose this to the user, so the management layer can pass the file descriptor of an already opened path. If the libblkio virtio-blk driver supports fd passing, let's always use qemu_open() to open the `path`, so we can handle fd passing from the management layer through the "/dev/fdset/N" special path. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230530071941.8954-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block: remove bdrv_co_io_plug() APIStefan Hajnoczi
No block driver implements .bdrv_co_io_plug() anymore. Get rid of the function pointers. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block/linux-aio: convert to blk_io_plug_call() APIStefan Hajnoczi
Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Note that a dev_max_batch check is dropped in laio_io_unplug() because the semantics of unplug_fn() are different from .bdrv_co_unplug(): 1. unplug_fn() is only called when the last blk_io_unplug() call occurs, not every time blk_io_unplug() is called. 2. unplug_fn() is per-thread, not per-BlockDriverState, so there is no way to get per-BlockDriverState fields like dev_max_batch. Therefore this condition cannot be moved to laio_unplug_fn(). It is not obvious that this condition affects performance in practice, so I am removing it instead of trying to come up with a more complex mechanism to preserve the condition. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230530180959.1108766-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block/io_uring: convert to blk_io_plug_call() APIStefan Hajnoczi
Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block/blkio: convert to blk_io_plug_call() APIStefan Hajnoczi
Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block/nvme: convert to blk_io_plug_call() APIStefan Hajnoczi
Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01block: add blk_io_plug_call() APIStefan Hajnoczi
Introduce a new API for thread-local blk_io_plug() that does not traverse the block graph. The goal is to make blk_io_plug() multi-queue friendly. Instead of having block drivers track whether or not we're in a plugged section, provide an API that allows them to defer a function call until we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is called multiple times with the same fn/opaque pair, then fn() is only called once at the end of the function - resulting in batching. This patch introduces the API and changes blk_io_plug()/blk_io_unplug(). blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument because the plug state is now thread-local. Later patches convert block drivers to blk_io_plug_call() and then we can finally remove .bdrv_co_io_plug() once all block drivers have been converted. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-30aio: remove aio_disable_external() APIStefan Hajnoczi
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/fuse: do not set is_external=true on FUSE fdStefan Hajnoczi
This is part of ongoing work to remove the aio_disable_external() API. Use BlockDevOps .drained_begin/end/poll() instead of aio_set_fd_handler(is_external=true). As a side-effect the FUSE export now follows AioContext changes like the other export types. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-16-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: don't require AioContext lock around blk_exp_ref/unref()Stefan Hajnoczi
The FUSE export calls blk_exp_ref/unref() without the AioContext lock. Instead of fixing the FUSE export, adjust blk_exp_ref/unref() so they work without the AioContext lock. This way it's less error-prone. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-15-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: rewrite vduse-blk drain codeStefan Hajnoczi
vduse_blk_detach_ctx() waits for in-flight requests using AIO_WAIT_WHILE(). This is not allowed according to a comment in bdrv_set_aio_context_commit(): /* * Take the old AioContex when detaching it from bs. * At this point, new_context lock is already acquired, and we are now * also taking old_context. This is safe as long as bdrv_detach_aio_context * does not call AIO_POLL_WHILE(). */ Use this opportunity to rewrite the drain code in vduse-blk: - Use the BlockExport refcount so that vduse_blk_exp_delete() is only called when there are no more requests in flight. - Implement .drained_poll() so in-flight request coroutines are stopped by the time .bdrv_detach_aio_context() is called. - Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the .bdrv_detach_aio_context() constraint violation. It's no longer needed due to the previous changes. - Always handle the VDUSE file descriptor, even in drained sections. The VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in drained sections. This ensures that the VDUSE kernel code gets a fast response. - Suspend virtqueue fd handlers in .drained_begin() and resume them in .drained_end(). This eliminates the need for the aio_set_fd_handler(is_external=true) flag, which is being removed from QEMU. This is a long list but splitting it into individual commits would probably lead to git bisect failures - the changes are all related. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-14-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block: drain from main loop thread in bdrv_co_yield_to_drain()Stefan Hajnoczi
For simplicity, always run BlockDevOps .drained_begin/end/poll() callbacks in the main loop thread. This makes it easier to implement the callbacks and avoids extra locks. Move the function pointer declarations from the I/O Code section to the Global State section for BlockDevOps, BdrvChildClass, and BlockDriver. Narrow IO_OR_GS_CODE() to GLOBAL_STATE_CODE() where appropriate. The test-bdrv-drain test case calls bdrv_drain() from an IOThread. This is now only allowed from coroutine context, so update the test case to run in a coroutine. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-11-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block: add blk_in_drain() APIStefan Hajnoczi
The BlockBackend quiesce_counter is greater than zero during drained sections. Add an API to check whether the BlockBackend is in a drained section. The next patch will use this API. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-10-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: stop using is_external in vhost-user-blk serverStefan Hajnoczi
vhost-user activity must be suspended during bdrv_drained_begin/end(). This prevents new requests from interfering with whatever is happening in the drained section. Previously this was done using aio_set_fd_handler()'s is_external argument. In a multi-queue block layer world the aio_disable_external() API cannot be used since multiple AioContext may be processing I/O, not just one. Switch to BlockDevOps->drained_begin/end() callbacks. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-8-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30block/export: wait for vhost-user-blk requests when drainingStefan Hajnoczi
Each vhost-user-blk request runs in a coroutine. When the BlockBackend enters a drained section we need to enter a quiescent state. Currently any in-flight requests race with bdrv_drained_begin() because it is unaware of vhost-user-blk requests. When blk_co_preadv/pwritev()/etc returns it wakes the bdrv_drained_begin() thread but vhost-user-blk request processing has not yet finished. The request coroutine continues executing while the main loop thread thinks it is in a drained section. One example where this is unsafe is for blk_set_aio_context() where bdrv_drained_begin() is called before .aio_context_detached() and .aio_context_attach(). If request coroutines are still running after bdrv_drained_begin(), then the AioContext could change underneath them and they race with new requests processed in the new AioContext. This could lead to virtqueue corruption, for example. (This example is theoretical, I came across this while reading the code and have not tried to reproduce it.) It's easy to make bdrv_drained_begin() wait for in-flight requests: add a .drained_poll() callback that checks the VuServer's in-flight counter. VuServer just needs an API that returns true when there are requests in flight. The in-flight counter needs to be atomic. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30util/vhost-user-server: rename refcount to in_flight counterStefan Hajnoczi
The VuServer object has a refcount field and ref/unref APIs. The name is confusing because it's actually an in-flight request counter instead of a refcount. Normally a refcount destroys the object upon reaching zero. The VuServer counter is used to wake up the vhost-user coroutine when there are no more requests. Avoid confusing by renaming refcount and ref/unref to in_flight and inc/dec. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230516190238.8401-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>