aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-07-15linux-user: Rename mmap_reserve to mmap_reserve_or_unmapRichard Henderson
If !reserved_va, munmap instead and assert success. Update all callers. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-22-richard.henderson@linaro.org>
2023-07-15linux-user: Rewrite mmap_reserveRichard Henderson
Use 'last' variables instead of 'end' variables; be careful about avoiding overflow. Assert that the mmap succeeded. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-21-richard.henderson@linaro.org>
2023-07-15linux-user: Use 'last' instead of 'end' in target_mmapRichard Henderson
Complete the transition within the mmap functions to a formulation that does not overflow at the end of the address space. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230707204054.8792-20-richard.henderson@linaro.org>
2023-07-15linux-user: Use page_find_range_empty for mmap_find_vma_reservedRichard Henderson
Use the interval tree to find empty space, rather than probing each page in turn. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-19-richard.henderson@linaro.org>
2023-07-15bsd-user: Use page_find_range_empty for mmap_find_vma_reservedRichard Henderson
Use the interval tree to find empty space, rather than probing each page in turn. Cc: Warner Losh <imp@bsdimp.com> Cc: Kyle Evans <kevans@freebsd.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-bt: Warner Losh <imp@bsdimp.com> Message-Id: <20230707204054.8792-18-richard.henderson@linaro.org>
2023-07-15accel/tcg: Introduce page_find_range_emptyRichard Henderson
Use the interval tree to locate an unused range in the VM. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-17-richard.henderson@linaro.org>
2023-07-15linux-user: Rewrite mmap_fragRichard Henderson
Use 'last' variables instead of 'end' variables. Always zero MAP_ANONYMOUS fragments, which we previously failed to do if they were not writable; early exit in case we allocate a new page from the kernel, known zeros. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-16-richard.henderson@linaro.org>
2023-07-15linux-user: Rewrite target_mprotectRichard Henderson
Use 'last' variables instead of 'end' variables. When host page size > guest page size, detect when adjacent host pages have the same protection and merge that expanded host range into fewer syscalls. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-15-richard.henderson@linaro.org>
2023-07-15linux-user: Widen target_mmap offset argument to off_tRichard Henderson
We build with _FILE_OFFSET_BITS=64, so off_t = off64_t = uint64_t. With an extra cast, this fixes emulation of mmap2, which could overflow the computation of the full value of offset. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-14-richard.henderson@linaro.org>
2023-07-15linux-user: Split out target_to_host_protRichard Henderson
Split out from validate_prot_to_pageflags, as there is not one single host_prot for the entire range. We need to adjust prot for every host page that overlaps multiple guest pages. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-13-richard.henderson@linaro.org>
2023-07-15linux-user: Implement MAP_FIXED_NOREPLACERichard Henderson
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-12-richard.henderson@linaro.org>
2023-07-15bsd-user: Use page_check_range_empty for MAP_EXCLRichard Henderson
The previous check returned -1 when any page within [start, start+len) is unmapped, not when all are unmapped. Cc: Warner Losh <imp@bsdimp.com> Cc: Kyle Evans <kevans@freebsd.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Warner Losh <imp@bsdimp.com> Message-Id: <20230707204054.8792-11-richard.henderson@linaro.org>
2023-07-15accel/tcg: Introduce page_check_range_emptyRichard Henderson
Examine the interval tree to validate that a region has no existing mappings. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-10-richard.henderson@linaro.org>
2023-07-15linux-user: Populate more bits in mmap_flags_tblRichard Henderson
Fix translation of TARGET_MAP_SHARED and TARGET_MAP_PRIVATE, which are types not single bits. Add TARGET_MAP_SHARED_VALIDATE, TARGET_MAP_SYNC, TARGET_MAP_NONBLOCK, TARGET_MAP_POPULATE, TARGET_MAP_FIXED_NOREPLACE, and TARGET_MAP_UNINITIALIZED. Update strace to match. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-9-richard.henderson@linaro.org>
2023-07-15linux-user: Split TARGET_PROT_* out of syscall_defs.hRichard Henderson
Move the values into the per-target target_mman.h headers Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-8-richard.henderson@linaro.org>
2023-07-15linux-user: Split TARGET_MAP_* out of syscall_defs.hRichard Henderson
Move the values into the per-target target_mman.h headers Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-7-richard.henderson@linaro.org>
2023-07-15linux-user/strace: Expand struct flags to hold a maskRichard Henderson
A zero bit value does not make sense -- it must relate to some field in some way. Define FLAG_BASIC with a build-time sanity check. Adjust FLAG_GENERIC and FLAG_TARGET to use it. Add FLAG_GENERIC_MASK and FLAG_TARGET_MASK. Fix up the existing flag definitions for build errors. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-6-richard.henderson@linaro.org>
2023-07-15linux-user: Fix formatting of mmap.cRichard Henderson
Fix all checkpatch.pl errors within mmap.c. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230707204054.8792-5-richard.henderson@linaro.org>
2023-07-15linux-user: Make sure initial brk(0) is page-alignedAndreas Schwab
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages") Signed-off-by: Andreas Schwab <schwab@suse.de> Message-Id: <mvmpm55qnno.fsf@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15tcg: Fix info_in_idx increment in layout_arg_by_refRichard Henderson
Off by one error, failing to take into account that layout_arg_1 already incremented info_in_idx for the first piece. We only need care for the n-1 TCG_CALL_ARG_BY_REF_N pieces here. Cc: qemu-stable@nongnu.org Fixes: 313bdea84d2 ("tcg: Add TCG_CALL_{RET,ARG}_BY_REF") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1751 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Peter Maydell <peter.maydell@linaro.org>
2023-07-15accel/tcg: Split out cpu_exec_longjmp_cleanupRichard Henderson
Share the setjmp cleanup between cpu_exec_step_atomic and cpu_exec_setjmp. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Fix do_shmat type errorsRichard Henderson
The guest address, raddr, should be unsigned, aka abi_ulong. The host addresses should be cast via *intptr_t not long. Drop the inline and fix two other whitespace issues. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-Id: <20230626140250.69572-1-richard.henderson@linaro.org>
2023-07-15linux-user/syscall: Implement execve without execveatPierrick Bouvier
Support for execveat syscall was implemented in 55bbe4 and is available since QEMU 8.0.0. It relies on host execveat, which is widely available on most of Linux kernels today. However, this change breaks qemu-user self emulation, if "host" qemu version is less than 8.0.0. Indeed, it does not implement yet execveat. This strange use case happens with most of distribution today having binfmt support. With a concrete failing example: $ qemu-x86_64-7.2 qemu-x86_64-8.0 /bin/bash -c /bin/ls /bin/bash: line 1: /bin/ls: Function not implemented -> not implemented means execve returned ENOSYS qemu-user-static 7.2 and 8.0 can be conveniently grabbed from debian packages qemu-user-static* [1]. One usage of this is running wine-arm64 from linux-x64 (details [2]). This is by updating qemu embedded in docker image that we ran into this issue. The solution to update host qemu is not always possible. Either it's complicated or ask you to recompile it, or simply is not accessible (GitLab CI, GitHub Actions). Thus, it could be worth to implement execve without relying on execveat, which is the goal of this patch. This patch was tested with example presented in this commit message. [1] http://ftp.us.debian.org/debian/pool/main/q/qemu/ [1] https://www.linaro.org/blog/emulate-windows-on-arm/ Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Message-Id: <20230705121023.973284-1-pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for nios2Richard Henderson
Based on gcc's nios2.h setting BIGGEST_ALIGNMENT to 32 bits. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for microblazeRichard Henderson
Based on gcc's microblaze.h setting BIGGEST_ALIGNMENT to 32 bits. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_uint not unsigned in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_short not short in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_ushort not unsigned short in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_int not int in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_llong not long long in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_ullong not unsigned long long in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_uint not unsigned int in syscall_defs.hRichard Henderson
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_llong not int64_t in syscall_defs.hRichard Henderson
Be careful not to change linux_dirent64, which is a host structure. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_ullong not uint64_t in syscall_defs.hRichard Henderson
Be careful not to change linux_dirent64, which is a host structure. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_int not int32_t in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Use abi_uint not uint32_t in syscall_defs.hRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Remove #if 0 block in syscall_defs.hRichard Henderson
These definitions are in sparc/signal.c. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-15linux-user: Reformat syscall_defs.hRichard Henderson
Untabify and re-indent. We had a mix of 2, 3, 4, and 8 space indentation. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-14Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into stagingRichard Henderson
* SCSI unit attention fix * add PCIe devices to s390x emulator * IDE unplug fix for Xen # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSxEWkUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroPDrAf/SyGEcBr1U2v0HBwfqGcHOVPwx5Dc # jk9628klLgRF9EqEoffFfJTf9LU5Su4WsjtGLvH+GBCV0thfaPrvQJxD4KWvxgUl # SKX5zepw9GY+uiTmbyuStLo5a8ksL6z5Zvw92gKh2PEKwuicerJL7OnK8drTMXXS # haL/UL3v3Qa3OwkxBIIq9uXdZjUiSib6PQD9/u7OoY67F6/ThmtUozgcMpqR/39Q # 0AdNibteN2XlUrysS9hreC0pAmqB6luAdo7wcUR53NV7Yp0yOa1jySJRxiNvHGrB # gK7jpHL/UBjTTkBodfZD21q5Ih4Vpya2FWpg4ZZlrIEJQc2AyxCl3zw3Bg== # =Ai1b # -----END PGP SIGNATURE----- # gpg: Signature made Fri 14 Jul 2023 10:12:09 AM BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: scsi: clear unit attention only for REPORT LUNS commands scsi: cleanup scsi_clear_unit_attention() scsi: fetch unit attention when creating the request kconfig: Add PCIe devices to s390x machines hw/ide/piix: properly initialize the BMIBA register Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-14scsi: clear unit attention only for REPORT LUNS commandsStefano Garzarella
scsi_clear_unit_attention() now only handles REPORTED LUNS DATA HAS CHANGED. This only happens when we handle REPORT LUNS commands, so let's rename the function in scsi_clear_reported_luns_changed() and call it only in scsi_target_emulate_report_luns(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-ID: <20230712134352.118655-4-sgarzare@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-14scsi: cleanup scsi_clear_unit_attention()Stefano Garzarella
The previous commit moved the unit attention clearing when we create the request. So now we can clean scsi_clear_unit_attention() to handle only the case of the REPORT LUNS command: this is the only case in which a UNIT ATTENTION is cleared without having been reported. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-ID: <20230712134352.118655-3-sgarzare@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-14scsi: fetch unit attention when creating the requestStefano Garzarella
Commit 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") split calls to scsi_req_new() and scsi_req_enqueue() in the virtio-scsi device. No ill effects were observed until commit 8cc5583abe ("virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events") added a unit attention that was easy to trigger with device hotplug and hot-unplug. Because the two calls were separated, all requests in the batch were prepared calling scsi_req_new() to report a sense. The first one submitted would report the right sense and reset it to NO_SENSE, while the others reported CHECK_CONDITION with no sense data. This caused SCSI errors in Linux. To solve this issue, let's fetch the unit attention as early as possible when we prepare the request, so that only the first request in the batch will use the unit attention SCSIReqOps and the others will not report CHECK CONDITION. Fixes: 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") Fixes: 8cc5583abe ("virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events") Reported-by: Thomas Huth <thuth@redhat.com> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2176702 Co-developed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-ID: <20230712134352.118655-2-sgarzare@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-14kconfig: Add PCIe devices to s390x machinesCédric Le Goater
It is useful to extend the number of available PCIe devices to KVM guests for passthrough scenarios and also to expose these models to a different (big endian) architecture. Introduce a new config PCIE_DEVICES to select models, Intel Ethernet adapters and one USB controller. These devices all support MSI-X which is a requirement on s390x as legacy INTx are not supported. Cc: Matthew Rosato <mjrosato@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230712080146.839113-1-clg@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-14hw/ide/piix: properly initialize the BMIBA registerOlaf Hering
According to the 82371FB documentation (82371FB.pdf, 2.3.9. BMIBA-BUS MASTER INTERFACE BASE ADDRESS REGISTER, April 1997), the register is 32bit wide. To properly reset it to default values, all 32bit need to be cleared. Bit #0 "Resource Type Indicator (RTE)" needs to be enabled. The initial change wrote just the lower 8 bit, leaving parts of the "Bus Master Interface Base Address" address at bit 15:4 unchanged. Fixes: e6a71ae327 ("Add support for 82371FB (Step A1) and Improved support for 82371SB (Function 1)") Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230712074721.14728-1-olaf@aepfle.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-07-12Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into stagingRichard Henderson
Pull request # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmSvAB0ACgkQnKSrs4Gr # c8hVzAgAomXGVhqm/qnQ99SIry+kec9a1Bom4ZprvpEtiHndoq8bw/ujeUlr/XK0 # CBKdYNYY3R1rSB6yLsV2ea45elk3x/iMqygbJF3QfWxpHfx0l8vs1WB6uSQFqo/E # ext1dvP8Czc0BP4MLaijvkW2u0j8qsLQnJcu9JDrRzgD8OqJSlhOxBSmb8VDvDvx # am0RMRkYxSl7jn2LFEE4mMfUjy9JJSFhnzP8lMoGH/m8C62Eult2PFDItnTAG8hN # IAyNDCDr2LKZwe6DP9JHUKCtqNYUHnGibgKH3k9NKWgUyOHSxqtDUC9vtoTPskGf # BRo0XZM7qnSUZCoAhEjvKVWcEkFIkw== # =aHUy # -----END PGP SIGNATURE----- # gpg: Signature made Wed 12 Jul 2023 08:33:49 PM BST # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] * tag 'block-pull-request' of https://gitlab.com/stefanha/qemu: virtio-blk: fix host notifier issues during dataplane start/stop Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-12virtio-blk: fix host notifier issues during dataplane start/stopStefan Hajnoczi
The main loop thread can consume 100% CPU when using --device virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but reading virtqueue host notifiers fails with EAGAIN. The file descriptors are stale and remain registered with the AioContext because of bugs in the virtio-blk dataplane start/stop code. The problem is that the dataplane start/stop code involves drain operations, which call virtio_blk_drained_begin() and virtio_blk_drained_end() at points where the host notifier is not operational: - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after vblk->dataplane_started has been set to true but the host notifier has not been attached yet. - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context() drain after the host notifier has already been detached but with vblk->dataplane_started still set to true. I would like to simplify ->ioeventfd_start/stop() to avoid interactions with drain entirely, but couldn't find a way to do that. Instead, this patch accepts the fragile nature of the code and reorders it so that vblk->dataplane_started is false during drain operations. This way the virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't touch the host notifier. The result is that virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have complete control over the host notifier and stale file descriptors are no longer left in the AioContext. This patch fixes the 100% CPU consumption in the main loop thread and correctly moves host notifier processing to the IOThread. Fixes: 1665d9326fd2 ("virtio-blk: implement BlockDevOps->drained_begin()") Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Lukas Doktor <ldoktor@redhat.com> Message-id: 20230704151527.193586-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-07-12Merge tag 'mem-2023-07-12' of https://github.com/davidhildenbrand/qemu into ↵Richard Henderson
staging Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Memory device cleanups (especially around machine initialization) - "x-ignore-shared" migration support for virtio-mem - Add an abstract virtio-md-pci device as a common parent for virtio-mem-pci and virtio-pmem-pci (virtio based memory devices) - Device unplug support for virtio-mem-pci # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmSuYAQRHGRhdmlkQHJl # ZGhhdC5jb20ACgkQTd4Q9wD/g1od9A/9HXT8IqKGup9is7P/mpobPWXczRGZ5sEg # /q21PzX6crr9aFa+fYRF/Dlm3G/cSMOVXFRKGz3royLjsvaEj/veEewfKF8KWbBf # eIS9udQTOwoD2kAhcv3pm0SwSJoVizpw2z7IodGVKE6iZxTXsmDksqQuFbrvVLSh # 2wtP4lizEXco/YsiCoAnStj2QtXBcHw7Ua7W2cDzxFmL+1pM5w3rjQ1ydCNz3bSG # l4CXXs1i8OmOZbFN78F/E9SEkzQnAuHSO0Sc1aeAJkwVzOt2lj/YMgt0jHjAY0at # pheWZ5pEE6hnQP740YXpt4Y6IIgO22pH23dLhq9A2reyRnwjt830uObHi3qAE8kB # KR+ZQ+Z5bI6ZNB/EFiUsC1dFsr2fF20zQlO02MctyJ+lUG6p3gpvwsGScQxt+zdF # QlkiSecGErYwC+nZ529SQB4gSEJTCjd/STDoidVYnZazdStaOaSyft02xRNzBPW/ # OnOY+6ZxZK6R11KfwGjnsftrovQIP3Pqi9TXGzW2xVlkWJHqlicy6G3ZfceTTlj9 # Gg2Ue694Wr1r4PDV2XlYcZ1IPLjSy5Msp5V2wERRrp3OItxnvegvTevQN7USEHC+ # BPGNMu11jriSY2pE5BSFN0hfGOvuvsk3GreLJiHFUXoje6gzAynuLjCN/CHdIVyK # 5i0AwdZ+xcA= # =ch6m # -----END PGP SIGNATURE----- # gpg: Signature made Wed 12 Jul 2023 09:10:44 AM BST # gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A # gpg: issuer "david@redhat.com" # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown] # gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [undefined] # gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A * tag 'mem-2023-07-12' of https://github.com/davidhildenbrand/qemu: (21 commits) virtio-mem-pci: Device unplug support virtio-mem: Prepare for device unplug support virtio-md-pci: Support unplug requests for compatible devices virtio-md-pci: Handle unplug of virtio based memory devices arm/virt: Use virtio-md-pci (un)plug functions pc: Factor out (un)plug handling of virtio-md-pci devices virtio-md-pci: New parent type for virtio-mem-pci and virtio-pmem-pci virtio-mem: Support "x-ignore-shared" migration migration/ram: Expose ramblock_is_ignored() as migrate_ram_is_ignored() virtio-mem: Skip most of virtio_mem_unplug_all() without plugged memory softmmu/physmem: Warn with ram_block_discard_range() on MAP_PRIVATE file mapping memory-device: Track used region size in DeviceMemoryState memory-device: Refactor memory_device_pre_plug() hw/i386/pc: Remove PC_MACHINE_DEVMEM_REGION_SIZE hw/i386/acpi-build: Rely on machine->device_memory when building SRAT hw/i386/pc: Use machine_memory_devices_init() hw/loongarch/virt: Use machine_memory_devices_init() hw/ppc/spapr: Use machine_memory_devices_init() hw/arm/virt: Use machine_memory_devices_init() memory-device: Introduce machine_memory_devices_init() ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-12virtio-mem-pci: Device unplug supportDavid Hildenbrand
Let's support device unplug by forwarding the unplug_request_check() callback to the virtio-mem device. Further, disallow changing the requested-size once an unplug request is pending. Disallowing requested-size changes handles corner cases such as (1) pausing the VM (2) requesting device unplug and (3) adjusting the requested size. If the VM would plug memory (due to the requested size change) before processing the unplug request, we would be in trouble. Message-ID: <20230711153445.514112-8-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12virtio-mem: Prepare for device unplug supportDavid Hildenbrand
In many cases, blindly unplugging a virtio-mem device is problematic. We can only safely remove a device once: * The guest is not expecting to be able to read unplugged memory (unplugged-inaccessible == on) * The virtio-mem device does not have memory plugged (size == 0) * The virtio-mem device does not have outstanding requests to the VM to plug memory (requested-size == 0) So let's add a callback to the virtio-mem device class to check for that. We'll wire-up virtio-mem-pci next. Message-ID: <20230711153445.514112-7-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12virtio-md-pci: Support unplug requests for compatible devicesDavid Hildenbrand
Let's support unplug requests for virtio-md-pci devices that provide a unplug_request_check() callback. We'll wire that up for virtio-mem-pci next. Message-ID: <20230711153445.514112-6-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>