aboutsummaryrefslogtreecommitdiff
path: root/accel/tcg/cpu-exec.c
AgeCommit message (Collapse)Author
2023-07-17accel/tcg: Zero-pad PC in TCG CPU exec trace linesPeter Maydell
In commit f0a08b0913befbd we changed the type of the PC from target_ulong to vaddr. In doing so we inadvertently dropped the zero-padding on the PC in trace lines (the second item inside the [] in these lines). They used to look like this on AArch64, for instance: Trace 0: 0x7f2260000100 [00000000/0000000040000000/00000061/ff200000] and now they look like this: Trace 0: 0x7f4f50000100 [00000000/40000000/00000061/ff200000] and if the PC happens to be somewhere low like 0x5000 then the field is shown as /5000/. This is because TARGET_FMT_lx is a "%08x" or "%016x" specifier, depending on TARGET_LONG_SIZE, whereas VADDR_PRIx is just PRIx64 with no width specifier. Restore the zero-padding by adding an 016 width specifier to this tracing and a couple of others that were similarly recently changed to use VADDR_PRIx without a width specifier. We can't unfortunately restore the "32-bit guests are padded to 8 hex digits and 64-bit guests to 16 hex digits" behaviour so easily. Fixes: f0a08b0913befbd ("accel/tcg/cpu-exec.c: Widen pc to vaddr") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Johansson <anjo@rev.ng> Message-id: 20230711165434.4123674-1-peter.maydell@linaro.org
2023-07-15accel/tcg: Always lock pages before translationRichard Henderson
We had done this for user-mode by invoking page_protect within the translator loop. Extend this to handle system mode as well. Move page locking out of tb_link_page. Reported-by: Liren Wei <lrwei@bupt.edu.cn> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2023-07-15accel/tcg: Split out cpu_exec_longjmp_cleanupRichard Henderson
Share the setjmp cleanup between cpu_exec_step_atomic and cpu_exec_setjmp. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26accel/tcg/cpu-exec.c: Widen pc to vaddrAnton Johansson
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-7-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26target: Widen pc/cs_base in cpu_get_tb_cpu_stateAnton Johansson
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230621135633.1649-4-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20accel/tcg/cpu-exec: Use generic 'helper-proto-common.h' headerPhilippe Mathieu-Daudé
We only need lookup_tb_ptr() prototype. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230611085846.21415-3-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20accel/tcg: Check for USER_ONLY definition instead of SOFTMMU onePhilippe Mathieu-Daudé
Since we *might* have user emulation with softmmu, replace the system emulation check by !user emulation one. Invert some if() ladders for clarity. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-7-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-13util/log: Add vector registers to logIvan Klokov
Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu. Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-06-06atomics: eliminate mb_read/mb_setPaolo Bonzini
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives introduced for QEMU; their semantics are unclear and they provide a false sense of safety. The last use of qatomic_mb_read() has been removed, so delete it. qatomic_mb_set() instead can survive as an optimized qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but rename it to qatomic_set_mb() to match the order of the two operations. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-05exec-all: Widen TranslationBlock pc and cs_base to 64-bitsRichard Henderson
This makes TranslationBlock agnostic to the address size of the guest. Use vaddr for pc, since that's always a virtual address. Use uint64_t for cs_base, since usage varies between guests. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-01accel/tcg: include cs_base in our hash calculationsAlex Bennée
We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01tcg: remove the final vestiges of dstateAlex Bennée
Now we no longer have dynamic state affecting things we can remove the additional fields in cpu.h and simplify the TB hash calculation. For the benchmark: hyperfine -w 2 -m 20 \ "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \ -machine type=virt,highmem=off \ -display none -m 2048 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \ -device scsi-hd,drive=hd -smp 4 \ -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \ -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \ -snapshot" It has a marginal effect on runtime, before: Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s] Range (min … max): 24.420 s … 32.565 s 20 runs after: Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s] Range (min … max): 21.663 s … 29.937 s 20 runs Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358 Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230526165401.574474-10-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-23tcg: Remove DEBUG_DISASRichard Henderson
This had been set since the beginning, is never undefined, and it would seem to be harmful to debugging to do so. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-02accel/tcg: Use one_insn_per_tb global instead of old singlestep globalPeter Maydell
The only place left that looks at the old 'singlestep' global variable is the TCG curr_cflags() function. Replace the old global with a new 'one_insn_per_tb' which is defined in tcg-all.c and declared in accel/tcg/internal.h. This keeps it restricted to the TCG code, unlike 'singlestep' which was available to every file in the system and defined in multiple different places for softmmu vs linux-user vs bsd-user. While we're making this change, use qatomic_read() and qatomic_set() on the accesses to the new global, because TCG will read it without holding a lock. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230417164041.684562-4-peter.maydell@linaro.org
2023-04-04accel/tcg: Fix jump cache set in cpu_exec_loopRichard Henderson
Assign pc and use store_release to assign tb. Fixes: 2dd5b7a1b91 ("accel/tcg: Move jmp-cache `CF_PCREL` checks to caller") Reported-by: Weiwei Li <liweiwei@iscas.ac.cn> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-22tcg: Clear plugin_mem_cbs on TB exitRichard Henderson
Do this in cpu_tb_exec (normal exit) and cpu_loop_exit (exception), adjacent to where we reset can_do_io. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1381 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230310195252.210956-2-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230315174331.2959-12-alex.bennee@linaro.org> Reviewed-by: Emilio Cota <cota@braap.org>
2023-03-01accel/tcg: Replace `tb_pc()` with `tb->pc`Anton Johansson
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-13-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01accel/tcg: Move jmp-cache `CF_PCREL` checks to callerAnton Johansson
tb-jmp-cache.h contains a few small functions that only exist to hide a CF_PCREL check, however the caller often already performs such a check. This patch moves CF_PCREL checks from the callee to the caller, and also removes these functions which now only hide an access of the jmp-cache. Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-12-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01accel/tcg: Replace `TARGET_TB_PCREL` with `CF_PCREL`Anton Johansson
Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-5-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-02-27replay: Extract core API to 'exec/replay-core.h'Philippe Mathieu-Daudé
replay API is used deeply within TCG common code (common to user and system emulation). Unfortunately "sysemu/replay.h" requires some QAPI headers for few system-specific declarations, example: void replay_input_event(QemuConsole *src, InputEvent *evt); Since commit c2651c0eaa ("qapi/meson: Restrict UI module to system emulation and tools") the QAPI header defining the InputEvent is not generated anymore. To keep it simple, extract the 'core' replay prototypes to a new "exec/replay-core.h" header which we include in the TCG code that doesn't need the rest of the replay API. Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <20221219170806.60580-5-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-02-27accel/tcg: Restrict 'qapi-commands-machine.h' to system emulationPhilippe Mathieu-Daudé
Since commit a0e61807a3 ("qapi: Remove QMP events and commands from user-mode builds") we don't generate the "qapi-commands-machine.h" header in a user-emulation-only build. Rename 'hmp.c' as 'monitor.c' and move the QMP functions from cpu-exec.c (which is always compiled) to monitor.c (which is only compiled when system-emulation is selected). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-4-philmd@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-02-27exec: Remove unused 'qemu/timer.h' timerPhilippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221219170806.60580-2-philmd@linaro.org>
2023-02-08Don't include headers already included by qemu/osdep.hMarkus Armbruster
This commit was created with scripts/clean-includes. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
2023-02-02cpu-exec: assert that plugin_mem_cbs is NULL after executionEmilio Cota
Fixes: #1381 Signed-off-by: Emilio Cota <cota@braap.org> Message-Id: <20230108165107.62488-1-cota@braap.org> [AJB: manually applied follow-up fix] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230124180127.1881110-35-alex.bennee@linaro.org>
2023-02-02cpu: free cpu->tb_jmp_cache with RCUEmilio Cota
Fixes the appended use-after-free. The root cause is that during tb invalidation we use CPU_FOREACH, and therefore to safely free a vCPU we must wait for an RCU grace period to elapse. $ x86_64-linux-user/qemu-x86_64 tests/tcg/x86_64-linux-user/munmap-pthread ================================================================= ==1800604==ERROR: AddressSanitizer: heap-use-after-free on address 0x62d0005f7418 at pc 0x5593da6704eb bp 0x7f4961a7ac70 sp 0x7f4961a7ac60 READ of size 8 at 0x62d0005f7418 thread T2 #0 0x5593da6704ea in tb_jmp_cache_inval_tb ../accel/tcg/tb-maint.c:244 #1 0x5593da6704ea in do_tb_phys_invalidate ../accel/tcg/tb-maint.c:290 #2 0x5593da670631 in tb_phys_invalidate__locked ../accel/tcg/tb-maint.c:306 #3 0x5593da670631 in tb_invalidate_phys_page_range__locked ../accel/tcg/tb-maint.c:542 #4 0x5593da67106d in tb_invalidate_phys_range ../accel/tcg/tb-maint.c:614 #5 0x5593da6a64d4 in target_munmap ../linux-user/mmap.c:766 #6 0x5593da6dba05 in do_syscall1 ../linux-user/syscall.c:10105 #7 0x5593da6f564c in do_syscall ../linux-user/syscall.c:13329 #8 0x5593da49e80c in cpu_loop ../linux-user/x86_64/../i386/cpu_loop.c:233 #9 0x5593da6be28c in clone_func ../linux-user/syscall.c:6633 #10 0x7f496231cb42 in start_thread nptl/pthread_create.c:442 #11 0x7f49623ae9ff (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff) 0x62d0005f7418 is located 28696 bytes inside of 32768-byte region [0x62d0005f0400,0x62d0005f8400) freed by thread T148 here: #0 0x7f49627b6460 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52 #1 0x5593da5ac057 in cpu_exec_unrealizefn ../cpu.c:180 #2 0x5593da81f851 (/home/cota/src/qemu/build/qemu-x86_64+0x484851) Signed-off-by: Emilio Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230111151628.320011-2-cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230124180127.1881110-27-alex.bennee@linaro.org>
2023-01-17tcg: Remove TCG_TARGET_HAS_direct_jumpRichard Henderson
We now have the option to generate direct or indirect goto_tb depending on the dynamic displacement, thus the define is no longer necessary or completely accurate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Change tb_target_set_jmp_target argumentsRichard Henderson
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Add TranslationBlock.jmp_insn_offsetRichard Henderson
Stop overloading jmp_target_arg for both offset and address, depending on TCG_TARGET_HAS_direct_jump. Instead, add a new field to hold the jump insn offset and always set the target address in jmp_target_addr[]. This will allow a tcg backend to use either direct or indirect depending on displacement. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-16accel/tcg: Split out cpu_exec_{setjmp,loop}Richard Henderson
Recently the g_assert(cpu == current_cpu) test has been intermittently failing with gcc. Reorg the code around the setjmp to minimize the lifetime of the cpu variable affected by the setjmp. This appears to fix the existing issue with clang as well. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1147 Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01accel/tcg: Complete cpu initialization before registrationRichard Henderson
Delay cpu_list_add until realize is complete, so that cross-cpu interaction does not happen with incomplete cpu state. For this, we must delay plugin initialization out of tcg_exec_realizefn, because no cpu_index has been assigned. Fixes a problem with cross-cpu jump cache flushing, when the jump cache has not yet been allocated. Fixes: a976a99a2975 ("include/hw/core: Create struct CPUJumpCache") Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26accel/tcg: Introduce tb_{set_}page_addr{0,1}Richard Henderson
This data structure will be replaced for user-only: add accessors. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26accel/tcg: Add a quicker check for breakpointsLeandro Lupori
Profiling QEMU during Fedora 35 for PPC64 boot revealed that a considerable amount of time was being spent in check_for_breakpoints() (0.61% of total time on PPC64 and 2.19% on amd64), even though it was just checking that its queue was empty and returning, when no breakpoints were set. It turns out this function is not inlined by the compiler and it's always called by helper_lookup_tb_ptr(), one of the most called functions. By leaving only the check for empty queue in check_for_breakpoints() and moving the remaining code to check_for_breakpoints_slow(), called only when the queue is not empty, it's possible to avoid the call overhead. An improvement of about 3% in total time was measured on POWER9. Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221025202424.195984-2-leandro.lupori@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04accel/tcg: Introduce TARGET_TB_PCRELRichard Henderson
Prepare for targets to be able to produce TBs that can run in more than one virtual context. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04accel/tcg: Introduce tb_pc and log_pcRichard Henderson
The availability of tb->pc will shortly be conditional. Introduce accessor functions to minimize ifdefs. Pass around a known pc to places like tcg_gen_code, where the caller must already have the value. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04include/hw/core: Create struct CPUJumpCacheRichard Henderson
Wrap the bare TranslationBlock pointer into a structure. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04accel/tcg: Do not align tb->page_addr[0]Richard Henderson
Let tb->page_addr[0] contain the address of the first byte of the translated block, rather than the address of the page containing the start of the translated block. We need to recover this value anyway at various points, and it is easier to discard a page offset when it is not needed, which happens naturally via the existing find_page shift. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-09-06accel/tcg: Document the faulting lookup in tb_lookup_cmpRichard Henderson
It was non-obvious to me why we can raise an exception in the middle of a comparison function, but it works. While nearby, use TARGET_PAGE_ALIGN instead of open-coding. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-09-06accel/tcg: Make tb_htable_lookup staticRichard Henderson
The function is not used outside of cpu-exec.c. Move it and its subroutines up in the file, before the first use. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-09-06accel/tcg: Unlock mmap_lock after longjmpRichard Henderson
The mmap_lock is held around tb_gen_code. While the comment is correct that the lock is dropped when tb_gen_code runs out of memory, the lock is *not* dropped when an exception is raised reading code for translation. Acked-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-06-11accel/tcg: Inline dump_opcount_info() and remove itBernhard Beschow
dump_opcount_info() is a one-line wrapper around tcg_dump_op_count() which is also exported. So use the latter directly. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220520180109.8224-10-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-06-11accel/tcg/cpu-exec: Unexport dump_drift_info()Bernhard Beschow
Commit 3a841ab53f165910224dc4bebabf1a8f1d04200c 'qapi: introduce x-query-jit QMP command' basically moved the only function using dump_drift_info() to cpu-exec.c. Therefore, dump_drift_info() doesn't need to be exported any longer. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220520180109.8224-9-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-04-20accel/tcg: Use cpu_dump_state between qemu_log_trylock/unlockRichard Henderson
Inside log_cpu_state, we perform qemu_log_trylock/unlock, which need not be done if we have already performed the lock beforehand. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220417183019.755276-15-richard.henderson@linaro.org>
2022-04-20*: Use fprintf between qemu_log_trylock/unlockRichard Henderson
Inside qemu_log, we perform qemu_log_trylock/unlock, which need not be done if we have already performed the lock beforehand. Always check the result of qemu_log_trylock -- only checking qemu_loglevel_mask races with the acquisition of the lock on the logfile. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220417183019.755276-10-richard.henderson@linaro.org>
2022-04-20util/log: Rename qemu_log_lock to qemu_log_trylockRichard Henderson
This function can fail, which makes it more like ftrylockfile or pthread_mutex_trylock than flockfile or pthread_mutex_lock, so rename it. To closer match the other trylock functions, release rcu_read_lock along the failure path, so that qemu_log_unlock need not be called on failure. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220417183019.755276-8-richard.henderson@linaro.org>
2022-04-06Remove qemu-common.h include from most unitsMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-06accel/tcg: Remove pointless CPUArchState castsPhilippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220305233415.64627-2-philippe.mathieu.daude@gmail.com>
2022-02-28accel/tcg/cpu-exec: Fix precise single-stepping after interruptLuc Michel
In some cases, cpu->exit_request can be false after handling the interrupt, leading to another TB being executed instead of returning to the main loop. Fix this by returning true unconditionally when in single-step mode. Fixes: ba3c35d9c402 ("tcg/cpu-exec: precise single-stepping after an interrupt") Signed-off-by: Luc Michel <lmichel@kalray.eu> Message-Id: <20220214132656.11397-1-lmichel@kalray.eu> [rth: Unlock iothread mutex; simplify indentation] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-02-09replay: use CF_NOIRQ for special exception-replaying TBPavel Dovgalyuk
Commit aff0e204cb1f1c036a496c94c15f5dfafcd9b4b4 introduced CF_NOIRQ usage, but one case was forgotten. Record/replay uses one special TB which is not really executed, but used to cause a correct exception in replay mode. This patch adds CF_NOIRQ flag for such block. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <164362834054.1754532.7678416881159817273.stgit@pasha-ThinkPad-X280> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-18monitor: move x-query-profile into accel/tcg to fix buildAlex Bennée
As --enable-profiler isn't defended in CI we missed this breakage. Move the qmp handler into accel/tcg so we have access to the helpers we need. While we are at it ensure we gate the feature on CONFIG_TCG. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Fixes: 37087fde0e ("qapi: introduce x-query-profile QMP command") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/773 Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220105135009.1584676-23-alex.bennee@linaro.org>
2021-11-29accel/tcg: suppress IRQ check for special TBsAlex Bennée
When we set cpu->cflags_next_tb it is because we want to carefully control the execution of the next TB. Currently there is a race that causes the second stage of watchpoint handling to get ignored if an IRQ is processed before we finish executing the instruction that triggers the watchpoint. Use the new CF_NOIRQ facility to avoid the race. We also suppress IRQs when handling precise self modifying code to avoid unnecessary bouncing. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/245 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211129140932.4115115-3-alex.bennee@linaro.org>