aboutsummaryrefslogtreecommitdiff
path: root/accel/tcg
AgeCommit message (Collapse)Author
2023-06-05accel/tcg: Move most of gen-icount.h into translator.cRichard Henderson
The only usage of gen_tb_start and gen_tb_end are here. Move the static icount_start_insn variable into a local within translator_loop. Simplify the two subroutines by passing in the existing local cflags variable. Leave only the declaration of gen_io_start in gen-icount.h. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05exec-all: Widen TranslationBlock pc and cs_base to 64-bitsRichard Henderson
This makes TranslationBlock agnostic to the address size of the guest. Use vaddr for pc, since that's always a virtual address. Use uint64_t for cs_base, since usage varies between guests. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Remove NO_CPU_IO_DEFSRichard Henderson
From this remove, it's no longer clear what this is attempting to protect. The last time a use of this define was added to the source tree, as opposed to merely moved around, was 2008. There have been many cleanups since that time and this is no longer required for the build to succeed. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Add guest_mo to TCGContextRichard Henderson
This replaces of TCG_GUEST_DEFAULT_MO in tcg-op-ldst.c. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Add insn_start_words to TCGContextRichard Henderson
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c. Split out "tcg/insn-start-words.h" and use it in target/. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Split helper-proto.hRichard Henderson
Create helper-proto-common.h without the target specific portion. Use that in tcg-op-common.h. Include helper-proto.h in target/arm and target/hexagon before helper-info.c.inc; all other targets are already correct in this regard. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Pass TCGHelperInfo to tcg_gen_callNRichard Henderson
In preparation for compiling tcg/ only once, eliminate the all_helpers array. Instantiate the info structs for the generic helpers in accel/tcg/, and the structs for the target-specific helpers in each translate.c. Since we don't see all of the info structs at startup, initialize at first use, using g_once_init_* to make sure we don't race while doing so. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Split out tcg/oversized-guest.hRichard Henderson
Move a use of TARGET_LONG_BITS out of tcg/tcg.h. Include the new file only where required. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05*: Add missing includes of tcg/tcg.hRichard Henderson
This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h, but the include of tcg.h will be removed. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Add tlb_fast_offset to TCGContextRichard Henderson
Disconnect the layout of ArchCPU from TCG compilation. Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Widen CPUTLBEntry comparators to 64-bitsRichard Henderson
This makes CPUTLBEntry agnostic to the address size of the guest. When 32-bit addresses are in effect, we can simply read the low 32 bits of the 64-bit field. Similarly when we need to update the field for setting TLB_NOTDIRTY. For TCG backends that could in theory be big-endian, but in practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON to document and ensure this is not accidentally missed. For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway, to document the reason for the adjustment. For sparc64 and ppc64, always perform a 64-bit load, and rely on the following 32-bit comparison to ignore the high bits. Rearrange mips and ppc if ladders for clarity. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05tcg: Move TCG_TYPE_TL from tcg.h to tcg-op.hRichard Henderson
Removes the only use of TARGET_LONG_BITS from tcg.h, which is to be target independent. Move the symbol to a define in tcg-op.h, which will continue to be target dependent. Rather than complicate matters for the use in tb_gen_code(), expand the definition there. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-01accel/tcg: include cs_base in our hash calculationsAlex Bennée
We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01tcg: remove the final vestiges of dstateAlex Bennée
Now we no longer have dynamic state affecting things we can remove the additional fields in cpu.h and simplify the TB hash calculation. For the benchmark: hyperfine -w 2 -m 20 \ "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \ -machine type=virt,highmem=off \ -display none -m 2048 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \ -device scsi-hd,drive=hd -smp 4 \ -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \ -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \ -snapshot" It has a marginal effect on runtime, before: Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s] Range (min … max): 24.420 s … 32.565 s 20 runs after: Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s] Range (min … max): 21.663 s … 29.937 s 20 runs Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358 Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230526165401.574474-10-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-30accel/tcg: Extract store_atom_insert_al16 to host headerRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30accel/tcg: Extract load_atom_extract_al16_or_al8 to host headerRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30accel/tcg: Fix check for page writeability in load_atomic16_or_exitRichard Henderson
PAGE_WRITE is current writability, as modified by TB protection; PAGE_WRITE_ORG is the original page writability. Fixes: cdfac37be0d ("accel/tcg: Honor atomicity of loads") Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23tcg: Remove DEBUG_DISASRichard Henderson
This had been set since the beginning, is never undefined, and it would seem to be harmful to debugging to do so. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.incRichard Henderson
Remove the locally defined load_atomic16 and store_atomic16, along with HAVE_al16 and HAVE_al16_fast in favor of the routines defined in atomic128.h. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128Richard Henderson
These symbols will shortly become dynamic runtime tests and therefore not appropriate for the preprocessor. Use the matching CONFIG_* symbols for that purpose. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23accel/tcg: Remove prot argument to atomic_mmu_lookupRichard Henderson
Now that load/store are gone, we're always passing PAGE_READ | PAGE_WRITE for RMW atomic operations. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmuRichard Henderson
Atomic load/store of 128-byte quantities is now handled by cpu_{ld,st}16_mmu. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmuRichard Henderson
With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. The only use of the functions with explicit endianness was in target/sparc64, and that was only to satisfy the assert: the correct endianness is already built into memop. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.hRichard Henderson
Not only the routines in ldst_atomicity.c.inc need markup, but also the ones in the headers. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-18accel/tcg: Fix append_mem_cbRichard Henderson
In fcdab382c8b9 we removed a tcg_gen_extu_tl_i64 from gen_empty_mem_cb, and failed to adjust the associated copy, leading to a failed assert. Fixes: fcdab382c8b9 ("accel/tcg: Widen plugin_gen_empty_mem_callback to i64") Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230518145813.2940745-1-richard.henderson@linaro.org>
2023-05-18tcg: round-robin: do not use mb_read for rr_current_cpuPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-16tcg: Add tlb_dyn_max_bits to TCGContextRichard Henderson
Disconnect guest tlb parameters from TCG compilation. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Add page_bits and page_mask to TCGContextRichard Henderson
Disconnect guest page size from TCG compilation. While this could be done via exec/target_page.h, we want to cache the value across multiple memory access operations, so we might as well initialize this early. The changes within tcg/ are entirely mechanical: sed -i s/TARGET_PAGE_BITS/s->page_bits/g sed -i s/TARGET_PAGE_MASK/s->page_mask/g Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Add addr_type to TCGContextRichard Henderson
This will enable replacement of TARGET_LONG_BITS within tcg/. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Widen plugin_gen_empty_mem_callback to i64Richard Henderson
Since we do this inside gen_empty_mem_cb anyway, let's do this earlier inside tcg expansion. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Merge do_gen_mem_cb into callerRichard Henderson
As do_gen_mem_cb is called once, merge it into gen_empty_mem_cb. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callbackRichard Henderson
As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback, we can avoid the curiosity of union mem_gen_fn by inlining it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Widen helper_atomic_* addresses to uint64_tRichard Henderson
Always pass the target address as uint64_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Widen helper_{ld,st}_i128 addresses to uint64_tRichard Henderson
Always pass the target address as uint64_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Widen tcg-ldst.h addresses to uint64_tRichard Henderson
Always pass the target address as uint64_t. Adjust tcg_out_{ld,st}_helper_args to match. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Widen gen_insn_data to uint64_tRichard Henderson
We already pass uint64_t to restore_state_to_opc; this changes all of the other uses from insn_start through the encoding to decoding. Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Remove helper_unaligned_{ld,st}Richard Henderson
These functions are now unused. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16meson: Detect atomic128 support with optimizationRichard Henderson
There is an edge condition prior to gcc13 for which optimization is required to generate 16-byte atomic sequences. Detect this. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Add 128-bit guest memory primitivesRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Implement helper_{ld,st}*_mmu for user-onlyRichard Henderson
TCG backends may need to defer to a helper to implement the atomicity required by a given operation. Mirror the interface used in system mode. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16tcg: Unify helper_{be,le}_{ld,st}*Richard Henderson
With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. Hoist the qemu_{ld,st}_helpers arrays to tcg.c. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Honor atomicity of storesRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16accel/tcg: Honor atomicity of loadsRichard Henderson
Create ldst_atomicity.c.inc. Not required for user-only code loads, because we've ensured that the page is read-only before beginning to translate code. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg: Reorg system mode store helpersRichard Henderson
Instead of trying to unify all operations on uint64_t, use mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg: Reorg system mode load helpersRichard Henderson
Instead of trying to unify all operations on uint64_t, pull out mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg: Introduce tlb_read_idxRichard Henderson
Instead of playing with offsetof in various places, use MMUAccessType to index an array. This is easily defined instead of the previous dummy padding array in the union. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg: Add cpu_in_serial_contextRichard Henderson
Like cpu_in_exclusive_context, but also true if there is no other cpu against which we could race. Use it in tb_flush as a direct replacement. Use it in cpu_loop_exit_atomic to ensure that there is no loop against cpu_exec_step_atomic. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg/tcg-accel-ops-rr: ensure fairness with icountJamie Iles
The round-robin scheduler will iterate over the CPU list with an assigned budget until the next timer expiry and may exit early because of a TB exit. This is fine under normal operation but with icount enabled and SMP it is possible for a CPU to be starved of run time and the system live-locks. For example, booting a riscv64 platform with '-icount shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel has timers enabled and starts performing TLB shootdowns. In this case we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU 1. As we enter the TCG loop, we assign the icount budget to next timer interrupt to CPU 0 and begin executing where the guest is sat in a busy loop exhausting all of the budget before we try to execute CPU 1 which is the target of the IPI but CPU 1 is left with no budget with which to execute and the process repeats. We try here to add some fairness by splitting the budget across all of the CPUs on the thread fairly before entering each one. The CPU count is cached on CPU list generation ID to avoid iterating the list on each loop iteration. With this change it is possible to boot an SMP rv64 guest with icount enabled and no hangs. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11accel/tcg: Fix atomic_mmu_lookup for readsRichard Henderson
A copy-paste bug had us looking at the victim cache for writes. Cc: qemu-stable@nongnu.org Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
2023-05-08tb-maint: do not use mb_read/mb_setPaolo Bonzini
The load side can use a relaxed load, which will surely happen before the work item is run by async_safe_run_on_cpu() or before double-checking under mmap_lock. The store side can use an atomic RMW operation. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>