slackcoder/qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer

Age	Commit message (Collapse)	Author
2023-06-05	accel/tcg: Move most of gen-icount.h into translator.c	Richard Henderson
	The only usage of gen_tb_start and gen_tb_end are here. Move the static icount_start_insn variable into a local within translator_loop. Simplify the two subroutines by passing in the existing local cflags variable. Leave only the declaration of gen_io_start in gen-icount.h. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	exec-all: Widen TranslationBlock pc and cs_base to 64-bits	Richard Henderson
	This makes TranslationBlock agnostic to the address size of the guest. Use vaddr for pc, since that's always a virtual address. Use uint64_t for cs_base, since usage varies between guests. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Remove NO_CPU_IO_DEFS	Richard Henderson
	From this remove, it's no longer clear what this is attempting to protect. The last time a use of this define was added to the source tree, as opposed to merely moved around, was 2008. There have been many cleanups since that time and this is no longer required for the build to succeed. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Add guest_mo to TCGContext	Richard Henderson
	This replaces of TCG_GUEST_DEFAULT_MO in tcg-op-ldst.c. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Add insn_start_words to TCGContext	Richard Henderson
	This will enable replacement of TARGET_INSN_START_WORDS in tcg.c. Split out "tcg/insn-start-words.h" and use it in target/. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Split helper-proto.h	Richard Henderson
	Create helper-proto-common.h without the target specific portion. Use that in tcg-op-common.h. Include helper-proto.h in target/arm and target/hexagon before helper-info.c.inc; all other targets are already correct in this regard. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Pass TCGHelperInfo to tcg_gen_callN	Richard Henderson
	In preparation for compiling tcg/ only once, eliminate the all_helpers array. Instantiate the info structs for the generic helpers in accel/tcg/, and the structs for the target-specific helpers in each translate.c. Since we don't see all of the info structs at startup, initialize at first use, using g_once_init_* to make sure we don't race while doing so. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Split out tcg/oversized-guest.h	Richard Henderson
	Move a use of TARGET_LONG_BITS out of tcg/tcg.h. Include the new file only where required. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	*: Add missing includes of tcg/tcg.h	Richard Henderson
	This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h, but the include of tcg.h will be removed. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Add tlb_fast_offset to TCGContext	Richard Henderson
	Disconnect the layout of ArchCPU from TCG compilation. Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Widen CPUTLBEntry comparators to 64-bits	Richard Henderson
	This makes CPUTLBEntry agnostic to the address size of the guest. When 32-bit addresses are in effect, we can simply read the low 32 bits of the 64-bit field. Similarly when we need to update the field for setting TLB_NOTDIRTY. For TCG backends that could in theory be big-endian, but in practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON to document and ensure this is not accidentally missed. For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway, to document the reason for the adjustment. For sparc64 and ppc64, always perform a 64-bit load, and rely on the following 32-bit comparison to ignore the high bits. Rearrange mips and ppc if ladders for clarity. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05	tcg: Move TCG_TYPE_TL from tcg.h to tcg-op.h	Richard Henderson
	Removes the only use of TARGET_LONG_BITS from tcg.h, which is to be target independent. Move the symbol to a define in tcg-op.h, which will continue to be target dependent. Rather than complicate matters for the use in tb_gen_code(), expand the definition there. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-01	accel/tcg: include cs_base in our hash calculations	Alex Bennée
	We weren't using cs_base in the hash calculations before. Since the arm front end moved a chunk of flags in a378206a20 (target/arm: Move mode specific TB flags to tb->cs_base) they comprise of an important part of the execution state. Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8() to accommodate it. My initial benchmark shows very little difference in the runtime. Before: armhf ➜ hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot" Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.627 s ± 2.708 s [User: 34.309 s, System: 1.797 s] Range (min … max): 22.345 s … 29.864 s 20 runs arm64 ➜ hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.559 s ± 2.917 s [User: 189.115 s, System: 4.089 s] Range (min … max): 59.997 s … 70.153 s 10 runs After: armhf Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot Time (mean ± σ): 24.223 s ± 2.151 s [User: 34.284 s, System: 1.906 s] Range (min … max): 22.000 s … 28.476 s 20 runs arm64 hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot" Benchmark 1: 20 Time (mean ± σ): 62.769 s ± 1.978 s [User: 188.431 s, System: 5.269 s] Range (min … max): 60.285 s … 66.868 s 10 runs Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230526165401.574474-12-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01	tcg: remove the final vestiges of dstate	Alex Bennée
	Now we no longer have dynamic state affecting things we can remove the additional fields in cpu.h and simplify the TB hash calculation. For the benchmark: hyperfine -w 2 -m 20 \ "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \ -machine type=virt,highmem=off \ -display none -m 2048 \ -serial mon:stdio \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \ -device scsi-hd,drive=hd -smp 4 \ -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \ -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \ -snapshot" It has a marginal effect on runtime, before: Time (mean ± σ): 26.279 s ± 2.438 s [User: 41.113 s, System: 1.843 s] Range (min … max): 24.420 s … 32.565 s 20 runs after: Time (mean ± σ): 24.440 s ± 2.885 s [User: 34.474 s, System: 2.028 s] Range (min … max): 21.663 s … 29.937 s 20 runs Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358 Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20230526165401.574474-10-alex.bennee@linaro.org Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-05-30	accel/tcg: Extract store_atom_insert_al16 to host header	Richard Henderson
	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30	accel/tcg: Extract load_atom_extract_al16_or_al8 to host header	Richard Henderson
	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30	accel/tcg: Fix check for page writeability in load_atomic16_or_exit	Richard Henderson
	PAGE_WRITE is current writability, as modified by TB protection; PAGE_WRITE_ORG is the original page writability. Fixes: cdfac37be0d ("accel/tcg: Honor atomicity of loads") Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	tcg: Remove DEBUG_DISAS	Richard Henderson
	This had been set since the beginning, is never undefined, and it would seem to be harmful to debugging to do so. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc	Richard Henderson
	Remove the locally defined load_atomic16 and store_atomic16, along with HAVE_al16 and HAVE_al16_fast in favor of the routines defined in atomic128.h. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128	Richard Henderson
	These symbols will shortly become dynamic runtime tests and therefore not appropriate for the preprocessor. Use the matching CONFIG_* symbols for that purpose. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	accel/tcg: Remove prot argument to atomic_mmu_lookup	Richard Henderson
	Now that load/store are gone, we're always passing PAGE_READ \| PAGE_WRITE for RMW atomic operations. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu	Richard Henderson
	Atomic load/store of 128-byte quantities is now handled by cpu_{ld,st}16_mmu. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu	Richard Henderson
	With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. The only use of the functions with explicit endianness was in target/sparc64, and that was only to satisfy the assert: the correct endianness is already built into memop. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23	include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.h	Richard Henderson
	Not only the routines in ldst_atomicity.c.inc need markup, but also the ones in the headers. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-18	accel/tcg: Fix append_mem_cb	Richard Henderson
	In fcdab382c8b9 we removed a tcg_gen_extu_tl_i64 from gen_empty_mem_cb, and failed to adjust the associated copy, leading to a failed assert. Fixes: fcdab382c8b9 ("accel/tcg: Widen plugin_gen_empty_mem_callback to i64") Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230518145813.2940745-1-richard.henderson@linaro.org>
2023-05-18	tcg: round-robin: do not use mb_read for rr_current_cpu	Paolo Bonzini
	Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-16	tcg: Add tlb_dyn_max_bits to TCGContext	Richard Henderson
	Disconnect guest tlb parameters from TCG compilation. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Add page_bits and page_mask to TCGContext	Richard Henderson
	Disconnect guest page size from TCG compilation. While this could be done via exec/target_page.h, we want to cache the value across multiple memory access operations, so we might as well initialize this early. The changes within tcg/ are entirely mechanical: sed -i s/TARGET_PAGE_BITS/s->page_bits/g sed -i s/TARGET_PAGE_MASK/s->page_mask/g Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Add addr_type to TCGContext	Richard Henderson
	This will enable replacement of TARGET_LONG_BITS within tcg/. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Widen plugin_gen_empty_mem_callback to i64	Richard Henderson
	Since we do this inside gen_empty_mem_cb anyway, let's do this earlier inside tcg expansion. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Merge do_gen_mem_cb into caller	Richard Henderson
	As do_gen_mem_cb is called once, merge it into gen_empty_mem_cb. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callback	Richard Henderson
	As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback, we can avoid the curiosity of union mem_gen_fn by inlining it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Widen helper_atomic_* addresses to uint64_t	Richard Henderson
	Always pass the target address as uint64_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Widen helper_{ld,st}_i128 addresses to uint64_t	Richard Henderson
	Always pass the target address as uint64_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Widen tcg-ldst.h addresses to uint64_t	Richard Henderson
	Always pass the target address as uint64_t. Adjust tcg_out_{ld,st}_helper_args to match. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Widen gen_insn_data to uint64_t	Richard Henderson
	We already pass uint64_t to restore_state_to_opc; this changes all of the other uses from insn_start through the encoding to decoding. Reviewed-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Remove helper_unaligned_{ld,st}	Richard Henderson
	These functions are now unused. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	meson: Detect atomic128 support with optimization	Richard Henderson
	There is an edge condition prior to gcc13 for which optimization is required to generate 16-byte atomic sequences. Detect this. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Add 128-bit guest memory primitives	Richard Henderson
	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Implement helper_{ld,st}*_mmu for user-only	Richard Henderson
	TCG backends may need to defer to a helper to implement the atomicity required by a given operation. Mirror the interface used in system mode. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	tcg: Unify helper_{be,le}_{ld,st}*	Richard Henderson
	With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. Hoist the qemu_{ld,st}_helpers arrays to tcg.c. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Honor atomicity of stores	Richard Henderson
	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16	accel/tcg: Honor atomicity of loads	Richard Henderson
	Create ldst_atomicity.c.inc. Not required for user-only code loads, because we've ensured that the page is read-only before beginning to translate code. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg: Reorg system mode store helpers	Richard Henderson
	Instead of trying to unify all operations on uint64_t, use mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg: Reorg system mode load helpers	Richard Henderson
	Instead of trying to unify all operations on uint64_t, pull out mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg: Introduce tlb_read_idx	Richard Henderson
	Instead of playing with offsetof in various places, use MMUAccessType to index an array. This is easily defined instead of the previous dummy padding array in the union. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg: Add cpu_in_serial_context	Richard Henderson
	Like cpu_in_exclusive_context, but also true if there is no other cpu against which we could race. Use it in tb_flush as a direct replacement. Use it in cpu_loop_exit_atomic to ensure that there is no loop against cpu_exec_step_atomic. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg/tcg-accel-ops-rr: ensure fairness with icount	Jamie Iles
	The round-robin scheduler will iterate over the CPU list with an assigned budget until the next timer expiry and may exit early because of a TB exit. This is fine under normal operation but with icount enabled and SMP it is possible for a CPU to be starved of run time and the system live-locks. For example, booting a riscv64 platform with '-icount shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel has timers enabled and starts performing TLB shootdowns. In this case we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU 1. As we enter the TCG loop, we assign the icount budget to next timer interrupt to CPU 0 and begin executing where the guest is sat in a busy loop exhausting all of the budget before we try to execute CPU 1 which is the target of the IPI but CPU 1 is left with no budget with which to execute and the process repeats. We try here to add some fairness by splitting the budget across all of the CPUs on the thread fairly before entering each one. The CPU count is cached on CPU list generation ID to avoid iterating the list on each loop iteration. With this change it is possible to boot an SMP rv64 guest with icount enabled and no hangs. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11	accel/tcg: Fix atomic_mmu_lookup for reads	Richard Henderson
	A copy-paste bug had us looking at the victim cache for writes. Cc: qemu-stable@nongnu.org Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
2023-05-08	tb-maint: do not use mb_read/mb_set	Paolo Bonzini
	The load side can use a relaxed load, which will surely happen before the work item is run by async_safe_run_on_cpu() or before double-checking under mmap_lock. The store side can use an atomic RMW operation. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>