aboutsummaryrefslogtreecommitdiff
path: root/tcg
AgeCommit message (Collapse)Author
2023-01-23tcg/arm: Use register pair allocation for qemu_{ld,st}_i64Richard Henderson
Although we still can't use ldrd and strd for all operations, increase the chances by getting the register allocation correct. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23tcg: Avoid recursion in tcg_gen_mulu2_i32Richard Henderson
We have a test for one of TCG_TARGET_HAS_mulu2_i32 or TCG_TARGET_HAS_muluh_i32 being defined, but the test became non-functional when we changed to always define all of these macros. Replace this with a build-time test in tcg_gen_mulu2_i32. Fixes: 25c4d9cc845 ("tcg: Always define all of the TCGOpcode enum members.") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1435 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-20tcg/riscv: Use tcg_pcrel_diff in tcg_out_ldstRichard Henderson
We failed to update this with the w^x split, so misses the fact that true pc-relative offsets are usually small. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20230117230415.354239-1-richard.henderson@linaro.org> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-01-17tcg/riscv: Implement direct branch for goto_tbRichard Henderson
Now that tcg can handle direct and indirect goto_tb simultaneously, we can optimistically leave space for a direct branch and fall back to loading the pointer from the TB for an indirect branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/riscv: Introduce OPC_NOPRichard Henderson
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/arm: Implement direct branch for goto_tbRichard Henderson
Now that tcg can handle direct and indirect goto_tb simultaneously, we can optimistically leave space for a direct branch and fall back to loading the pointer from the TB for an indirect branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/sparc64: Reorg goto_tb implementationRichard Henderson
The old sparc64 implementation may replace two insns, which leaves a race condition in which a thread could be stopped at a PC in the middle of the sequence, and when restarted does not see the complete address computation and branches to nowhere. The new implemetation replaces only one insn, swapping between a direct branch and a direct call. The TCG_REG_TB register is loaded from tb->jmp_target_addr[] in the delay slot. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/sparc64: Remove USE_REG_TBRichard Henderson
This is always true for sparc64, so this is dead since 3a5f6805c7ca. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/ppc: Reorg goto_tb implementationRichard Henderson
The old ppc64 implementation replaces 2 or 4 insns, which leaves a race condition in which a thread could be stopped at a PC in the middle of the sequence, and when restarted does not see the complete address computation and branches to nowhere. The new implemetation replaces only one insn, swapping between b <dest> and mtctr r31 falling through to a general-case indirect branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/aarch64: Reorg goto_tb implementationRichard Henderson
The old implementation replaces two insns, swapping between b <dest> nop br x30 and adrp x30, <dest> addi x30, x30, lo12:<dest> br x30 There is a race condition in which a thread could be stopped at the PC of the second insn, and when restarted does not see the complete address computation and branches to nowhere. The new implemetation replaces only one insn, swapping between b <dest> br tmp and ldr tmp, <jmp_addr> br tmp Reported-by: hev <r@hev.cc> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Remove TCG_TARGET_HAS_direct_jumpRichard Henderson
We now have the option to generate direct or indirect goto_tb depending on the dynamic displacement, thus the define is no longer necessary or completely accurate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Always define tb_target_set_jmp_targetRichard Henderson
Install empty versions for !TCG_TARGET_HAS_direct_jump hosts. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Move tb_target_set_jmp_target declaration to tcg.hRichard Henderson
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Change tb_target_set_jmp_target argumentsRichard Henderson
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Add TranslationBlock.jmp_insn_offsetRichard Henderson
Stop overloading jmp_target_arg for both offset and address, depending on TCG_TARGET_HAS_direct_jump. Instead, add a new field to hold the jump insn offset and always set the target address in jmp_target_addr[]. This will allow a tcg backend to use either direct or indirect depending on displacement. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Add gen_tb to TCGContextRichard Henderson
This can replace four other variables that are references into the TranslationBlock structure. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Rename TB_JMP_RESET_OFFSET_INVALID to TB_JMP_OFFSET_INVALIDRichard Henderson
This will shortly be used for more than reset. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Split out tcg_out_goto_tbRichard Henderson
The INDEX_op_goto_tb opcode needs no register allocation. Split out a dedicated helper function for it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Introduce get_jmp_target_addrRichard Henderson
Similar to the existing set_jmp_reset_offset. Include the rw->rx address space conversion done by arm and s390x, and forgotten by mips and riscv. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Introduce set_jmp_insn_offsetRichard Henderson
Similar to the existing set_jmp_reset_offset. Move any assert for TCG_TARGET_HAS_direct_jump into the new function (which now cannot be build-time). Will be unused if TCG_TARGET_HAS_direct_jump is constant 0, but we can't test for constant in the preprocessor, so just mark it G_GNUC_UNUSED. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Replace asserts on tcg_jmp_insn_offsetRichard Henderson
Test TCG_TARGET_HAS_direct_jump instead of testing an implementation pointer. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/sparc64: Remove unused goto_tb code for indirect jumpRichard Henderson
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/ppc: Remove unused goto_tb code for indirect jumpRichard Henderson
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg/i386: Remove unused goto_tb code for indirect jumpRichard Henderson
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17tcg: Split out tcg_out_exit_tbRichard Henderson
The INDEX_op_exit_tb opcode needs no register allocation. Split out a dedicated helper function for it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-16tcg: add perfmap and jitdumpIlya Leoshkevich
Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump. The first one allows the perf tool to map samples to each individual translation block. The second one adds the ability to resolve symbol names, line numbers and inspect JITed code. Example of use: perf record qemu-x86_64 -perfmap ./a.out perf report or perf record -k 1 qemu-x86_64 -jitdump ./a.out DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted perf report -i perf.data.jitted Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com> Co-developed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-Id: <20230112152013.125680-4-iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-08Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into stagingPeter Maydell
tcg/s390x improvements: - drop support for pre-z196 cpus (eol before 2017) - add support for misc-instruction-extensions-3 - misc cleanups # gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits) tcg/s390x: Avoid the constant pool in tcg_out_movi tcg/s390x: Cleanup tcg_out_movi tcg/s390x: Tighten constraints for 64-bit compare tcg/s390x: Implement ctpop operation tcg/s390x: Use tgen_movcond_int in tgen_clz tcg/s390x: Support SELGR instruction in movcond tcg/s390x: Generalize movcond implementation tcg/s390x: Create tgen_cmp2 to simplify movcond tcg/s390x: Support MIE3 logical operations tcg/s390x: Tighten constraints for and_i64 tcg/s390x: Tighten constraints for or_i64 and xor_i64 tcg/s390x: Issue XILF directly for xor_i32 tcg/s390x: Support MIE2 MGRK instruction tcg/s390x: Support MIE2 multiply single instructions tcg/s390x: Distinguish RIE formats tcg/s390x: Distinguish RRF-a and RRF-c formats tcg/s390x: Use LARL+AGHI for odd addresses tcg/s390x: Remove DISTINCT_OPERANDS facility check tcg/s390x: Remove FAST_BCR_SER facility check tcg/s390x: Check for load-on-condition facility at startup ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-06tcg/s390x: Avoid the constant pool in tcg_out_moviRichard Henderson
Load constants in no more than two insns, which turns out to be faster than using the constant pool. Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Cleanup tcg_out_moviRichard Henderson
Merge maybe_out_small_movi, as it no longer has additional users. Use is_const_p{16,32}. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Tighten constraints for 64-bit compareRichard Henderson
Give 64-bit comparison second operand a signed 33-bit immediate. This is the smallest superset of uint32_t and int32_t, as used by CLGFI and CGFI respectively. The rest of the 33-bit space can be loaded into TCG_TMP0. Drop use of the constant pool. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Implement ctpop operationRichard Henderson
There is an older form that produces per-byte results, and a newer form that produces per-register results. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Use tgen_movcond_int in tgen_clzRichard Henderson
Reuse code from movcond to conditionally copy a2 to dest, based on the condition codes produced by FLOGR. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Support SELGR instruction in movcondRichard Henderson
The new select instruction provides two separate register inputs, whereas the old load-on-condition instruction overlaps one of the register inputs with the destination. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Generalize movcond implementationRichard Henderson
Generalize movcond to support pre-computed conditions, and the same set of arguments at all times. This will be assumed by a following patch, which needs to reuse tgen_movcond_int. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Create tgen_cmp2 to simplify movcondRichard Henderson
Return both regular and inverted condition codes from tgen_cmp2. This lets us choose after the fact which comparision we want. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Support MIE3 logical operationsRichard Henderson
This is andc, orc, nand, nor, eqv. We can use nor for implementing not. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Tighten constraints for and_i64Richard Henderson
Let the register allocator handle such immediates by matching only what one insn can achieve. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Tighten constraints for or_i64 and xor_i64Richard Henderson
Drop support for sequential OR and XOR, as the serial dependency is slower than loading the constant first. Let the register allocator handle such immediates by matching only what one insn can achieve. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Issue XILF directly for xor_i32Richard Henderson
There is only one instruction that is applicable to a 32-bit immediate xor. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Support MIE2 MGRK instructionRichard Henderson
The MIE2 facility adds a 3-operand signed 64x64->128 multiply. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Support MIE2 multiply single instructionsRichard Henderson
The MIE2 facility adds 3-operand versions of multiply. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Distinguish RIE formatsRichard Henderson
There are multiple variations, with different fields. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Distinguish RRF-a and RRF-c formatsRichard Henderson
One has 3 register arguments; the other has 2 plus an m3 field. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Use LARL+AGHI for odd addressesRichard Henderson
Add one instead of dropping odd addresses to the constant pool. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Remove DISTINCT_OPERANDS facility checkRichard Henderson
The distinct-operands facility is bundled into facility 45, along with load-on-condition. We are checking this at startup. Remove the a0 == a1 checks for 64-bit sub, and, or, xor, as there is no space savings for avoiding the distinct-operands insn. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Remove FAST_BCR_SER facility checkRichard Henderson
The fast-bcr-serialization facility is bundled into facility 45, along with load-on-condition. We are checking this at startup. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Check for load-on-condition facility at startupRichard Henderson
The general-instruction-extension facility was introduced in z196, which itself was end-of-life in 2021. In addition, z196 is the minimum CPU supported by our set of supported operating systems: RHEL 7 (z196), SLES 12 (z196) and Ubuntu 16.04 (zEC12). Check for facility number 45, which will be the consilidated check for several facilities. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Check for general-instruction-extension facility at startupRichard Henderson
The general-instruction-extension facility was introduced in z10, which itself was end-of-life in 2019. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Check for extended-immediate facility at startupRichard Henderson
The extended-immediate facility was introduced in z9-109, which itself was end-of-life in 2017. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06tcg/s390x: Check for long-displacement facility at startupRichard Henderson
We are already assuming the existance of long-displacement, but were not being explicit about it. This has been present since z990. Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>