aboutsummaryrefslogtreecommitdiff
path: root/tcg
AgeCommit message (Collapse)Author
2021-06-29tcg/riscv: Remove MO_BSWAP handlingRichard Henderson
TCG_TARGET_HAS_MEMORY_BSWAP is already unset for this backend, which means that MO_BSWAP be handled by the middle-end and will never be seen by the backend. Thus the indexes used with qemu_{ld,st}_helpers will always be zero. Tidy the comments and asserts in tcg_out_qemu_{ld,st}_direct. It is not that we do not handle bswap "yet", but never will. Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/aarch64: Unset TCG_TARGET_HAS_MEMORY_BSWAPRichard Henderson
The memory bswap support in the aarch64 backend merely dates from a time when it was required. There is nothing special about the backend support that could not have been provided by the middle-end even prior to the introduction of the bswap flags. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/arm: Unset TCG_TARGET_HAS_MEMORY_BSWAPRichard Henderson
Now that the middle-end can replicate the same tricks as tcg/arm used for optimizing bswap for signed loads and for stores, do not pretend to have these memory ops in the backend. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Make use of bswap flags in tcg_gen_qemu_st_*Richard Henderson
By removing TCG_BSWAP_IZ we indicate that the input is not zero-extended, and thus can remove an explicit extend. By removing TCG_BSWAP_OZ, we allow the implementation to leave high bits set, which will be ignored by the store. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Make use of bswap flags in tcg_gen_qemu_ld_*Richard Henderson
We can perform any required sign-extension via TCG_BSWAP_OS. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add flags argument to tcg_gen_bswap16_*, tcg_gen_bswap32_i64Richard Henderson
Implement the new semantics in the fallback expansion. Change all callers to supply the flags that keep the semantics unchanged locally. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Handle new bswap flags during optimizeRichard Henderson
Notice when the input is known to be zero-extended and force the TCG_BSWAP_IZ flag on. Honor the TCG_BSWAP_OS bit during constant folding. Propagate the input to the output mask. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/tci: Support bswap flagsRichard Henderson
The existing interpreter zero-extends, ignoring high bits. Simply add a separate sign-extension opcode if required. Ensure that the interpreter supports ext16s when bswap16 is enabled. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/mips: Support bswap flags in tcg_out_bswap32Richard Henderson
Merge tcg_out_bswap32 and tcg_out_bswap32s. Use the flags in the internal uses for loads and stores. For mips32r2 bswap32 with zero-extension, standardize on WSBH+ROTR+DEXT. This is the same number of insns as the previous DSBH+DSHD+DSRL but fits in better with the flags check. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/mips: Support bswap flags in tcg_out_bswap16Richard Henderson
Merge tcg_out_bswap16 and tcg_out_bswap16s. Use the flags in the internal uses for loads and stores. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/s390: Support bswap flagsRichard Henderson
For INDEX_op_bswap16_i64, use 64-bit instructions so that we can easily provide the extension to 64-bits. Drop the special case, previously used, where the input is already zero-extended -- the minor code size savings is not worth the complication. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Use power10 byte-reverse instructionsRichard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Support bswap flagsRichard Henderson
For INDEX_op_bswap32_i32, pass 0 for flags: input not zero-extended, output does not need extension within the host 64-bit register. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Split out tcg_out_bswap64Richard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Split out tcg_out_bswap32Richard Henderson
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Split out tcg_out_bswap16Richard Henderson
With the use of a suitable temporary, we can use the same algorithm when src overlaps dst. The result is the same number of instructions either way. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Split out tcg_out_sari{32,64}Richard Henderson
We will shortly require sari in other context; split out both for cleanliness sake. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/ppc: Split out tcg_out_ext{8,16,32}sRichard Henderson
We will shortly require these in other context; make the expansion as clear as possible. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/arm: Support bswap flagsRichard Henderson
Combine the three bswap16 routines, and differentiate via the flags. Use the correct flags combination from the load/store routines, and pass along the constant parameter from tcg_out_op. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/aarch64: Support bswap flagsRichard Henderson
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/aarch64: Merge tcg_out_rev{16,32,64}Richard Henderson
Pass in the input and output size. We currently use 3 of the 5 possible combinations; the others may be used by new tcg opcodes. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg/i386: Support bswap flagsRichard Henderson
Retain the current rorw bswap16 expansion for the zero-in/zero-out case. Otherwise, perform a wider bswap plus a right-shift or extend. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add flags argument to bswap opcodesRichard Henderson
This will eventually simplify front-end usage, and will allow backends to unset TCG_TARGET_HAS_MEMORY_BSWAP without loss of optimization. The argument is added during expansion, not currently exposed to the front end translators. The backends currently only support a flags value of either TCG_BSWAP_IZ, or (TCG_BSWAP_IZ | TCG_BSWAP_OZ), since they all require zero top bytes and leave them that way. At the existing call sites we pass in (TCG_BSWAP_IZ | TCG_BSWAP_OZ), except for the flags-ignored cases of a 32-bit swap of a 32-bit value and or a 64-bit swap of a 64-bit value, where we pass 0. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add tcg_gen_vec_shl{shr}{sar}8i_i32LIU Zhiwei
Implement tcg_gen_vec_shl{shr}{sar}8i_tl by adding corresponging i32 OP. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Message-Id: <20210624105023.3852-5-zhiwei_liu@c-sky.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add tcg_gen_vec_shl{shr}{sar}16i_i32LIU Zhiwei
Implement tcg_gen_vec_shl{shr}{sar}16i_tl by adding corresponging i32 OP. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Message-Id: <20210624105023.3852-4-zhiwei_liu@c-sky.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add tcg_gen_vec_add{sub}8_i32LIU Zhiwei
Implement tcg_gen_vec_add{sub}8_tl by adding corresponging i32 OP. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Message-Id: <20210624105023.3852-3-zhiwei_liu@c-sky.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-29tcg: Add tcg_gen_vec_add{sub}16_i32LIU Zhiwei
Implement tcg_gen_vec_add{sub}16_tl by adding corresponding i32 OP. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Message-Id: <20210624105023.3852-2-zhiwei_liu@c-sky.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-24Merge remote-tracking branch ↵Peter Maydell
'remotes/pmaydell/tags/pull-target-arm-20210624' into staging target-arm queue: * Don't require 'virt' board to be compiled in for ACPI GHES code * docs: Document which architecture extensions we emulate * Fix bugs in M-profile FPCXT_NS accesses * First slice of MVE patches * Implement MTE3 * docs/system: arm: Add nRF boards description # gpg: Signature made Thu 24 Jun 2021 14:59:16 BST # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20210624: (57 commits) docs/system: arm: Add nRF boards description target/arm: Implement MTE3 target/arm: Make VMOV scalar <-> gpreg beatwise for MVE target/arm: Implement MVE VADDV target/arm: Implement MVE VHCADD target/arm: Implement MVE VCADD target/arm: Implement MVE VADC, VSBC target/arm: Implement MVE VRHADD target/arm: Implement MVE VQDMULL (vector) target/arm: Implement MVE VQDMLSDH and VQRDMLSDH target/arm: Implement MVE VQDMLADH and VQRDMLADH target/arm: Implement MVE VRSHL target/arm: Implement MVE VSHL insn target/arm: Implement MVE VQRSHL target/arm: Implement MVE VQSHL (vector) target/arm: Implement MVE VQADD, VQSUB (vector) target/arm: Implement MVE VQDMULH, VQRDMULH (vector) target/arm: Implement MVE VQDMULL scalar target/arm: Implement MVE VQDMULH and VQRDMULH (scalar) target/arm: Implement MVE VQADD and VQSUB ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-21tcg: Make gen_dup_i32/i64() public as tcg_gen_dup_i32/i64Peter Maydell
The Arm MVE VDUP implementation would like to be able to emit code to duplicate a byte or halfword value into an i32. We have code to do this already in tcg-op-gvec.c, so all we need to do is make the functions global. For consistency with other functions made available to the frontends: * we rename to tcg_gen_dup_* * we expose both the _i32 and _i64 forms * we provide the #define for a _tl form Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20210617121628.20116-10-peter.maydell@linaro.org
2021-06-19tcg: Restart when exhausting the stack frameRichard Henderson
Assume that we'll have fewer temps allocated after restarting with a fewer number of instructions. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg: Allocate sufficient storage in temp_allocate_frameRichard Henderson
This function should have been updated for vector types when they were introduced. Fixes: d2fd745fe8b Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367 Cc: qemu-stable@nongnu.org Tested-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/sparc: Fix temp_allocate_frame vs sparc stack biasRichard Henderson
We should not be aligning the offset in temp_allocate_frame, because the odd offset produces an aligned address in the end. Instead, pass the logical offset into tcg_set_frame and add the stack bias last. Cc: qemu-stable@nongnu.org Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Use {set,clear}_helper_retaddrRichard Henderson
Wrap guest memory operations for tci like we do for cpu_ld*_data. We cannot actually use the cpu_ldst.h interface without duplicating the memory trace operations performed within, which will already have been expanded into the tcg opcode stream. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Remove the qemu_ld/st_type macrosRichard Henderson
These macros are only used in one place. By expanding, we get to apply some common-subexpression elimination and create some local variables. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19Revert "tcg/tci: Use exec/cpu_ldst.h interfaces"Richard Henderson
This reverts commit dc09f047eddec8f4a1991c4f5f4a428d7aa3f2c0. For tcg, tracepoints are expanded inline in tcg opcodes. Using a helper which generates a second tracepoint is incorrect. For system mode, the extraction and re-packing of MemOp and mmu_idx lost the alignment information from MemOp. So we were no longer raising alignment exceptions for !TARGET_ALIGNED_ONLY guests. This can be seen in tests/tcg/xtensa/test_load_store.S. For user mode, we must update to the new signature of g2h() so that the revert compiles. We can leave set_helper_retaddr for later. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Split out tci_qemu_ld, tci_qemu_stRichard Henderson
We can share this code between 32-bit and 64-bit loads and stores. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement add2, sub2Richard Henderson
We already had the 32-bit versions for a 32-bit host; expand this to 64-bit hosts as well. The 64-bit opcodes are new. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement mulu2, muls2Richard Henderson
We already had mulu2_i32 for a 32-bit host; expand this to 64-bit hosts as well. The muls2_i32 and the 64-bit opcodes are new. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement clz, ctz, ctpopRichard Henderson
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement extract, sextractRichard Henderson
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement andc, orc, eqv, nand, norRichard Henderson
These were already present in tcg-target.c.inc, but not in the interpreter. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement movcondRichard Henderson
When this opcode is not available in the backend, tcg middle-end will expand this as a series of 5 opcodes. So implementing this saves bytecode space. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Implement goto_ptrRichard Henderson
This operation is critical to staying within the interpretation loop longer, which avoids the overhead of setup and teardown for many TBs. The check in tcg_prologue_init is disabled because TCI does want to use NULL to indicate exit, as opposed to branching to a real epilogue. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Change encoding to uint32_t unitsRichard Henderson
This removes all of the problems with unaligned accesses to the bytecode stream. With an 8-bit opcode at the bottom, we have 24 bits remaining, which are generally split into 6 4-bit slots. This fits well with the maximum length opcodes, e.g. INDEX_op_add2_i32, which have 6 register operands. We have, in previous patches, rearranged things such that there are no operations with a label which have more than one other operand. Which leaves us with a 20-bit field in which to encode a label, giving us a maximum TB size of 512k -- easily large. Change the INDEX_op_tci_movi_{i32,i64} opcodes to tci_mov[il]. The former puts the immediate in the upper 20 bits of the insn, like we do for the label displacement. The later uses a label to reference an entry in the constant pool. Thus, in the worst case we still have a single memory reference for any constant, but now the constants are out-of-line of the bytecode and can be shared between different moves saving space. Change INDEX_op_call to use a label to reference a pair of pointers in the constant pool. This removes the only slightly dodgy link with the layout of struct TCGHelperInfo. The re-encode cannot be done in pieces. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Remove tci_write_regRichard Henderson
Inline it into its one caller, tci_write_reg64. Drop the asserts that are redundant with tcg_read_r. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Emit setcond before brcondRichard Henderson
The encoding planned for tci does not have enough room for brcond2, with 4 registers and a condition as input as well as the label. Resolve the condition into TCG_REG_TMP, and relax brcond to one register plus a label, considering the condition to always be reg != 0. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Reserve r13 for a temporaryRichard Henderson
We're about to adjust the offset range on host memory ops, and the format of branches. Both will require a temporary. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Use ffi for callsRichard Henderson
This requires adjusting where arguments are stored. Place them on the stack at left-aligned positions. Adjust the stack frame to be at entirely positive offsets. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Move call-return regs to end of tcg_target_reg_alloc_orderRichard Henderson
As the only call-clobbered regs for TCI, these should receive the least priority. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19tcg/tci: Improve tcg_target_call_clobber_regsRichard Henderson
The current setting is much too pessimistic. Indicating only the one or two registers that are actually assigned after a call should avoid unnecessary movement between the register array and the stack array. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>