aboutsummaryrefslogtreecommitdiff
path: root/target/arm/translate-vfp.inc.c
AgeCommit message (Collapse)Author
2019-12-16target/arm: Handle trapping to EL2 of AArch32 VMRS instructionsMarc Zyngier
HCR_EL2.TID3 requires that AArch32 reads of MVFR[012] are trapped to EL2, and HCR_EL2.TID0 does the same for reads of FPSID. In order to handle this, introduce a new TCG helper function that checks for these control bits before executing the VMRC instruction. Tested with a hacked-up version of KVM/arm64 that sets the control bits for 32bit guests. Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191201122018.25808-4-maz@kernel.org [PMM: move helper declaration to helper.h; make it TCG_CALL_NO_WG] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-11-01target/arm: Allow reading flags from FPSCR for M-profileChristophe Lyon
rt==15 is a special case when reading the flags: it means the destination is APSR. This patch avoids rejecting vmrs apsr_nzcv, fpscr as illegal instruction. Cc: qemu-stable@nongnu.org Signed-off-by: Christophe Lyon <christophe.lyon@linaro.org> Message-id: 20191025095711.10853-1-christophe.lyon@linaro.org [PMM: updated the comment] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-03target/arm: Free TCG temps in trans_VMOV_64_sp()Peter Maydell
The function neon_store_reg32() doesn't free the TCG temp that it is passed, so the caller must do that. We got this right in most places but forgot to free the TCG temps in trans_VMOV_64_sp(). Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190827121931.26836-1-peter.maydell@linaro.org
2019-09-03target/arm: Factor out unallocated_encoding for aarch32Richard Henderson
Make this a static function private to translate.c. Thus we can use the same idiom between aarch64 and aarch32 without actually sharing function implementations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20190826151536.6771-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-03Revert "target/arm: Use unallocated_encoding for aarch32"Richard Henderson
This reverts commit 3cb36637157088892e9e33ddb1034bffd1251d3b. Despite the fact that the text for the call to gen_exception_insn is identical for aarch64 and aarch32, the implementation inside gen_exception_insn is totally different. This fixes exceptions raised from aarch64. Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20190826151536.6771-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16target/arm: Use unallocated_encoding for aarch32Richard Henderson
Promote this function from aarch64 to fully general use. Use it to unify the code sequences for generating illegal opcode exceptions. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807045335.1361-11-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16target/arm: Replace offset with pc in gen_exception_insnRichard Henderson
The offset is variable depending on the instruction set, whereas we have stored values for the current pc and the next pc. Passing in the actual value is clearer in intent. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807045335.1361-8-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16target/arm: Introduce add_reg_for_litRichard Henderson
Provide a common routine for the places that require ALIGN(PC, 4) as the base address as opposed to plain PC. The two are always the same for A32, but the difference is meaningful for thumb mode. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190807045335.1361-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-04target/arm: Correct VMOV_imm_dp handling of short vectorsPeter Maydell
Coverity points out (CID 1402195) that the loop in trans_VMOV_imm_dp() that iterates over the destination registers in a short-vector VMOV accidentally throws away the returned updated register number from vfp_advance_dreg(). Add the missing assignment. (We got this correct in trans_VMOV_imm_sp().) Fixes: 18cf951af9a27ae573a Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190702105115.9465-1-peter.maydell@linaro.org
2019-06-18target/arm: Check for dp support for dp VFM, not spPeter Maydell
In commit 1120827fa182f0e7622 we accidentally put the "UNDEF unless FPU has double-precision support" check in the single-precision VFM function. Put it in the dp function where it belongs. Fixes: 1120827fa182f0e7622 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190617160130.3207-1-peter.maydell@linaro.org
2019-06-17target/arm: Only implement doubles if the FPU supports themPeter Maydell
The architecture permits FPUs which have only single-precision support, not double-precision; Cortex-M4 and Cortex-M33 are both like that. Add the necessary checks on the MVFR0 FPDP field so that we UNDEF any double-precision instructions on CPUs like this. Note that even if FPDP==0 the insns like VMOV-to/from-gpreg, VLDM/VSTM, VLDR/VSTR which take double precision registers still exist. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190614104457.24703-3-peter.maydell@linaro.org
2019-06-17target/arm: Fix typos in trans function prototypesPeter Maydell
In several places cut and paste errors meant we were using the wrong type for the 'arg' struct in trans_ functions called by the decodetree decoder, because we were using the _sp version of the struct in the _dp function. These were harmless, because the two structs were identical and so decodetree made them typedefs of the same underlying structure (and we'd have had a compile error if they were not harmless), but we should clean them up anyway. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190614104457.24703-2-peter.maydell@linaro.org
2019-06-17target/arm: Use vfp_expand_imm() for AArch32 VFP VMOV_immPeter Maydell
The AArch32 VMOV (immediate) instruction uses the same VFP encoded immediate format we already handle in vfp_expand_imm(). Use that function rather than hand-decoding it. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190613163917.28589-3-peter.maydell@linaro.org
2019-06-17target/arm: Move vfp_expand_imm() to translate.[ch]Peter Maydell
We want to use vfp_expand_imm() in the AArch32 VFP decode; move it from the a64-only header/source file to the AArch32 one (which is always compiled even for AArch64). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190613163917.28589-2-peter.maydell@linaro.org
2019-06-13target/arm: Fix short-vector increment behaviourPeter Maydell
For VFP short vectors, the VFP registers are divided into a series of banks: for single-precision these are s0-s7, s8-s15, s16-s23 and s24-s31; for double-precision they are d0-d3, d4-d7, ... d28-d31. Some banks are "scalar" meaning that use of a register within them triggers a pure-scalar or mixed vector-scalar operation rather than a full vector operation. The scalar banks are s0-s7, d0-d3 and d16-d19. When using a bank as part of a vector operation, we iterate through it, increasing the register number by the specified stride each time, and wrapping around to the beginning of the bank. Unfortunately our calculation of the "increment" part of this was incorrect: vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask) will only do the intended thing if bank_mask has exactly one set high bit. For instance for doubles (bank_mask = 0xc), if we start with vd = 6 and delta_d = 2 then vd is updated to 12 rather than the intended 4. This only causes problems in the unlikely case that the starting register is not the first in its bank: if the register number doesn't have to wrap around then the expression happens to give the right answer. Fix this bug by abstracting out the "check whether register is in a scalar bank" and "advance register within bank" operations to utility functions which use the right bit masking operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert float-to-integer VCVT insns to decodetreePeter Maydell
Convert the float-to-integer VCVT instructions to decodetree. Since these are the last unconverted instructions, we can delete the old decoder structure entirely now. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VCVT fp/fixed-point conversion insns to decodetreePeter Maydell
Convert the VCVT (between floating-point and fixed-point) instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VJCVT to decodetreePeter Maydell
Convert the VJCVT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert integer-to-float insns to decodetreePeter Maydell
Convert the VCVT integer-to-float instructions to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert double-single precision conversion insns to decodetreePeter Maydell
Convert the VCVT double/single precision conversion insns to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP round insns to decodetreePeter Maydell
Convert the VFP round-to-integer instructions VRINTR, VRINTZ and VRINTX to decodetree. These instructions were only introduced as part of the "VFP misc" additions in v8A, so we check this. The old decoder's implementation was incorrectly providing them even for v7A CPUs. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert the VCVT-to-f16 insns to decodetreePeter Maydell
Convert the VCVTT and VCVTB instructions which convert from f32 and f64 to f16 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit store of the right half of the input single-precision register rather than doing a load/modify/store sequence on the full 32 bits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert the VCVT-from-f16 insns to decodetreePeter Maydell
Convert the VCVTT, VCVTB instructions that deal with conversion from half-precision floats to f32 or 64 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit load of the right half of the input single-precision register rather than loading the full 32 bits and then doing a separate shift or sign-extension. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP comparison insns to decodetreePeter Maydell
Convert the VFP comparison instructions to decodetree. Note that comparison instructions should not honour the VFP short-vector length and stride information: they are scalar-only operations. This applies to all the 2-operand instructions except for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VMOV (register) to decodetreePeter Maydell
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VSQRT to decodetreePeter Maydell
Convert the VSQRT instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VNEG to decodetreePeter Maydell
Convert the VNEG instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VABS to decodetreePeter Maydell
Convert the VFP VABS instruction to decodetree. Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or VFPGen2OpDPFn because none of the operations which use this format and support short vectors will need it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VMOV (imm) to decodetreePeter Maydell
Convert the VFP VMOV (immediate) instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP fused multiply-add insns to decodetreePeter Maydell
Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VDIV to decodetreePeter Maydell
Convert the VDIV instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VSUB to decodetreePeter Maydell
Convert the VSUB instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VADD to decodetreePeter Maydell
Convert the VADD instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VNMUL to decodetreePeter Maydell
Convert the VNMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VMUL to decodetreePeter Maydell
Convert the VMUL instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP VNMLA to decodetreePeter Maydell
Convert the VFP VNMLA instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP VNMLS to decodetreePeter Maydell
Convert the VFP VNMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP VMLS to decodetreePeter Maydell
Convert the VFP VMLS instruction to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP VMLA to decodetreePeter Maydell
Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0dPeter Maydell
Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert the VFP load/store multiple insns to decodetreePeter Maydell
Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP VLDR and VSTR to decodetreePeter Maydell
Convert the VFP single load/store insns VLDR and VSTR to decodetree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert VFP two-register transfer insns to decodetreePeter Maydell
Convert the VFP two-register transfer instructions to decodetree (in the v8 Arm ARM these are the "Advanced SIMD and floating-point 64-bit move" encoding group). Again, we expand out the sequences involving gen_vfp_msr() and gen_msr_vfp(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert "single-precision" register moves to decodetreePeter Maydell
Convert the "single-precision" register moves to decodetree: * VMSR * VMRS * VMOV between general purpose register and single precision Note that the VMSR/VMRS conversions make our handling of the "should this UNDEF?" checks consistent between the two instructions: * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0 (previously was a nop) * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better (previously was a nop) * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better (previously would write to the register, which had no guest-visible effect because we always UNDEF reads) We also tighten up the decode: we were previously underdecoding some SBZ or SBO bits. The conversion of VMOV_single includes the expansion out of the gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr() sequences into the simpler direct load/store of the TCG temp via neon_{load,store}_reg32(): we know in the new function that we're always single-precision, we don't need to use the old-and-deprecated cpu_F0* TCG globals, and we don't happen to have the declaration of gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the new function is. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert "double-precision" register moves to decodetreePeter Maydell
Convert the "double-precision" register moves to decodetree: this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP. Note that the conversion process has tightened up a few of the UNDEF encoding checks: we now correctly forbid: * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10 * VMOV-from-gpr with opc1:opc2 == 0x10 * VDUP with B:E == 11 * VDUP with Q == 1 and Vn<0> == 1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- The accesses of elements < 32 bits could be improved by doing direct ld/st of the right size rather than 32-bit read-and-shift or read-modify-write, but we leave this for later cleanup, since this series is generally trying to stick to fixing the decode. Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Add helpers for VFP register loads and storesPeter Maydell
The current VFP code has two different idioms for loading and storing from the VFP register file: 1 using the gen_mov_F0_vreg() and similar functions, which load and store to a fixed set of TCG globals cpu_F0s, CPU_F0d, etc 2 by direct calls to tcg_gen_ld_f64() and friends We want to phase out idiom 1 (because the use of the fixed globals is a relic of a much older version of TCG), but idiom 2 is quite longwinded: tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg)) requires us to specify the 64-bitness twice, once in the function name and once by passing 'true' to vfp_reg_offset(). There's no guard against accidentally passing the wrong flag. Instead, let's move to a convention of accessing 64-bit registers via the existing neon_load_reg64() and neon_store_reg64(), and provide new neon_load_reg32() and neon_store_reg32() for the 32-bit equivalents. Implement the new functions and use them in the code in translate-vfp.inc.c. We will convert the rest of the VFP code as we do the decodetree conversion in subsequent commits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Move the VFP trans_* functions to translate-vfp.inc.cPeter Maydell
Move the trans_*() functions we've just created from translate.c to translate-vfp.inc.c. This is pure code motion with no textual changes (this can be checked with 'git show --color-moved'). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Convert the VSEL instructions to decodetreePeter Maydell
Convert the VSEL instructions to decodetree. We leave trans_VSEL() in translate.c for now as this allows the patch to show just the changes from the old handle_vsel(). In the old code the check for "do D16-D31 exist" was hidden in the VFP_DREG macro, and assumed that VFPv3 always implied that D16-D31 exist. In the new code we do the correct ID register test. This gives identical behaviour for most of our CPUs, and fixes previously incorrect handling for Cortex-R5F, Cortex-M4 and Cortex-M33, which all implement VFPv3 or better with only 16 double-precision registers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Factor out VFP access checking codePeter Maydell
Factor out the VFP access checking code so that we can use it in the leaf functions of the decodetree decoder. We call the function full_vfp_access_check() so we can keep the more natural vfp_access_check() for a version which doesn't have the 'ignore_vfp_enabled' flag -- that way almost all VFP insns will be able to use vfp_access_check(s) and only the special-register access function will have to use full_vfp_access_check(s, ignore_vfp_enabled). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-13target/arm: Add stubs for AArch32 VFP decodetreePeter Maydell
Add the infrastructure for building and invoking a decodetree decoder for the AArch32 VFP encodings. At the moment the new decoder covers nothing, so we always fall back to the existing hand-written decode. We need to have one decoder for the unconditional insns and one for the conditional insns, as otherwise the patterns for conditional insns would incorrectly match against the unconditional ones too. Since translate.c is over 14,000 lines long and we're going to be touching pretty much every line of the VFP code as part of the decodetree conversion, we create a new translate-vfp.inc.c to hold the code which deals with VFP in the new scheme. It should be possible to convert this into a standalone translation unit eventually, but the conversion process will be much simpler if we simply #include it midway through translate.c to start with. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>