aboutsummaryrefslogtreecommitdiff
path: root/target/ppc/translate
AgeCommit message (Collapse)Author
2021-11-09target/ppc: Implement cnttzdmLuis Pires
Implement the following PowerISA v3.1 instruction: cnttzdm: Count Trailing Zeros Doubleword Under Bit Mask Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-9-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Implement cntlzdmLuis Pires
Implement the following PowerISA v3.1 instruction: cntlzdm: Count Leading Zeros Doubleword Under Bit Mask Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-8-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Implement PLQ and PSTQMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-7-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Move LQ and STQ to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-6-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Implement PLFS, PLFD, PSTFS and PSTFD instructionsFernando Eckhardt Valle
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Fernando Eckhardt Valle <fernando.valle@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-5-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Move load and store floating point instructions to decodetreeFernando Eckhardt Valle
Move load floating point instructions (lfs, lfsu, lfsx, lfsux, lfd, lfdu, lfdx, lfdux) and store floating point instructions(stfs, stfsu, stfsx, stfsux, stfd, stfdu, stfdx, stfdux) from legacy system to decodetree. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Fernando Eckhardt Valle <fernando.valle@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-4-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: move resolve_PLS_D to translate.cFernando Eckhardt Valle
Move resolve_PLS_D from fixedpoint-impl.c.inc to translate.c because this way the function can be used not only by fixed point instructions. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Fernando Eckhardt Valle <phervalle@gmail.com> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-3-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: introduce do_ea_calcFernando Eckhardt Valle
The do_ea_calc function will calculate the effective address(EA) according to PowerIsa 3.1. With that, it was replaced part of do_ldst() that calculates the EA by this new function. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Fernando Eckhardt Valle (pherde) <phervalle@gmail.com> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-2-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-29target/ppc: Ease L=0 requirement on cmp/cmpi/cmpl/cmpli for ppc32Matheus Ferst
In commit 8f0a4b6a9b, we started to require L=0 for ppc32 to match what The Programming Environments Manual say: "For 32-bit implementations, the L field must be cleared, otherwise the instruction form is invalid." The stricter behavior, however, broke AROS boot on sam460ex, which is a regression from 6.0. This patch partially reverts the change, raising the exception only for CPUs known to require L=0 (e500 and e500mc) and logging a guest error for other cases. Both behaviors are acceptable by the PowerISA, which allows "the system illegal instruction error handler to be invoked or yield boundedly undefined results." Reported-by: BALATON Zoltan <balaton@eik.bme.hu> Fixes: 8f0a4b6a9b ("target/ppc: Move cmp/cmpi/cmpl/cmpli to decodetree") Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210720135507.2444635-1-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Move cmp/cmpi/cmpl/cmpli to decodetreeMatheus Ferst
Additionally, REQUIRE_64BIT when L=1 to match what is specified in The Programming Environments Manual: "For 32-bit implementations, the L field must be cleared, otherwise the instruction form is invalid." Some CPUs are known to deviate from this specification by ignoring the L bit [1]. The stricter behavior, however, can help users that test software with qemu, making it more likely to detect bugs that would otherwise be silent. If deemed necessary, a future patch can adapt this behavior based on the specific CPU model. [1] The 601 manual is the only one I've found that explicitly states that the L bit is ignored, but we also observe this behavior in a 7447A v1.2. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-15-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [dwg: Corrected whitespace error] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Move addpcis to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-14-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement vcfuged instructionMatheus Ferst
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-13-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement cfuged instructionMatheus Ferst
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-12-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement setbc/setbcr/stnbc/setnbcr instructionsMatheus Ferst
Implements the following PowerISA v3.1 instructions: setbc: Set Boolean Condition setbcr: Set Boolean Condition Reverse setnbc: Set Negative Boolean Condition setnbcr: Set Negative Boolean Condition Reverse Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-11-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement prefixed integer store instructionsRichard Henderson
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-10-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Move D/DS/X-form integer stores to decodetreeRichard Henderson
These are all connected by macros in the legacy decoding. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-9-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement prefixed integer load instructionsRichard Henderson
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-8-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Move D/DS/X-form integer loads to decodetreeRichard Henderson
These are all connected by macros in the legacy decoding. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-7-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Implement PNOPRichard Henderson
The illegal suffix behavior matches what was observed in a POWER10 DD2.0 machine. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-6-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Move ADDI, ADDIS to decodetree, implement PADDIRichard Henderson
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-5-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03target/ppc: Add infrastructure for prefixed insnsRichard Henderson
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-4-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19target/ppc: Fix load endianness for lxvwsx/lxvdsxGiuseppe Musacchio
TARGET_WORDS_BIGENDIAN may not match the machine endianness if that's a runtime-configurable parameter. Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c Fixes: afae37d98ae991c0792c867dbd9f32f988044318 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/212 Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <20210518133020.58927-1-thatlemon@gmail.com> Tested-by: Paul A. Clarke <pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-14ppc/translate: Rewrite gen_lxvdsx to use gvec primitivesGiuseppe Musacchio
Make the implementation match the lxvwsx one. The code is now shorter smaller and potentially faster as the translation will use the host SIMD capabilities if available. No functional change. Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <a463dea379da4cb3a22de49c678932f74fb15dd7.1604912739.git.thatlemon@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-11-24ppc/translate: Implement lxvwsx opcodeLemonBoy
Implement the "Load VSX Vector Word & Splat Indexed" opcode, introduced in Power ISA v3.0. Buglink: https://bugs.launchpad.net/qemu/+bug/1793608 Signed-off-by: Giuseppe Musacchio <thatlemon@gmail.com> Message-Id: <d7d533e18c2bc10d924ee3e09907ff2b41fddb3a.1604912739.git.thatlemon@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-24Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20200818' ↵Peter Maydell
into staging ppc patch queue 2020-08-18 Here's my first pull request for qemu-5.2, which has quite a few accumulated things. Highlights are: * Preliminary support for POWER10 (Power ISA 3.1) instruction emulation * Add documentation on the (very confusing) pseries NUMA configuration * Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip * Fix icount for a number of POWER registers * Many cleanups to error handling in XIVE code * Validate size of -prom-env data # gpg: Signature made Tue 18 Aug 2020 05:18:36 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-5.2-20200818: (40 commits) spapr/xive: Use xive_source_esb_len() nvram: Exit QEMU if NVRAM cannot contain all -prom-env data spapr/xive: Simplify error handling of kvmppc_xive_cpu_synchronize_state() ppc/xive: Simplify error handling in xive_tctx_realize() spapr/xive: Simplify error handling in kvmppc_xive_connect() ppc/xive: Fix error handling in vmstate_xive_tctx_*() callbacks spapr/xive: Fix error handling in kvmppc_xive_post_load() spapr/kvm: Fix error handling in kvmppc_xive_pre_save() spapr/xive: Rework error handling of kvmppc_xive_set_source_config() spapr/xive: Rework error handling in kvmppc_xive_get_queues() spapr/xive: Rework error handling of kvmppc_xive_[gs]et_queue_config() spapr/xive: Rework error handling of kvmppc_xive_cpu_[gs]et_state() spapr/xive: Rework error handling of kvmppc_xive_mmap() spapr/xive: Rework error handling of kvmppc_xive_source_reset() spapr/xive: Rework error handling of kvmppc_xive_cpu_connect() spapr: Simplify error handling in spapr_phb_realize() spapr/xive: Convert KVM device fd checks to assert() ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers ppc/xive: Rework setup of XiveSource::esb_mmio target/ppc: Integrate icount to purr, vtb, and tbu40 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-08-21meson: rename included C source files to .c.incPaolo Bonzini
With Makefiles that have automatically generated dependencies, you generated includes are set as dependencies of the Makefile, so that they are built before everything else and they are available when first building the .c files. Alternatively you can use a fine-grained dependency, e.g. target/arm/translate.o: target/arm/decode-neon-shared.inc.c With Meson you have only one choice and it is a third option, namely "build at the beginning of the corresponding target"; the way you express it is to list the includes in the sources of that target. The problem is that Meson decides if something is a source vs. a generated include by looking at the extension: '.c', '.cc', '.m', '.C' are sources, while everything else is considered an include---including '.inc.c'. Use '.c.inc' to avoid this, as it is consistent with our other convention of using '.rst.inc' for included reStructuredText files. The editorconfig file is adjusted. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-12target/ppc: Fix SPE unavailable exception triggeringMatthieu Bucchianeri
When emulating certain floating point instructions or vector instructions on PowerPC machines, QEMU did not properly generate the SPE/Embedded Floating- Point Unavailable interrupt. See the buglink further below for references to the relevant NXP documentation. This patch fixes the behavior of some evfs* instructions that were incorrectly emitting the interrupt. More importantly, this patch fixes the behavior of several efd* and ev* instructions that were not generating the interrupt. Triggering the interrupt for these instructions fixes lazy FPU/vector context switching on some operating systems like Linux. Without this patch, the result of some double-precision arithmetic could be corrupted due to the lack of proper saving and restoring of the upper 32-bit part of the general-purpose registers. Buglink: https://bugs.launchpad.net/qemu/+bug/1888918 Buglink: https://bugs.launchpad.net/qemu/+bug/1611394 Signed-off-by: Matthieu Bucchianeri <matthieu.bucchianeri@leostella.com> Message-Id: <20200727175553.32276-1-matthieu.bucchianeri@leostella.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12target/ppc: add vmulh{su}d instructionsLijun Pan
vmulhsd: Vector Multiply High Signed Doubleword vmulhud: Vector Multiply High Unsigned Doubleword Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12target/ppc: add vmulh{su}w instructionsLijun Pan
vmulhsw: Vector Multiply High Signed Word vmulhuw: Vector Multiply High Unsigned Word Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-4-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12target/ppc: add vmulld instructionLijun Pan
vmulld: Vector Multiply Low Doubleword. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200701234344.91843-6-ljp@linux.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12target/ppc: convert vmuluwm to tcg_gen_gvec_mulLijun Pan
Convert the original implementation of vmuluwm to the more generic tcg_gen_gvec_mul. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200701234344.91843-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12target/ppc: Fix TCG leak with the evmwsmiaa instructionMatthieu Bucchianeri
Fix double-call to tcg_temp_new_i64(), where a temp is allocated both at declaration time and further down the implementation of gen_evmwsmiaa(). Note that gen_evmwsmia() and gen_evmwsmiaa() are still not implemented correctly, as they invoke gen_evmwsmi() which may return early, but the return is not propagated. This will be fixed in my patch for bug #1888918. Signed-off-by: Matthieu Bucchianeri <matthieu.bucchianeri@leostella.com> Message-Id: <20200727172114.31415-1-matthieu.bucchianeri@leostella.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-06-02target/ppc: Use tcg_gen_gvec_rotlvRichard Henderson
Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-05-06target/ppc: Use tcg_gen_gvec_dup_immRichard Henderson
We can now unify the implementation of the 3 VSPLTI instructions. Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-02-21target/ppc: Fix typo in commentsBALATON Zoltan
"Deferred" was misspelled as "differed" in some comments, correct this typo, Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-Id: <20200214155748.0896B745953@zero.eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24target/ppc: Fix for optimized vsl/vsr instructionsStefan Brankovic
In previous implementation, invocation of TCG shift function could request shift of TCG variable by 64 bits when variable 'sh' is 0, which is not supported in TCG (values can be shifted by 0 to 63 bits). This patch fixes this by using two separate invocation of TCG shift functions, with maximum shift amount of 32. Name of variable 'shifted' is changed to 'carry' so variable naming is similar to old helper implementation. Variables 'avrA' and 'avrB' are replaced with variable 'avr'. Fixes: 4e6d0920e7547e6af4bbac5ffe9adfe6ea621822 Reported-by: "Paul A. Clark" <pc@us.ibm.com> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Suggested-by: Aleksandar Markovic <aleksandar.markovic@rt-rk.com> Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1570196639-7025-2-git-send-email-stefan.brankovic@rt-rk.com> Tested-by: Paul A. Clarke <pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-04ppc: Add support for 'mffsce' instructionPaul A. Clarke
ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsce' instruction. 'mffsce' is identical to 'mffs', except that it also clears the exception enable bits in the FPSCR. On CPUs without support for 'mffsce' (below ISA 3.0), the instruction will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1568817082-1384-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-04ppc: Add support for 'mffscrn','mffscrni' instructionsPaul A. Clarke
ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffscrn' and 'mffscrni' instructions. 'mffscrn' and 'mffscrni' are similar to 'mffsl', except they do not return the status bits (FI, FR, FPRF) and they also set the rounding mode in the FPSCR. On CPUs without support for 'mffscrn'/'mffscrni' (below ISA 3.0), the instructions will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1568817081-1345-1-git-send-email-pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-29target/ppc: Refactor emulation of vmrgew and vmrgow instructionsStefan Brankovic
Since I found this two instructions implemented with tcg, I refactored them so they are consistent with other similar implementations that I introduced in this patch. Also, a new dual macro GEN_VXFORM_TRANS_DUAL is added. This macro is used if one instruction is realized with direct translation, and second one with a helper. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1566898663-25858-4-git-send-email-stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-29ppc: Fix xsmaddmdp and friendsPaul A. Clarke
A class of instructions of the form: op Target,A,B which operate like: Target = Target * A + B have a bit set which distinguishes them from instructions that operate as: Target = Target * B + A This bit is not being checked properly (using PPC_BIT macro), so all instructions in this class are operating incorrectly as the second form above. The bit was being checked as if it were part of a 64-bit instruction opcode, rather than a proper 32-bit opcode. Fix by using the macro (PPC_BIT32) which treats the opcode as a 32-bit quantity. Fixes: c9f4e4d8b632 ("target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro") Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1566401321-22419-1-git-send-email-pc@us.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21ppc: Add support for 'mffsl' instructionPaul A. Clarke
ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsl'. 'mffsl' is identical to 'mffs', except it only returns mode, status, and enable bits from the FPSCR. On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction will execute identically to 'mffs'. Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which is both bits of the FPSCR rounding mode, as defined in the ISA. I also fixed a typo in the definition of FPSCR_FR. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> v4: - nit: added some braces to resolve a checkpatch complaint. v3: - Changed tcg_gen_and_i64 to tcg_gen_andi_i64, eliminating the need for a temporary, per review from Richard Henderson. v2: - I found that I copied too much of the 'mffs' implementation. The 'Rc' condition code bits are not needed for 'mffsl'. Removed. - I now free the (renamed) 'tmask' temporary. - I now bail early for older ISA to the original 'mffs' implementation. Message-Id: <1565982203-11048-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21target/ppc: Optimize emulation of vclzw instructionStefan Brankovic
Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word). This instruction counts the number of leading zeros of each word element in source register and places result in the appropriate word element of destination register. Counting is to be performed in four iterations of for loop(one for each word elemnt of source register vB). Every iteration consists of loading appropriate word element from source register, counting leading zeros with tcg_gen_clzi_i32, and saving the result in appropriate word element of destination register. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21target/ppc: Optimize emulation of vclzd instructionStefan Brankovic
Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword). This instruction counts the number of leading zeros of each doubleword element in source register and places result in the appropriate doubleword element of destination register. Using tcg-s count leading zeros instruction two times(once for each doubleword element of source register vB) and placing result in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21target/ppc: Optimize emulation of vgbbd instructionStefan Brankovic
Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword) All ith bits (i in range 1 to 8) of each byte of doubleword element in source register are concatenated and placed into ith byte of appropriate doubleword element in destination register. Following solution is done for both doubleword elements of source register in parallel, in order to reduce the number of instructions needed(that's why arrays are used): First, both doubleword elements of source register vB are placed in appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with tcg_mask. For every following iteration, both avr[i] and tcg_mask variables have to be shifted right for 7 and 8 places, respectively, in order to get bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask... After first 8 iteration(first loop), all the first bits are in their final places, all second bits but second bit from eight byte are in their places... only 1 eight bit from eight byte is in it's place). In second loop we do all operations symmetrically, in order to get other half of bits in their final spots. Results for first and second doubleword elements are saved in result[0] and result[1] respectively. In the end those results are saved in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21target/ppc: Optimize emulation of vsl and vsr instructionsStefan Brankovic
Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt). Perform shift operation (left and right respectively) on 128 bit value of register vA by value specified in bits 125-127 of register vB. Lowest 3 bits in each byte element of register vB must be identical or result is undefined. For vsl instruction, the first step is bits 125-127 of register vB have to be saved in variable sh. Then, the highest sh bits of the lower doubleword element of register vA are saved in variable shifted, in order not to lose those bits when shift operation is performed on the lower doubleword element of register vA, which is the next step. After shifting the lower doubleword element shift operation is performed on higher doubleword element of vA, with replacement of the lowest sh bits(that are now 0) with bits saved in shifted. For vsr instruction, firstly, the bits 125-127 of register vB have to be saved in variable sh. Then, the lowest sh bits of the higher doubleword element of register vA are saved in variable shifted, in odred not to lose those bits when the shift operation is performed on the higher doubleword element of register vA, which is the next step. After shifting higher doubleword element, shift operation is performed on lower doubleword element of vA, with replacement of highest sh bits(that are now 0) with bits saved in shifted. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21target/ppc: Optimize emulation of lvsl and lvsr instructionsStefan Brankovic
Adding simple macro that is calling tcg implementation of appropriate instruction if altivec support is active. Optimization of altivec instruction lvsl (Load Vector for Shift Left). Place bytes sh:sh+15 of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by addition of the result with 0x0001020304050607. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by adding the result of previous multiplication with 0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element of vD. Optimization of altivec instruction lvsr (Load Vector for Shift Right). Place bytes 16-sh:31-sh of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by substraction of the result from 0x1011121314151617. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by substracting the result of previous multiplication from 0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element of vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macroMark Cave-Ayland
Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which enables the source and destination registers to be decoded at translation time. This enables the determination of a or m form to be made at translation time so that a single helper function can now be used for both variants. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-16-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02target/ppc: decode target register in VSX_EXTRACT_INSERT at translation timeMark Cave-Ayland
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-15-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at ↵Mark Cave-Ayland
translation time Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-14-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.cMark Cave-Ayland
Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based upon rA and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-13-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>