aboutsummaryrefslogtreecommitdiff
path: root/target/ppc/translate/vsx-impl.c.inc
AgeCommit message (Collapse)Author
2022-10-28target/ppc: Use gvec to decode XVTSTDC[DS]PLucas Mateus Castro (alqotel)
Used gvec to translate XVTSTDCSP and XVTSTDCDP. xvtstdcsp: rept loop imm master version prev version current version 25 4000 0 0,206200 0,040730 (-80.2%) 0,040740 (-80.2%) 25 4000 1 0,205120 0,053650 (-73.8%) 0,053510 (-73.9%) 25 4000 3 0,206160 0,058630 (-71.6%) 0,058570 (-71.6%) 25 4000 51 0,217110 0,191490 (-11.8%) 0,192320 (-11.4%) 25 4000 127 0,206160 0,191490 (-7.1%) 0,192640 (-6.6%) 8000 12 0 1,234719 0,418833 (-66.1%) 0,386365 (-68.7%) 8000 12 1 1,232417 1,435979 (+16.5%) 1,462792 (+18.7%) 8000 12 3 1,232760 1,766073 (+43.3%) 1,743990 (+41.5%) 8000 12 51 1,239281 1,319562 (+6.5%) 1,423479 (+14.9%) 8000 12 127 1,231708 1,315760 (+6.8%) 1,426667 (+15.8%) xvtstdcdp: rept loop imm master version prev version current version 25 4000 0 0,159930 0,040830 (-74.5%) 0,040610 (-74.6%) 25 4000 1 0,160640 0,053670 (-66.6%) 0,053480 (-66.7%) 25 4000 3 0,160020 0,063030 (-60.6%) 0,062960 (-60.7%) 25 4000 51 0,160410 0,128620 (-19.8%) 0,127470 (-20.5%) 25 4000 127 0,160330 0,127670 (-20.4%) 0,128690 (-19.7%) 8000 12 0 1,190365 0,422146 (-64.5%) 0,388417 (-67.4%) 8000 12 1 1,191292 1,445312 (+21.3%) 1,428698 (+19.9%) 8000 12 3 1,188687 1,980656 (+66.6%) 1,975354 (+66.2%) 8000 12 51 1,191250 1,264500 (+6.1%) 1,355083 (+13.8%) 8000 12 127 1,197313 1,266729 (+5.8%) 1,349156 (+12.7%) Overall, these instructions are the hardest ones to measure performance as the gvec implementation is affected by the immediate. Above there are 5 different scenarios when it comes to immediate and 2 when it comes to rept/loop combination. The immediates scenarios are: all bits are 0 therefore the target register should just be changed to 0, with 1 bit set, with 2 bits set in a combination the new implementation can deal with using gvec, 4 bits set and the new implementation can't deal with it using gvec and all bits set. The rept/loop scenarios are high loop and low rept (so it should spend more time executing it than translating it) and high rept low loop (so it should spend more time translating it than executing this code). These comparisons are between the upstream version, a previous similar implementation and a one with a cleaner code(this one). For a comparison with o previous different implementation: <20221010191356.83659-13-lucas.araujo@eldorado.org.br> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-13-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XSTSTDC[QDS]P to decodetreeLucas Mateus Castro (alqotel)
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of its decoding away from the helper as previously the DCMX, XB and BF were calculated in the helper with the help of cpu_env, now that part was moved to the decodetree with the rest. xvtstdcsp: rept loop master patch 8 12500 1,85393600 1,94683600 (+5.0%) 25 4000 1,78779800 1,92479000 (+7.7%) 100 1000 2,12775000 2,28895500 (+7.6%) 500 200 2,99655300 3,23102900 (+7.8%) 2500 40 6,89082200 7,44827500 (+8.1%) 8000 12 17,50585500 18,95152100 (+8.3%) xvtstdcdp: rept loop master patch 8 12500 1,39043100 1,33539800 (-4.0%) 25 4000 1,35731800 1,37347800 (+1.2%) 100 1000 1,51514800 1,56053000 (+3.0%) 500 200 2,21014400 2,47906000 (+12.2%) 2500 40 5,39488200 6,68766700 (+24.0%) 8000 12 13,98623900 18,17661900 (+30.0%) xvtstdcdp: rept loop master patch 8 12500 1,35123800 1,34455800 (-0.5%) 25 4000 1,36441200 1,36759600 (+0.2%) 100 1000 1,49763500 1,54138400 (+2.9%) 500 200 2,19020200 2,46196400 (+12.4%) 2500 40 5,39265700 6,68147900 (+23.9%) 8000 12 14,04163600 18,19669600 (+29.6%) As some values are now decoded outside the helper and passed to it as an argument the number of arguments of the helper increased, the number of TCGop needed to load the arguments increased. I suspect that's why the slow-down in the tests with a high REPT but low LOOP. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XVTSTDC[DS]P to decodetreeLucas Mateus Castro (alqotel)
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper to be simpler and do all decoding in the decodetree (so XB, XT and DCMX are all calculated outside the helper). Obs: The tests in this one are slightly different, these are the sum of these instructions with all possible immediate and those instructions are repeated 10 times. xvtstdcsp: rept loop master patch 8 12500 2,76402100 2,70699100 (-2.1%) 25 4000 2,64867100 2,67884100 (+1.1%) 100 1000 2,73806300 2,78701000 (+1.8%) 500 200 3,44666500 3,61027600 (+4.7%) 2500 40 5,85790200 6,47475500 (+10.5%) 8000 12 15,22102100 17,46062900 (+14.7%) xvtstdcdp: rept loop master patch 8 12500 2,11818000 1,61065300 (-24.0%) 25 4000 2,04573400 1,60132200 (-21.7%) 100 1000 2,13834100 1,69988100 (-20.5%) 500 200 2,73977000 2,48631700 (-9.3%) 2500 40 5,05067000 5,25914100 (+4.1%) 8000 12 14,60507800 15,93704900 (+9.1%) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Use gvec to decode XVCPSGN[SD]PLucas Mateus Castro (alqotel)
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate them. xvcpsgnsp: rept loop master patch 8 12500 0,00561400 0,00537900 (-4.2%) 25 4000 0,00562100 0,00400000 (-28.8%) 100 1000 0,00696900 0,00416300 (-40.3%) 500 200 0,02211900 0,00840700 (-62.0%) 2500 40 0,09328600 0,02728300 (-70.8%) 8000 12 0,27295300 0,06867800 (-74.8%) xvcpsgndp: rept loop master patch 8 12500 0,00556300 0,00584200 (+5.0%) 25 4000 0,00482700 0,00431700 (-10.6%) 100 1000 0,00585800 0,00464400 (-20.7%) 500 200 0,01565300 0,00839700 (-46.4%) 2500 40 0,05766500 0,02430600 (-57.8%) 8000 12 0,19875300 0,07947100 (-60.0%) Like the previous instructions there seemed to be a improvement on translation time. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-10-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]PLucas Mateus Castro (alqotel)
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to decodetree and used gvec to translate them. xvabssp: rept loop master patch 8 12500 0,00477900 0,00476000 (-0.4%) 25 4000 0,00442800 0,00353300 (-20.2%) 100 1000 0,00478700 0,00366100 (-23.5%) 500 200 0,00973200 0,00649400 (-33.3%) 2500 40 0,03165200 0,02226700 (-29.7%) 8000 12 0,09315900 0,06674900 (-28.3%) xvabsdp: rept loop master patch 8 12500 0,00475000 0,00474400 (-0.1%) 25 4000 0,00355600 0,00367500 (+3.3%) 100 1000 0,00444200 0,00366000 (-17.6%) 500 200 0,00942700 0,00732400 (-22.3%) 2500 40 0,02990000 0,02308500 (-22.8%) 8000 12 0,08770300 0,06683800 (-23.8%) xvnabssp: rept loop master patch 8 12500 0,00494500 0,00492900 (-0.3%) 25 4000 0,00397700 0,00338600 (-14.9%) 100 1000 0,00421400 0,00353500 (-16.1%) 500 200 0,01048000 0,00707100 (-32.5%) 2500 40 0,03251500 0,02238300 (-31.2%) 8000 12 0,08889100 0,06469800 (-27.2%) xvnabsdp: rept loop master patch 8 12500 0,00511000 0,00492700 (-3.6%) 25 4000 0,00398800 0,00381500 (-4.3%) 100 1000 0,00390500 0,00365900 (-6.3%) 500 200 0,00924800 0,00784600 (-15.2%) 2500 40 0,03138900 0,02391600 (-23.8%) 8000 12 0,09654200 0,05684600 (-41.1%) xvnegsp: rept loop master patch 8 12500 0,00493900 0,00452800 (-8.3%) 25 4000 0,00369100 0,00366800 (-0.6%) 100 1000 0,00371100 0,00380000 (+2.4%) 500 200 0,00991100 0,00652300 (-34.2%) 2500 40 0,03025800 0,02422300 (-19.9%) 8000 12 0,09251100 0,06457600 (-30.2%) xvnegdp: rept loop master patch 8 12500 0,00474900 0,00454400 (-4.3%) 25 4000 0,00353100 0,00325600 (-7.8%) 100 1000 0,00398600 0,00366800 (-8.0%) 500 200 0,01032300 0,00702400 (-32.0%) 2500 40 0,03125000 0,02422400 (-22.5%) 8000 12 0,09475100 0,06173000 (-34.9%) This one to me seemed the opposite of the previous instructions, as it looks like there was an improvement in the translation time (itself not a surprise as operations were done twice before so there was the need to translate twice as many TCGop) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-9-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-17target/ppc: Fix xvcmp* clearing FI bitVíctor Colombo
Vector instructions in general are not supposed to change the FI bit. However, xvcmp* instructions are calling gen_helper_float_check_status, which is leading to a cleared FI flag where it should be kept unchanged. As helper_float_check_status only affects inexact, overflow and underflow, and the xvcmp* instructions don't change these flags, this issue can be fixed by removing the call to helper_float_check_status. By doing this, the FI bit in FPSCR will be preserved as expected. Fixes: 00084a25adf ("target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions") Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221005121551.27957-1-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented [pm]xvbf16ger2*Lucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented pmxvf*ger*Lucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate pmxvf16ger2np: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate pmxvf16ger2pn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate pmxvf16ger2pp: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate pmxvf32ger: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) pmxvf32gernn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf32gernp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf32gerpn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf32gerpp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate pmxvf64ger: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) pmxvf64gernn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf64gernp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf64gerpn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf64gerpp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-7-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented xvf16ger*Lucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented xvf*ger*Lucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented pmxvi*ger* instructionsLucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate pmxvi8ger4: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) pmxvi8ger4pp: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate pmxvi8ger4spp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate pmxvi16ger2: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) pmxvi16ger2pp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate pmxvi16ger2s: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation pmxvi16ger2spp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-4-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented xvi*ger* instructionsLucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate xvi8ger4: VSX Vector 4-bit Signed Integer GER (rank-8 update) xvi8ger4pp: VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate xvi8ger4spp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate xvi16ger2: VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp: VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate xvi16ger2s: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation xvi16ger2spp: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-3-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implement xxm[tf]acc and xxsetacczLucas Mateus Castro (alqotel)
Implement the following PowerISA v3.1 instructions: xxmfacc: VSX Move From Accumulator xxmtacc: VSX Move To Accumulator xxsetaccz: VSX Set Accumulator to Zero The PowerISA 3.1 mentions that for the current version of the architecture, "the hardware implementation provides the effect of ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" and "The Accumulators introduce no new logical state at this time" (page 501). For now it seems unnecessary to create new structures, so this patch just uses ACC[i] as VSRs 4*i to 4*i+3 and therefore move to and from accumulators are no-ops. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-2-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: declare xxextractuw and xxinsertw helpers with call flagsMatheus Ferst
Move xxextractuw and xxinsertw to decodetree, declare both helpers with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-9-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: declare xvxsigsp helper with call flagsMatheus Ferst
Move xvxsigsp to decodetree, declare helper_xvxsigsp with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: declare xscvspdpn helper with call flagsMatheus Ferst
Move xscvspdpn to decodetree, declare helper_xscvspdpn with TCG_CALL_NO_RWG_SE and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-7-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20target/ppc: implement xscvqp[su]qzMatheus Ferst
Implement the following PowerISA v3.1 instructions: xscvqpsqz: VSX Scalar Convert with round to zero Quad-Precision to Signed Quadword xscvqpuqz: VSX Scalar Convert with round to zero Quad-Precision to Unsigned Quadword Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220330175932.6995-9-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20target/ppc: implement xscv[su]qqpMatheus Ferst
Implement the following PowerISA v3.1 instructions: xscvsqqp: VSX Scalar Convert with round Signed Quadword to Quad-Precision xscvuqqp: VSX Scalar Convert with round Unsigned Quadword to Quad-Precision format Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220330175932.6995-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-06Replace config-time define HOST_WORDS_BIGENDIANMarc-André Lureau
Replace a config-time define with a compile time condition define (compatible with clang and gcc) that must be declared prior to its usage. This avoids having a global configure time define, but also prevents from bad usage, if the config header wasn't included before. This can help to make some code independent from qemu too. gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> [ For the s390x parts I'm involved in ] Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-26target/ppc: fix helper_xvmadd* argument orderMatheus Ferst
When the xsmadd* insns were moved to decodetree, the helper arguments were reordered to better match the PowerISA description. The same macro is used to declare xvmadd* helpers, but the translation macro of these insns was not changed accordingly. Reported-by: Víctor Colombo <victor.colombo@eldorado.org.br> Fixes: e4318ab2e423 ("target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree") Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Víctor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20220325111851.718966-1-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-14target/ppc: fix xxspltw for big endian hostsMatheus Ferst
Fix a typo in the host endianness macro and add a simple test to detect regressions. Fixes: 9bb0048ec6f8 ("target/ppc: convert xxspltw to vector operations") Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220310172047.61094-1-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05target/ppc: split XXGENPCV macros for readabilityMatheus Ferst
Fixes: b090f4f1e3c9 ("target/ppc: Implement xxgenpcv[bhwd]m instruction") Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220304175156.2012315-6-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: implement lxvr[bhwd]/stxvr[bhwd]xLucas Coutinho
Implement the following PowerISA v3.1 instuctions: lxvrbx: Load VSX Vector Rightmost Byte Indexed X-form lxvrhx: Load VSX Vector Rightmost Halfword Indexed X-form lxvrwx: Load VSX Vector Rightmost Word Indexed X-form lxvrdx: Load VSX Vector Rightmost Doubleword Indexed X-form stxvrbx: Store VSX Vector Rightmost Byte Indexed X-form stxvrhx: Store VSX Vector Rightmost Halfword Indexed X-form stxvrwx: Store VSX Vector Rightmost Word Indexed X-form stxvrdx: Store VSX Vector Rightmost Doubleword Indexed X-form Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-50-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: implement plxssp/pstxsspLeandro Lupori
Implement instructions plxssp/pstxssp and port lxssp/stxssp to decode tree. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-49-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: implement plxsd/pstxsdLeandro Lupori
Implement instructions plxsd/pstxsd and port lxsd/stxsd to decode tree. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-48-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructionsVíctor Colombo
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220225210936.1749575-47-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xs{max,min}cqpVíctor Colombo
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-46-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3Víctor Colombo
Also, fixes these instructions not being capitalized. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-44-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xscmp{eq,ge,gt}dp to decodetreeVíctor Colombo
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-43-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xscmp{eq,ge,gt}qpVíctor Colombo
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-42-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Remove xscmpnedp instructionVíctor Colombo
xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch removes this instruction as it was not in the final version of v3.0. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-40-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xvtlsbb instructionVíctor Colombo
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-39-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]Matheus Ferst
Implement the following PowerISA v3.0 instuctions: xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd] xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round to Odd] xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using round to Odd] xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision [using round to Odd] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-38-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-37-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xxgenpcv[bhwd]m instructionMatheus Ferst
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220225210936.1749575-36-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xxevalMatheus Ferst
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220225210936.1749575-35-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xxpermx instructionMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-33-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xxpermdi to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-32-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: move xxperm/xxpermr to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-31-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xxsel to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-30-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Introduce TRANS*FLAGS macrosLuis Pires
New macros that add FLAGS and FLAGS2 checking were added for both TRANS and TRANS64. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> [ferst: - TRANS_FLAGS2 instead of TRANS_FLAGS_E - Use the new macros in load/store vector insns ] Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-2-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-09target/ppc: Change VSX instructions behavior to fill with zerosVíctor Colombo
ISA v3.1 changed some VSX instructions behavior by changing what the other words/doubleword in the result should contain when the result is only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and saves the result also in doubleword 0. Before, the second doubleword result was undefined according to the ISA, but now it's stated that it should be zeroed. Even tough the result was undefined before, hardware implementing these instructions already filled these fields with 0s. Changing every ISA version in QEMU to this behavior makes the results match what happens in hardware. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220204181944.65063-1-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-08exec/memop: Adding signedness to quad definitionsFrédéric Pétrot
Renaming defines for quad in their various forms so that their signedness is now explicit. Done using git grep as suggested by Philippe, with a bit of hand edition to keep assignments aligned. Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20220106210108.138226-2-frederic.petrot@univ-grenoble-alpes.fr Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2021-12-17target/ppc: move xscvqpdp to decodetreeMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211213120958.24443-5-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: fix xscvqpdp register accessMatheus Ferst
This instruction has VRT and VRB fields instead of T/TX and B/BX. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211213120958.24443-4-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Move xs{max,min}[cj]dp to decodetreeVictor Colombo
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20211213120958.24443-3-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Fix xs{max, min}[cj]dp to use VSX registersVictor Colombo
PPC instruction xsmaxcdp, xsmincdp, xsmaxjdp, and xsminjdp are using vector registers when they should be using VSX ones. This happens because the instructions are using GEN_VSX_HELPER_R3, which adds 32 to the register numbers, effectively making them vector registers. This patch fixes it by changing these instructions to use GEN_VSX_HELPER_X3. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20211213120958.24443-2-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-11-09target/ppc: Implement lxvkq instructionMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-25-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: Implement xxblendvb/xxblendvh/xxblendvw/xxblendvd instructionsMatheus Ferst
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-24-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09target/ppc: implemented XXSPLTIDP instructionBruno Larsen (billionai)
Implemented the instruction XXSPLTIDP using decodetree. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-23-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>