Age | Commit message (Collapse) | Author |
|
Currently, we pass env to every generated helper. When the semantics of
the instruction only depend on the arguments, this is unnecessary and
adds extra overhead to the helper call.
We add the TCG_CALL_NO_RWG_SE flag to any non-HVX helpers that don't get
the ptr to env.
The A2_nop and SA1_setin1 instructions end up with no arguments. This
results in a "old-style function definition" error from the compiler, so
we write overrides for them.
With this change, the number of helpers with env argument is
idef-parser enabled: 329 total, 23 with env
idef-parser disabled: 1543 total, 550 with env
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Tested-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240214042726.19290-4-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
|
|
Currently, the register number (MuN) for modifier registers is the
modifier register number rather than the index into hex_gpr. This
patch changes MuN to the hex_gpr index, which is consistent with
the handling of control registers.
Note that HELPER(fcircadd) needs the CS register corresponding to the
modifier register specified in the instruction. We create a TCGv
variable "CS" to hold the value to pass to the helper.
Reviewed-by: Brian Cain <bcain@quicinc.com>
Signed-off-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <20231210220712.491494-2-ltaylorsimpson@gmail.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
|
|
Allow the name 'cpu_env' to be used for something else.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The new_pred_value array in the CPUHexagonState is only used for
bookkeeping within the translation of a packet. With recent changes
that eliminate the need to free TCGv variables, these make more sense
to be transient and kept in DisasContext.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-19-tsimpson@quicinc.com>
|
|
The following have overrides
S2_insert
S2_insert_rp
S2_asr_r_svw_trun
A2_swiz
These instructions have semantics that write to the destination
before all the operand reads have been completed. Therefore,
the idef-parser versions were disabled with the short-circuit patch.
Test cases added to tests/tcg/hexagon/read_write_overlap.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-16-tsimpson@quicinc.com>
|
|
In certain cases, we can avoid the overhead of writing to hex_new_value
and write directly to hex_gpr. We add need_commit field to DisasContext
indicating if the end-of-packet commit is needed. If it is not needed,
get_result_gpr() and get_result_gpr_pair() can return hex_gpr.
We pass the ctx->need_commit to helpers when needed.
Finally, we can early-exit from gen_reg_writes during packet commit.
There are a few instructions whose semantics write to the result before
reading all the inputs. Therefore, the idef-parser generated code is
incompatible with short-circuit. We tell idef-parser to skip them.
For debugging purposes, we add a cpu property to turn off short-circuit.
When the short-circuit property is false, we skip the analysis and force
the end-of-packet commit.
Here's a simple example of the TCG generated for
0x004000b4: 0x7800c020 { R0 = #0x1 }
BEFORE:
---- 004000b4
movi_i32 new_r0,$0x1
mov_i32 r0,new_r0
AFTER:
---- 004000b4
movi_i32 r0,$0x1
This patch reintroduces a use of check_for_attrib, so we remove the
G_GNUC_UNUSED added earlier in this series.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20230427230012.3800327-12-tsimpson@quicinc.com>
|
|
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
The following instructions are overriden
S2_cabacdecbin
SA1_cmpeqi
Remove the log_pred_write function from op_helper.c
Remove references in macros.h
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-8-tsimpson@quicinc.com>
|
|
These instructions have implicit reads from p0, so we don't want
them in helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-6-tsimpson@quicinc.com>
|
|
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-5-tsimpson@quicinc.com>
|
|
These instructions have implicit writes to registers, so we don't
want them to be helpers when idef-parser is off.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-4-tsimpson@quicinc.com>
|
|
Add DisasContext arg to gen_log_reg_write_pair also
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427230012.3800327-3-tsimpson@quicinc.com>
|
|
The following instructions are added
J2_callrh
J2_junprh
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-9-tsimpson@quicinc.com>
|
|
The following instructions are added
L2_loadw_aq
L4_loadd_aq
R6_release_at_vi
R6_release_st_vi
S2_storew_rl_at_vi
S4_stored_rl_at_vi
S2_storew_rl_st_vi
S4_stored_rl_st_vi
The release instructions are nop's in qemu. The others behave as
loads/stores.
The encodings for these instructions changed some "don't care" bits
L2_loadw_locked
L4_loadd_locked
S2_storew_locked
S4_stored_locked
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230427224057.3766963-3-tsimpson@quicinc.com>
|
|
Most of these are not modelled in QEMU, so save the overhead of
calling a helper.
The only exception is dczeroa. It assigns to hex_dczero_addr, which
is handled during packet commit.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230410202402.2856852-1-tsimpson@quicinc.com>
|
|
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230405164211.30015-3-tsimpson@quicinc.com>
|
|
The following instructions are overriden
S2_ct0 Count trailing zeros
S2_ct1 Count trailing ones
S2_ct0p Count trailing zeros (register pair)
S2_ct1p Count trailing ones (register pair)
These instructions are not handled by idef-parser because the
imported semantics uses bit-reverse. However, they are
straightforward to implement in TCG with tcg_gen_ctzi_*
Test cases added to tests/tcg/hexagon/misc.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230405164211.30015-1-tsimpson@quicinc.com>
|
|
We assign the instruction destination register to hex_new_value[num]
instead of a TCG temp that gets copied back to hex_new_value[num].
We introduce new functions get_result_gpr[_pair] to facilitate getting
the proper destination register.
Since we preload hex_new_value for predicated instructions, we don't
need the check for slot_cancelled. So, we call gen_log_reg_write instead.
We update the helper function generation and gen_tcg.h to maintain the
disable-hexagon-idef-parser configuration.
Here is a simple example of the differences in the TCG code generated:
IN:
0x00400094: 0xf900c102 { if (P0) R2 = and(R0,R1) }
BEFORE
---- 00400094
mov_i32 slot_cancelled,$0x0
mov_i32 new_r2,r2
mov_i32 loc2,$0x0
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 loc2,tmp0
br $L2
set_label $L1
or_i32 slot_cancelled,slot_cancelled,$0x8
set_label $L2
and_i32 tmp0,slot_cancelled,$0x8
movcond_i32 new_r2,tmp0,$0x0,loc2,new_r2,eq
mov_i32 r2,new_r2
AFTER
---- 00400094
mov_i32 slot_cancelled,$0x0
mov_i32 new_r2,r2
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 new_r2,tmp0
br $L2
set_label $L1
or_i32 slot_cancelled,slot_cancelled,$0x8
set_label $L2
mov_i32 r2,new_r2
We'll remove the unnecessary manipulation of slot_cancelled in a
subsequent patch.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-13-tsimpson@quicinc.com>
|
|
These instructions perform a deallocframe+return (jumpr r31)
Add overrides for
L4_return
SL2_return
L4_return_t
L4_return_f
L4_return_tnew_pt
L4_return_fnew_pt
L4_return_tnew_pnt
L4_return_fnew_pnt
SL2_return_t
SL2_return_f
SL2_return_tnew
SL2_return_fnew
This patch eliminates the last helper that uses write_new_pc, so we
remove it from op_helper.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-5-tsimpson@quicinc.com>
|
|
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-4-tsimpson@quicinc.com>
|
|
Add overrides for
J2_callr
J2_callrt
J2_callrf
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-3-tsimpson@quicinc.com>
|
|
Add overrides for
SL2_jumpr31 Unconditional
SL2_jumpr31_t Predicated true (old value)
SL2_jumpr31_f Predicated false (old value)
SL2_jumpr31_tnew Predicated true (new value)
SL2_jumpr31_fnew Predicated false (new value)
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-2-tsimpson@quicinc.com>
|
|
The --disable-hexagon-idef-parser configuration was broken by this patch
2feacf60c23ba6 (target/hexagon: Drop tcg_temp_free from C code)
That config is not tested by CI
Fix is simple: Mark a few TCGv variables as unused
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230306172515.346813-1-tsimpson@quicinc.com>
|
|
Translators are no longer required to free tcg temporaries.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Since tcg_temp_new_* is now identical, use those.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Direct block chaining is documented here
https://qemu.readthedocs.io/en/latest/devel/tcg.html#direct-block-chaining
Hexagon inner loops end with the endloop0 instruction
To go back to the beginning of the loop, this instructions writes to PC
from register SA0 (start address 0). To use direct block chaining, we
have to assign PC with a constant value. So, we specialize the code
generation when the start of the translation block is equal to SA0.
When this is the case, we defer the compare/branch from endloop0 to
gen_end_tb. When this is done, we can assign the start address of the TB
to PC.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-12-tsimpson@quicinc.com>
|
|
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-10-tsimpson@quicinc.com>
|
|
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-9-tsimpson@quicinc.com>
|
|
Add overrides for
J2_call
J2_callt
J2_callf
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-8-tsimpson@quicinc.com>
|
|
The imported files don't properly mark all CONDEXEC instructions, so
we add some logic to hex_common.py to add the attribute.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-7-tsimpson@quicinc.com>
|
|
Add pc field to Packet structure
For helpers that need PC, pass an extra argument
Remove slot arg from conditional jump helpers
On a trap0, copy pkt->pc into hex_gpr[HEX_REG_PC]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-6-tsimpson@quicinc.com>
|
|
These instructions will not be generated by idef-parser, so we override
them manually.
Test cases added to tests/tcg/hexagon/usr.c
Co-authored-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221108162906.3166-4-tsimpson@quicinc.com>
|
|
The semantics of a mem_noshuf packet are that the store effectively
happens before the load. However, in cases where the load raises an
exception, we cannot simply execute the store first.
This change adds a probe to check that the load will not raise an
exception before executing the store.
If the load is predicated, this requires special handling. We check
the condition before performing the probe. Since, we need the EA to
perform the check, we move the GET_EA portion inside CHECK_NOSHUF_PRED.
Test case added in tests/tcg/hexagon/mem_noshuf_exception.c
Suggested-by: Alessandro Di Federico <ale@rev.ng>
Suggested-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220707210546.15985-3-tsimpson@quicinc.com>
|
|
Call the CHECK_NOSHUF macro multiple times: once in the
fGEN_TCG_PRED_LOAD() and again in fLOAD().
Before this commit, a packet with a store and a predicated
load with mem_noshuf that gets encoded like this:
{ P0 = cmp.eq(R17,#0x0)
memw(R18+#0x0) = R2
if (!P0.new) R3 = memw(R17+#0x4) }
... would end up generating a branch over both the load
and the store like so:
...
brcond_i32 loc17,$0x0,eq,$L1
mov_i32 loc18,store_addr_1
qemu_st_i32 store_val32_1,store_addr_1,leul,0
qemu_ld_i32 loc16,loc7,leul,0
set_label $L1
...
Test cases added to tests/tcg/hexagon/mem_noshuf.c
Co-authored-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220707210546.15985-2-tsimpson@quicinc.com>
|
|
Change additional tcg_const_tl to tcg_constant_tl
Note that gen_pred_cancal had slot_mask initialized with tcg_const_tl.
However, it is not constant throughout, so we initialize it with
tcg_temp_new and replace the first use with the constant value.
Inspired-by: Richard Henderson <richard.henderson@linaro.org>
Inspired-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
|
|
Replace uses of tcg_const_* with the allocate and free close together.
Inspired-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211003004750.3608983-3-f4bug@amsat.org>
|
|
Previously the store-conditional code was writing to hex_pred[prednum].
Then, the fGEN_TCG override was reading from there to the destination
variable so that the packet commit logic would handle it properly.
The correct implementation is to write to the destination variable
and don't have the extra read in the override.
Remove the unused arguments from gen_store_conditional[48]
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1622589584-22571-4-git-send-email-tsimpson@quicinc.com>
|
|
Y4_l2fetch == l2fetch(Rs32, Rt32)
Y5_l2fetch == l2fetch(Rs32, Rtt32)
The semantics for these instructions are present, but the encodings
are missing.
Note that these are treated as nops in qemu, so we add overrides.
Test case added to tests/tcg/hexagon/misc.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1622589584-22571-3-git-send-email-tsimpson@quicinc.com>
|
|
The following instructions are added
L2_loadalignb_io Ryy32 = memb_fifo(Rs32+#s11:1)
L2_loadalignh_io Ryy32 = memh_fifo(Rs32+#s11:1)
L4_loadalignb_ur Ryy32 = memb_fifo(Rt32<<#u2+#U6)
L4_loadalignh_ur Ryy32 = memh_fifo(Rt32<<#u2+#U6)
L4_loadalignb_ap Ryy32 = memb_fifo(Re32=#U6)
L4_loadalignh_ap Ryy32 = memh_fifo(Re32=#U6)
L2_loadalignb_pr Ryy32 = memb_fifo(Rx32++Mu2)
L2_loadalignh_pr Ryy32 = memh_fifo(Rx32++Mu2)
L2_loadalignb_pbr Ryy32 = memb_fifo(Rx32++Mu2:brev)
L2_loadalignh_pbr Ryy32 = memh_fifo(Rx32++Mu2:brev)
L2_loadalignb_pi Ryy32 = memb_fifo(Rx32++#s4:1)
L2_loadalignh_pi Ryy32 = memh_fifo(Rx32++#s4:1)
L2_loadalignb_pci Ryy32 = memb_fifo(Rx32++#s4:1:circ(Mu2))
L2_loadalignh_pci Ryy32 = memh_fifo(Rx32++#s4:1:circ(Mu2))
L2_loadalignb_pcr Ryy32 = memb_fifo(Rx32++I:circ(Mu2))
L2_loadalignh_pcr Ryy32 = memh_fifo(Rx32++I:circ(Mu2))
Test cases in tests/tcg/hexagon/load_align.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-26-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The following instructions are added
L2_loadbzw2_io Rd32 = memubh(Rs32+#s11:1)
L2_loadbzw4_io Rdd32 = memubh(Rs32+#s11:1)
L2_loadbsw2_io Rd32 = membh(Rs32+#s11:1)
L2_loadbsw4_io Rdd32 = membh(Rs32+#s11:1)
L4_loadbzw2_ur Rd32 = memubh(Rt32<<#u2+#U6)
L4_loadbzw4_ur Rdd32 = memubh(Rt32<<#u2+#U6)
L4_loadbsw2_ur Rd32 = membh(Rt32<<#u2+#U6)
L4_loadbsw4_ur Rdd32 = membh(Rt32<<#u2+#U6)
L4_loadbzw2_ap Rd32 = memubh(Re32=#U6)
L4_loadbzw4_ap Rdd32 = memubh(Re32=#U6)
L4_loadbsw2_ap Rd32 = membh(Re32=#U6)
L4_loadbsw4_ap Rdd32 = membh(Re32=#U6)
L2_loadbzw2_pr Rd32 = memubh(Rx32++Mu2)
L2_loadbzw4_pr Rdd32 = memubh(Rx32++Mu2)
L2_loadbsw2_pr Rd32 = membh(Rx32++Mu2)
L2_loadbsw4_pr Rdd32 = membh(Rx32++Mu2)
L2_loadbzw2_pbr Rd32 = memubh(Rx32++Mu2:brev)
L2_loadbzw4_pbr Rdd32 = memubh(Rx32++Mu2:brev)
L2_loadbsw2_pbr Rd32 = membh(Rx32++Mu2:brev)
L2_loadbsw4_pbr Rdd32 = membh(Rx32++Mu2:brev)
L2_loadbzw2_pi Rd32 = memubh(Rx32++#s4:1)
L2_loadbzw4_pi Rdd32 = memubh(Rx32++#s4:1)
L2_loadbsw2_pi Rd32 = membh(Rx32++#s4:1)
L2_loadbsw4_pi Rdd32 = membh(Rx32++#s4:1)
L2_loadbzw2_pci Rd32 = memubh(Rx32++#s4:1:circ(Mu2))
L2_loadbzw4_pci Rdd32 = memubh(Rx32++#s4:1:circ(Mu2))
L2_loadbsw2_pci Rd32 = membh(Rx32++#s4:1:circ(Mu2))
L2_loadbsw4_pci Rdd32 = membh(Rx32++#s4:1:circ(Mu2))
L2_loadbzw2_pcr Rd32 = memubh(Rx32++I:circ(Mu2))
L2_loadbzw4_pcr Rdd32 = memubh(Rx32++I:circ(Mu2))
L2_loadbsw2_pcr Rd32 = membh(Rx32++I:circ(Mu2))
L2_loadbsw4_pcr Rdd32 = membh(Rx32++I:circ(Mu2))
Test cases in tests/tcg/hexagon/load_unpack.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-25-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The following instructions are added
L2_loadrub_pbr Rd32 = memub(Rx32++Mu2:brev)
L2_loadrb_pbr Rd32 = memb(Rx32++Mu2:brev)
L2_loadruh_pbr Rd32 = memuh(Rx32++Mu2:brev)
L2_loadrh_pbr Rd32 = memh(Rx32++Mu2:brev)
L2_loadri_pbr Rd32 = memw(Rx32++Mu2:brev)
L2_loadrd_pbr Rdd32 = memd(Rx32++Mu2:brev)
S2_storerb_pbr memb(Rx32++Mu2:brev).=.Rt32
S2_storerh_pbr memh(Rx32++Mu2:brev).=.Rt32
S2_storerf_pbr memh(Rx32++Mu2:brev).=.Rt.H32
S2_storeri_pbr memw(Rx32++Mu2:brev).=.Rt32
S2_storerd_pbr memd(Rx32++Mu2:brev).=.Rt32
S2_storerinew_pbr memw(Rx32++Mu2:brev).=.Nt8.new
S2_storerbnew_pbr memw(Rx32++Mu2:brev).=.Nt8.new
S2_storerhnew_pbr memw(Rx32++Mu2:brev).=.Nt8.new
Test cases in tests/tcg/hexagon/brev.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-24-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The following instructions are added
L2_loadrub_pci Rd32 = memub(Rx32++#s4:0:circ(Mu2))
L2_loadrb_pci Rd32 = memb(Rx32++#s4:0:circ(Mu2))
L2_loadruh_pci Rd32 = memuh(Rx32++#s4:1:circ(Mu2))
L2_loadrh_pci Rd32 = memh(Rx32++#s4:1:circ(Mu2))
L2_loadri_pci Rd32 = memw(Rx32++#s4:2:circ(Mu2))
L2_loadrd_pci Rdd32 = memd(Rx32++#s4:3:circ(Mu2))
S2_storerb_pci memb(Rx32++#s4:0:circ(Mu2)) = Rt32
S2_storerh_pci memh(Rx32++#s4:1:circ(Mu2)) = Rt32
S2_storerf_pci memh(Rx32++#s4:1:circ(Mu2)) = Rt.H32
S2_storeri_pci memw(Rx32++#s4:2:circ(Mu2)) = Rt32
S2_storerd_pci memd(Rx32++#s4:3:circ(Mu2)) = Rtt32
S2_storerbnew_pci memb(Rx32++#s4:0:circ(Mu2)) = Nt8.new
S2_storerhnew_pci memw(Rx32++#s4:1:circ(Mu2)) = Nt8.new
S2_storerinew_pci memw(Rx32++#s4:2:circ(Mu2)) = Nt8.new
L2_loadrub_pcr Rd32 = memub(Rx32++I:circ(Mu2))
L2_loadrb_pcr Rd32 = memb(Rx32++I:circ(Mu2))
L2_loadruh_pcr Rd32 = memuh(Rx32++I:circ(Mu2))
L2_loadrh_pcr Rd32 = memh(Rx32++I:circ(Mu2))
L2_loadri_pcr Rd32 = memw(Rx32++I:circ(Mu2))
L2_loadrd_pcr Rdd32 = memd(Rx32++I:circ(Mu2))
S2_storerb_pcr memb(Rx32++I:circ(Mu2)) = Rt32
S2_storerh_pcr memh(Rx32++I:circ(Mu2)) = Rt32
S2_storerf_pcr memh(Rx32++I:circ(Mu2)) = Rt32.H32
S2_storeri_pcr memw(Rx32++I:circ(Mu2)) = Rt32
S2_storerd_pcr memd(Rx32++I:circ(Mu2)) = Rtt32
S2_storerbnew_pcr memb(Rx32++I:circ(Mu2)) = Nt8.new
S2_storerhnew_pcr memh(Rx32++I:circ(Mu2)) = Nt8.new
S2_storerinew_pcr memw(Rx32++I:circ(Mu2)) = Nt8.new
Test cases in tests/tcg/hexagon/circ.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-23-git-send-email-tsimpson@quicinc.com>
[rth: Squash <1619667142-29636-1-git-send-email-tsimpson@quicinc.com>
removing gen_read_reg and gen_set_byte to avoid clang Werror.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rdd32 = add(Rss32, Rtt32, Px4):carry
Add with carry
Rdd32 = sub(Rss32, Rtt32, Px4):carry
Sub with carry
Test cases in tests/tcg/hexagon/multi_result.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-22-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rdd32,Pe4 = vminub(Rtt32, Rss32)
Vector min of bytes
Test cases in tests/tcg/hexagon/multi_result.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-21-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rxx32,Pe4 = vacsh(Rss32, Rtt32)
Add compare and select elements of two vectors
Test cases in tests/tcg/hexagon/multi_result.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-20-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rd32,Pe4 = sfinvsqrta(Rs32)
Square root approx
The helper packs the 2 32-bit results into a 64-bit value,
and the fGEN_TCG override unpacks them into the proper results.
Test cases in tests/tcg/hexagon/multi_result.c
FP exception tests added to tests/tcg/hexagon/fpstuff.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-19-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rd32,Pe4 = sfrecipa(Rs32, Rt32)
Recripocal approx
Test cases in tests/tcg/hexagon/multi_result.c
FP exception tests added to tests/tcg/hexagon/fpstuff.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-18-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reported-by: Richard Henderson <<richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1615784100-26459-1-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The imported code uses host floating point. We override them
to use qemu softfloat
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1612763186-18161-29-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Helpers won't work if there are multiple definitions, so we override these
instructions using #define fGEN_TCG_<tag>.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1612763186-18161-28-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|