Age | Commit message (Collapse) | Author |
|
The code reorganization in commit 4a6fd938 broke handling of PREFIX_ADR.
While fixing this, tidy and comment the code so that it's more obvious
what's going on in setting both aflag and dflag.
The TARGET_X86_64 ifdef can be eliminated because CODE64 expands to the
constant zero when TARGET_X86_64 is undefined.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369855851-21400-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
Fix EFLAGS corruption by ROR r8/r16 imm instruction located at the end
of the TB, similarly to commit 089305ac for the non-immediate case.
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
This replaces the feature-bit fields on both X86CPU and x86_def_t
structs with an array.
With this, we will be able to simplify code that simply does the same
operation on all feature words (e.g. kvm_check_features_against_host(),
filter_features_for_kvm(), add_flagname_to_bitmaps(), CPU feature-bit
property lookup/registration, and the proposed "feature-words" property)
The following field replacements were made on X86CPU and x86_def_t:
(cpuid_)features -> features[FEAT_1_EDX]
(cpuid_)ext_features -> features[FEAT_1_ECX]
(cpuid_)ext2_features -> features[FEAT_8000_0001_EDX]
(cpuid_)ext3_features -> features[FEAT_8000_0001_ECX]
(cpuid_)ext4_features -> features[FEAT_C000_0001_EDX]
(cpuid_)kvm_features -> features[FEAT_KVM]
(cpuid_)svm_features -> features[FEAT_SVM]
(cpuid_)7_0_ebx_features -> features[FEAT_7_0_EBX]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
|
|
Fixed EFLAGS corruption by ROR r8/r16 instruction located at the end of the TB.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
gen_op_mov_TN_reg() loads the value in cpu_T[0], so this temporary should
be used instead of cpu_tmp0.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
When starting from CC_OP_DYNAMIC, and issuing adox before adcx,
a typo used the wrong value for the resulting CC_OP.
Cc: Blue Swirl <blauwirbel@gmail.com>
Reported-by: Torbjorn Granlund <tg@gmplib.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Fix various typos and misspellings. The bulk of these were found with
codespell.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
The gen_icount_start/end functions are now somewhat misnamed since they
are useful for generic "start/end of TB" code, used for more than just
icount. Rename them to gen_tb_start/end.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
These correspond very closely to the insns that we're emulating.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
With this being all straight-line code, it can get deleted
when the cc variables die.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The shift and rotate insns use movcond to set CC_OP, and thus
achieve a conditional EFLAGS setting. By discarding CC_OP in
a later flags setting insn, we can discard that movcond.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
With this being all straight-line code, it can get deleted
when the cc variables die.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
With this being all straight-line code, it can get deleted
when the cc variables die.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Special case xor with self. We need not even store the known
zero into cc_src.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
We weren't computing flags for lzcnt at all. At the same time,
adjust the implementation of bsf/bsr to avoid the local branch,
using movcond instead.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Do all of group 17 at one time for ease.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
As this is the first of the BMI insns to be implemented,
this carries quite a bit more baggage than normal.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
No actual required uses of these encodings yet.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Avoid duplicating switch statement between 32 and 64-bit modes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Add another slot in ENV and store two of the three inputs. This lets us
do less work when carry-out is not needed, and avoids the unpredictable
CC_OP after translating these insns.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Pass the data in explicitly, rather than indirectly via env.
This avoids all sorts of unnecessary register spillage.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
After a comparison or subtraction, the original value of the LHS will
currently be reconstructed using an addition. However, in most cases
it is already available: store it in a temp-local variable and save 1
or 2 TCG ops (2 if the result of the addition needs to be extended).
The temp-local can be declared dead as soon as the cc_op changes again,
or also before the translation block ends because gen_prepare_cc will
always make a copy before returning it. All this magic, plus copy
propagation and dead-code elimination, ensures that the temp local will
(almost) never be spilled.
Example (cmp $0x21,%rax + jbe):
Before After
----------------------------------------------------------------------------
movi_i64 tmp1,$0x21 movi_i64 tmp1,$0x21
movi_i64 cc_src,$0x21 movi_i64 cc_src,$0x21
sub_i64 cc_dst,rax,tmp1 sub_i64 cc_dst,rax,tmp1
add_i64 tmp7,cc_dst,cc_src
movi_i32 cc_op,$0x11 movi_i32 cc_op,$0x11
brcond_i64 tmp7,cc_src,leu,$0x0 discard loc11
brcond_i64 rax,cc_src,leu,$0x0
Before After
----------------------------------------------------------------------------
mov (%r14),%rbp mov (%r14),%rbp
mov %rbp,%rbx mov %rbp,%rbx
sub $0x21,%rbx sub $0x21,%rbx
lea 0x21(%rbx),%r12
movl $0x11,0xa0(%r14) movl $0x11,0xa0(%r14)
movq $0x21,0x90(%r14) movq $0x21,0x90(%r14)
mov %rbx,0x98(%r14) mov %rbx,0x98(%r14)
cmp $0x21,%r12 | cmp $0x21,%rbp
jbe ... jbe ...
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Placing the CC_OP_DYNAMIC at the join is less effective than
before the branch, as the branch will have forced global registers
to their home locations. This way we have a chance to discard
CC_SRC2 before it gets stored.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
A jump that ends a basic block or otherwise falls back to CC_OP_DYNAMIC
will always have to call gen_op_set_cc_op. However, not all jumps end
a basic block, so introduce a variant that does not do this.
This was partially undone earlier (i386: drop cc_op argument of gen_jcc1),
redo it now also to prepare for the introduction of src2.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Replace low-level ops with a higher-level "cmp %al, (A0)" in the case
of scas, and "cmp T0, (A0)" in the case of cmps.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
It is almost unused, and it is simpler to pass a TCG value directly
to gen_shiftd_rm_T1_T3. This value is then written to t2 without
going through a temporary register.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This simplifies all the jump generation code. CCPrepare allows the
code to create an efficient brcond always, so there is no need to
duplicate the setcc and jcc code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This makes the i386 front-end able to create CCPrepare structs for all
condition, not just those that come from a single flag. In particular,
JCC_L and JCC_LE can be optimized because gen_prepare_cc is not forced
to return a result in bit 0 (unlike gen_setcc_slow).
However, for now the slow jcc operations will still go through CC
computation in a single-bit temporary, followed by a brcond if the
temporary is nonzero.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Introduce a struct that describes how to build a *cond operation
that checks for a given x86 condition code. For now, just change
gen_compute_eflags_* to return the new struct, generate code for
the CCPrepare struct, and go on as before.
[rth: Use ctz with the proper width rather than ffs.]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reconstruct the arguments for complex conditions involving CC_OP_SUBx (BE,
L, LE). In the others do it via setcond and gen_setcc_slow (which is
not that slow in many cases).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
And allow gen_setcc_slow to operate on cpu_cc_src.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This is looking at EFLAGS, but it can do so more efficiently with
setcond.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Do not hard code the destination register.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Do the switch at translation time, converting the helper templates to
TCG opcodes. In some cases CF can be computed with a single setcond,
though others it may require a little more work.
In the CC_OP_DYNAMIC case, compute the whole EFLAGS, same as for ZF/SF/PF.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Make gen_compute_eflags_z and gen_compute_eflags_s able to compute the
inverted condition, and use this in gen_setcc_slow_T0. We cannot do it
yet in gen_compute_eflags_c, but prepare the code for it anyway. It is
not worthwhile for PF, as usual.
shr+and+xor could be replaced by and+setcond. I'm not doing it yet.
Reviewed-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|