Age | Commit message (Collapse) | Author |
|
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
For the normal case of no additional scaling, this reduces the
profile contribution of int64_to_float64 to the testcase in the
linked issue from 0.81% to 0.04%.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/134
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_modrem. This was the last use of a lot
of the legacy infrastructure, so remove it as required.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_log2. Though this is partly a ruse, since I do not
believe the code will succeed for float128 without work. Which is ok
for now, because we do not need this for more than float32 and float64.
Since berkeley-testfloat-3 doesn't support log2, compare float64_log2
vs the system log2. Fix the errors for inputs near 1.0:
test: 3ff00000000000b0 +0x1.00000000000b0p+0
sf: 3d2fa00000000000 +0x1.fa00000000000p-45
libm: 3d2fbd422b1bd36f +0x1.fbd422b1bd36fp-45
Error in fraction: 32170028290927 ulp
test: 3feec24f6770b100 +0x1.ec24f6770b100p-1
sf: bfad3740d13c9ec0 -0x1.d3740d13c9ec0p-5
libm: bfad3740d13c9e98 -0x1.d3740d13c9e98p-5
Error in fraction: 40 ulp
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Keep the intermediate results in FloatParts instead of
converting back and forth between float64. Use muladd
instead of separate mul+add.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
This is the last use of commonNaNT and all of the routines
that use it, so remove all of them for Werror.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Since this is the first such, this includes all of the
packing and unpacking routines as well.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
With floatx80_precision_x, the rounding happens across
the break between words. Notice this case with
frac_lsb = round_mask + 1 -> 0
and check the bits in frac_hi as needed.
In addition, since frac_shift == 0, we won't implicitly clear
round_mask via the right-shift, so explicitly clear those bits.
This fixes rounding for floatx80_precision_[sd].
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Use an enumeration instead of raw 32/64/80 values.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Remove frac_lsb, frac_lsbm1, roundeven_mask. Compute
these from round_mask in parts$N_uncanon_normal.
With floatx80, round_mask will not be tied to frac_shift.
Everything else is easily computable.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
We will need to treat the non-normal cases of floatx80 specially,
so split out the normal case that we can reuse.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.
Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.
Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_scalbn.
Reimplement float128_scalbn with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_compare. Rename all of the intermediate
functions to ftype_do_compare. Rename the hard-float functions
to ftype_hs_compare. Convert float128 to FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
The float128 implementation is straight-forward.
Unfortuantely, we don't have any tests we can simply adjust/unlock.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210517142739.38597-24-david@redhat.com>
[rth: Update for changed parts_minmax return value]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_minmax. Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_uint_to_float.
Reimplement uint64_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_sint_to_float.
Reimplement int{32,64}_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_float_to_uint. Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
For Arm BFDOT and BFMMLA, we need a version of round-to-odd
that overflows to infinity, instead of the max normal number.
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Rename to parts$N_float_to_sint. Reimplement
float128_to_int{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, split out
parts$N_round_to_int_normal, define a macro for
parts_round_to_int using QEMU_GENERIC.
This necessarily meant some rearrangement to the
rount_to_{,u}int_and_pack routines, so go ahead and
convert to parts_round_to_int_normal, which in turn
allows cleaning up of the raised exception handling.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Introduce parts_float_to_float_widen and parts_float_to_float_narrow.
Use them for float128_to_float{32,64} and float{32,64}_to_float128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Split out parts_float_to_ahp and parts_float_to_float.
Convert to pointers.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_div.
Implement float128_div with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Have x86_64 assembly for them, with a fallback.
This avoids shuffling values through %cl in the x86 case.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_muladd.
Implement float128_muladd with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Rename to parts$N_mul.
Reimplement float128_mul with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Replace the existing Berkeley implementation with the
FloatParts implementation.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
In preparation for implementing multiple sizes. Rename to parts_addsub,
split out parts_add/sub_normal for future reuse with muladd.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, renaming to parts$N_uncanon,
and define a macro for parts_uncanon using QEMU_GENERIC.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, rename to parts$N_canonicalize
and define a macro for parts_canonicalize using QEMU_GENERIC.
Rearrange the cases to recognize float_class_normal as
early as possible.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, rename to pick_nan_muladd$N
and define a macro for pick_nan_muladd using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, rename to parts$N_pick_nan
and define a macro for parts_pick_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, convert to pointers, rename to return_nan$N
and define a macro for return_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
This is the minimal change that also introduces float128_params,
float128_unpack_raw, and float128_pack_raw without running into
unused symbol Werrors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Shuffle the fraction to the end, otherwise sort by size.
Add frac_hi and frac_lo members to alias frac.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
At the same time, rename to parts64_silence_nan, split out
parts_silence_nan_frac, and define a macro for parts_silence_nan.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|
|
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
|