diff options
author | Emilio G. Cota <cota@braap.org> | 2018-03-17 01:17:30 -0400 |
---|---|---|
committer | Alex Bennée <alex.bennee@linaro.org> | 2018-12-17 08:25:25 +0000 |
commit | ccf770ba7396c240ca8a1564740083742dd04c08 (patch) | |
tree | 51ea3f1279259942c56df222e2a4e10eb71b7b22 /include/crypto | |
parent | 4a6295613f533a6841de5968c50e1ca36748807e (diff) |
hardfloat: implement float32/64 fused multiply-add
Performance results for fp-bench:
1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
fma-single: 74.73 MFlops
fma-double: 74.54 MFlops
- after:
fma-single: 203.37 MFlops
fma-double: 169.37 MFlops
2. ARM Aarch64 A57 @ 2.4GHz
- before:
fma-single: 23.24 MFlops
fma-double: 23.70 MFlops
- after:
fma-single: 66.14 MFlops
fma-double: 63.10 MFlops
3. IBM POWER8E @ 2.1 GHz
- before:
fma-single: 37.26 MFlops
fma-double: 37.29 MFlops
- after:
fma-single: 48.90 MFlops
fma-double: 59.51 MFlops
Here having 3FP64 set to 1 pays off for x86_64:
[1] 170.15 vs [0] 153.12 MFlops
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Diffstat (limited to 'include/crypto')
0 files changed, 0 insertions, 0 deletions