Age | Commit message (Collapse) | Author |
|
|
|
Also rename AArch64 intrinsics to ARMv8 intrinsics
as these are not necessarily limited to 64-bit
|
|
|
|
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-
|
|
|
|
|
|
|
|
66b2cf1ccfad545a8ec3f2a854e23f647322bf30 Use immintrin.h everywhere for intrinsics (Pieter Wuille)
4c935e2eee456ff66cdfb908b0edffdd1e8a6c04 Add SHA256 implementation using using Intel SHA intrinsics (Pieter Wuille)
268400d3188200c9e3dcd3482c4853354388a721 [Refactor] CPU feature detection logic for SHA256 (Pieter Wuille)
Pull request description:
Based on #13191.
This adds SHA256 implementations that use Intel's SHA Extension instructions (using intrinsics). This needs GCC 4.9 or Clang 3.4.
In addition to #13191, two extra implementations are provided:
* (a) A variable-length SHA256 implementation using SHA extensions.
* (b) A 2-way 64-byte input double-SHA256 implementation using SHA extensions.
Benchmarks for 9001-element Merkle tree root computation on an AMD Ryzen 1800X system:
* Using generic C++ code (pre-#10821): 6.1ms
* Using SSE4 (master, #10821): 4.6ms
* Using 4-way SSE4 specialized for 64-byte inputs (#13191): 2.8ms
* Using 8-way AVX2 specialized for 64-byte inputs (#13191): 2.1ms
* Using 2-way SHA-NI specialized for 64-byte inputs (this PR): 0.56ms
Benchmarks for 32-byte SHA256 on the same system:
* Using SSE4 (master, #10821): 190ns
* Using SHA-NI (this PR): 53ns
Benchmarks for 1000000-byte SHA256 on the same system:
* Using SSE4 (master, #10821): 2.5ms
* Using SHA-NI (this PR): 0.51ms
Tree-SHA512: 2b319e33b22579f815d91f9daf7994a5e1e799c4f73c13e15070dd54ba71f3f6438ccf77ae9cbd1ce76f972d9cbeb5f0edfea3d86f101bbc1055db70e42743b7
|
|
|
|
|
|
|
|
32d153fa360f73b4999701b97d55b12318fd2659 For AVX2 code, also check for AVX, XSAVE, and OS support (Pieter Wuille)
Pull request description:
Fixes #12903.
Tree-SHA512: 01e71efb5d3a43c49a145a5b1dc4fe7d0a491e1e78479e7df830a2aaac57c3dcfc316e28984c695206c76f93b68e4350fc037ca36756ca579b7070e39c835da2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-BEGIN VERIFY SCRIPT-
for f in \
src/*.cpp \
src/*.h \
src/bench/*.cpp \
src/bench/*.h \
src/compat/*.cpp \
src/compat/*.h \
src/consensus/*.cpp \
src/consensus/*.h \
src/crypto/*.cpp \
src/crypto/*.h \
src/crypto/ctaes/*.h \
src/policy/*.cpp \
src/policy/*.h \
src/primitives/*.cpp \
src/primitives/*.h \
src/qt/*.cpp \
src/qt/*.h \
src/qt/test/*.cpp \
src/qt/test/*.h \
src/rpc/*.cpp \
src/rpc/*.h \
src/script/*.cpp \
src/script/*.h \
src/support/*.cpp \
src/support/*.h \
src/support/allocators/*.h \
src/test/*.cpp \
src/test/*.h \
src/wallet/*.cpp \
src/wallet/*.h \
src/wallet/test/*.cpp \
src/wallet/test/*.h \
src/zmq/*.cpp \
src/zmq/*.h
do
base=${f%/*}/ relbase=${base#src/} sed -i "s:#include \"\(.*\)\"\(.*\):if test -e \$base'\\1'; then echo \"#include <\"\$relbase\"\\1>\\2\"; else echo \"#include <\\1>\\2\"; fi:e" $f
done
-END VERIFY SCRIPT-
|
|
Now that 0.15 is branched off, enable assembler SHA256 optimizations by default.
|
|
|
|
|
|
|
|
|
|
Extracted from a patch by Wladimir van der Laan.
|
|
Github-Pull: #5494
Rebased-From: 15de949bb9277e442302bdd8dee299a8d6deee60
|
|
|