Age | Commit message (Collapse) | Author |
|
Co-authored-by: Peter Yordanov <ppyordanov@yahoo.com>
|
|
|
|
|
|
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:
* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
an Intel i7-8700 CPU @ 3.20GHz.
* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
calculate standard deviation / mean = coefficient of variation:
* 0.57% CV for old benchmarking framework
* 0.20% CV for nanobench
So the benchmark results with nanobench seem to vary less than with
the old framework.
* It automatically determines runtime based on clock precision, no need
to specify number of evaluations.
* measure instructions, cycles, branches, instructions per cycle,
branch misses (only Linux, when performance counters are available)
* output in markdown table format.
* Warn about unstable environment (frequency scaling, turbo, ...)
* For better profiling, it is possible to set the environment variable
NANOBENCH_ENDLESS to force endless running of a particular benchmark
without the need to recompile. This makes it to e.g. run "perf top"
and look at hotspots.
Here is an example copy & pasted from the terminal output:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 2.52 | 396,529,415.94 | 0.6% | 25.42 | 8.02 | 3.169 | 0.06 | 0.0% | 0.03 | `bench/crypto_hash.cpp RIPEMD160`
| 1.87 | 535,161,444.83 | 0.3% | 21.36 | 5.95 | 3.589 | 0.06 | 0.0% | 0.02 | `bench/crypto_hash.cpp SHA1`
| 3.22 | 310,344,174.79 | 1.1% | 36.80 | 10.22 | 3.601 | 0.09 | 0.0% | 0.04 | `bench/crypto_hash.cpp SHA256`
| 2.01 | 496,375,796.23 | 0.0% | 18.72 | 6.43 | 2.911 | 0.01 | 1.0% | 0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
| 7.23 | 138,263,519.35 | 0.1% | 82.66 | 23.11 | 3.577 | 1.63 | 0.1% | 0.00 | `bench/crypto_hash.cpp SHA256_32b`
| 3.04 | 328,780,166.40 | 0.3% | 35.82 | 9.69 | 3.696 | 0.03 | 0.0% | 0.03 | `bench/crypto_hash.cpp SHA512`
[1] https://github.com/martinus/nanobench
* Adds support for asymptotes
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to #17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.
Usage is e.g. like this:
```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```
This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:
| complexityN | ns/op | op/s | err% | ins/op | cyc/op | IPC | total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
| 25 | 1,064,241.00 | 939.64 | 1.4% | 3,960,279.00 | 2,829,708.00 | 1.400 | 0.01 | `ComplexMemPool`
| 50 | 1,579,530.00 | 633.10 | 1.0% | 6,231,810.00 | 4,412,674.00 | 1.412 | 0.02 | `ComplexMemPool`
| 100 | 4,022,774.00 | 248.58 | 0.6% | 16,544,406.00 | 11,889,535.00 | 1.392 | 0.04 | `ComplexMemPool`
| 200 | 15,390,986.00 | 64.97 | 0.2% | 63,904,254.00 | 47,731,705.00 | 1.339 | 0.17 | `ComplexMemPool`
| 400 | 69,394,711.00 | 14.41 | 0.1% | 272,602,461.00 | 219,014,691.00 | 1.245 | 0.76 | `ComplexMemPool`
| 600 | 168,977,165.00 | 5.92 | 0.1% | 639,108,082.00 | 535,316,887.00 | 1.194 | 1.86 | `ComplexMemPool`
| 800 | 310,109,077.00 | 3.22 | 0.1% |1,149,134,246.00 | 984,620,812.00 | 1.167 | 3.41 | `ComplexMemPool`
| coefficient | err% | complexity
|--------------:|-------:|------------
| 4.78486e-07 | 4.5% | O(n^2)
| 6.38557e-10 | 21.7% | O(n^3)
| 3.42338e-05 | 38.0% | O(n log n)
| 0.000313914 | 46.9% | O(n)
| 0.0129823 | 114.4% | O(log n)
| 0.0815055 | 133.8% | O(1)
The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
|
|
|
|
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-
|
|
e6e622e5a0e22c2ac1b50b96af818e412d67ac54 Implement O(1) OP_IF/NOTIF/ELSE/ENDIF logic (Pieter Wuille)
d0e8f4d5d8ddaccb37f98b7989fb944081e41ab8 [refactor] interpreter: define interface for vfExec (Anthony Towns)
89fb241c54fc85befacfa3703d8e21bf3b8a76eb Benchmark script verification with 100 nested IFs (Pieter Wuille)
Pull request description:
While investigating what mechanisms are possible to maximize the per-opcode verification cost of scripts, I noticed that the logic for determining whether a particular opcode is to be executed is O(n) in the nesting depth. This issue was also pointed out by Sergio Demian Lerner in https://bitslog.wordpress.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/, and this PR implements a variant of the O(1) algorithm suggested there.
This is not a problem currently, because even with a nesting depth of 100 (the maximum possible right now due to the 201 ops limit), the slowdown caused by this on my machine is around 70 ns per opcode (or 0.25 s per block) at worst, far lower than what is possible with other opcodes.
This PR mostly serves as a proof of concept that it's possible to avoid it, which may be relevant in discussions around increasing the opcode limits in future script versions. Without it, the execution time of scripts can grow quadratically with the nesting depth, which very quickly becomes unreasonable.
This improves upon #14245 by completely removing the `vfExec` vector.
ACKs for top commit:
jnewbery:
Code review ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54
MarcoFalke:
ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54 🐴
fjahr:
ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54
ajtowns:
ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54
laanwj:
concept and code review ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54
jonatack:
ACK e6e622e5a0e22c2ac1b50b96af818e412d67ac54 code review, build, benches, fuzzing
Tree-SHA512: 1dcfac3411ff04773de461959298a177f951cb5f706caa2734073bcec62224d7cd103767cfeef85cd129813e70c14c74fa8f1e38e4da70ec38a0f615aab1f7f7
|
|
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-
|
|
|
|
-BEGIN VERIFY SCRIPT-
# Move files
for f in $(git ls-files src/test/lib/); do git mv $f src/test/util/; done
git mv src/test/setup_common.cpp src/test/util/
git mv src/test/setup_common.h src/test/util/
# Replace Windows paths
sed -i -e 's|\\setup_common|\\util\\setup_common|g' $(git grep -l '\\setup_common')
sed -i -e 's|src\\test\\lib\\|src\\test\\util\\|g' build_msvc/test_bitcoin/test_bitcoin.vcxproj
# Everything else
sed -i -e 's|/setup_common|/util/setup_common|g' $(git grep -l 'setup_common')
sed -i -e 's|test/lib/|test/util/|g' $(git grep -l 'test/lib/')
# Fix include guard
sed -i -e 's|BITCOIN_TEST_SETUP_COMMON_H|BITCOIN_TEST_UTIL_SETUP_COMMON_H|g' ./src/test/util/setup_common.h
sed -i -e 's|BITCOIN_TEST_LIB_|BITCOIN_TEST_UTIL_|g' $(git grep -l 'BITCOIN_TEST_LIB_')
-END VERIFY SCRIPT-
|
|
prototypes used in src/test/script_tests.cpp:
- CMutableTransaction BuildCreditingTransaction(const CScript& scriptPubKey, int nValue = 0);
- CMutableTransaction BuildSpendingTransaction(const CScript& scriptSig, const CScriptWitness& scriptWitness, const CTransaction& txCredit);
prototypes used in bench/verify_script.cpp:
- CMutableTransaction BuildCreditingTransaction(const CScript& scriptPubKey);
- CMutableTransaction BuildSpendingTransaction(const CScript& scriptSig, const CMutableTransaction& txCredit);
The more generic versions from the script tests are moved into a new file pair
transaction_utils.cpp/h and the calls are adapted accordingly in the
verify_script benchmark (passing the nValue of 1 explicitely for
BuildCreditingTransaction(), passing empty scriptWitness explicitely and
converting txCredit parameter to CTransaction in BuildSpendingTransaction()).
|
|
|
|
e306be742932d4ea5aca0ea4768e54b2fc3dc6a0 Use 72 byte dummy signatures when watching only inputs may be used (Andrew Chow)
48b1473c898129a99212e2db36c61cf93625ea17 Use 71 byte signature for DUMMY_SIGNATURE_CREATOR (Andrew Chow)
18dfea0dd082af18dfb02981b7ee1cd44d514388 Always create 70 byte signatures with low R values (Andrew Chow)
Pull request description:
When creating signatures for transactions, always make one which has a 32 byte or smaller R and 32 byte or smaller S value. This results in signatures that are always less than 71 bytes (32 byte R + 32 byte S + 6 bytes DER + 1 byte sighash) with low R values. In most cases, the signature will be 71 bytes.
Because R is not mutable in the same way that S is, a low R value can only be found by trying different nonces. RFC 6979 for deterministic nonce generation has the option to specify additional entropy, so we simply use that and add a uin32_t counter which we increment in order to try different nonces. Nonces are sill deterministically generated as the nonce used will the be the first one where the counter results in a nonce that results in a low R value. Because different nonces need to be tried, time to produce a signature does increase. On average, it takes twice as long to make a signature as two signatures need to be created, on average, to find one with a low R.
Having a fixed size signature makes size calculations easier and also saves half a byte of transaction size, on average.
DUMMY_SIGNATURE_CREATOR has been modified to produce 71 byte dummy signatures instead of 72 byte signatures.
Tree-SHA512: 3cd791505126ce92da7c631856a97ba0b59e87d9c132feff6e0eef1dc47768e81fbb38bfbe970371bedf9714b7f61a13a5fe9f30f962c81734092a4d19a4ef33
|
|
When extra entropy is not specified by the caller, CKey::Sign will
now always create a signature that has a low R value and is at most
70 bytes. The resulting signature on the stack will be 71 bytes when
the sighash byte is included.
Using low R signatures means that the resulting DER encoded signature
will never need to have additional padding to account for high R
values.
|
|
|
|
In addition to having the scriptSig and scriptWitness, have SignatureData
also be able to store just the signatures (pubkeys mapped to sigs) and
scripts (script ids mapped to scripts).
Also have DataFromTransaction be able to extract signatures and scripts
from the scriptSig and scriptWitness of an input to put them in SignatureData.
Adds a new SignatureChecker which takes a SignatureData and puts pubkeys
and signatures into it when it successfully verifies a signature.
Adds a new field in SignatureData which stores whether the SignatureData
was complete. This allows us to also update the scriptSig and
scriptWitness to the final one when updating a SignatureData with another
one.
|
|
|
|
-BEGIN VERIFY SCRIPT-
sed -i 's/enum DBErrors/enum class DBErrors/g' src/wallet/walletdb.h
git grep -l DB_ | xargs sed -i 's/DB_\(LOAD_OK\|CORRUPT\|NONCRITICAL_ERROR\|TOO_NEW\|LOAD_FAIL\|NEED_REWRITE\)/DBErrors::\1/g'
sed -i 's/^ DBErrors::/ /g' src/wallet/walletdb.h
sed -i 's/enum VerifyResult/enum class VerifyResult/g' src/wallet/db.h
sed -i 's/\(VERIFY_OK\|RECOVER_OK\|RECOVER_FAIL\)/VerifyResult::\1/g' src/wallet/db.cpp
sed -i 's/enum ThresholdState/enum class ThresholdState/g' src/versionbits.h
git grep -l THRESHOLD_ | xargs sed -i 's/THRESHOLD_\(DEFINED\|STARTED\|LOCKED_IN\|ACTIVE\|FAILED\)/ThresholdState::\1/g'
sed -i 's/^ ThresholdState::/ /g' src/versionbits.h
sed -i 's/enum SigVersion/enum class SigVersion/g' src/script/interpreter.h
git grep -l SIGVERSION_ | xargs sed -i 's/SIGVERSION_\(BASE\|WITNESS_V0\)/SigVersion::\1/g'
sed -i 's/^ SigVersion::/ /g' src/script/interpreter.h
sed -i 's/enum RetFormat {/enum class RetFormat {/g' src/rest.cpp
sed -i 's/RF_\(UNDEF\|BINARY\|HEX\|JSON\)/RetFormat::\1/g' src/rest.cpp
sed -i 's/^ RetFormat::/ /g' src/rest.cpp
sed -i 's/enum HelpMessageMode {/enum class HelpMessageMode {/g' src/init.h
git grep -l HMM_ | xargs sed -i 's/HMM_BITCOIN/HelpMessageMode::BITCOIN/g'
sed -i 's/^ HelpMessageMode::/ /g' src/init.h
sed -i 's/enum FeeEstimateHorizon/enum class FeeEstimateHorizon/g' src/policy/fees.h
sed -i 's/enum RBFTransactionState/enum class RBFTransactionState/g' src/policy/rbf.h
git grep -l RBF_ | xargs sed -i 's/RBF_TRANSACTIONSTATE_\(UNKNOWN\|REPLACEABLE_BIP125\|FINAL\)/RBFTransactionState::\1/g'
sed -i 's/^ RBFTransactionState::/ /g' src/policy/rbf.h
sed -i 's/enum BlockSource {/enum class BlockSource {/g' src/qt/clientmodel.h
git grep -l BLOCK_SOURCE_ | xargs sed -i 's/BLOCK_SOURCE_\(NONE\|REINDEX\|DISK\|NETWORK\)/BlockSource::\1/g'
sed -i 's/^ BlockSource::/ /g' src/qt/clientmodel.h
sed -i 's/enum FlushStateMode {/enum class FlushStateMode {/g' src/validation.cpp
sed -i 's/FLUSH_STATE_\(NONE\|IF_NEEDED\|PERIODIC\|ALWAYS\)/FlushStateMode::\1/g' src/validation.cpp
sed -i 's/^ FlushStateMode::/ /g' src/validation.cpp
sed -i 's/enum WitnessMode {/enum class WitnessMode {/g' src/test/script_tests.cpp
sed -i 's/WITNESS_\(NONE\|PKH\|SH\)/WitnessMode::\1/g' src/test/script_tests.cpp
sed -i 's/^ WitnessMode::/ /g' src/test/script_tests.cpp
-END VERIFY SCRIPT-
|
|
|
|
* inline performance critical code
* Average runtime is specified and used to calculate iterations.
* Console: show median of multiple runs
* plot: show box plot
* filter benchmarks
* specify scaling factor
* ignore src/test and src/bench in command line check script
* number of iterations instead of time
* Replaced runtime in BENCHMARK makro number of iterations.
* Added -? to bench_bitcoin
* Benchmark plotly.js URL, width, height can be customized
* Fixed incorrect precision warning
|
|
-BEGIN VERIFY SCRIPT-
for f in \
src/*.cpp \
src/*.h \
src/bench/*.cpp \
src/bench/*.h \
src/compat/*.cpp \
src/compat/*.h \
src/consensus/*.cpp \
src/consensus/*.h \
src/crypto/*.cpp \
src/crypto/*.h \
src/crypto/ctaes/*.h \
src/policy/*.cpp \
src/policy/*.h \
src/primitives/*.cpp \
src/primitives/*.h \
src/qt/*.cpp \
src/qt/*.h \
src/qt/test/*.cpp \
src/qt/test/*.h \
src/rpc/*.cpp \
src/rpc/*.h \
src/script/*.cpp \
src/script/*.h \
src/support/*.cpp \
src/support/*.h \
src/support/allocators/*.h \
src/test/*.cpp \
src/test/*.h \
src/wallet/*.cpp \
src/wallet/*.h \
src/wallet/test/*.cpp \
src/wallet/test/*.h \
src/zmq/*.cpp \
src/zmq/*.h
do
base=${f%/*}/ relbase=${base#src/} sed -i "s:#include \"\(.*\)\"\(.*\):if test -e \$base'\\1'; then echo \"#include <\"\$relbase\"\\1>\\2\"; else echo \"#include <\\1>\\2\"; fi:e" $f
done
-END VERIFY SCRIPT-
|
|
Avoid static analyzer warnings regarding "Function call argument
is a pointer to uninitialized value" in cases where we are
intentionally using such arguments.
This is achieved by using ...
`f(b.begin(), b.end())` (`std::array<char, N>`)
... instead of ...
`f(b, b + N)` (`char b[N]`)
Rationale:
* Reduce false positives by guiding static analyzers regarding our
intentions.
Before this commit:
```
$ clang-tidy-3.5 -checks=* src/bench/base58.cpp
bench/base58.cpp:23:9: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage]
EncodeBase58(b, b + 32);
^
$ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp
bench/verify_script.cpp:59:5: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage]
key.Set(vchKey, vchKey + 32, false);
^
$
```
After this commit:
```
$ clang-tidy-3.5 -checks=* src/bench/base58.cpp
$ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp
$
```
|
|
5113474 wallet: Use CDataStream.data() (Wladimir J. van der Laan)
e2300ff bench: Use CDataStream.data() (Wladimir J. van der Laan)
adff950 dbwrapper: Use new .data() method of CDataStream (Wladimir J. van der Laan)
a2141e4 streams: Remove special cases for ancient MSVC (Wladimir J. van der Laan)
af4c44c streams: Add data() method to CDataStream (Wladimir J. van der Laan)
|
|
f6fb7ac Move CTxInWitness inside CTxIn (Pieter Wuille)
|
|
|
|
|
|
|
|
The new benchmarks exercise script validation, CCoinsDBView caching,
mempool eviction, and wallet coin selection code.
All of the benchmarks added here are extremely simple and don't
necessarily mirror common real world conditions or interesting
performance edge cases. Details about how specific benchmarks can be
improved are noted in comments.
Github-Issue: #7883
|