Age | Commit message (Collapse) | Author |
|
determine available cores
937bf4335 Use std::thread::hardware_concurrency, instead of Boost, to determine available cores (fanquake)
Pull request description:
Following discussion on IRC about replacing Boost usage for detecting available system cores, I've opened this to collect some benchmarks + further discussion.
The current method for detecting available cores was introduced in #6361.
Recap of the IRC chat:
```
21:14:08 fanquake: Since we seem to be giving Boost removal a good shot for 0.15, does anyone have suggestions for replacing GetNumCores?
21:14:26 fanquake: There is std::thread::hardware_concurrency(), but that seems to count virtual cores, which I don't think we want.
21:14:51 BlueMatt: fanquake: I doubt we'll do boost removal for 0.15
21:14:58 BlueMatt: shit like BOOST_FOREACH, sure
21:15:07 BlueMatt: but all of boost? doubtful, there are still things we need
21:16:36 fanquake: Yea sorry, not the whole lot, but we can remove a decent chunk. Just looking into what else needs to be done to replace some of the less involved Boost usage.
21:16:43 BlueMatt: fair
21:17:14 wumpus: yes, it makes sense to plan ahead a bit, without immediately doing it
21:18:12 wumpus: right, don't count virtual cores, that used to be the case but it makes no sense for our usage
21:19:15 wumpus: it'd create a swarm of threads overwhelming any machine with hyperthreading (+accompanying thread stack overhead), for script validation, and there was no gain at all for that
21:20:03 sipa: BlueMatt: don't worry, there is no hurry
21:59:10 morcos: wumpus: i don't think that is correct
21:59:24 morcos: suppose you have 4 cores (8 virtual cores)
21:59:24 wumpus: fanquake: indeed seems that std has no equivalent to physical_concurrency, on any standard. That's annoying as it is non-trivial to implement
21:59:35 morcos: i think running par=8 (if it let you) would be notably faster
21:59:59 morcos: jeremyrubin and i discussed this at length a while back... i think i commented about it on irc at the time
22:00:21 wumpus: morcos: I think the conclusion at the time was that it made no difference, but sure would make sense to benchmark
22:00:39 morcos: perhaps historical testing on the virtual vs actual cores was polluted by concurrency issues that have now improved
22:00:47 wumpus: I think there are not more ALUs, so there is not really a point in having more threads
22:01:40 wumpus: hyperthreads are basically just a stored register state right?
22:02:23 sipa: wumpus: yes but it helps the scheduler
22:02:27 wumpus: in which case the only speedup using "number of cores" threads would give you is, possibly, excluding other software from running on the cores on the same time
22:02:37 morcos: well this is where i get out of my depth
22:02:50 sipa: if one of the threads is waiting on a read from ram, the other can use the arithmetic unit for example
22:02:54 morcos: wumpus: i'm pretty sure though that the speed up is considerably more than what you might expect from that
22:02:59 wumpus: sipa: ok, I back down, I didn't want to argue this at all
22:03:35 morcos: the reason i haven't tested it myself, is the machine i usually use has 16 cores... so not easy due to remaining concurrency issues to get much more speedup
22:03:36 wumpus: I'm fine with restoring it to number of virtual threads if that's faster
22:03:54 morcos: we should have somene with 4 cores (and  actually test it though, i agree
22:03:58 sipa: i would expect (but we should benchmark...) that if 8 scriot validation threads instead of 4 on a quadcore hyperthreading is not faster, it's due to lock contention
22:04:20 morcos: sipa: yeah thats my point, i think lock contention isn't that bad with 8 now
22:04:22 wumpus: on 64-bit systems the additional thread overhead wouldn't be important at least
22:04:23 gmaxwell: I previously benchmarked, a long time ago, it was faster.
22:04:33 gmaxwell: (to use the HT core count)
22:04:44 wumpus: why was this changed at all then?
22:04:47 wumpus: I'm confused
22:05:04 sipa: good question!
22:05:06 gmaxwell: I had no idea we changed it.
22:05:25 wumpus: sigh 
22:05:54 gmaxwell: What PR changed it?
22:06:51 gmaxwell: In any case, on 32-bit it's probably a good tradeoff... the extra ram overhead is worth avoiding.
22:07:22 wumpus: https://github.com/bitcoin/bitcoin/pull/6361
22:07:28 gmaxwell: PR 6461 btw.
22:07:37 gmaxwell: er lol at least you got it right.
22:07:45 wumpus: the complaint was that systems became unsuably slow when using that many thread
22:07:51 wumpus: so at least I got one thing right, woohoo
22:07:55 sipa: seems i even acked it!
22:07:57 BlueMatt: wumpus: there are more alus
22:08:38 BlueMatt: but we need to improve lock contention first
22:08:40 morcos: anywya, i think in the past the lock contention made 8 threads regardless of cores a bit dicey.. now that is much better (although more still to be done)
22:09:01 BlueMatt: or we can just merge #10192, thats fee
22:09:04 gribble: https://github.com/bitcoin/bitcoin/issues/10192 | Cache full script execution results in addition to signatures by TheBlueMatt · Pull Request #10192 · bitcoin/bitcoin · GitHub
22:09:11 BlueMatt: s/fee/free/
22:09:21 morcos: no, we do not need to improve lock contention first. but we should probably do that before we increase the max beyond 16
22:09:26 BlueMatt: then we can toss concurrency issues out the window and get more speedup anyway
22:09:35 gmaxwell: wumpus: yea, well in QT I thought we also diminished the count by 1 or something? but yes, if the motivation was to reduce how heavily the machine was used, thats fair.
22:09:56 sipa: the benefit of using HT cores is certainly not a factor 2
22:09:58 wumpus: gmaxwell: for the default I think this makes a lot of sense, yes
22:10:10 gmaxwell: morcos: right now on my 24/28 physical core hosts going beyond 16 still reduces performance.
22:10:11 wumpus: gmaxwell: do we also restrict the maximum par using this? that'd make less sense
22:10:51 wumpus: if someone *wants* to use the virtual cores they should be able to by setting -par=
22:10:51 sipa: *flies to US*
22:10:52 BlueMatt: sipa: sure, but the shared cache helps us get more out of it than some others, as morcos points out
22:11:30 BlueMatt: (because it means our thread contention issues are less)
22:12:05 morcos: gmaxwell: yeah i've been bogged down in fee estimation as well (and the rest of life) for a while now.. otherwise i would have put more effort into jeremy's checkqueue
22:12:36 BlueMatt: morcos: heh, well now you can do other stuff while the rest of us get bogged down in understanding fee estimation enough to review it 
22:12:37 wumpus: [to answer my own question: no, the limit for par is MAX_SCRIPTCHECK_THREADS, or 16]
22:12:54 morcos: but to me optimizing for more than 16 cores is pretty valuable as miners could use beefy machines and be less concerned by block validation time
22:14:38 BlueMatt: morcos: i think you may be surprised by the number of mining pools that are on VPSes that do not have 16 cores 
22:15:34 gmaxwell: I assume right now most of the time block validation is bogged in the parts that are not as concurrent. simple because caching makes the concurrent parts so fast. (and soon to hopefully increase with bluematt's patch)
22:17:55 gmaxwell: improving sha2 speed, or transaction malloc overhead are probably bigger wins now for connection at the tip than parallelism beyond 16 (though I'd like that too).
22:18:21 BlueMatt: sha2 speed is big
22:18:27 morcos: yeah lots of things to do actually...
22:18:57 gmaxwell: BlueMatt: might be a tiny bit less big if we didn't hash the block header 8 times for every block. 
22:21:27 BlueMatt: ehh, probably, but I'm less rushed there
22:21:43 BlueMatt: my new cache thing is about to add a bunch of hashing
22:21:50 BlueMatt: 1 sha round per tx
22:22:25 BlueMatt: and sigcache is obviously a ton
```
Tree-SHA512: a594430e2a77d8cc741ea8c664a2867b1e1693e5050a4bbc8511e8d66a2bffe241a9965f6dff1e7fbb99f21dd1fdeb95b826365da8bd8f9fab2d0ffd80d5059c
|
|
Add a unit test for LockDirectory, introduced in #11281.
|
|
Most commandline/config args are interpreted as relative to datadir if
not passed absolute. Consolidate the logic for this normalization.
|
|
|
|
|
|
available cores
|
|
be9f38c Do not make it trivial for inbound peers to generate log entries (Matt Corallo)
Pull request description:
Based on #11580 because I'm lazy.
We should generally avoid writing to debug.log unconditionally for
inbound peers which misbehave (the peer being about to be banned
being an exception, since they cannot do this twice).
Tree-SHA512: 8e59c8d08d00b1527951b30f4842d010a4c2fc440503ade112baa2c1b9afd0e0d1c5c2df83dde25183a242af45089cf9b9f873b71796771232ffb6c5fc6cc0cc
|
|
This patch adds an option to configure the name and/or directory of the
debug log.
The user can specify either a relative path, in which case the path
is relative to the data directory. They can also specify an absolute
path to put the log anywhere else in the file system.
|
|
|
|
-BEGIN VERIFY SCRIPT-
for f in \
src/*.cpp \
src/*.h \
src/bench/*.cpp \
src/bench/*.h \
src/compat/*.cpp \
src/compat/*.h \
src/consensus/*.cpp \
src/consensus/*.h \
src/crypto/*.cpp \
src/crypto/*.h \
src/crypto/ctaes/*.h \
src/policy/*.cpp \
src/policy/*.h \
src/primitives/*.cpp \
src/primitives/*.h \
src/qt/*.cpp \
src/qt/*.h \
src/qt/test/*.cpp \
src/qt/test/*.h \
src/rpc/*.cpp \
src/rpc/*.h \
src/script/*.cpp \
src/script/*.h \
src/support/*.cpp \
src/support/*.h \
src/support/allocators/*.h \
src/test/*.cpp \
src/test/*.h \
src/wallet/*.cpp \
src/wallet/*.h \
src/wallet/test/*.cpp \
src/wallet/test/*.h \
src/zmq/*.cpp \
src/zmq/*.h
do
base=${f%/*}/ relbase=${base#src/} sed -i "s:#include \"\(.*\)\"\(.*\):if test -e \$base'\\1'; then echo \"#include <\"\$relbase\"\\1>\\2\"; else echo \"#include <\\1>\\2\"; fi:e" $f
done
-END VERIFY SCRIPT-
|
|
We should generally avoid writing to debug.log unconditionally for
inbound peers which misbehave (the peer being about to be banned
being an exception, since they cannot do this twice).
To avoid removing logs for outbound peers, a new log is added to
notify users when a new outbound peer is connected which mimics
the version print.
|
|
From @ryanofsky:s #10973. Thanks!
|
|
a622a1768 Fix constness of ArgsManager methods (João Barbosa)
Pull request description:
Make `cs_args` mutex mutable so that const methods can acquire it.
There's also tiny performance improvement by avoiding two map lookups when retrieving an argument value.
Tree-SHA512: ece58469745f2743b4b643242b51889a3d9c5b76492ed70bb74d4e5b378fff59da79fc129e499da779bf9f488c9435dda17ad1f3a804c1c30f56af422389e8bd
|
|
|
|
instead of the macro NULL
-BEGIN VERIFY SCRIPT-
sed -i 's/\<NULL\>/nullptr/g' src/*.cpp src/*.h src/*/*.cpp src/*/*.h src/qt/*/*.cpp src/qt/*/*.h src/wallet/*/*.cpp src/wallet/*/*.h src/support/allocators/*.h
sed -i 's/Prefer nullptr, otherwise SAFECOOKIE./Prefer NULL, otherwise SAFECOOKIE./g' src/torcontrol.cpp
sed -i 's/tor: Using nullptr authentication/tor: Using NULL authentication/g' src/torcontrol.cpp
sed -i 's/METHODS=nullptr/METHODS=NULL/g' src/test/torcontrol_tests.cpp src/torcontrol.cpp
sed -i 's/nullptr certificates/NULL certificates/g' src/qt/paymentserver.cpp
sed -i 's/"nullptr"/"NULL"/g' src/torcontrol.cpp src/test/torcontrol_tests.cpp
-END VERIFY SCRIPT-
|
|
|
|
match actual parameter names.
|
|
has been running
|
|
d5711f4 Filter subtrees and and benchmarks from coverage report (Andrew Chow)
405b86a Replace lcov -r commands with faster way (Andrew Chow)
c8914b9 Have `make cov` optionally include branch coverage statistics (Andrew Chow)
Tree-SHA512: 9c349a7baeb7430ea586617c52f91177df58e3546d6dc573e26815ddb79e30ab1873542d85ac1daca5e1fb2c6d6c8965824b42d027b6b0496a744af57b095852
|
|
|
|
1d1ea9f Turn TryCreateDirectory() into TryCreateDirectories() (Marko Bencun)
Tree-SHA512: 49a524167bcf66e351a964c88d09cb3bcee12769a32da83410e3ba649fa4bcdbf0478d41e4d09bb55adb9b3f122e742271db6feb30bbafe2a7973542b5f10f79
|
|
Use case: TryCreateDirectory(GetDataDir() / "blocks" / "index") would
fail if the blocks directory was not explicitly created before.
The line that did so was in a weird location and could be removed as a
result.
|
|
|
|
Added an option to configure to allow for branch coverage statistics gathering.
Disabled logprint macro when coverage testing is on so that unnecessary branches are not analyzed.
|
|
|
|
- Set ArgsManager::mapMultiArgs in ArgsManager::SoftSetArg, ForceSetArg, SoftSetBoolArg
|
|
- Introduce ArgsManager::GetArgs()
- Adapt util_tests.cpp to ArgsManager
|
|
Adds an RPC to get and set currently active logging categories.
|
|
Step two in abstracting away boost::filesystem.
To repeat this, simply run:
```
git ls-files \*.cpp \*.h | xargs sed -i 's/boost::filesystem/fs/g'
```
|
|
This is step one in abstracting the use of boost::filesystem.
|
|
This changes the logging categories to boolean flags instead of strings.
This simplifies the acceptance testing by avoiding accessing a scoped
static thread local pointer to a thread local set of strings. It
eliminates the only use of boost::thread_specific_ptr outside of
lockorder debugging.
This change allows log entries to be directed to multiple categories
and makes it easy to change the logging flags at runtime (e.g. via
an RPC, though that isn't done by this commit.)
It also eliminates the fDebug global.
Configuration of unknown logging categories now produces a warning.
|
|
|
|
Throw tinyformat::format_error on formatting error instead of the
`std::runtime_error`.
|
|
Instead of having an exception propagate into the program when an
error happens while formatting a log message, just print a message to
the log.
Addresses #9423.
|
|
407cdd6 Do not evaluate hidden LogPrint arguments (Pieter Wuille)
|
|
Edited via:
$ contrib/devtools/copyright_header.py update .
|
|
|
|
|
|
|
|
Swap mapMultiArgs for a const-reference to a _mapMultiArgs which is
only accessed in util.cpp
|
|
|
|
|
|
|
|
This moves all access to these datastructures through accessor functions
and protects them with a lock.
|
|
strMiscWarning.
This is a first step in avoiding racy accesses to strMiscWarning.
The change required moving GetWarnings and related globals to util.
|
|
There is no need to store this flag globally, the variable is only used
inside the initialization process.
Thanks to Alex Morcos for the idea.
|
|
|
|
|
|
Changes in tinyformat, recently imported from upstream have made the
zero-argument versions of formatting functions unnecessary. Remove them.
This is a slight semantic change: `%` characters in the zero-argument
call are now regarded and need to be escaped. As for as I know, the only
use of this is in `main.cpp`.
|
|
|