diff options
author | Andrew Chow <github@achow101.com> | 2023-01-27 01:44:17 -0500 |
---|---|---|
committer | Andrew Chow <github@achow101.com> | 2023-01-27 01:53:21 -0500 |
commit | 835212cd1d8f8fc7f19775f5ff8cc21c099122b2 (patch) | |
tree | 52e4bff58bddf2c42d1a827791f5a6dd2c60124d /src/util/check.h | |
parent | ffc22b7d42c6360223508293b8c1f88b1a1a468b (diff) | |
parent | 39b93649c4b98cd82c64b957fd9f6a6fd3c2a359 (diff) |
Merge bitcoin/bitcoin#25880: p2p: Make stalling timeout adaptive during IBD
39b93649c4b98cd82c64b957fd9f6a6fd3c2a359 test: add functional test for IBD stalling logic (Martin Zumsande)
0565951f34e6d155dc825964c5d8b1dd00931682 p2p: Make block stalling timeout adaptive (Martin Zumsande)
Pull request description:
During IBD, there is the following stalling mechanism if we can't proceed with assigning blocks from a 1024 lookahead window because all of these blocks are either already downloaded or in-flight: We'll mark the peer from which we expect the current block that would allow us to advance our tip (and thereby move the 1024 window ahead) as a possible staller. We then give this peer 2 more seconds to deliver a block (`BLOCK_STALLING_TIMEOUT`) and if it doesn't, disconnect it and assign the critical block we need to another peer.
Now the problem is that this second peer is immediately marked as a potential staller using the same mechanism and given 2 seconds as well - if our own connection is so slow that it simply takes us more than 2 seconds to download this block, that peer will also be disconnected (and so on...), leading to repeated disconnections and no progress in IBD. This has been described in #9213, and I have observed this when doing IBD on slower connections or with Tor - sometimes there would be several minutes without progress, where all we did was disconnect peers and find new ones.
The `2s` stalling timeout was introduced in #4468, when blocks weren't full and before Segwit increased the maximum possible physical size of blocks - so I think it made a lot of sense back then.
But it would be good to revisit this timeout now.
This PR makes the timout adaptive (idea by sipa):
If we disconnect a peer for stalling, we now double the timeout for the next peer (up to a maximum of 64s). If we connect a block, we half it again up to the old value of 2 seconds. That way, peers that are comparatively slower will still get disconnected, but long phases of disconnecting all peers shouldn't happen anymore.
Fixes #9213
ACKs for top commit:
achow101:
ACK 39b93649c4b98cd82c64b957fd9f6a6fd3c2a359
RandyMcMillan:
Strong Concept ACK 39b93649c4b98cd82c64b957fd9f6a6fd3c2a359
vasild:
ACK 39b93649c4b98cd82c64b957fd9f6a6fd3c2a359
naumenkogs:
ACK 39b93649c4b98cd82c64b957fd9f6a6fd3c2a359
Tree-SHA512: 85bc57093b2fb1d28d7409ed8df5a91543909405907bc129de7c6285d0810dd79bc05219e4d5aefcb55c85512b0ad5bed43a4114a17e46c35b9a3f9a983d5754
Diffstat (limited to 'src/util/check.h')
0 files changed, 0 insertions, 0 deletions