Merge bitcoin/bitcoin#25957: wallet: fast rescan with BIP157 block filters for descriptor wallets

0582932260e7de4e8aba01d63e7c8a9ddb9c3685 test: add test for fast rescan using block filters (top-up detection) (Sebastian Falbesoner) ca48a4694f73e5be8f971ae482ebc2cce4caef44 rpc: doc: mention rescan speedup using `blockfilterindex=1` in affected wallet RPCs (Sebastian Falbesoner) 3449880b499d54bfbcf6caeed52851ce55259ed7 wallet: fast rescan: show log message for every non-skipped block (Sebastian Falbesoner) 935c6c4b234bbb0565cda6f58ee298048856acae wallet: take use of `FastWalletRescanFilter` (Sebastian Falbesoner) 70b35139040a2351c845a1cec1dafd2fbcd16e93 wallet: add `FastWalletRescanFilter` class for speeding up rescans (Sebastian Falbesoner) c051026586fb269584bcba41de8a4a90280f5a7e wallet: add method for retrieving the end range for a ScriptPubKeyMan (Sebastian Falbesoner) 845279132b494f03b84d689c666fdcfad37f5a42 wallet: support fetching scriptPubKeys with minimum descriptor range index (Sebastian Falbesoner) 088e38d3bbea9694b319bc34e0d2e70d210c38b4 add chain interface methods for using BIP 157 block filters (Sebastian Falbesoner) Pull request description: ## Description This PR is another take of using BIP 157 block filters (enabled by `-blockfilterindex=1`) for faster wallet rescans and is a modern revival of #15845. For reviewers new to this topic I can highly recommend to read the corresponding PR review club (https://bitcoincore.reviews/15845). The basic idea is to skip blocks for deeper inspection (i.e. looking at every single tx for matches) if our block filter doesn't match any of the block's spent or created UTXOs are relevant for our wallet. Note that there can be false-positives (see https://bitcoincore.reviews/15845#l-199 for a PR review club discussion about false-positive rates), but no false-negatives, i.e. it is safe to skip blocks if the filter doesn't match; if the filter *does* match even though there are no wallet-relevant txs in the block, no harm is done, only a little more time is spent extra. In contrast to #15845, this solution only supports descriptor wallets, which are way more widespread now than back in the time >3 years ago. With that approach, we don't have to ever derive the relevant scriptPubKeys ourselves from keys before populating the filter, and can instead shift the full responsibility to that to the `DescriptorScriptPubKeyMan` which already takes care of that automatically. Compared to legacy wallets, the `IsMine` logic for descriptor wallets is as trivial as checking if a scriptPubKey is included in the ScriptPubKeyMan's set of scriptPubKeys (`m_map_script_pub_keys`): https://github.com/bitcoin/bitcoin/blob/e191fac4f3c37820f0618f72f0a8e8b524531ab8/src/wallet/scriptpubkeyman.cpp#L1703-L1710 One of the unaddressed issues of #15845 was that [the filter was only created once outside the loop](https://github.com/bitcoin/bitcoin/pull/15845#discussion_r343265997) and as such didn't take into account possible top-ups that have happened. This is solved here by keeping a state of ranged `DescriptorScriptPubKeyMan`'s descriptor end ranges and check at each iteration whether that range has increased since last time. If yes, we update the filter with all scriptPubKeys that have been added since the last filter update with a range index equal or higher than the last end range. Note that finding new scriptPubKeys could be made more efficient than linearly iterating through the whole `m_script_pub_keys` map (e.g. by introducing a bidirectional map), but this would mean introducing additional complexity and state and it's probably not worth it at this time, considering that the performance gain is already significant. Output scripts from non-ranged `DescriptorScriptPubKeyMan`s (i.e. ones with a fixed set of output scripts that is never extended) are added only once when the filter is created first. ## Benchmark results Obviously, the speed-up indirectly correlates with the wallet tx frequency in the scanned range: the more blocks contain wallet-related transactions, the less blocks can be skipped due to block filter detection. In a [simple benchmark](https://github.com/theStack/bitcoin/blob/fast_rescan_functional_test_benchmark/test/functional/pr25957_benchmark.py), a regtest chain with 1008 blocks (corresponding to 1 week) is mined with 20000 scriptPubKeys contained (25 txs * 800 outputs) each. The blocks each have a weight of ~2500000 WUs and hence are about 62.5% full. A global constant `WALLET_TX_BLOCK_FREQUENCY` defines how often wallet-related txs are included in a block. The created descriptor wallet (default setting of `keypool=1000`, we have 8*1000 = 8000 scriptPubKeys at the start) is backuped via the `backupwallet` RPC before the mining starts and imported via `restorewallet` RPC after. The measured time for taking this import process (which involves a rescan) once with block filters (`-blockfilterindex=1`) and once without block filters (`-blockfilterindex=0`) yield the relevant result numbers for the benchmark. The following table lists the results, sorted from worst-case (all blocks contain wallte-relevant txs, 0% can be skipped) to best-case (no blocks contain walltet-relevant txs, 100% can be skipped) where the frequencies have been picked arbitrarily: wallet-related tx frequency; 1 tx per... | ratio of irrelevant blocks | w/o filters | with filters | speed gain --------------------------------------------|-----------------------------|-------------|--------------|------------- ~ 10 minutes (every block) | 0% | 56.806s | 63.554s | ~0.9x ~ 20 minutes (every 2nd block) | 50% (1/2) | 58.896s | 36.076s | ~1.6x ~ 30 minutes (every 3rd block) | 66.67% (2/3) | 56.781s | 25.430s | ~2.2x ~ 1 hour (every 6th block) | 83.33% (5/6) | 58.193s | 15.786s | ~3.7x ~ 6 hours (every 36th block) | 97.22% (35/36) | 57.500s | 6.935s | ~8.3x ~ 1 day (every 144th block) | 99.31% (143/144) | 68.881s | 6.107s | ~11.3x (no txs) | 100% | 58.529s | 5.630s | ~10.4x Since even the (rather unrealistic) worst-case scenario of having wallet-related txs in _every_ block of the rescan range obviously doesn't take significantly longer, I'd argue it's reasonable to always take advantage of block filters if they are available and there's no need to provide an option for the user. Feedback about the general approach (but also about details like naming, where I struggled a lot) would be greatly appreciated. Thanks fly out to furszy for discussing this subject and patiently answering basic question about descriptor wallets! ACKs for top commit: achow101: ACK 0582932260e7de4e8aba01d63e7c8a9ddb9c3685 Sjors: re-utACK 0582932260e7de4e8aba01d63e7c8a9ddb9c3685 aureleoules: ACK 0582932260e7de4e8aba01d63e7c8a9ddb9c3685 - minor changes, documentation and updated test since last review w0xlt: re-ACK https://github.com/bitcoin/bitcoin/pull/25957/commits/0582932260e7de4e8aba01d63e7c8a9ddb9c3685 Tree-SHA512: 3289ba6e4572726e915d19f3e8b251d12a4cec8c96d041589956c484b5575e3708b14f6e1e121b05fe98aff1c8724de4564a5a9123f876967d33343cbef242e1
author: Andrew Chow <github@achow101.com> 2022-10-26 11:12:28 -0400
committer: Andrew Chow <github@achow101.com> 2022-10-26 11:19:19 -0400
commit: 48af307481af3b3e9455596b515e8ecb001672ef (patch)
tree: 6d842cbd6a0ce8eef9c017799cc5c4bf30e1240d /src/node
parent: 69b10212ea5370606c7a5aa500a70c36b4cbb58f (diff)
parent: 0582932260e7de4e8aba01d63e7c8a9ddb9c3685 (diff)
download: bitcoin-48af307481af3b3e9455596b515e8ecb001672ef.tar.xz
1 files changed, 16 insertions, 0 deletions
diff --git a/src/node/interfaces.cpp b/src/node/interfaces.cpp
index 8a0011a629..979c625463 100644
--- a/src/node/interfaces.cpp
+++ b/src/node/interfaces.cpp
@@ -4,10 +4,12 @@
 
 #include <addrdb.h>
 #include <banman.h>
+#include <blockfilter.h>
 #include <chain.h>
 #include <chainparams.h>
 #include <deploymentstatus.h>
 #include <external_signer.h>
+#include <index/blockfilterindex.h>
 #include <init.h>
 #include <interfaces/chain.h>
 #include <interfaces/handler.h>
@@ -536,6 +538,20 @@ public:
         }
         return std::nullopt;
     }
+    bool hasBlockFilterIndex(BlockFilterType filter_type) override
+    {
+        return GetBlockFilterIndex(filter_type) != nullptr;
+    }
+    std::optional<bool> blockFilterMatchesAny(BlockFilterType filter_type, const uint256& block_hash, const GCSFilter::ElementSet& filter_set) override
+    {
+        const BlockFilterIndex* block_filter_index{GetBlockFilterIndex(filter_type)};
+        if (!block_filter_index) return std::nullopt;
+
+        BlockFilter filter;
+        const CBlockIndex* index{WITH_LOCK(::cs_main, return chainman().m_blockman.LookupBlockIndex(block_hash))};
+        if (index == nullptr || !block_filter_index->LookupFilter(index, filter)) return std::nullopt;
+        return filter.GetFilter().MatchAny(filter_set);
+    }
     bool findBlock(const uint256& hash, const FoundBlock& block) override
     {
         WAIT_LOCK(cs_main, lock);
author	Andrew Chow <github@achow101.com>	2022-10-26 11:12:28 -0400
committer	Andrew Chow <github@achow101.com>	2022-10-26 11:19:19 -0400
commit	48af307481af3b3e9455596b515e8ecb001672ef (patch)
tree	6d842cbd6a0ce8eef9c017799cc5c4bf30e1240d /src/node
parent	69b10212ea5370606c7a5aa500a70c36b4cbb58f (diff)
parent	0582932260e7de4e8aba01d63e7c8a9ddb9c3685 (diff)
download	bitcoin-48af307481af3b3e9455596b515e8ecb001672ef.tar.xz