diff options
author | MarcoFalke <falke.marco@gmail.com> | 2021-11-03 13:01:47 +0100 |
---|---|---|
committer | MarcoFalke <falke.marco@gmail.com> | 2021-11-03 13:01:53 +0100 |
commit | 23ae7931be50376fa6bda692c641a3d2538556ee (patch) | |
tree | 0b97976cce0ac0df287b0be41a214ef81bb27df8 | |
parent | e2b5192d1c078c92beb002f47f75666a2ee7474d (diff) | |
parent | 9ab440199d5c888363a42c957433d0e46cd0d2ff (diff) |
Merge bitcoin/bitcoin#23154: doc: add assumeutxo notes
9ab440199d5c888363a42c957433d0e46cd0d2ff doc: add assumeutxo notes (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: #15606)
---
Adds some notes on assumeutxo design.
Related: https://github.com/bitcoin/bitcoin/pull/21526#discussion_r715558994
ACKs for top commit:
ariard:
ACK 9ab4401
naumenkogs:
ACK 9ab4401
michaelfolkson:
ACK 9ab440199d5c888363a42c957433d0e46cd0d2ff
fjahr:
ACK 9ab440199d5c888363a42c957433d0e46cd0d2ff
Tree-SHA512: 2fca8373b78701754957d12bc43ce18aa6928507965448741cb4e8c56589ad61d261f8542e348094fc9631d46ee6a7afee75c965c0db993fc816758569137b74
-rw-r--r-- | doc/README.md | 1 | ||||
-rw-r--r-- | doc/assumeutxo.md | 138 |
2 files changed, 139 insertions, 0 deletions
diff --git a/doc/README.md b/doc/README.md index aabfe220bc..4845f00ade 100644 --- a/doc/README.md +++ b/doc/README.md @@ -71,6 +71,7 @@ The Bitcoin repo's [root README](/README.md) contains relevant information on th ### Miscellaneous - [Assets Attribution](assets-attribution.md) +- [Assumeutxo design](assumeutxo.md) - [bitcoin.conf Configuration File](bitcoin-conf.md) - [Files](files.md) - [Fuzz-testing](fuzzing.md) diff --git a/doc/assumeutxo.md b/doc/assumeutxo.md new file mode 100644 index 0000000000..2726cf779b --- /dev/null +++ b/doc/assumeutxo.md @@ -0,0 +1,138 @@ +# assumeutxo + +Assumeutxo is a feature that allows fast bootstrapping of a validating bitcoind +instance with a very similar security model to assumevalid. + +The RPC commands `dumptxoutset` and `loadtxoutset` are used to respectively generate +and load UTXO snapshots. The utility script `./contrib/devtools/utxo_snapshot.sh` may +be of use. + +## General background + +- [assumeutxo proposal](https://github.com/jamesob/assumeutxo-docs/tree/2019-04-proposal/proposal) +- [Github issue](https://github.com/bitcoin/bitcoin/issues/15605) +- [draft PR](https://github.com/bitcoin/bitcoin/pull/15606) + +## Design notes + +- A new block index `nStatus` flag is introduced, `BLOCK_ASSUMED_VALID`, to mark block + index entries that are required to be assumed-valid by a chainstate created + from a UTXO snapshot. This flag is mostly used as a way to modify certain + CheckBlockIndex() logic to account for index entries that are pending validation by a + chainstate running asynchronously in the background. We also use this flag to control + which index entries are added to setBlockIndexCandidates during LoadBlockIndex(). + +- Indexing implementations via BaseIndex can no longer assume that indexation happens + sequentially, since background validation chainstates can submit BlockConnected + events out of order with the active chain. + +- The concept of UTXO snapshots is treated as an implementation detail that lives + behind the ChainstateManager interface. The external presentation of the changes + required to facilitate the use of UTXO snapshots is the understanding that there are + now certain regions of the chain that can be temporarily assumed to be valid (using + the nStatus flag mentioned above). In certain cases, e.g. wallet rescanning, this is + very similar to dealing with a pruned chain. + + Logic outside ChainstateManager should try not to know about snapshots, instead + preferring to work in terms of more general states like assumed-valid. + + +## Chainstate phases + +Chainstate within the system goes through a number of phases when UTXO snapshots are +used, as managed by `ChainstateManager`. At various points there can be multiple +`CChainState` objects in existence to facilitate both maintaining the network tip and +performing historical validation of the assumed-valid chain. + +It is worth noting that though there are multiple separate chainstates, those +chainstates share use of a common block index (i.e. they hold the same `BlockManager` +reference). + +The subheadings below outline the phases and the corresponding changes to chainstate +data. + +### "Normal" operation via initial block download + +`ChainstateManager` manages a single CChainState object, for which +`m_snapshot_blockhash` is null. This chainstate is (maybe obviously) +considered active. This is the "traditional" mode of operation for bitcoind. + +| | | +| ---------- | ----------- | +| number of chainstates | 1 | +| active chainstate | ibd | + +### User loads a UTXO snapshot via `loadtxoutset` RPC + +`ChainstateManager` initializes a new chainstate (see `ActivateSnapshot()`) to load the +snapshot contents into. During snapshot load and validation (see +`PopulateAndValidateSnapshot()`), the new chainstate is not considered active and the +original chainstate remains in use as active. + +| | | +| ---------- | ----------- | +| number of chainstates | 2 | +| active chainstate | ibd | + +Once the snapshot chainstate is loaded and validated, it is promoted to active +chainstate and a sync to tip begins. A new chainstate directory is created in the +datadir for the snapshot chainstate called +`chainstate_[SHA256 blockhash of snapshot base block]`. + +| | | +| ---------- | ----------- | +| number of chainstates | 2 | +| active chainstate | snapshot | + +The snapshot begins to sync to tip from its base block, technically in parallel with +the original chainstate, but it is given priority during block download and is +allocated most of the cache (see `MaybeRebalanceCaches()` and usages) as our chief +consideration is getting to network tip. + +**Failure consideration:** if shutdown happens at any point during this phase, both +chainstates will be detected during the next init and the process will resume. + +### Snapshot chainstate hits network tip + +Once the snapshot chainstate leaves IBD, caches are rebalanced +(via `MaybeRebalanceCaches()` in `ActivateBestChain()`) and more cache is given +to the background chainstate, which is responsible for doing full validation of the +assumed-valid parts of the chain. + +**Note:** at this point, ValidationInterface callbacks will be coming in from both +chainstates. Considerations here must be made for indexing, which may no longer be happening +sequentially. + +### Background chainstate hits snapshot base block + +Once the tip of the background chainstate hits the base block of the snapshot +chainstate, we stop use of the background chainstate by setting `m_stop_use` (not yet +committed - see #15606), in `CompleteSnapshotValidation()`, which is checked in +`ActivateBestChain()`). We hash the background chainstate's UTXO set contents and +ensure it matches the compiled value in `CMainParams::m_assumeutxo_data`. + +The background chainstate data lingers on disk until shutdown, when in +`ChainstateManager::Reset()`, the background chainstate is cleaned up with +`ValidatedSnapshotShutdownCleanup()`, which renames the `chainstate_[hash]` datadir as +`chainstate`. + +| | | +| ---------- | ----------- | +| number of chainstates | 2 (ibd has `m_stop_use=true`) | +| active chainstate | snapshot | + +**Failure consideration:** if bitcoind unexpectedly halts after `m_stop_use` is set on +the background chainstate but before `CompleteSnapshotValidation()` can finish, the +need to complete snapshot validation will be detected on subsequent init by +`ChainstateManager::CheckForUncleanShutdown()`. + +### Bitcoind restarts sometime after snapshot validation has completed + +When bitcoind initializes again, what began as the snapshot chainstate is now +indistinguishable from a chainstate that has been built from the traditional IBD +process, and will be initialized as such. + +| | | +| ---------- | ----------- | +| number of chainstates | 1 | +| active chainstate | ibd | |