diff options
author | glozow <gloriajzhao@gmail.com> | 2022-10-12 14:13:35 -0400 |
---|---|---|
committer | glozow <gloriajzhao@gmail.com> | 2022-10-12 14:13:54 -0400 |
commit | cc12b8947b6d3f6c4b9cd4d147543dce693c6758 (patch) | |
tree | 6703f7abf1ae67bc1a24172956911255e999ac1a /src/test | |
parent | 1d277f42236d6074ea84c407ae0863ae943f27e4 (diff) | |
parent | bcb0cacac28e98a39dc856c574a0872fe17059e9 (diff) |
Merge bitcoin/bitcoin#24858: incorrect blk file size calculation during reindex results in recoverable blk file corruption
bcb0cacac28e98a39dc856c574a0872fe17059e9 reindex, log, test: fixes #21379 (mruddy)
Pull request description:
Fixes #21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac28e98a39dc856c574a0872fe17059e9
ryanofsky:
Code review ACK bcb0cacac28e98a39dc856c574a0872fe17059e9. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac28e98a39dc856c574a0872fe17059e9 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK https://github.com/bitcoin/bitcoin/pull/24858/commits/bcb0cacac28e98a39dc856c574a0872fe17059e9
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
Diffstat (limited to 'src/test')
-rw-r--r-- | src/test/blockmanager_tests.cpp | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/src/test/blockmanager_tests.cpp b/src/test/blockmanager_tests.cpp new file mode 100644 index 0000000000..dd7c32cc46 --- /dev/null +++ b/src/test/blockmanager_tests.cpp @@ -0,0 +1,42 @@ +// Copyright (c) 2022 The Bitcoin Core developers +// Distributed under the MIT software license, see the accompanying +// file COPYING or http://www.opensource.org/licenses/mit-license.php. + +#include <chainparams.h> +#include <node/blockstorage.h> +#include <node/context.h> +#include <validation.h> + +#include <boost/test/unit_test.hpp> +#include <test/util/setup_common.h> + +using node::BlockManager; +using node::BLOCK_SERIALIZATION_HEADER_SIZE; + +// use BasicTestingSetup here for the data directory configuration, setup, and cleanup +BOOST_FIXTURE_TEST_SUITE(blockmanager_tests, BasicTestingSetup) + +BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos) +{ + const auto params {CreateChainParams(ArgsManager{}, CBaseChainParams::MAIN)}; + BlockManager blockman {}; + CChain chain {}; + // simulate adding a genesis block normally + BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, chain, *params, nullptr).nPos, BLOCK_SERIALIZATION_HEADER_SIZE); + // simulate what happens during reindex + // simulate a well-formed genesis block being found at offset 8 in the blk00000.dat file + // the block is found at offset 8 because there is an 8 byte serialization header + // consisting of 4 magic bytes + 4 length bytes before each block in a well-formed blk file. + FlatFilePos pos{0, BLOCK_SERIALIZATION_HEADER_SIZE}; + BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, chain, *params, &pos).nPos, BLOCK_SERIALIZATION_HEADER_SIZE); + // now simulate what happens after reindex for the first new block processed + // the actual block contents don't matter, just that it's a block. + // verify that the write position is at offset 0x12d. + // this is a check to make sure that https://github.com/bitcoin/bitcoin/issues/21379 does not recur + // 8 bytes (for serialization header) + 285 (for serialized genesis block) = 293 + // add another 8 bytes for the second block's serialization header and we get 293 + 8 = 301 + FlatFilePos actual{blockman.SaveBlockToDisk(params->GenesisBlock(), 1, chain, *params, nullptr)}; + BOOST_CHECK_EQUAL(actual.nPos, BLOCK_SERIALIZATION_HEADER_SIZE + ::GetSerializeSize(params->GenesisBlock(), CLIENT_VERSION) + BLOCK_SERIALIZATION_HEADER_SIZE); +} + +BOOST_AUTO_TEST_SUITE_END() |