diff options
author | Anthony Towns <aj@erisian.com.au> | 2020-06-01 14:18:32 +1000 |
---|---|---|
committer | Elliott Jin <elliott.jin@gmail.com> | 2020-10-01 22:22:56 -0700 |
commit | d9141a0002bb508b2e94e206a1bd28ef8f97ffde (patch) | |
tree | 3f7d590d4ab45a1e495327f5fe5d5744c26edc57 /src | |
parent | 652c45fdbbd55bde95c8c6cf08a5feb6055ac112 (diff) |
doc: clarify CRollingBloomFilter size estimate
Diffstat (limited to 'src')
-rw-r--r-- | src/bloom.h | 13 |
1 files changed, 12 insertions, 1 deletions
diff --git a/src/bloom.h b/src/bloom.h index 9307257852..24dc607cd9 100644 --- a/src/bloom.h +++ b/src/bloom.h @@ -94,7 +94,18 @@ public: * insert()'ed ... but may also return true for items that were not inserted. * * It needs around 1.8 bytes per element per factor 0.1 of false positive rate. - * (More accurately: 3/(log(256)*log(2)) * log(1/fpRate) * nElements bytes) + * For example, if we want 1000 elements, we'd need: + * - ~1800 bytes for a false positive rate of 0.1 + * - ~3600 bytes for a false positive rate of 0.01 + * - ~5400 bytes for a false positive rate of 0.001 + * + * If we make these simplifying assumptions: + * - logFpRate / log(0.5) doesn't get rounded or clamped in the nHashFuncs calculation + * - nElements is even, so that nEntriesPerGeneration == nElements / 2 + * + * Then we get a more accurate estimate for filter bytes: + * + * 3/(log(256)*log(2)) * log(1/fpRate) * nElements */ class CRollingBloomFilter { |