summaryrefslogtreecommitdiff
path: root/bip-0158.mediawiki
diff options
context:
space:
mode:
Diffstat (limited to 'bip-0158.mediawiki')
-rw-r--r--bip-0158.mediawiki18
1 files changed, 9 insertions, 9 deletions
diff --git a/bip-0158.mediawiki b/bip-0158.mediawiki
index bf2e856..2062c6e 100644
--- a/bip-0158.mediawiki
+++ b/bip-0158.mediawiki
@@ -65,11 +65,10 @@ For each block, compact filters are derived containing sets of items associated
with the block (eg. addresses sent to, outpoints spent, etc.). A set of such
data objects is compressed into a probabilistic structure called a
''Golomb-coded set'' (GCS), which matches all items in the set with probability
-1, and matches other items with probability <code>2^(-P)</code> for some
-integer parameter <code>P</code>. We also introduce parameter <code>M</code>
-which allows filter to uniquely tune the range that items are hashed onto
-before compressing. Each defined filter also selects distinct parameters for P
-and M.
+1, and matches other items with probability <code>1/M</code> for some
+integer parameter <code>M</code>. The encoding is also parameterized by
+<code>P</code>, the bit length of the remainder code. Each filter defined
+specifies values for <code>P</code> and <code>M</code>.
At a high level, a GCS is constructed from a set of <code>N</code> items by:
# hashing all items to 64-bit integers in the range <code>[0, N * M)</code>
@@ -88,8 +87,8 @@ one is able to select both Parameters independently, then more optimal values
can be
selected<ref>https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845</ref>.
Set membership queries against the hash outputs will have a false positive rate
-of <code>2^(-P)</code>. To avoid integer overflow, the
-number of items <code>N</code> MUST be <2^32 and <code>M</code> MUST be <2^32.
+of <code>M</code>. To avoid integer overflow, the number of items <code>N</code>
+MUST be <2^32 and <code>M</code> MUST be <2^32.
The items are first passed through the pseudorandom function ''SipHash'', which
takes a 128-bit key <code>k</code> and a variable-sized byte vector and produces
@@ -189,9 +188,10 @@ golomb_decode(stream, P: uint) -> uint64:
==== Set Construction ====
-A GCS is constructed from three parameters:
+A GCS is constructed from four parameters:
* <code>L</code>, a vector of <code>N</code> raw items
-* <code>P</code>, which determines the false positive rate
+* <code>P</code>, the bit parameter of the Golomb-Rice coding
+* <code>M</code>, the target false positive rate
* <code>k</code>, the 128-bit key used to randomize the SipHash outputs
The result is a byte vector with a minimum size of <code>N * (P + 1)</code>