summaryrefslogtreecommitdiff
path: root/bip-0039.mediawiki
diff options
context:
space:
mode:
authorPeter Todd <pete@petertodd.org>2013-10-21 00:56:42 -0400
committerPeter Todd <pete@petertodd.org>2013-10-21 00:56:42 -0400
commite6abf7e66d7a499696dc9ef152f2c9eaa5b1b8ef (patch)
treed4c3a357c70ac18fbb43edc265f3b9cf5d105177 /bip-0039.mediawiki
parent2cdc696476cebedf20a621053f8156692cf20239 (diff)
downloadbips-e6abf7e66d7a499696dc9ef152f2c9eaa5b1b8ef.tar.xz
Archive Revision as of 22:27, 7 October 2013
https://en.bitcoin.it/w/index.php?title=BIP_0039&oldid=41584
Diffstat (limited to 'bip-0039.mediawiki')
-rw-r--r--bip-0039.mediawiki140
1 files changed, 140 insertions, 0 deletions
diff --git a/bip-0039.mediawiki b/bip-0039.mediawiki
new file mode 100644
index 0000000..3c91099
--- /dev/null
+++ b/bip-0039.mediawiki
@@ -0,0 +1,140 @@
+{{bip}}
+
+<pre>
+ BIP: BIP-0039
+ Title: Mnemonic code for generating deterministic keys
+ Author: Pavol Rusnak <stick@gk2.sk>
+ Marek Palatinus <info@bitcoin.cz>
+ Aaron Voisine <voisine@gmail.com>
+ Status: Draft
+ Type: Standards Track
+ Created: 10-09-2013
+</pre>
+
+==Abstract==
+
+This BIP proposes a scheme for translating binary data (usually master seeds
+for deterministic keys, but it can be applied to any binary data) into a group
+of easy to remember words also known as mnemonic code or mnemonic sentence.
+
+==Motivation==
+
+Such mnemonic code or mnemonic sentence is much easier to work with than working
+with the binary data directly (or its hexadecimal interpretation). The sentence
+could be writen down on paper (e.g. for storing in a secure location such as
+safe), told over telephone or other voice communication method, or memorized
+in ones memory (this method is called brainwallet).
+
+==Backwards Compatibility==
+
+As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic
+codes, but it uses a different wordlist than the proposed one.
+
+For compatibility reasons we propose adding a checkbox to Electrum, which will
+allow user to indicate if the legacy code is being entered during import or
+it is a new one that is BIP-0039 compatible. For exporting, only the new format
+will be used, so this is not an issue.
+
+==Rationale==
+
+Our proposal is inspired by implementation used in Electrum, but we enhanced
+the wordlist and algorithm so it meets the following criteria:
+
+a) smart selection of words
+ - wordlist is created in such way that it's enough to type just first four
+ letters to unambiguously identify the word
+
+b) similar words avoided
+ - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
+ not only make remembering the sentence difficult, but are also more error
+ prone and more difficult to guess (see point below)
+ - we avoid these words by carefully selecting them during addition
+
+c) sorted wordlists
+ - wordlist is sorted which allow more efficient lookup of the code words
+ (i.e. implementation can use binary search instead of linear search)
+ - this also allows trie (prefix tree) to be used, e.g. for better compression
+
+d) localized wordlists
+ - we would like to allow localized wordlists, so it is easier for users
+ to remember the code in their native language
+ - by using wordlists with no colliding words among languages, it's easy to
+ determine which language was used just by checking the first word of
+ the sentence
+
+e) mnemonic checksum
+ - this leads to better user experience, because user can be notified
+ if the mnemonic sequence is wrong, instead of showing the confusing
+ data generated from the wrong sequence.
+
+f) seed stretching
+ - before the encoding and after the decoding the input binary sequence is
+ stretched using a symmetric cipher (Blowfish) in order to prevent
+ brute-force attacks in case some of the mnemonic words are leaked
+
+==Specification==
+
+<pre>
+Our proposal implements two methods - "encode" and "decode".
+
+The first method takes a binary data which have to length (L) in bytes divisable
+by four and returns a sentence that consists of (L/4*3) words from the wordlist.
+
+The second method takes sentences generated by first method (number of words in
+the sentence has to be divisable by 3) and reconstructs the original binary data.
+
+Words can repeat in the sentence more than one time.
+
+Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
+the code to compute the checksum of the whole mnemonic sequence.
+Each 32 bits of input data add 1 bit of checksum.
+
+See the following table for relation between input lengths, output lengths and
+checksum sizes for the most common usecases:
+
++--------+---------+---------+----------+
+| input | input | output | checksum |
+| (bits) | (bytes) | (words) | (bits) |
++--------+---------+---------+----------+
+| 128 | 16 | 12 | 4 |
+| 192 | 24 | 18 | 6 |
+| 256 | 32 | 24 | 8 |
++--------+---------+---------+----------+
+</pre>
+
+===Algorithm:===
+
+<pre>
+Encoding:
+1. Read input data (I).
+2. Make sure its length (L) is divisable by 64 bits.
+3. Encrypt input data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
+4. Compute the length of the checkum (LC). LC = L/32
+5. Split I into chunks of LC bits (I1, I2, I3, ...).
+6. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
+7. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
+8. Keep taking 11 bits from E until there are none left.
+9. Treat them as integer W, add word with index W to the output.
+
+Decoding:
+1. Read input mnemonic (M).
+2. Make sure its wordcount is divisable by 6.
+3. Figure out word indexes in a dictionary and output them as binary stream E.
+4. Length of E (L) is divisable by 33 bits.
+5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
+6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
+7. If it's not we have invalid mnemonic code.
+8. Treat B as binary data.
+9. Decrypt this data 1000x with Blowfish (ECB) using the word "mnemonic" as key.
+10. Return the result as output.
+</pre>
+
+==Test vectors==
+
+See https://github.com/trezor/python-mnemonic/blob/master/vectors.json
+
+==Reference Implementation==
+
+Reference implementation including wordlists is available from
+
+http://github.com/trezor/python-mnemonic