summaryrefslogtreecommitdiff
path: root/bip-0039.mediawiki
blob: 679572b5bfb7b32992938a861f4a9a531e6e6e3a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
<pre>
  BIP:     BIP-0039
  Title:   Mnemonic code for generating deterministic keys
  Authors: Marek Palatinus <slush@satoshilabs.com>
           Pavol Rusnak <stick@satoshilabs.com>
           ThomasV <thomasv@bitcointalk.org>
           Aaron Voisine <voisine@gmail.com>
  Status:  Draft
  Type:    Standards Track
  Created: 10-09-2013
</pre>

==Abstract==

This BIP describes an usage of mnemonic code or mnemonic sentence - a group of
easy to remember words - to generate deterministic wallets.

It consists of two parts: generating the mnemonic and converting it into
a binary seed. This seed can be later used to generate deterministic wallets
using BIP-0032 or similar methods.

==Motivation==

Such mnemonic code or mnemonic sentence is much easier to work with than working
with the binary data directly (or its hexadecimal interpretation). The sentence
could be writen down on paper (e.g. for storing in a secure location such as
safe), told over telephone or other voice communication method, or memorized
in ones memory (this method is called brainwallet).

==Generating the mnemonic==

First, we decide how much entropy we want mnemonic to encode. Recommended size
is 128-256 bits, but basically any multiple of 32 bits will do. More bits
mean more security, but also longer word sentence.

We take initial entropy of ENT bits and compute its checksum by taking first
ENT / 32 bits of its SHA256 hash. We append these bits to the end of the initial
entropy. Next we take these concatenated bits and split them into groups of 11
bits. Each group encodes number from 0-2047 which is a position in a wordlist.
We convert numbers into words and use joined words as mnemonic sentence.

The following table describes the relation between initial entropy length (ENT),
checksum length (CS) and length of the generated mnemonic sentence (MS) in words.

<pre>
CS = ENT / 32
MS = (ENT + CS) / 11

|  ENT  | CS | ENT+CS |  MS  |
+-------+----+--------+------+
|  128  |  4 |   132  |  12  |
|  160  |  5 |   165  |  15  |
|  192  |  6 |   198  |  18  |
|  224  |  7 |   231  |  21  |
|  256  |  8 |   264  |  24  |
</pre>

==Wordlist==

In previous section we described how to pick words from a wordlist. Now we
describe how does a good wordlist look like.

a) smart selection of words
   - wordlist is created in such way that it's enough to type just first four
     letters to unambiguously identify the word

b) similar words avoided
   - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
     not only make remembering the sentence difficult, but are also more error
     prone and more difficult to guess (see point below)
   - we avoid these words by carefully selecting them during addition

c) sorted wordlists
   - wordlist is sorted which allow more efficient lookup of the code words
     (i.e. implementation can use binary search instead of linear search)
   - this also allows trie (prefix tree) to be used, e.g. for better compression

Wordlist can contain native characters, but they have to be encoded using UTF-8.

==From mnemonic to seed==

User can decide to protect his mnemonic by passphrase. If passphrase is not present
an empty string "" is used instead.

To create binary seed from mnemonic, we use PBKDF2 function with mnemonic sentence
(in UTF-8) used as a password and string "mnemonic" + passphrase (again in UTF-8)
used as a salt. Iteration count is set to 4096 and HMAC-SHA512 is used as a pseudo-
random function. Desired length of the derived key is 512 bits (= 64 bytes).

This seed can be later used to generate deterministic wallets using BIP-0032 or
similar methods.

The conversion of the mnemonic sentence to binary seed is completely independent
from generating the sentence. This results in rather simple code, there are no
constraints on sentence structure and clients are free to implement their own
wordlists or even whole sentence generators (they'll lose the proposed method
for typo detection in that case, but they can come up with their own).

Described method also provides plausable deniability, because every passphrase
generates a valid seed (and thus deterministic wallet) but only the correct one
will make the desired wallet available.

==Test vectors==

See https://github.com/trezor/python-mnemonic/blob/master/vectors.json

==Reference Implementation==

Reference implementation including wordlists is available from

http://github.com/trezor/python-mnemonic