1 files changed, 57 insertions, 16 deletions
diff --git a/bip-0173.mediawiki b/bip-0173.mediawiki
index af2516d..c3ee060 100644
--- a/bip-0173.mediawiki
+++ b/bip-0173.mediawiki
@@ -6,9 +6,9 @@
           Greg Maxwell <greg@xiph.org>
   Comments-Summary: No comments yet.
   Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0173
-  Status: Draft
+  Status: Final
   Type: Informational
-  Created: 2016-03-20
+  Created: 2017-03-20
   License: BSD-2-Clause
   Replaces: 142
 </pre>
@@ -76,7 +76,7 @@ increase, but that does not matter when copy-pasting addresses.</ref> format cal
 
 A Bech32<ref>'''Why call it Bech32?''' "Bech" contains the characters BCH (the error
 detection algorithm used) and sounds a bit like "base".</ref> string is at most 90 characters long and consists of:
-* The '''human-readable part''', which is intended to convey the type of data or anything else that is relevant for the reader. Its validity (including the used set of characters) is application specific, but restricted to ASCII characters with values in the range 33-126.
+* The '''human-readable part''', which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.
 * The '''separator''', which is always "1". In case "1" is allowed inside the human-readable part, the last one in the string is the separator<ref>'''Why include a separator in addresses?''' That way the human-readable
 part is unambiguously separated from the data part, avoiding potential
 collisions with other human-readable parts that share a prefix. It also
@@ -153,7 +153,7 @@ guarantees detection of '''any error affecting at most 4 characters'''
 and has less than a 1 in 10<sup>9</sup> chance of failing to detect more
 errors. More details about the properties can be found in the
 Checksum Design appendix. The human-readable part is processed by first
-feeding the higher bits of each character's ASCII value into the
+feeding the higher bits of each character's US-ASCII value into the
 checksum calculation followed by a zero and then the lower bits of each<ref>'''Why are the high bits of the human-readable part processed first?'''
 This results in the actually checksummed data being ''[high hrp] 0 [low hrp] [data]''. This means that under the assumption that errors to the
 human readable part only change the low 5 bits (like changing an alphabetical character into another), errors are restricted to the ''[low hrp] [data]''
@@ -182,11 +182,15 @@ to make.
 
 '''Uppercase/lowercase'''
 
-Decoders MUST accept both uppercase and lowercase strings, but
-not mixed case. The lowercase form is used when determining a character's
-value for checksum purposes. For presentation, lowercase is usually
-preferable, but inside QR codes uppercase SHOULD be used, as those permit
-the use of
+The lowercase form is used when determining a character's value for checksum purposes.
+
+Encoders MUST always output an all lowercase Bech32 string.
+If an uppercase version of the encoding result is desired, (e.g.- for presentation purposes, or QR code use),
+then an uppercasing procedure can be performed external to the encoding process.
+
+Decoders MUST NOT accept strings where some characters are uppercase and some are lowercase (such strings are referred to as mixed case strings).
+
+For presentation, lowercase is usually preferable, but inside QR codes uppercase SHOULD be used, as those permit the use of
 ''[http://www.thonky.com/qr-code-tutorial/alphanumeric-mode-encoding alphanumeric mode]'', which is 45% more compact than the normal
 ''[http://www.thonky.com/qr-code-tutorial/byte-mode-encoding byte mode]''.
 
@@ -204,8 +208,8 @@ be of the same length as the mainnet counterpart (to simplify
 implementations' assumptions about lengths), but still be visually
 distinct.</ref> for testnet.
 * The data-part values:
-** 1 value: the witness version
-** A conversion of the the 2-to-40-byte witness program (as defined by [https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki BIP141]) to base32:
+** 1 byte: the witness version
+** A conversion of the 2-to-40-byte witness program (as defined by [https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki BIP141]) to base32:
 *** Start with the bits of the witness program, most significant bit per byte first.
 *** Re-arrange those bits into groups of 5, and pad with zeroes at the end if needed.
 *** Translate those bits to characters using the table above.
@@ -227,6 +231,12 @@ program is neither 20 nor 32 bytes, the script must fail.''
 As a result of the previous rules, addresses are always between 14 and 74 characters long, and their length modulo 8 cannot be 0, 3, or 5.
 Version 0 witness addresses are always 42 or 62 characters, but implementations MUST allow the use of any version.
 
+Implementations should take special care when converting the address to a
+scriptPubkey, where witness version ''n'' is stored as ''OP_n''. OP_0 is
+encoded as 0x00, but OP_1 through OP_16 are encoded as 0x51 though 0x60
+(81 to 96 in decimal). If a bech32 address is converted to an incorrect
+scriptPubKey the result will likely be either unspendable or insecure.
+
 ===Compatibility===
 
 Only new software will be able to use these addresses, and only for
@@ -241,32 +251,62 @@ P2PKH addresses can be used.
 
 * Reference encoder and decoder:
 ** [https://github.com/sipa/bech32/tree/master/ref/c For C]
+** [https://github.com/sipa/bech32/tree/master/ref/c++ For C++]
 ** [https://github.com/sipa/bech32/tree/master/ref/javascript For JavaScript]
+** [https://github.com/sipa/bech32/tree/master/ref/go For Go]
 ** [https://github.com/sipa/bech32/tree/master/ref/python For Python]
 ** [https://github.com/sipa/bech32/tree/master/ref/haskell For Haskell]
+** [https://github.com/sipa/bech32/tree/master/ref/ruby For Ruby]
 ** [https://github.com/sipa/bech32/tree/master/ref/rust For Rust]
 
 * Fancy decoder that localizes errors:
 ** [https://github.com/sipa/bech32/tree/master/ecc/javascript For JavaScript] ([http://bitcoin.sipa.be/bech32/demo/demo.html demo website])
 
+==Registered Human-readable Prefixes==
+
+SatoshiLabs maintains a full list of registered human-readable parts for other cryptocurrencies:
+
+[https://github.com/satoshilabs/slips/blob/master/slip-0173.md SLIP-0173 : Registered human-readable parts for BIP-0173]
+
 ==Appendices==
 
 ===Test vectors===
 
-The following strings have a valid Bech32 checksum.
+The following strings are valid Bech32:
 * <tt>A12UEL5L</tt>
+* <tt>a12uel5l</tt>
 * <tt>an83characterlonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1tt5tgs</tt>
 * <tt>abcdef1qpzry9x8gf2tvdw0s3jn54khce6mua7lmqqqxw</tt>
 * <tt>11qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqc8247j</tt>
 * <tt>split1checkupstagehandshakeupstreamerranterredcaperred2y9e3w</tt>
+* <tt>?1ezyfcl</tt> WARNING: During conversion to US-ASCII some encoders may set unmappable characters to a valid US-ASCII character, such as '?'. For example:
+
+<pre>
+>>> bech32_encode('\x80'.encode('ascii', 'replace').decode('ascii'), [])
+'?1ezyfcl'
+</pre>
+
+The following string are not valid Bech32 (with reason for invalidity):
+* 0x20 + <tt>1nwldj5</tt>: HRP character out of range
+* 0x7F + <tt>1axkwrx</tt>: HRP character out of range
+* 0x80 + <tt>1eym55h</tt>: HRP character out of range
+* <tt>an84characterslonghumanreadablepartthatcontainsthenumber1andtheexcludedcharactersbio1569pvx</tt>: overall max length exceeded
+* <tt>pzry9x0s0muk</tt>: No separator character
+* <tt>1pzry9x0s0muk</tt>: Empty HRP
+* <tt>x1b4n0q5v</tt>: Invalid data character
+* <tt>li1dgmt3</tt>: Too short checksum
+* <tt>de1lg7wt</tt> + 0xFF: Invalid character in checksum
+* <tt>A1G7SGD8</tt>: checksum calculated with uppercase form of HRP
+* <tt>10a06t8</tt>: empty HRP
+* <tt>1qzzfhee</tt>: empty HRP
 
 The following list gives valid segwit addresses and the scriptPubKey that they
 translate to in hex.
 * <tt>BC1QW508D6QEJXTDG4Y5R3ZARVARY0C5XW7KV8F3T4</tt>: <tt>0014751e76e8199196d454941c45d1b3a323f1433bd6</tt>
 * <tt>tb1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3q0sl5k7</tt>: <tt>00201863143c14c5166804bd19203356da136c985678cd4d27a1b8c6329604903262</tt>
-* <tt>bc1pw508d6qejxtdg4y5r3zarvary0c5xw7kw508d6qejxtdg4y5r3zarvary0c5xw7k7grplx</tt>: <tt>8128751e76e8199196d454941c45d1b3a323f1433bd6751e76e8199196d454941c45d1b3a323f1433bd6</tt>
-* <tt>BC1SW50QA3JX3S</tt>: <tt>9002751e</tt>
-* <tt>bc1zw508d6qejxtdg4y5r3zarvaryvg6kdaj</tt>: <tt>8210751e76e8199196d454941c45d1b3a323</tt>
+* <tt>bc1pw508d6qejxtdg4y5r3zarvary0c5xw7kw508d6qejxtdg4y5r3zarvary0c5xw7k7grplx</tt>: <tt>5128751e76e8199196d454941c45d1b3a323f1433bd6751e76e8199196d454941c45d1b3a323f1433bd6</tt>
+* <tt>BC1SW50QA3JX3S</tt>: <tt>6002751e</tt>
+* <tt>bc1zw508d6qejxtdg4y5r3zarvaryvg6kdaj</tt>: <tt>5210751e76e8199196d454941c45d1b3a323</tt>
 * <tt>tb1qqqqqp399et2xygdj5xreqhjjvcmzhxw4aywxecjdzew6hylgvsesrxh6hy</tt>: <tt>0020000000c4a5cad46221b2a187905e5266362b99d5e91c6ce24d165dab93e86433</tt>
 
 The following list gives invalid segwit addresses and the reason for
@@ -278,8 +318,9 @@ their invalidity.
 * <tt>bc10w508d6qejxtdg4y5r3zarvary0c5xw7kw508d6qejxtdg4y5r3zarvary0c5xw7kw5rljs90</tt>: Invalid program length
 * <tt>BC1QR508D6QEJXTDG4Y5R3ZARVARYV98GJ9P</tt>: Invalid program length for witness version 0 (per BIP141)
 * <tt>tb1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3q0sL5k7</tt>: Mixed case
-* <tt>tb1pw508d6qejxtdg4y5r3zarqfsj6c3</tt>: zero padding of more than 4 bits
+* <tt>bc1zw508d6qejxtdg4y5r3zarvaryvqyzf3du</tt>: zero padding of more than 4 bits
 * <tt>tb1qrp33g0q5c5txsp9arysrx4k6zdkfs4nce4xj0gdcccefvpysxf3pjxtptv</tt>: Non-zero padding in 8-to-5 conversion
+* <tt>bc1gmk9yu</tt>: Empty data section
 
 ===Checksum design===