diff options
Diffstat (limited to 'bip-0380.mediawiki')
-rw-r--r-- | bip-0380.mediawiki | 335 |
1 files changed, 335 insertions, 0 deletions
diff --git a/bip-0380.mediawiki b/bip-0380.mediawiki new file mode 100644 index 0000000..27b7908 --- /dev/null +++ b/bip-0380.mediawiki @@ -0,0 +1,335 @@ +<pre> + BIP: 380 + Layer: Applications + Title: Output Script Descriptors General Operation + Author: Pieter Wuille <pieter@wuille.net> + Ava Chow <me@achow101.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0380 + Status: Draft + Type: Informational + Created: 2021-06-27 + License: BSD-2-Clause +</pre> + +==Abstract== + +Output Script Descriptors are a simple language which can be used to describe collections of output scripts. +There can be many different descriptor fragments and functions. +This document describes the general syntax for descriptors, descriptor checksums, and common expressions. + +==Copyright== + +This BIP is licensed under the BSD 2-clause license. + +==Motivation== + +Bitcoin wallets traditionally have stored a set of keys which are later serialized and mutated to produce the output scripts that the wallet watches and the addresses it provides to users. +Typically backups have consisted of solely the private keys, nowadays primarily in the form of BIP 39 mnemonics. +However this backup solution is insufficient, especially since the introduction of Segregated Witness which added new output types. +Given just the private keys, it is not possible for restored wallets to know which kinds of output scripts and addresses to produce. +This has lead to incompatibilities between wallets when restoring a backup or exporting data for a watch only wallet. + +Further complicating matters are BIP 32 derivation paths. +Although BIPs 44, 49, and 84 have specified standard BIP 32 derivation paths for different output scripts and addresses, not all wallets support them nor use those derivation paths. +The lack of derivation path information in these backups and exports leads to further incompatibilities between wallets. + +Current solutions to these issues have not been generic and can be viewed as being layer violations. +Solutions such as introducing different version bytes for extended key serialization both are a layer violation (key derivation should be separate from script type meaning) and specific only to a particular derivation path and script type. + +Output Script Descriptors introduces a generic solution to these issues. +Script types are specified explicitly through the use of Script Expressions. +Key derivation paths are specified explicitly in Key Expressions. +These allow for creating wallet backups and exports which specify the exact scripts, subscripts (redeemScript, witnessScript, etc.), and keys to produce. +With the general structure specified in this BIP, new Script Expressions can be introduced as new script types are added. +Lastly, the use of common terminology and existing standards allow for Output Script Descriptors to be engineer readable so that the results can be understood at a glance. + +==Specification== + +Descriptors consist of several types of expressions. +The top level expression is a <tt>SCRIPT</tt>. +This expression may be followed by <tt>#CHECKSUM</tt>, where <tt>CHECKSUM</tt> is an 8 character alphanumeric descriptor checksum. +Although the checksum is optional for parsing, applications may choose to reject descriptors that do not contain a checksum. + +===Script Expressions=== + +Script Expressions (denoted <tt>SCRIPT</tt>) are expressions which correspond directly with a Bitcoin script. +These expressions are written as functions and take arguments. +Such expressions have a script template which is filled with the arguments correspondingly. +Expressions are written with a human readable identifier string with the arguments enclosed with parentheses. +The identifier string should be alphanumeric and may include underscores. + +The arguments to a script expression are defined by that expression itself. +They could be a script expression, a key expression, or some other expression entirely. + +===Key Expressions=== + +A common expression used as an argument to script expressions are key expressions (denoted <tt>KEY</tt>). +These represent a public or private key and, optionally, information about the origin of that key. +Key expressions can only be used as arguments to script expressions. + +Key expressions consist of: +* Optionally, key origin information, consisting of: +** An open bracket <tt>[</tt> +** Exactly 8 hex characters for the fingerprint of the key where the derivation starts (see BIP 32 for details) +** Followed by zero or more <tt>/NUM</tt> or <tt>/NUMh</tt> path elements to indicate the unhardened or hardened derivation steps between the fingerprint and the key that follows. +** A closing bracket <tt>]</tt> +* Followed by the actual key, which is either: +** A hex encoded public key, which depending the script expression, may be either: +*** 66 hex character string beginning with <tt>02</tt> or <tt>03</tt> representing a compressed public key +*** 130 hex character string beginning with <tt>04</tt> representing an uncompressed public key +** A [[https://en.bitcoin.it/wiki/Wallet_import_format|WIF]] encoded private key +** <tt>xpub</tt> encoded extended public key or <tt>xprv</tt> encoded extended private key (as defined in BIP 32) +*** Followed by zero or more <tt>/NUM</tt> or <tt>/NUMh</tt> path elements indicating BIP 32 derivation steps to be taken after the given extended key. +*** Optionally followed by a single <tt>/*</tt> or <tt>/*h</tt> final step to denote all direct unhardened or hardened children. + +If the <tt>KEY</tt> is a BIP 32 extended key, before output scripts can be created, child keys must be derived using the derivation information that follows the extended key. +When the final step is <tt>/*</tt> or <tt>/*'</tt>, an output script will be produced for every child key index. +The derived key must be not be serialized as an uncompressed public key. +Script Expressions may have further requirements on how derived public keys are serialized for script creation. + +In the above specification, the hardened indicator <tt>h</tt> may be replaced with alternative hardened indicators of <tt>H</tt> or <tt>'</tt>. + +====Normalization of Key Expressions with Hardened Derivation==== + +When a descriptor is exported without private keys, it is necessary to do additional derivation to remove any intermediate hardened derivation steps for the exported descriptor to be useful. +The exporter should derive the extended public key at the last hardened derivation step and use that extended public key as the key in the descriptor. +The derivation steps that were taken to get to that key must be added to the previous key origin information. +If there is no key origin information, then one must be added for the newly derived extended public key. +If the final derivation is hardened, then it is not necessary to do additional derivation. + +===Character Set=== + +The expressions used in descriptors must only contain characters within this character set so that the descriptor checksum will work. + +The allowed characters are: +<pre> +0123456789()[],'/*abcdefgh@:$%{} +IJKLMNOPQRSTUVWXYZ&+-.;<=>?!^_|~ +ijklmnopqrstuvwxyzABCDEFGH`#"\<space> +</pre> +Note that <tt><space></tt> on the last line is a space character. + +This character set is written as 3 groups of 32 characters in this specific order so that the checksum below can identify more errors. +The first group are the most common "unprotected" characters (i.e. things such as hex and keypaths that do not already have their own checksums). +Case errors cause an offset that is a multiple of 32 while as many alphabetic characters are in the same group while following the previous restrictions. + +===Checksum=== + +Following the top level script expression is a single octothorpe (<tt>#</tt>) followed by the 8 character checksum. +The checksum is an error correcting checksum similar to bech32. + +The checksum has the following properties: +* Mistakes in a descriptor string are measured in "symbol errors". The higher the number of symbol errors, the harder it is to detect: +** An error substituting a character from <tt>0123456789()[],'/*abcdefgh@:$%{}</tt> for another in that set always counts as 1 symbol error. +*** Note that hex encoded keys are covered by these characters. Extended keys (<tt>xpub</tt> and <tt>xprv</tt>) use other characters too, but also have their own checksum mechanism. +*** <tt>SCRIPT</tt> expression function names use other characters, but mistakes in these would generally result in an unparsable descriptor. +** A case error always counts as 1 symbol error. +** Any other 1 character substitution error counts as 1 or 2 symbol errors. +* Any 1 symbol error is always detected. +* Any 2 or 3 symbol error in a descriptor of up to 49154 characters is always detected. +* Any 4 symbol error in a descriptor of up to 507 characters is always detected. +* Any 5 symbol error in a descriptor of up to 77 characters is always detected. +* Is optimized to minimize the chance of a 5 symbol error in a descriptor up to 387 characters is undetected +* Random errors have a chance of 1 in 2<super>40</super> of being undetected. + +The checksum itself uses the same character set as bech32: <tt>qpzry9x8gf2tvdw0s3jn54khce6mua7l</tt> + +Valid descriptor strings with a checksum must pass the criteria for validity specified by the Python3 code snippet below. +The function <tt>descsum_check</tt> must return true when its argument <tt>s</tt> is a descriptor consisting in the form <tt>SCRIPT#CHECKSUM</tt>. + +<pre> +INPUT_CHARSET = "0123456789()[],'/*abcdefgh@:$%{}IJKLMNOPQRSTUVWXYZ&+-.;<=>?!^_|~ijklmnopqrstuvwxyzABCDEFGH`#\"\\ " +CHECKSUM_CHARSET = "qpzry9x8gf2tvdw0s3jn54khce6mua7l" +GENERATOR = [0xf5dee51989, 0xa9fdca3312, 0x1bab10e32d, 0x3706b1677a, 0x644d626ffd] + +def descsum_polymod(symbols): + """Internal function that computes the descriptor checksum.""" + chk = 1 + for value in symbols: + top = chk >> 35 + chk = (chk & 0x7ffffffff) << 5 ^ value + for i in range(5): + chk ^= GENERATOR[i] if ((top >> i) & 1) else 0 + return chk + +def descsum_expand(s): + """Internal function that does the character to symbol expansion""" + groups = [] + symbols = [] + for c in s: + if not c in INPUT_CHARSET: + return None + v = INPUT_CHARSET.find(c) + symbols.append(v & 31) + groups.append(v >> 5) + if len(groups) == 3: + symbols.append(groups[0] * 9 + groups[1] * 3 + groups[2]) + groups = [] + if len(groups) == 1: + symbols.append(groups[0]) + elif len(groups) == 2: + symbols.append(groups[0] * 3 + groups[1]) + return symbols + +def descsum_check(s): + """Verify that the checksum is correct in a descriptor""" + if s[-9] != '#': + return False + if not all(x in CHECKSUM_CHARSET for x in s[-8:]): + return False + symbols = descsum_expand(s[:-9]) + [CHECKSUM_CHARSET.find(x) for x in s[-8:]] + return descsum_polymod(symbols) == 1 +</pre> + +This implements a BCH code that has the properties described above. +The entire descriptor string is first processed into an array of symbols. +The symbol for each character is its position within its group. +After every 3rd symbol, a 4th symbol is inserted which represents the group numbers combined together. +This means that a change that only affects the position within a group, or only a group number change, will only affect a single symbol. + +To construct a valid checksum given a script expression, the code below can be used: + +<pre> +def descsum_create(s): + """Add a checksum to a descriptor without""" + symbols = descsum_expand(s) + [0, 0, 0, 0, 0, 0, 0, 0] + checksum = descsum_polymod(symbols) ^ 1 + return s + '#' + ''.join(CHECKSUM_CHARSET[(checksum >> (5 * (7 - i))) & 31] for i in range(8)) + +</pre> + +==Test Vectors== + +The following tests cover the checksum and character set: + +* Valid checksum: <tt>raw(deadbeef)#89f8spxm</tt> +* No checksum: <tt>raw(deadbeef)</tt> +* Missing checksum: <tt>raw(deadbeef)#</tt> +* Too long checksum (9 chars): <tt>raw(deadbeef)#89f8spxmx</tt> +* Too short checksum (7 chars): <tt>raw(deadbeef)#89f8spx</tt> +* Error in payload: <tt>raw(deedbeef)#89f8spxm</tt> +* Error in checksum: <tt>raw(deedbeef)##9f8spxm</tt> +* Invalid characters in payload: <tt>raw(Ü)#00000000</tt> + +The following tests cover key expressions: + +Valid expressions: + +* Compressed public key: <tt>0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Uncompressed public key: <tt>04a34b99f22c790c4e36b2b3c2c35a36db06226e41c692fc82b8b56ac1c540c5bd5b8dec5235a0fa8722476c7709c02559e3aa73aa03918ba2d492eea75abea235</tt> +* Public key with key origin: <tt>[deadbeef/0h/0h/0h]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Public key with key origin (<tt>'</tt> as hardened indicator): <tt>[deadbeef/0'/0'/0']0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Public key with key origin (mixed hardened indicator): <tt>[deadbeef/0'/0h/0']0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* WIF uncompressed private key <tt>5KYZdUEo39z3FPrtuX2QbbwGnNP5zTd7yyr2SC1j299sBCnWjss</tt> +* WIF compressed private key <tt>L4rK1yDtCWekvXuE6oXD9jCYfFNV2cWRpVuPLBcCU2z8TrisoyY1</tt> +* Extended public key: <tt>xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL</tt> +* Extended public key with key origin: <tt>[deadbeef/0h/1h/2h]xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL</tt> +* Extended public key with derivation: <tt>[deadbeef/0h/1h/2h]xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL/3/4/5</tt> +* Extended public key with derivation and children: <tt>[deadbeef/0h/1h/2h]xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL/3/4/5/*</tt> +* Extended public key with hardened derivation and unhardened children: <tt>xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL/3h/4h/5h/*</tt> +* Extended public key with hardened derivation and children: <tt>xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL/3h/4h/5h/*h</tt> +* Extended public key with key origin, hardened derivation and children: <tt>[deadbeef/0h/1h/2]xpub6ERApfZwUNrhLCkDtcHTcxd75RbzS1ed54G1LkBUHQVHQKqhMkhgbmJbZRkrgZw4koxb5JaHWkY4ALHY2grBGRjaDMzQLcgJvLJuZZvRcEL/3h/4h/5h/*h</tt> +* Extended private key: <tt>xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc</tt> +* Extended private key with key origin: <tt>[deadbeef/0h/1h/2h]xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc</tt> +* Extended private key with derivation: <tt>[deadbeef/0h/1h/2h]xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc/3/4/5</tt> +* Extended private key with derivation and children: <tt>[deadbeef/0h/1h/2h]xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc/3/4/5/*</tt> +* Extended private key with hardened derivation and unhardened children: <tt>xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc/3h/4h/5h/*</tt> +* Extended private key with hardened derivation and children: <tt>xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc/3h/4h/5h/*h</tt> +* Extended private key with key origin, hardened derivation and children: <tt>[deadbeef/0h/1h/2]xprvA1RpRA33e1JQ7ifknakTFpgNXPmW2YvmhqLQYMmrj4xJXXWYpDPS3xz7iAxn8L39njGVyuoseXzU6rcxFLJ8HFsTjSyQbLYnMpCqE2VbFWc/3h/4h/5h/*h</tt> + +Invalid expression: + +* Children indicator in key origin: <tt>[deadbeef/0h/0h/0h/*]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Trailing slash in key origin: <tt>[deadbeef/0h/0h/0h/]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Too short fingerprint: <tt>[deadbef/0h/0h/0h]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Too long fingerprint: <tt>[deadbeeef/0h/0h/0h]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Invalid hardened indicators: <tt>[deadbeef/0f/0f/0f]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Invalid hardened indicators: <tt>[deadbeef/0H/0H/0H]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Invalid hardened indicators: <tt>[deadbeef/-0/-0/-0]0260b2003c386519fc9eadf2b5cf124dd8eea4c4e68d5e154050a9346ea98ce600</tt> +* Private key with derivation: <tt>L4rK1yDtCWekvXuE6oXD9jCYfFNV2cWRpVuPLBcCU2z8TrisoyY1/0</tt> +* Private key with derivation children: <tt>L4rK1yDtCWekvXuE6oXD9jCYfFNV2cWRpVuPLBcCU2z8TrisoyY1/*</tt> +* Derivation index out of range: <tt>xprv9s21ZrQH143K31xYSDQpPDxsXRTUcvj2iNHm5NUtrGiGG5e2DtALGdso3pGz6ssrdK4PFmM8NSpSBHNqPqm55Qn3LqFtT2emdEXVYsCzC2U/2147483648</tt> +* Invalid derivation index: <tt>xprv9s21ZrQH143K31xYSDQpPDxsXRTUcvj2iNHm5NUtrGiGG5e2DtALGdso3pGz6ssrdK4PFmM8NSpSBHNqPqm55Qn3LqFtT2emdEXVYsCzC2U/1aa</tt> +* Multiple key origins: <tt>[aaaaaaaa][aaaaaaaa]xprv9s21ZrQH143K31xYSDQpPDxsXRTUcvj2iNHm5NUtrGiGG5e2DtALGdso3pGz6ssrdK4PFmM8NSpSBHNqPqm55Qn3LqFtT2emdEXVYsCzC2U/2147483647'/0</tt> +* Missing key origin start: <tt>aaaaaaaa]xprv9s21ZrQH143K31xYSDQpPDxsXRTUcvj2iNHm5NUtrGiGG5e2DtALGdso3pGz6ssrdK4PFmM8NSpSBHNqPqm55Qn3LqFtT2emdEXVYsCzC2U/2147483647'/0</tt> +* Non hex fingerprint: <tt>[gaaaaaaa]xprv9s21ZrQH143K31xYSDQpPDxsXRTUcvj2iNHm5NUtrGiGG5e2DtALGdso3pGz6ssrdK4PFmM8NSpSBHNqPqm55Qn3LqFtT2emdEXVYsCzC2U/2147483647'/0</tt> +* Key origin with no public key: <tt>[deadbeef]</tt> + +==Backwards Compatibility== + +Output script descriptors are an entirely new language which is not compatible with any existing software. +However many components of the expressions reuse encodings and serializations defined by previous BIPs. + +Output script descriptors are designed for future extension with further fragment types and new script expressions. +These will be specified in additional BIPs. + +==Reference Implementation== + +Descriptors have been implemented in Bitcoin Core since version 0.17. + +==Appendix A: Index of Expressions== + +Future BIPs may specify additional types of expressions. +All available expression types are listed in this table. + +{| +! Name +! Denoted As +! BIP +|- +| Script +| <tt>SCRIPT</tt> +| 380 +|- +| Key +| <tt>KEY</tt> +| 380 +|- +| Tree +| <tt>TREE</tt> +| [[bip-0386.mediawiki|386]] +|} + +==Appendix B: Index of Script Expressions== + +Script expressions will be specified in additional BIPs. +This Table lists all available Script expressions and the BIPs specifying them. + +{| +! Expression +! BIP +|- +| <tt>pk(KEY)</tt> +| [[bip-0381.mediawiki|381]] +|- +| <tt>pkh(KEY)</tt> +| [[bip-0381.mediawiki|381]] +|- +| <tt>sh(SCRIPT)</tt> +| [[bip-0381.mediawiki|381]] +|- +| <tt>wpkh(KEY)</tt> +| [[bip-0382.mediawiki|382]] +|- +| <tt>wsh(SCRIPT)</tt> +| [[bip-0382.mediawiki|382]] +|- +| <tt>multi(NUM, KEY, ..., KEY)</tt> +| [[bip-0383.mediawiki|383]] +|- +| <tt>sortedmulti(NUM, KEY, ..., KEY)</tt> +| [[bip-0383.mediawiki|383]] +|- +| <tt>combo(KEY)</tt> +| [[bip-0384.mediawiki|384]] +|- +| <tt>raw(HEX)</tt> +| [[bip-0385.mediawiki|385]] +|- +| <tt>addr(ADDR)</tt> +| [[bip-0385.mediawiki|385]] +|- +| <tt>tr(KEY)</tt>, <tt>tr(KEY, TREE)</tt> +| [[bip-0386.mediawiki|386]] +|} |