summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLuke Dashjr <luke-jr+git@utopios.org>2020-01-24 00:01:16 +0000
committerLuke Dashjr <luke-jr+git@utopios.org>2020-01-24 00:01:16 +0000
commit33308e75f8e741a9c092f0a0dc9701c614a00175 (patch)
treec4f6aa0cfd07e99a273b31268a66c168c8ba3b75
parent6a802329e468b6a5fa24a1f28d2464736e490fbb (diff)
parent97ef6783dfe3825d600ccdc968a29b6340f4441a (diff)
downloadbips-33308e75f8e741a9c092f0a0dc9701c614a00175.tar.xz
Merge BIPs 340-342
-rw-r--r--README.mediawiki21
-rw-r--r--bip-0340.mediawiki263
-rw-r--r--bip-0340/reference.py169
-rw-r--r--bip-0340/speedup-batch.pngbin0 -> 11914 bytes
-rw-r--r--bip-0340/test-vectors.csv16
-rw-r--r--bip-0340/test-vectors.py241
-rw-r--r--bip-0341.mediawiki311
-rw-r--r--bip-0341/tree.pngbin0 -> 78937 bytes
-rw-r--r--bip-0342.mediawiki140
-rwxr-xr-xscripts/buildtable.pl1
10 files changed, 1162 insertions, 0 deletions
diff --git a/README.mediawiki b/README.mediawiki
index 46cc389..4a086af 100644
--- a/README.mediawiki
+++ b/README.mediawiki
@@ -931,6 +931,27 @@ Those proposing changes should consider that ultimately consent may rest with th
| Gleb Naumenko, Pieter Wuille
| Standard
| Draft
+|-
+| [[bip-0340.mediawiki|340]]
+|
+| Schnorr Signatures for secp256k1
+| Pieter Wuille, Jonas Nick, Tim Ruffing
+| Standard
+| Draft
+|-
+| [[bip-0341.mediawiki|341]]
+| Consensus (soft fork)
+| Taproot: SegWit version 1 spending rules
+| Pieter Wuille, Jonas Nick, Anthony Towns
+| Standard
+| Draft
+|-
+| [[bip-0342.mediawiki|342]]
+| Consensus (soft fork)
+| Validation of Taproot Scripts
+| Pieter Wuille, Jonas Nick, Anthony Towns
+| Standard
+| Draft
|}
<!-- IMPORTANT! See the instructions at the top of this page, do NOT JUST add BIPs here! -->
diff --git a/bip-0340.mediawiki b/bip-0340.mediawiki
new file mode 100644
index 0000000..9e0a73e
--- /dev/null
+++ b/bip-0340.mediawiki
@@ -0,0 +1,263 @@
+<pre>
+ BIP: 340
+ Title: Schnorr Signatures for secp256k1
+ Author: Pieter Wuille <pieter.wuille@gmail.com>
+ Jonas Nick <Jonas Nick <jonasd.nick@gmail.com>
+ Tim Ruffing <crypto@timruffing.de>
+ Comments-Summary: No comments yet.
+ Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0340
+ Status: Draft
+ Type: Standards Track
+ License: BSD-2-Clause
+ Created: 2020-01-19
+ Post-History: 2018-07-06: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-July/016203.html [bitcoin-dev] Schnorr signatures BIP
+</pre>
+
+== Introduction ==
+
+=== Abstract ===
+
+This document proposes a standard for 64-byte Schnorr signatures over the elliptic curve ''secp256k1''.
+
+=== Copyright ===
+
+This document is licensed under the 2-clause BSD license.
+
+=== Motivation ===
+
+Bitcoin has traditionally used
+[https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm ECDSA] signatures over the [https://www.secg.org/sec2-v2.pdf secp256k1 curve] with [https://en.wikipedia.org/wiki/SHA-2 SHA256] hashes for authenticating
+transactions. These are [https://www.secg.org/sec1-v2.pdf standardized], but have a number of downsides
+compared to [http://publikationen.ub.uni-frankfurt.de/opus4/files/4280/schnorr.pdf Schnorr signatures] over the same curve:
+
+* '''Provable security''': Schnorr signatures are provably secure. In more detail, they are ''strongly unforgeable under chosen message attack (SUF-CMA)''<ref>Informally, this means that without knowledge of the secret key but given valid signatures of arbitrary messages, it is not possible to come up with further valid signatures.</ref> [https://www.di.ens.fr/~pointche/Documents/Papers/2000_joc.pdf in the random oracle model assuming the hardness of the elliptic curve discrete logarithm problem (ECDLP)] and [http://www.neven.org/papers/schnorr.pdf in the generic group model assuming variants of preimage and second preimage resistance of the used hash function]<ref>A detailed security proof in the random oracle model, which essentially restates [https://www.di.ens.fr/~pointche/Documents/Papers/2000_joc.pdf the original security proof by Pointcheval and Stern] more explicitly, can be found in [https://eprint.iacr.org/2016/191 a paper by Kiltz, Masny and Pan]. All these security proofs assume a variant of Schnorr signatures that use ''(e,s)'' instead of ''(R,s)'' (see Design above). Since we use a unique encoding of ''R'', there is an efficiently computable bijection that maps ''(R,s)'' to ''(e,s)'', which allows to convert a successful SUF-CMA attacker for the ''(e,s)'' variant to a successful SUF-CMA attacker for the ''(R,s)'' variant (and vice-versa). Furthermore, the proofs consider a variant of Schnorr signatures without key prefixing (see Design above), but it can be verified that the proofs are also correct for the variant with key prefixing. As a result, all the aforementioned security proofs apply to the variant of Schnorr signatures proposed in this document.</ref>. In contrast, the [https://nbn-resolving.de/urn:nbn:de:hbz:294-60803 best known results for the provable security of ECDSA] rely on stronger assumptions.
+* '''Non-malleability''': The SUF-CMA security of Schnorr signatures implies that they are non-malleable. On the other hand, ECDSA signatures are inherently malleable<ref>If ''(r,s)'' is a valid ECDSA signature for a given message and key, then ''(r,n-s)'' is also valid for the same message and key. If ECDSA is restricted to only permit one of the two variants (as Bitcoin does through a policy rule on the network), it can be [https://nbn-resolving.de/urn:nbn:de:hbz:294-60803 proven] non-malleable under stronger than usual assumptions.</ref>; a third party without access to the secret key can alter an existing valid signature for a given public key and message into another signature that is valid for the same key and message. This issue is discussed in [[bip-0062.mediawiki|BIP62]] and [[bip-0146.mediawiki|BIP146]].
+* '''Linearity''': Schnorr signatures provide a simple and efficient method that enables multiple collaborating parties to produce a signature that is valid for the sum of their public keys. This is the building block for various higher-level constructions that improve efficiency and privacy, such as multisignatures and others (see Applications below).
+
+For all these advantages, there are virtually no disadvantages, apart
+from not being standardized. This document seeks to change that. As we
+propose a new standard, a number of improvements not specific to Schnorr signatures can be
+made:
+
+* '''Signature encoding''': Instead of using [https://en.wikipedia.org/wiki/X.690#DER_encoding DER]-encoding for signatures (which are variable size, and up to 72 bytes), we can use a simple fixed 64-byte format.
+* '''Public key encoding''': Instead of using ''compressed'' 33-byte encodings of elliptic curve points which are common in Bitcoin today, public keys in this proposal are encoded as 32 bytes.
+* '''Batch verification''': The specific formulation of ECDSA signatures that is standardized cannot be verified more efficiently in batch compared to individually, unless additional witness data is added. Changing the signature scheme offers an opportunity to address this.
+* '''Completely specified''': To be safe for usage in consensus systems, the verification algorithm must be completely specified at the byte level. This guarantees that nobody can construct a signature that is valid to some verifiers but not all. This is traditionally not a requirement for digital signature schemes, and the lack of exact specification for the DER parsing of ECDSA signatures has caused problems for Bitcoin [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-July/009697.html in the past], needing [[bip-0066.mediawiki|BIP66]] to address it. In this document we aim to meet this property by design. For batch verification, which is inherently non-deterministic as the verifier can choose their batches, this property implies that the outcome of verification may only differ from individual verifications with negligible probability, even to an attacker who intentionally tries to make batch- and non-batch verification differ.
+
+By reusing the same curve and hash function as Bitcoin uses for ECDSA, we are able to retain existing mechanisms for choosing secret and public keys, and we avoid introducing new assumptions about the security of elliptic curves and hash functions.
+
+== Description ==
+
+We first build up the algebraic formulation of the signature scheme by
+going through the design choices. Afterwards, we specify the exact
+encodings and operations.
+
+=== Design ===
+
+'''Schnorr signature variant''' Elliptic Curve Schnorr signatures for message ''m'' and public key ''P'' generally involve a point ''R'', integers ''e'' and ''s'' picked by the signer, and the base point ''G'' which satisfy ''e = hash(R || m)'' and ''s⋅G = R + e⋅P''. Two formulations exist, depending on whether the signer reveals ''e'' or ''R'':
+# Signatures are pairs ''(e, s)'' that satisfy ''e = hash(s⋅G - e⋅P || m)''. This variant avoids minor complexity introduced by the encoding of the point ''R'' in the signature (see paragraphs "Encoding R and public key point P" and "Implicit Y coordinates" further below in this subsection). Moreover, revealing ''e'' instead of ''R'' allows for potentially shorter signatures: Whereas an encoding of ''R'' inherently needs about 32 bytes, the hash ''e'' can be tuned to be shorter than 32 bytes, and [http://www.neven.org/papers/schnorr.pdf a short hash of only 16 bytes suffices to provide SUF-CMA security at the target security level of 128 bits]. However, a major drawback of this optimization is that finding collisions in a short hash function is easy. This complicates the implementation of secure signing protocols in scenarios in which a group of mutually distrusting signers work together to produce a single joint signature (see Applications below). In these scenarios, which are not captured by the SUF-CMA model due its assumption of a single honest signer, a promising attack strategy for malicious co-signers is to find a collision in the hash function in order to obtain a valid signature on a message that an honest co-signer did not intent to sign.
+# Signatures are pairs ''(R, s)'' that satisfy ''s⋅G = R + hash(R || m)⋅P''. This supports batch verification, as there are no elliptic curve operations inside the hashes. Batch verification enables significant speedups.
+
+[[File:bip-0340/speedup-batch.png|center|frame|This graph shows the ratio between the time it takes to verify ''n'' signatures individually and to verify a batch of ''n'' signatures. This ratio goes up logarithmically with the number of signatures, or in other words: the total time to verify ''n'' signatures grows with ''O(n / log n)''.]]
+
+Since we would like to avoid the fragility that comes with short hashes, the ''e'' variant does not provide significant advantages. We choose the ''R''-option, which supports batch verification.
+
+'''Key prefixing''' Using the verification rule above directly makes Schnorr signatures vulnerable to "related-key attacks" in which a third party can convert a signature ''(R, s)'' for public key ''P'' into a signature ''(R, s + a⋅hash(R || m))'' for public key ''P + a⋅G'' and the same message ''m'', for any given additive tweak ''a'' to the signing key. This would render signatures insecure when keys are generated using [[bip-0032.mediawiki#public-parent-key--public-child-key|BIP32's unhardened derivation]] and other methods that rely on additive tweaks to existing keys such as Taproot.
+
+To protect against these attacks, we choose ''key prefixed''<ref>A limitation of committing to the public key (rather than to a short hash of it, or not at all) is that it removes the ability for public key recovery or verifying signatures against a short public key hash. These constructions are generally incompatible with batch verification.</ref> Schnorr signatures; changing the equation to ''s⋅G = R + hash(R || P || m)⋅P''. [https://eprint.iacr.org/2015/1135.pdf It can be shown] that key prefixing protects against related-key attacks with additive tweaks. In general, key prefixing increases robustness in multi-user settings, e.g., it seems to be a requirement for proving the MuSig multisignature scheme secure (see Applications below).
+
+We note that key prefixing is not strictly necessary for transaction signatures as used in Bitcoin currently, because signed transactions indirectly commit to the public keys already, i.e., ''m'' contains a commitment to ''pk''. However, this indirect commitment should not be relied upon because it may change with proposals such as SIGHASH_NOINPUT ([[bip-0118.mediawiki|BIP118]]), and would render the signature scheme unsuitable for other purposes than signing transactions, e.g., [https://bitcoin.org/en/developer-reference#signmessage signing ordinary messages].
+
+'''Encoding R and public key point P''' There exist several possibilities for encoding elliptic curve points:
+# Encoding the full X and Y coordinates of ''P'' and ''R'', resulting in a 64-byte public key and a 96-byte signature.
+# Encoding the full X coordinate and one bit of the Y coordinate to determine one of the two possible Y coordinates. This would result in 33-byte public keys and 65-byte signatures.
+# Encoding only the X coordinate, resulting in 32-byte public keys and 64-byte signatures.
+
+Using the first option would be slightly more efficient for verification (around 10%), but we prioritize compactness, and therefore choose option 3.
+
+'''Implicit Y coordinates''' In order to support efficient verification and batch verification, the Y coordinate of ''P'' and of ''R'' cannot be ambiguous (every valid X coordinate has two possible Y coordinates). We have a choice between several options for symmetry breaking:
+# Implicitly choosing the Y coordinate that is in the lower half.
+# Implicitly choosing the Y coordinate that is even<ref>Since ''p'' is odd, negation modulo ''p'' will map even numbers to odd numbers and the other way around. This means that for a valid X coordinate, one of the corresponding Y coordinates will be even, and the other will be odd.</ref>.
+# Implicitly choosing the Y coordinate that is a quadratic residue (has a square root modulo the field size, or "is a square" for short)<ref>A product of two numbers is a square when either both or none of the factors are squares. As ''-1'' is not a square modulo secp256k1's field size ''p'', and the two Y coordinates corresponding to a given X coordinate are each other's negation, this means exactly one of the two must be a square.</ref>.
+
+In the case of ''R'' the third option is slower at signing time but a bit faster to verify, as it is possible to directly compute whether the Y coordinate is a square when the points are represented in
+[https://en.wikibooks.org/wiki/Cryptography/Prime_Curve/Jacobian_Coordinates Jacobian coordinates] (a common optimization to avoid modular inverses
+for elliptic curve operations). The two other options require a possibly
+expensive conversion to affine coordinates first. This would even be the case if the sign or oddness were explicitly coded (option 2 in the list above). We therefore choose option 3.
+
+For ''P'' the speed of signing and verification does not significantly differ between any of the three options because affine coordinates of the point have to be computed anyway. For consistency reasons we choose the same option as for ''R''. The signing algorithm ensures that the signature is valid under the correct public key by negating the secret key if necessary.
+
+Implicit Y coordinates are not a reduction in security when expressed as the number of elliptic curve operations an attacker is expected to perform to compute the secret key. An attacker can normalize any given public key to a point whose Y coordinate is a square by negating the point if necessary. This is just a subtraction of field elements and not an elliptic curve operation<ref>This can be formalized by a simple reduction that reduces an attack on Schnorr signatures with implicit Y coordinates to an attack to Schnorr signatures with explicit Y coordinates. The reduction works by reencoding public keys and negating the result of the hash function, which is modeled as random oracle, whenever the challenge public key has an explicit Y coordinate that is not a square. A proof sketch can be found [https://medium.com/blockstream/reducing-bitcoin-transaction-sizes-with-x-only-pubkeys-f86476af05d7 here].</ref>.
+
+'''Tagged Hashes''' Cryptographic hash functions are used for multiple purposes in the specification below and in Bitcoin in general. To make sure hashes used in one context can't be reinterpreted in another one, hash functions can be tweaked with a context-dependent tag name, in such a way that collisions across contexts can be assumed to be infeasible. Such collisions obviously can not be ruled out completely, but only for schemes using tagging with a unique name. As for other schemes collisions are at least less likely with tagging than without.
+
+For example, without tagged hashing a BIP340 signature could also be valid for a signature scheme where the only difference is that the arguments to the hash function are reordered. Worse, if the BIP340 nonce derivation function was copied or independently created, then the nonce could be accidentally reused in the other scheme leaking the secret key.
+
+This proposal suggests to include the tag by prefixing the hashed data with ''SHA256(tag) || SHA256(tag)''. Because this is a 64-byte long context-specific constant and the ''SHA256'' block size is also 64 bytes, optimized implementations are possible (identical to SHA256 itself, but with a modified initial state). Using SHA256 of the tag name itself is reasonably simple and efficient for implementations that don't choose to use the optimization.
+
+'''Final scheme''' As a result, our final scheme ends up using public key ''pk'' which is the X coordinate of a point ''P'' on the curve whose Y coordinate is a square and signatures ''(r,s)'' where ''r'' is the X coordinate of a point ''R'' whose Y coordinate is a square. The signature satisfies ''s⋅G = R + tagged_hash(r || pk || m)⋅P''.
+
+=== Specification ===
+
+The following conventions are used, with constants as defined for [https://www.secg.org/sec2-v2.pdf secp256k1]:
+* Lowercase variables represent integers or byte arrays.
+** The constant ''p'' refers to the field size, ''0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F''.
+** The constant ''n'' refers to the curve order, ''0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141''.
+* Uppercase variables refer to points on the curve with equation ''y<sup>2</sup> = x<sup>3</sup> + 7'' over the integers modulo ''p''.
+** ''is_infinite(P)'' returns whether or not ''P'' is the point at infinity.
+** ''x(P)'' and ''y(P)'' are integers in the range ''0..p-1'' and refer to the X and Y coordinates of a point ''P'' (assuming it is not infinity).
+** The constant ''G'' refers to the base point, for which ''x(G) = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798'' and ''y(G) = 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8''.
+** Addition of points refers to the usual [https://en.wikipedia.org/wiki/Elliptic_curve#The_group_law elliptic curve group operation].
+** [https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication Multiplication (⋅) of an integer and a point] refers to the repeated application of the group operation.
+* Functions and operations:
+** ''||'' refers to byte array concatenation.
+** The function ''x[i:j]'', where ''x'' is a byte array, returns a ''(j - i)''-byte array with a copy of the ''i''-th byte (inclusive) to the ''j''-th byte (exclusive) of ''x''.
+** The function ''bytes(x)'', where ''x'' is an integer, returns the 32-byte encoding of ''x'', most significant byte first.
+** The function ''bytes(P)'', where ''P'' is a point, returns ''bytes(x(P))''.
+** The function ''int(x)'', where ''x'' is a 32-byte array, returns the 256-bit unsigned integer whose most significant byte first encoding is ''x''.
+** The function ''is_square(x)'', where ''x'' is an integer, returns whether or not ''x'' is a quadratic residue modulo ''p''. Since ''p'' is prime, it is equivalent to the [https://en.wikipedia.org/wiki/Legendre_symbol Legendre symbol] ''(x / p) = x<sup>(p-1)/2</sup> mod p'' being equal to ''1''<ref>For points ''P'' on the secp256k1 curve it holds that ''y(P)<sup>(p-1)/2</sup> &ne; 0 mod p''.</ref>.
+** The function ''has_square_y(P)'', where ''P'' is a point, is defined as ''not is_infinite(P) and is_square(y(P))''<ref>For points ''P'' on the secp256k1 curve it holds that ''has_square_y(P) = not has_square_y(-P)''.</ref>.
+** The function ''lift_x(x)'', where ''x'' is an integer in range ''0..p-1'', returns the point ''P'' for which ''x(P) = x''<ref>
+ Given a candidate X coordinate ''x'' in the range ''0..p-1'', there exist either exactly two or exactly zero valid Y coordinates. If no valid Y coordinate exists, then ''x'' is not a valid X coordinate either, i.e., no point ''P'' exists for which ''x(P) = x''. The valid Y coordinates for a given candidate ''x'' are the square roots of ''c = x<sup>3</sup> + 7 mod p'' and they can be computed as ''y = &plusmn;c<sup>(p+1)/4</sup> mod p'' (see [https://en.wikipedia.org/wiki/Quadratic_residue#Prime_or_prime_power_modulus Quadratic residue]) if they exist, which can be checked by squaring and comparing with ''c''.
+</ref> and ''has_square_y(P)''<ref>
+ If ''P := lift_x(x)'' does not fail, then ''y := y(P) = c<sup>(p+1)/4</sup> mod p'' is square. Proof: If ''lift_x'' does not fail, ''y'' is a square root of ''c'' and therefore the [https://en.wikipedia.org/wiki/Legendre_symbol Legendre symbol] ''(c / p)'' is ''c<sup>(p-1)/2</sup> = 1 mod p''. Because the Legendre symbol ''(y / p)'' is ''y<sup>(p-1)/2</sup> mod p = c<sup>((p+1)/4)((p-1)/2)</sup> mod p = 1<sup>((p+1)/4)</sup> mod p = 1 mod p'', ''y'' is square.
+</ref>, or fails if no such point exists. The function ''lift_x(x)'' is equivalent to the following pseudocode:
+*** Let ''c = x<sup>3</sup> + 7 mod p''.
+*** Let ''y = c<sup>(p+1)/4</sup> mod p''.
+*** Fail if ''c &ne; y<sup>2</sup> mod p''.
+*** Return the unique point ''P'' such that ''x(P) = x'' and ''y(P) = y'', or fail if no such point exists.
+** The function ''point(x)'', where ''x'' is a 32-byte array, returns the point ''P = lift_x(int(x))''.
+** The function ''hash<sub>tag</sub>(x)'' where ''tag'' is a UTF-8 encoded tag name and ''x'' is a byte array returns the 32-byte hash ''SHA256(SHA256(tag) || SHA256(tag) || x)''.
+
+==== Public Key Generation ====
+
+Input:
+* The secret key ''sk'': a 32-byte array, generated uniformly at random
+
+The algorithm ''PubKey(sk)'' is defined as:
+* Let ''d = int(sk)''.
+* Fail if ''d = 0'' or ''d &ge; n''.
+* Return ''bytes(d⋅G)''.
+
+Note that we use a very different public key format (32 bytes) than the ones used by existing systems (which typically use elliptic curve points as public keys, or 33-byte or 65-byte encodings of them). A side effect is that ''PubKey(sk) = PubKey(bytes(n - int(sk))'', so every public key has two corresponding secret keys.
+
+As an alternative to generating keys randomly, it is also possible and safe to repurpose existing key generation algorithms for ECDSA in a compatible way. The secret keys constructed by such an algorithm can be used as ''sk'' directly. The public keys constructed by such an algorithm (assuming they use the 33-byte compressed encoding) need to be converted by dropping the first byte. Specifically, [[bip-0032.mediawiki|BIP32]] and schemes built on top of it remain usable.
+
+==== Default Signing ====
+
+Input:
+* The secret key ''sk'': a 32-byte array
+* The message ''m'': a 32-byte array
+
+The algorithm ''Sign(sk, m)'' is defined as:
+* Let ''d' = int(sk)''
+* Fail if ''d' = 0'' or ''d' &ge; n''
+* Let ''P = d'⋅G''
+* Let ''d = d' '' if ''has_square_y(P)'', otherwise let ''d = n - d' ''.
+* Let ''rand = hash<sub>BIPSchnorrDerive</sub>(bytes(d) || m)''.
+* Let ''k' = int(rand) mod n''<ref>Note that in general, taking a uniformly random 256-bit integer modulo the curve order will produce an unacceptably biased result. However, for the secp256k1 curve, the order is sufficiently close to ''2<sup>256</sup>'' that this bias is not observable (''1 - n / 2<sup>256</sup>'' is around ''1.27 * 2<sup>-128</sup>'').</ref>.
+* Fail if ''k' = 0''.
+* Let ''R = k'⋅G''.
+* Let ''k = k' '' if ''has_square_y(R)'', otherwise let ''k = n - k' ''.
+* Let ''e = int(hash<sub>BIPSchnorr</sub>(bytes(R) || bytes(P) || m)) mod n''.
+* Return the signature ''bytes(R) || bytes((k + ed) mod n)''.
+
+==== Alternative Signing ====
+
+It should be noted that various alternative signing algorithms can be used to produce equally valid signatures. The algorithm in the previous section is deterministic, i.e., it will always produce the same signature for a given message and secret key. This method does not need a random number generator (RNG) at signing time and is thus trivially robust against failures of RNGs. Alternatively the 32-byte ''rand'' value may be generated in other ways, producing a different but still valid signature (in other words, this is not a ''unique'' signature scheme). '''No matter which method is used to generate the ''rand'' value, the value must be a fresh uniformly random 32-byte string which is not even partially predictable for the attacker.'''
+
+'''Synthetic nonces''' For instance when a RNG is available, 32 bytes of RNG output can be appended to the input to ''hash<sub>BIPSchnorrDerive</sub>''. This will change the corresponding line in the signing algorithm to ''rand = hash<sub>BIPSchnorrDerive</sub>(bytes(d) || m || get_32_bytes_from_rng())'', where ''get_32_bytes_from_rng()'' is the call to the RNG. Adding RNG output may improve protection against [https://moderncrypto.org/mail-archive/curves/2017/000925.html fault injection attacks and side-channel attacks], and it is safe to add the output of a low-entropy RNG.
+
+'''Nonce exfiltration protection''' It is possible to strengthen the nonce generation algorithm using a second device. In this case, the second device contributes randomness which the actual signer provably incorporates into its nonce. This prevents certain attacks where the signer device is compromised and intentionally tries to leak the secret key through its nonce selection.
+
+'''Multisignatures''' This signature scheme is compatible with various types of multisignature and threshold schemes such as [https://eprint.iacr.org/2018/068 MuSig], where a single public key requires holders of multiple secret keys to participate in signing (see Applications below).
+'''It is important to note that multisignature signing schemes in general are insecure with the ''rand'' generation from the default signing algorithm above (or any other deterministic method).'''
+
+==== Verification ====
+
+Input:
+* The public key ''pk'': a 32-byte array
+* The message ''m'': a 32-byte array
+* A signature ''sig'': a 64-byte array
+
+The algorithm ''Verify(pk, m, sig)'' is defined as:
+* Let ''P = point(pk)''; fail if ''point(pk)'' fails.
+* Let ''r = int(sig[0:32])''; fail if ''r &ge; p''.
+* Let ''s = int(sig[32:64])''; fail if ''s &ge; n''.
+* Let ''e = int(hash<sub>BIPSchnorr</sub>(bytes(r) || bytes(P) || m)) mod n''.
+* Let ''R = s⋅G - e⋅P''.
+* Fail if ''not has_square_y(R)'' or ''x(R) &ne; r''.
+* Return success iff no failure occurred before reaching this point.
+
+For every valid secret key ''sk'' and message ''m'', ''Verify(PubKey(sk),m,Sign(sk,m))'' will succeed.
+
+Note that the correctness of verification relies on the fact that ''point(pk)'' always returns a point with a square Y coordinate. A hypothetical verification algorithm that treats points as public keys, and takes the point ''P'' directly as input would fail any time a point with non-square Y is used. While it is possible to correct for this by negating points with non-square Y coordinate before further processing, this would result in a scheme where every (message, signature) pair is valid for two public keys (a type of malleability that exists for ECDSA as well, but we don't wish to retain). We avoid these problems by treating just the X coordinate as public key.
+
+==== Batch Verification ====
+
+Input:
+* The number ''u'' of signatures
+* The public keys ''pk<sub>1..u</sub>'': ''u'' 32-byte arrays
+* The messages ''m<sub>1..u</sub>'': ''u'' 32-byte arrays
+* The signatures ''sig<sub>1..u</sub>'': ''u'' 64-byte arrays
+
+The algorithm ''BatchVerify(pk<sub>1..u</sub>, m<sub>1..u</sub>, sig<sub>1..u</sub>)'' is defined as:
+* Generate ''u-1'' random integers ''a<sub>2...u</sub>'' in the range ''1...n-1''. They are generated deterministically using a [https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator CSPRNG] seeded by a cryptographic hash of all inputs of the algorithm, i.e. ''seed = seed_hash(pk<sub>1</sub>..pk<sub>u</sub> || m<sub>1</sub>..m<sub>u</sub> || sig<sub>1</sub>..sig<sub>u</sub> )''. A safe choice is to instantiate ''seed_hash'' with SHA256 and use [https://tools.ietf.org/html/rfc8439 ChaCha20] with key ''seed'' as a CSPRNG to generate 256-bit integers, skipping integers not in the range ''1...n-1''.
+* For ''i = 1 .. u'':
+** Let ''P<sub>i</sub> = point(pk<sub>i</sub>)''; fail if ''point(pk<sub>i</sub>)'' fails.
+** Let ''r<sub>i</sub> = int(sig<sub>i</sub>[0:32])''; fail if ''r<sub>i</sub> &ge; p''.
+** Let ''s<sub>i</sub> = int(sig<sub>i</sub>[32:64])''; fail if ''s<sub>i</sub> &ge; n''.
+** Let ''e<sub>i</sub> = int(hash<sub>BIPSchnorr</sub>(bytes(r<sub>i</sub>) || bytes(P<sub>i</sub>) || m<sub>i</sub>)) mod n''.
+** Let ''R<sub>i</sub> = lift_x(r<sub>i</sub>)''; fail if ''lift_x(r<sub>i</sub>)'' fails.
+* Fail if ''(s<sub>1</sub> + a<sub>2</sub>s<sub>2</sub> + ... + a<sub>u</sub>s<sub>u</sub>)⋅G &ne; R<sub>1</sub> + a<sub>2</sub>⋅R<sub>2</sub> + ... + a<sub>u</sub>⋅R<sub>u</sub> + e<sub>1</sub>⋅P<sub>1</sub> + (a<sub>2</sub>e<sub>2</sub>)⋅P<sub>2</sub> + ... + (a<sub>u</sub>e<sub>u</sub>)⋅P<sub>u</sub>''.
+* Return success iff no failure occurred before reaching this point.
+
+If all individual signatures are valid (i.e., ''Verify'' would return success for them), ''BatchVerify'' will always return success. If at least one signature is invalid, ''BatchVerify'' will return success with at most a negligible probability.
+
+=== Optimizations ===
+
+Many techniques are known for optimizing elliptic curve implementations. Several of them apply here, but are out of scope for this document. Two are listed below however, as they are relevant to the design decisions:
+
+'''Squareness testing''' The function ''is_square(x)'' is defined as above, but can be computed more efficiently using an [https://en.wikipedia.org/wiki/Jacobi_symbol#Calculating_the_Jacobi_symbol extended GCD algorithm].
+
+'''Jacobian coordinates''' Elliptic Curve operations can be implemented more efficiently by using [https://en.wikibooks.org/wiki/Cryptography/Prime_Curve/Jacobian_Coordinates Jacobian coordinates]. Elliptic Curve operations implemented this way avoid many intermediate modular inverses (which are computationally expensive), and the scheme proposed in this document is in fact designed to not need any inversions at all for verification. When operating on a point ''P'' with Jacobian coordinates ''(x,y,z)'' which is not the point at infinity and for which ''x(P)'' is defined as ''x / z<sup>2</sup>'' and ''y(P)'' is defined as ''y / z<sup>3</sup>'':
+* ''has_square_y(P)'' can be implemented as ''is_square(yz mod p)''.
+* ''x(P) &ne; r'' can be implemented as ''(0 &le; r < p) and (x &ne; z<sup>2</sup>r mod p)''.
+
+== Applications ==
+
+There are several interesting applications beyond simple signatures.
+While recent academic papers claim that they are also possible with ECDSA, consensus support for Schnorr signature verification would significantly simplify the constructions.
+
+=== Multisignatures and Threshold Signatures ===
+
+By means of an interactive scheme such as [https://eprint.iacr.org/2018/068 MuSig], participants can aggregate their public keys into a single public key which they can jointly sign for. This allows ''n''-of-''n'' multisignatures which, from a verifier's perspective, are no different from ordinary signatures, giving improved privacy and efficiency versus ''CHECKMULTISIG'' or other means.
+
+Moreover, Schnorr signatures are compatible with [https://web.archive.org/web/20031003232851/http://www.research.ibm.com/security/dkg.ps distributed key generation], which enables interactive threshold signatures schemes, e.g., the schemes described by [http://cacr.uwaterloo.ca/techreports/2001/corr2001-13.ps Stinson and Strobl (2001)] or [https://web.archive.org/web/20060911151529/http://theory.lcs.mit.edu/~stasio/Papers/gjkr03.pdf Genaro, Jarecki and Krawczyk (2003)]. These protocols make it possible to realize ''k''-of-''n'' threshold signatures, which ensure that any subset of size ''k'' of the set of ''n'' signers can sign but no subset of size less than ''k'' can produce a valid Schnorr signature. However, the practicality of the existing schemes is limited: most schemes in the literature have been proven secure only for the case ''k-1 < n/2'', are not secure when used concurrently in multiple sessions, or require a reliable broadcast mechanism to be secure. Further research is necessary to improve this situation.
+
+=== Adaptor Signatures ===
+
+[https://download.wpsoftware.net/bitcoin/wizardry/mw-slides/2018-05-18-l2/slides.pdf Adaptor signatures] can be produced by a signer by offsetting his public nonce with a known point ''T = t⋅G'', but not offsetting his secret nonce.
+A correct signature (or partial signature, as individual signers' contributions to a multisignature are called) on the same message with same nonce will then be equal to the adaptor signature offset by ''t'', meaning that learning ''t'' is equivalent to learning a correct signature.
+This can be used to enable atomic swaps or even [https://eprint.iacr.org/2018/472 general payment channels] in which the atomicity of disjoint transactions is ensured using the signatures themselves, rather than Bitcoin script support. The resulting transactions will appear to verifiers to be no different from ordinary single-signer transactions, except perhaps for the inclusion of locktime refund logic.
+
+Adaptor signatures, beyond the efficiency and privacy benefits of encoding script semantics into constant-sized signatures, have additional benefits over traditional hash-based payment channels. Specifically, the secret values ''t'' may be reblinded between hops, allowing long chains of transactions to be made atomic while even the participants cannot identify which transactions are part of the chain. Also, because the secret values are chosen at signing time, rather than key generation time, existing outputs may be repurposed for different applications without recourse to the blockchain, even multiple times.
+
+=== Blind Signatures ===
+
+A blind signature protocol is an interactive protocol that enables a signer to sign a message at the behest of another party without learning any information about the signed message or the signature. Schnorr signatures admit a very [https://www.math.uni-frankfurt.de/~dmst/research/papers/schnorr.blind_sigs_attack.2001.pdf simple blind signature scheme] which is however insecure because it's vulnerable to [https://www.iacr.org/archive/crypto2002/24420288/24420288.pdf Wagner's attack]. A known mitigation is to let the signer abort a signing session with a certain probability, and the resulting scheme can be [https://eprint.iacr.org/2019/877 proven secure under non-standard cryptographic assumptions].
+
+Blind Schnorr signatures could for example be used in [https://github.com/ElementsProject/scriptless-scripts/blob/master/md/partially-blind-swap.md Partially Blind Atomic Swaps], a construction to enable transferring of coins, mediated by an untrusted escrow agent, without connecting the transactors in the public blockchain transaction graph.
+
+== Test Vectors and Reference Code ==
+
+For development and testing purposes, we provide a [[bip-0340/test-vectors.csv|collection of test vectors in CSV format]] and a naive, highly inefficient, and non-constant time [[bip-0340/reference.py|pure Python 3.7 reference implementation of the signing and verification algorithm]].
+The reference implementation is for demonstration purposes only and not to be used in production environments.
+
+== Footnotes ==
+
+<references />
+
+== Acknowledgements ==
+
+This document is the result of many discussions around Schnorr based signatures over the years, and had input from Johnson Lau, Greg Maxwell, Andrew Poelstra, Rusty Russell, and Anthony Towns. The authors further wish to thank all those who provided valuable feedback and reviews, including the participants of the [https://github.com/ajtowns/taproot-review structured reviews].
diff --git a/bip-0340/reference.py b/bip-0340/reference.py
new file mode 100644
index 0000000..f2a944f
--- /dev/null
+++ b/bip-0340/reference.py
@@ -0,0 +1,169 @@
+import hashlib
+import binascii
+
+p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
+n = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
+
+# Points are tuples of X and Y coordinates and the point at infinity is
+# represented by the None keyword.
+G = (0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798, 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8)
+
+# This implementation can be sped up by storing the midstate after hashing
+# tag_hash instead of rehashing it all the time.
+def tagged_hash(tag, msg):
+ tag_hash = hashlib.sha256(tag.encode()).digest()
+ return hashlib.sha256(tag_hash + tag_hash + msg).digest()
+
+def is_infinity(P):
+ return P is None
+
+def x(P):
+ return P[0]
+
+def y(P):
+ return P[1]
+
+def point_add(P1, P2):
+ if (P1 is None):
+ return P2
+ if (P2 is None):
+ return P1
+ if (x(P1) == x(P2) and y(P1) != y(P2)):
+ return None
+ if (P1 == P2):
+ lam = (3 * x(P1) * x(P1) * pow(2 * y(P1), p - 2, p)) % p
+ else:
+ lam = ((y(P2) - y(P1)) * pow(x(P2) - x(P1), p - 2, p)) % p
+ x3 = (lam * lam - x(P1) - x(P2)) % p
+ return (x3, (lam * (x(P1) - x3) - y(P1)) % p)
+
+def point_mul(P, n):
+ R = None
+ for i in range(256):
+ if ((n >> i) & 1):
+ R = point_add(R, P)
+ P = point_add(P, P)
+ return R
+
+def bytes_from_int(x):
+ return x.to_bytes(32, byteorder="big")
+
+def bytes_from_point(P):
+ return bytes_from_int(x(P))
+
+def point_from_bytes(b):
+ x = int_from_bytes(b)
+ if x >= p:
+ return None
+ y_sq = (pow(x, 3, p) + 7) % p
+ y = pow(y_sq, (p + 1) // 4, p)
+ if pow(y, 2, p) != y_sq:
+ return None
+ return [x, y]
+
+def int_from_bytes(b):
+ return int.from_bytes(b, byteorder="big")
+
+def hash_sha256(b):
+ return hashlib.sha256(b).digest()
+
+def is_square(x):
+ return pow(x, (p - 1) // 2, p) == 1
+
+def has_square_y(P):
+ return not is_infinity(P) and is_square(y(P))
+
+def pubkey_gen(seckey):
+ x = int_from_bytes(seckey)
+ if not (1 <= x <= n - 1):
+ raise ValueError('The secret key must be an integer in the range 1..n-1.')
+ P = point_mul(G, x)
+ return bytes_from_point(P)
+
+def schnorr_sign(msg, seckey0):
+ if len(msg) != 32:
+ raise ValueError('The message must be a 32-byte array.')
+ seckey0 = int_from_bytes(seckey0)
+ if not (1 <= seckey0 <= n - 1):
+ raise ValueError('The secret key must be an integer in the range 1..n-1.')
+ P = point_mul(G, seckey0)
+ seckey = seckey0 if has_square_y(P) else n - seckey0
+ k0 = int_from_bytes(tagged_hash("BIPSchnorrDerive", bytes_from_int(seckey) + msg)) % n
+ if k0 == 0:
+ raise RuntimeError('Failure. This happens only with negligible probability.')
+ R = point_mul(G, k0)
+ k = n - k0 if not has_square_y(R) else k0
+ e = int_from_bytes(tagged_hash("BIPSchnorr", bytes_from_point(R) + bytes_from_point(P) + msg)) % n
+ return bytes_from_point(R) + bytes_from_int((k + e * seckey) % n)
+
+def schnorr_verify(msg, pubkey, sig):
+ if len(msg) != 32:
+ raise ValueError('The message must be a 32-byte array.')
+ if len(pubkey) != 32:
+ raise ValueError('The public key must be a 32-byte array.')
+ if len(sig) != 64:
+ raise ValueError('The signature must be a 64-byte array.')
+ P = point_from_bytes(pubkey)
+ if (P is None):
+ return False
+ r = int_from_bytes(sig[0:32])
+ s = int_from_bytes(sig[32:64])
+ if (r >= p or s >= n):
+ return False
+ e = int_from_bytes(tagged_hash("BIPSchnorr", sig[0:32] + pubkey + msg)) % n
+ R = point_add(point_mul(G, s), point_mul(P, n - e))
+ if R is None or not has_square_y(R) or x(R) != r:
+ return False
+ return True
+
+#
+# The following code is only used to verify the test vectors.
+#
+import csv
+
+def test_vectors():
+ all_passed = True
+ with open('test-vectors.csv', newline='') as csvfile:
+ reader = csv.reader(csvfile)
+ reader.__next__()
+ for row in reader:
+ (index, seckey, pubkey, msg, sig, result, comment) = row
+ pubkey = bytes.fromhex(pubkey)
+ msg = bytes.fromhex(msg)
+ sig = bytes.fromhex(sig)
+ result = result == 'TRUE'
+ print('\nTest vector #%-3i: ' % int(index))
+ if seckey != '':
+ seckey = bytes.fromhex(seckey)
+ pubkey_actual = pubkey_gen(seckey)
+ if pubkey != pubkey_actual:
+ print(' * Failed key generation.')
+ print(' Expected key:', pubkey.hex().upper())
+ print(' Actual key:', pubkey_actual.hex().upper())
+ sig_actual = schnorr_sign(msg, seckey)
+ if sig == sig_actual:
+ print(' * Passed signing test.')
+ else:
+ print(' * Failed signing test.')
+ print(' Expected signature:', sig.hex().upper())
+ print(' Actual signature:', sig_actual.hex().upper())
+ all_passed = False
+ result_actual = schnorr_verify(msg, pubkey, sig)
+ if result == result_actual:
+ print(' * Passed verification test.')
+ else:
+ print(' * Failed verification test.')
+ print(' Expected verification result:', result)
+ print(' Actual verification result:', result_actual)
+ if comment:
+ print(' Comment:', comment)
+ all_passed = False
+ print()
+ if all_passed:
+ print('All test vectors passed.')
+ else:
+ print('Some test vectors failed.')
+ return all_passed
+
+if __name__ == '__main__':
+ test_vectors()
diff --git a/bip-0340/speedup-batch.png b/bip-0340/speedup-batch.png
new file mode 100644
index 0000000..fe672d4
--- /dev/null
+++ b/bip-0340/speedup-batch.png
Binary files differ
diff --git a/bip-0340/test-vectors.csv b/bip-0340/test-vectors.csv
new file mode 100644
index 0000000..3970803
--- /dev/null
+++ b/bip-0340/test-vectors.csv
@@ -0,0 +1,16 @@
+index,secret key,public key,message,signature,verification result,comment
+0,0000000000000000000000000000000000000000000000000000000000000001,79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798,0000000000000000000000000000000000000000000000000000000000000000,528F745793E8472C0329742A463F59E58F3A3F1A4AC09C28F6F8514D4D0322A258BD08398F82CF67B812AB2C7717CE566F877C2F8795C846146978E8F04782AE,TRUE,
+1,B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,667C2F778E0616E611BD0C14B8A600C5884551701A949EF0EBFD72D452D64E844160BCFC3F466ECB8FACD19ADE57D8699D74E7207D78C6AEDC3799B52A8E0598,TRUE,
+2,C90FDAA22168C234C4C6628B80DC1CD129024E088A67CC74020BBEA63B14E5C9,DD308AFEC5777E13121FA72B9CC1B7CC0139715309B086C960E18FD969774EB8,5E2D58D8B3BCDF1ABADEC7829054F90DDA9805AAB56C77333024B9D0A508B75C,2D941B38E32624BF0AC7669C0971B990994AF6F9B18426BF4F4E7EC10E6CDF386CF646C6DDAFCFA7F1993EEB2E4D66416AEAD1DDAE2F22D63CAD901412D116C6,TRUE,
+3,0B432B2677937381AEF05BB02A66ECD012773062CF3FA2549E44F58ED2401710,25D1DFF95105F5253C4022F628A996AD3A0D95FBF21D468A1B33F8C160D8F517,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,8BD2C11604B0A87A443FCC2E5D90E5328F934161B18864FB48CE10CB59B45FB9B5B2A0F129BD88F5BDC05D5C21E5C57176B913002335784F9777A24BD317CD36,TRUE,test fails if msg is reduced modulo p or n
+4,,D69C3509BB99E412E68B0FE8544E72837DFA30746D8BE2AA65975F29D22DC7B9,4DF3C3F68FCC83B27E9D42C90431A72499F17875C81A599B566C9889B9696703,00000000000000000000003B78CE563F89A0ED9414F5AA28AD0D96D6795F9C63EE374AC7FAE927D334CCB190F6FB8FD27A2DDC639CCEE46D43F113A4035A2C7F,TRUE,
+5,,EEFDEA4CDB677750A420FEE807EACF21EB9898AE79B9768766E4FAA04A2D4A34,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,667C2F778E0616E611BD0C14B8A600C5884551701A949EF0EBFD72D452D64E844160BCFC3F466ECB8FACD19ADE57D8699D74E7207D78C6AEDC3799B52A8E0598,FALSE,public key not on the curve
+6,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,F9308A019258C31049344F85F89D5229B531C845836F99B08601F113BCE036F9935554D1AA5F0374E5CDAACB3925035C7C169B27C4426DF0A6B19AF3BAEAB138,FALSE,has_square_y(R) is false
+7,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,10AC49A6A2EBF604189C5F40FC75AF2D42D77DE9A2782709B1EB4EAF1CFE9108D7003B703A3499D5E29529D39BA040A44955127140F81A8A89A96F992AC0FE79,FALSE,negated message
+8,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,667C2F778E0616E611BD0C14B8A600C5884551701A949EF0EBFD72D452D64E84BE9F4303C0B9913470532E6521A827951D39F5C631CFD98CE39AC4D7A5A83BA9,FALSE,negated s value
+9,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,000000000000000000000000000000000000000000000000000000000000000099D2F0EBC2996808208633CD9926BF7EC3DAB73DAAD36E85B3040A698E6D1CE0,FALSE,sG - eP is infinite. Test fails in single verification if has_square_y(inf) is defined as true and x(inf) as 0
+10,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,000000000000000000000000000000000000000000000000000000000000000124E81D89F01304695CE943F7D5EBD00EF726A0864B4FF33895B4E86BEADC5456,FALSE,sG - eP is infinite. Test fails in single verification if has_square_y(inf) is defined as true and x(inf) as 1
+11,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,4A298DACAE57395A15D0795DDBFD1DCB564DA82B0F269BC70A74F8220429BA1D4160BCFC3F466ECB8FACD19ADE57D8699D74E7207D78C6AEDC3799B52A8E0598,FALSE,sig[0:32] is not an X coordinate on the curve
+12,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F4160BCFC3F466ECB8FACD19ADE57D8699D74E7207D78C6AEDC3799B52A8E0598,FALSE,sig[0:32] is equal to field size
+13,,DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,667C2F778E0616E611BD0C14B8A600C5884551701A949EF0EBFD72D452D64E84FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141,FALSE,sig[32:64] is equal to curve order
+14,,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC30,243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89,667C2F778E0616E611BD0C14B8A600C5884551701A949EF0EBFD72D452D64E844160BCFC3F466ECB8FACD19ADE57D8699D74E7207D78C6AEDC3799B52A8E0598,FALSE,public key is not a valid X coordinate because it exceeds the field size
diff --git a/bip-0340/test-vectors.py b/bip-0340/test-vectors.py
new file mode 100644
index 0000000..195b61b
--- /dev/null
+++ b/bip-0340/test-vectors.py
@@ -0,0 +1,241 @@
+import sys
+from reference import *
+
+def vector0():
+ seckey = bytes_from_int(1)
+ msg = bytes_from_int(0)
+ sig = schnorr_sign(msg, seckey)
+ pubkey = pubkey_gen(seckey)
+
+ # The point reconstructed from the public key has an even Y coordinate.
+ pubkey_point = point_from_bytes(pubkey)
+ assert(pubkey_point[1] & 1 == 0)
+
+ return (seckey, pubkey, msg, sig, "TRUE", None)
+
+def vector1():
+ seckey = bytes_from_int(0xB7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF)
+ msg = bytes_from_int(0x243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89)
+ sig = schnorr_sign(msg, seckey)
+ pubkey = pubkey_gen(seckey)
+
+ # The point reconstructed from the public key has an odd Y coordinate.
+ pubkey_point = point_from_bytes(pubkey)
+ assert(pubkey_point[1] & 1 == 1)
+
+ return (seckey, pubkey, msg, sig, "TRUE", None)
+
+def vector2():
+ seckey = bytes_from_int(0xC90FDAA22168C234C4C6628B80DC1CD129024E088A67CC74020BBEA63B14E5C9)
+ msg = bytes_from_int(0x5E2D58D8B3BCDF1ABADEC7829054F90DDA9805AAB56C77333024B9D0A508B75C)
+ sig = schnorr_sign(msg, seckey)
+
+ # This signature vector would not verify if the implementer checked the
+ # squareness of the X coordinate of R instead of the Y coordinate.
+ R = point_from_bytes(sig[0:32])
+ assert(not is_square(R[0]))
+
+ return (seckey, pubkey_gen(seckey), msg, sig, "TRUE", None)
+
+def vector3():
+ seckey = bytes_from_int(0x0B432B2677937381AEF05BB02A66ECD012773062CF3FA2549E44F58ED2401710)
+ msg = bytes_from_int(0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF)
+ sig = schnorr_sign(msg, seckey)
+ return (seckey, pubkey_gen(seckey), msg, sig, "TRUE", "test fails if msg is reduced modulo p or n")
+
+# Signs with a given nonce. This can be INSECURE and is only INTENDED FOR
+# GENERATING TEST VECTORS. Results in an invalid signature if y(kG) is not
+# square.
+def insecure_schnorr_sign_fixed_nonce(msg, seckey0, k):
+ if len(msg) != 32:
+ raise ValueError('The message must be a 32-byte array.')
+ seckey0 = int_from_bytes(seckey0)
+ if not (1 <= seckey0 <= n - 1):
+ raise ValueError('The secret key must be an integer in the range 1..n-1.')
+ P = point_mul(G, seckey0)
+ seckey = seckey0 if has_square_y(P) else n - seckey0
+ R = point_mul(G, k)
+ e = int_from_bytes(tagged_hash("BIPSchnorr", bytes_from_point(R) + bytes_from_point(P) + msg)) % n
+ return bytes_from_point(R) + bytes_from_int((k + e * seckey) % n)
+
+# Creates a singature with a small x(R) by using k = 1/2
+def vector4():
+ one_half = 0x7fffffffffffffffffffffffffffffff5d576e7357a4501ddfe92f46681b20a0
+ seckey = bytes_from_int(0x763758E5CBEEDEE4F7D3FC86F531C36578933228998226672F13C4F0EBE855EB)
+ msg = bytes_from_int(0x4DF3C3F68FCC83B27E9D42C90431A72499F17875C81A599B566C9889B9696703)
+ sig = insecure_schnorr_sign_fixed_nonce(msg, seckey, one_half)
+ return (None, pubkey_gen(seckey), msg, sig, "TRUE", None)
+
+default_seckey = bytes_from_int(0xB7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF)
+default_msg = bytes_from_int(0x243F6A8885A308D313198A2E03707344A4093822299F31D0082EFA98EC4E6C89)
+
+# Public key is not on the curve
+def vector5():
+ # This creates a dummy signature that doesn't have anything to do with the
+ # public key.
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+
+ pubkey = bytes_from_int(0xEEFDEA4CDB677750A420FEE807EACF21EB9898AE79B9768766E4FAA04A2D4A34)
+ assert(point_from_bytes(pubkey) is None)
+
+ return (None, pubkey, msg, sig, "FALSE", "public key not on the curve")
+
+def vector6():
+ seckey = default_seckey
+ msg = default_msg
+ k = 3
+ sig = insecure_schnorr_sign_fixed_nonce(msg, seckey, k)
+
+ # Y coordinate of R is not a square
+ R = point_mul(G, k)
+ assert(not has_square_y(R))
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "has_square_y(R) is false")
+
+def vector7():
+ seckey = default_seckey
+ msg = int_from_bytes(default_msg)
+ neg_msg = bytes_from_int(n - msg)
+ sig = schnorr_sign(neg_msg, seckey)
+ return (None, pubkey_gen(seckey), bytes_from_int(msg), sig, "FALSE", "negated message")
+
+def vector8():
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+ sig = sig[0:32] + bytes_from_int(n - int_from_bytes(sig[32:64]))
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "negated s value")
+
+def bytes_from_point_inf0(P):
+ if P == None:
+ return bytes_from_int(0)
+ return bytes_from_int(P[0])
+
+def vector9():
+ seckey = default_seckey
+ msg = default_msg
+
+ # Override bytes_from_point in schnorr_sign to allow creating a signature
+ # with k = 0.
+ k = 0
+ bytes_from_point_tmp = bytes_from_point.__code__
+ bytes_from_point.__code__ = bytes_from_point_inf0.__code__
+ sig = insecure_schnorr_sign_fixed_nonce(msg, seckey, k)
+ bytes_from_point.__code__ = bytes_from_point_tmp
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "sG - eP is infinite. Test fails in single verification if has_square_y(inf) is defined as true and x(inf) as 0")
+
+def bytes_from_point_inf1(P):
+ if P == None:
+ return bytes_from_int(1)
+ return bytes_from_int(P[0])
+
+def vector10():
+ seckey = default_seckey
+ msg = default_msg
+
+ # Override bytes_from_point in schnorr_sign to allow creating a signature
+ # with k = 0.
+ k = 0
+ bytes_from_point_tmp = bytes_from_point.__code__
+ bytes_from_point.__code__ = bytes_from_point_inf1.__code__
+ sig = insecure_schnorr_sign_fixed_nonce(msg, seckey, k)
+ bytes_from_point.__code__ = bytes_from_point_tmp
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "sG - eP is infinite. Test fails in single verification if has_square_y(inf) is defined as true and x(inf) as 1")
+
+# It's cryptographically impossible to create a test vector that fails if run
+# in an implementation which merely misses the check that sig[0:32] is an X
+# coordinate on the curve. This test vector just increases test coverage.
+def vector11():
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+
+ # Replace R's X coordinate with an X coordinate that's not on the curve
+ x_not_on_curve = bytes_from_int(0x4A298DACAE57395A15D0795DDBFD1DCB564DA82B0F269BC70A74F8220429BA1D)
+ assert(point_from_bytes(x_not_on_curve) is None)
+ sig = x_not_on_curve + sig[32:64]
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "sig[0:32] is not an X coordinate on the curve")
+
+# It's cryptographically impossible to create a test vector that fails if run
+# in an implementation which merely misses the check that sig[0:32] is smaller
+# than the field size. This test vector just increases test coverage.
+def vector12():
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+
+ # Replace R's X coordinate with an X coordinate that's equal to field size
+ sig = bytes_from_int(p) + sig[32:64]
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "sig[0:32] is equal to field size")
+
+# It's cryptographically impossible to create a test vector that fails if run
+# in an implementation which merely misses the check that sig[32:64] is smaller
+# than the curve order. This test vector just increases test coverage.
+def vector13():
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+
+ # Replace s with a number that's equal to the curve order
+ sig = sig[0:32] + bytes_from_int(n)
+
+ return (None, pubkey_gen(seckey), msg, sig, "FALSE", "sig[32:64] is equal to curve order")
+
+# Test out of range pubkey
+# It's cryptographically impossible to create a test vector that fails if run
+# in an implementation which accepts out of range pubkeys because we can't find
+# a secret key for such a public key and therefore can not create a signature.
+# This test vector just increases test coverage.
+def vector14():
+ # This creates a dummy signature that doesn't have anything to do with the
+ # public key.
+ seckey = default_seckey
+ msg = default_msg
+ sig = schnorr_sign(msg, seckey)
+
+ pubkey_int = p + 1
+ pubkey = bytes_from_int(pubkey_int)
+ assert(point_from_bytes(pubkey) is None)
+ # If an implementation would reduce a given public key modulo p then the
+ # pubkey would be valid
+ assert(point_from_bytes(bytes_from_int(pubkey_int % p)) is not None)
+
+ return (None, pubkey, msg, sig, "FALSE", "public key is not a valid X coordinate because it exceeds the field size")
+
+vectors = [
+ vector0(),
+ vector1(),
+ vector2(),
+ vector3(),
+ vector4(),
+ vector5(),
+ vector6(),
+ vector7(),
+ vector8(),
+ vector9(),
+ vector10(),
+ vector11(),
+ vector12(),
+ vector13(),
+ vector14()
+ ]
+
+# Converts the byte strings of a test vector into hex strings
+def bytes_to_hex(seckey, pubkey, msg, sig, result, comment):
+ return (seckey.hex().upper() if seckey is not None else None, pubkey.hex().upper(), msg.hex().upper(), sig.hex().upper(), result, comment)
+
+vectors = list(map(lambda vector: bytes_to_hex(vector[0], vector[1], vector[2], vector[3], vector[4], vector[5]), vectors))
+
+def print_csv(vectors):
+ writer = csv.writer(sys.stdout)
+ writer.writerow(("index", "secret key", "public key", "message", "signature", "verification result", "comment"))
+ for (i,v) in enumerate(vectors):
+ writer.writerow((i,)+v)
+
+print_csv(vectors)
diff --git a/bip-0341.mediawiki b/bip-0341.mediawiki
new file mode 100644
index 0000000..8f27c35
--- /dev/null
+++ b/bip-0341.mediawiki
@@ -0,0 +1,311 @@
+<pre>
+ BIP: 341
+ Layer: Consensus (soft fork)
+ Title: Taproot: SegWit version 1 spending rules
+ Author: Pieter Wuille <pieter.wuille@gmail.com>
+ Jonas Nick <jonasd.nick@gmail.com>
+ Anthony Towns <aj@erisian.com.au>
+ Comments-Summary: No comments yet.
+ Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0341
+ Status: Draft
+ Type: Standards Track
+ Created: 2020-01-19
+ License: BSD-3-Clause
+ Requires: 340
+ Post-History: 2019-05-06: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-May/016914.html [bitcoin-dev] Taproot proposal
+</pre>
+
+==Introduction==
+
+===Abstract===
+
+This document proposes a new SegWit version 1 output type, with spending rules based on Taproot, Schnorr signatures, and Merkle branches.
+
+===Copyright===
+
+This document is licensed under the 3-clause BSD license.
+
+===Motivation===
+
+This proposal aims to improve privacy, efficiency, and flexibility of Bitcoin's scripting capabilities without adding new security assumptions<ref>'''What does not adding security assumptions mean?''' Unforgeability of signatures is a necessary requirement to prevent theft. At least when treating script execution as a digital signature scheme itself, unforgeability can be [https://github.com/apoelstra/taproot proven] in the Random Oracle Model assuming the Discrete Logarithm problem is hard. A [https://nbn-resolving.de/urn:nbn:de:hbz:294-60803 proof] for unforgeability of ECDSA in the current script system needs non-standard assumptions on top of that. Note that it is hard in general to model exactly what security for script means, as it depends on the policies and protocols used by wallet software.</ref>. Specifically, it seeks to minimize how much information about the spendability conditions of a transaction output is revealed on chain at creation or spending time and to add a number of upgrade mechanisms, while fixing a few minor but long-standing issues.
+
+==Design==
+
+A number of related ideas for improving Bitcoin's scripting capabilities have been previously proposed: Schnorr signatures ([[bip-0340|BIP340]]), Merkle branches ("MAST", [[bip-0114|BIP114]], [[bip-0117|BIP117]]), new sighash modes ([[bip-0118|BIP118]]), new opcodes like CHECKSIGFROMSTACK, [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/015614.html Taproot], [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-February/015700.html Graftroot], [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-July/016249.html G'root], and [https://bitcointalk.org/index.php?topic=1377298.0 cross-input aggregation].
+
+Combining all these ideas in a single proposal would be an extensive change, be hard to review, and likely miss new discoveries that otherwise could have been made along the way. Not all are equally mature as well. For example, cross-input aggregation [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-March/015838.html interacts] in complex ways with upgrade mechanisms, and solutions to that are still [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-October/016461.html in flux]. On the other hand, separating them all into independent upgrades would reduce the efficiency and privacy gains to be had, and wallet and service providers may not be inclined to go through many incremental updates. Therefore, we're faced with a tradeoff between functionality and scope creep. In this design we strike a balance by focusing on the structural script improvements offered by Taproot and Merkle branches, as well as changes necessary to make them usable and efficient. For things like sighashes and opcodes we include fixes for known problems, but exclude new features that can be added independently with no downsides.
+
+As a result we choose this combination of technologies:
+* '''Merkle branches''' let us only reveal the actually executed part of the script to the blockchain, as opposed to all possible ways a script can be executed. Among the various known mechanisms for implementing this, one where the Merkle tree becomes part of the script's structure directly maximizes the space savings, so that approach is chosen.
+* '''Taproot''' on top of that lets us merge the traditionally separate pay-to-pubkey and pay-to-scripthash policies, making all outputs spendable by either a key or (optionally) a script, and indistinguishable from each other. As long as the key-based spending path is used for spending, it is not revealed whether a script path was permitted as well, resulting in space savings and an increase in scripting privacy at spending time.
+* Taproot's advantages become apparent under the assumption that most applications involve outputs that could be spent by all parties agreeing. That's where '''Schnorr''' signatures come in, as they permit [https://eprint.iacr.org/2018/068 key aggregation]: a public key can be constructed from multiple participant public keys, and which requires cooperation between all participants to sign for. Such multi-party public keys and signatures are indistinguishable from their single-party equivalents. This means that with taproot most applications can use the key-based spending path, which is both efficient and private. This can be generalized to arbitrary M-of-N policies, as Schnorr signatures support threshold signing, at the cost of more complex setup protocols.
+* As Schnorr signatures also permit '''batch validation''', allowing multiple signatures to be validated together more efficiently than validating each one independently, we make sure all parts of the design are compatible with this.
+* Where unused bits appear as a result of the above changes, they are reserved for mechanisms for '''future extensions'''. As a result, every script in the Merkle tree has an associated version such that new script versions can be introduced with a soft fork while remaining compatible with BIP 341. Additionally, future soft forks can make use of the currently unused <code>annex</code> in the witness (see [[bip-0341.mediawiki#Rationale|BIP341]]).
+* While the core semantics of the '''signature hashing algorithm''' are not changed, a number of improvements are included in this proposal. The new signature hashing algorithm fixes the verification capabilities of offline signing devices by including amount and scriptPubKey in the signature message, avoids unnecessary hashing, uses '''tagged hashes''' and defines a default sighash byte.
+* The '''public key is directly included in the output''' in contrast to typical earlier constructions which store a hash of the public key or script in the output. This has the same cost for senders and is more space efficient overall if the key-based spending path is taken. <ref>'''Why is the public key directly included in the output?''' While typical earlier constructions store a hash of a script or a public key in the output, this is rather wasteful when a public key is always involved. To guarantee batch verifiability, the public key must be known to every verifier, and thus only revealing its hash as an output would imply adding an additional 32 bytes to the witness. Furthermore, to maintain [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-January/012198.html 128-bit collision security] for outputs, a 256-bit hash would be required anyway, which is comparable in size (and thus in cost for senders) to revealing the public key directly. While the usage of public key hashes is often said to protect against ECDLP breaks or quantum computers, this protection is very weak at best: transactions are not protected while being confirmed, and a very [https://twitter.com/pwuille/status/1108097835365339136 large portion] of the currency's supply is not under such protection regardless. Actual resistance to such systems can be introduced by relying on different cryptographic assumptions, but this proposal focuses on improvements that do not change the security model.</ref>
+
+Informally, the resulting design is as follows: a new witness version is added (version 1), whose programs consist of 32-byte encodings of points ''Q''. ''Q'' is computed as ''P + hash(P||m)G'' for a public key ''P'', and the root ''m'' of a Merkle tree whose leaves consist of a version number and a script. These outputs can be spent directly by providing a signature for ''Q'', or indirectly by revealing ''P'', the script and leaf version, inputs that satisfy the script, and a Merkle path that proves ''Q'' committed to that leaf. All hashes in this construction (the hash for computing ''Q'' from ''P'', the hashes inside the Merkle tree's inner nodes, and the signature hashes used) are tagged to guarantee domain separation.
+
+== Specification ==
+
+This section specifies the Taproot consensus rules. Validity is defined by exclusion: a block or transaction is valid if no condition exists that marks it failed.
+
+The notation below follows that of [[bip-0340.mediawiki#design|BIP340]]. This includes the ''hash<sub>tag</sub>(x)'' notation to refer to ''SHA256(SHA256(tag) || SHA256(tag) || x)''. To the best of the authors' knowledge, no existing use of SHA256 in Bitcoin feeds it a message that starts with two single SHA256 outputs, making collisions between ''hash<sub>tag</sub>'' with other hashes extremely unlikely.
+
+=== Script validation rules ===
+
+A Taproot output is a native SegWit output (see [[bip-0141.mediawiki|BIP141]]) with version number 1, and a 32-byte witness program.
+The following rules only apply when such an output is being spent. Any other outputs, including version 1 outputs with lengths other than 32 bytes, or P2SH-wrapped version 1 outputs<ref>'''Why is P2SH-wrapping not supported?''' Using P2SH-wrapped outputs only provides 80-bit collision security due to the use of a 160-bit hash. This is considered low, and becomes a security risk whenever the output includes data from more than a single party (public keys, hashes, ...).</ref>, remain unencumbered.
+
+* Let ''q'' be the 32-byte array containing the witness program (the second push in the scriptPubKey) which represents a public key according to [[bip-0340.mediawiki#design|BIP340]].
+* Fail if the witness stack has 0 elements.
+* If there are at least two witness elements, and the first byte of the last element is 0x50<ref>'''Why is the first byte of the annex <code>0x50</code>?''' The <code>0x50</code> is chosen as it could not be confused with a valid P2WPKH or P2WSH spending. As the control block's initial byte's lowest bit is used to indicate the public key's Y squareness, each leaf version needs an even byte value and the immediately following odd byte value that are both not yet used in P2WPKH or P2WSH spending. To indicate the annex, only an "unpaired" available byte is necessary like <code>0x50</code>. This choice maximizes the available options for future script versions.</ref>, this last element is called ''annex'' ''a''<ref>'''What is the purpose of the annex?''' The annex is a reserved space for future extensions, such as indicating the validation costs of computationally expensive new opcodes in a way that is recognizable without knowing the scriptPubKey of the output being spent. Until the meaning of this field is defined by another softfork, users SHOULD NOT include <code>annex</code> in transactions, or it may lead to PERMANENT FUND LOSS.</ref> and is removed from the witness stack. The annex (or the lack of thereof) is always covered by the signature and contributes to transaction weight, but is otherwise ignored during taproot validation.
+* If there is exactly one element left in the witness stack, key path spending is used:
+** The single witness stack element is interpreted as the signature and must be valid (see the next section) for the public key ''q'' (see the next subsection).
+* If there are at least two witness elements left, script path spending is used:
+** Call the second-to-last stack element ''s'', the script.
+** The last stack element is called the control block ''c'', and must have length ''33 + 32m'', for a value of ''m'' that is an integer between 0 and 128<ref>'''Why is the Merkle path length limited to 128?''' The optimally space-efficient Merkle tree can be constructed based on the probabilities of the scripts in the leaves, using the Huffman algorithm. This algorithm will construct branches with lengths approximately equal to ''log<sub>2</sub>(1/probability)'', but to have branches longer than 128 you would need to have scripts with an execution chance below 1 in ''2<sup>128</sup>''. As that is our security bound, scripts that truly have such a low chance can probably be removed entirely.</ref>, inclusive. Fail if it does not have such a length.
+** Let ''p = c[1:33]'' and let ''P = point(p)'' where ''point'' and ''[:]'' are defined as in [bip-0340.mediawiki#design|BIP340]. Fail if this point is not on the curve.
+** Let ''v = c[0] & 0xfe'' and call it the ''leaf version''<ref>'''What constraints are there on the leaf version?''' First, the leaf version cannot be odd as ''c[0] & 0xfe'' will always be even, and cannot be ''0x50'' as that would result in ambiguity with the annex. In addition, in order to support some forms of static analysis that rely on being able to identify script spends without access to the output being spent, it is recommended to avoid using any leaf versions that would conflict with a valid first byte of either a valid P2WPKH pubkey or a valid P2WSH script (that is, both ''v'' and ''v | 1'' should be an undefined, invalid or disabled opcode or an opcode that is not valid as the first opcode). The values that comply to this rule are the 32 even values between ''0xc0'' and ''0xfe'' and also ''0x66'', ''0x7e'', ''0x80'', ''0x84'', ''0x96'', ''0x98'', ''0xba'', ''0xbc'', ''0xbe''. Note also that this constraint implies that leaf versions should be shared amongst different witness versions, as knowing the witness version requires access to the output being spent.</ref>.
+** Let ''k<sub>0</sub> = hash<sub>TapLeaf</sub>(v || compact_size(size of s) || s)''; also call it the ''tapleaf hash''.
+** For ''j'' in ''[0,1,...,m-1]'':
+*** Let ''e<sub>j</sub> = c[33+32j:65+32j]''.
+*** Let ''k<sub>j+1</sub> depend on whether ''k<sub>j</sub> < e<sub>j</sub>'' (lexicographically)<ref>'''Why are child elements sorted before hashing in the Merkle tree?''' By doing so, it is not necessary to reveal the left/right directions along with the hashes in revealed Merkle branches. This is possible because we do not actually care about the position of specific scripts in the tree; only that they are actually committed to.</ref>:
+**** If ''k<sub>j</sub> < e<sub>j</sub>'': ''k<sub>j+1</sub> = hash<sub>TapBranch</sub>(k<sub>j</sub> || e<sub>j</sub>)''<ref>'''Why not use a more efficient hash construction for inner Merkle nodes?''' The chosen construction does require two invocations of the SHA256 compression functions, one of which can be avoided in theory (see [[bip-0098.mediawiki|BIP98]]). However, it seems preferable to stick to constructions that can be implemented using standard cryptographic primitives, both for implementation simplicity and analyzability. If necessary, a significant part of the second compression function can be optimized out by [https://github.com/bitcoin/bitcoin/pull/13191 specialization] for 64-byte inputs.</ref>.
+**** If ''k<sub>j</sub> &ge; e<sub>j</sub>'': ''k<sub>j+1</sub> = hash<sub>TapBranch</sub>(e<sub>j</sub> || k<sub>j</sub>)''.
+** Let ''t = hash<sub>TapTweak</sub>(p || k<sub>m</sub>)''.
+** If ''t &ge; 0xFFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141'' (order of secp256k1), fail.
+** Let ''Q = point(q) if (c[0] & 1) = 1 and -point(q) otherwise''<ref>'''Why is it necessary to reveal a bit to indicate if the point represented by the output public key is negated in a script path spend?''' The ''point'' function (defined in [[bip-0340.mediawiki#design|BIP340]]) always constructs a point with a square Y coordinate, but because ''Q'' is constructed by adding the taproot tweak to the internal public key ''P'', it cannot easily be guaranteed that ''Q'' in fact has such a Y coordinate. Therefore, before verifying the taproot tweak the original point is restored by negating if necessary. We can not ignore the Y coordinate because it would prevent batch verification. Trying out multiple internal keys until there's such a ''Q'' is possible but undesirable and unnecessary since this information about the Y coordinate only consumes an unused bit.</ref>. Fail if this point is not on the curve.
+** If ''Q &ne; P + int(t)G'', fail.
+** Execute the script, according to the applicable script rules<ref>'''What are the applicable script rules in script path spends?''' [[bip-0342.mediawiki|BIP342]] specifies validity rules that apply for leaf version 0xc0, but future proposals can introduce rules for other leaf versions.</ref>, using the witness stack elements excluding the script ''s'', the control block ''c'', and the annex ''a'' if present, as initial stack.
+
+''q'' is referred to as ''taproot output key'' and ''p'' as ''taproot internal key''.
+
+=== Signature validation rules ===
+
+We first define a reusable common signature message calculation function, followed by the actual signature validation as it's used in key path spending.
+
+==== Common signature message ====
+
+The function ''SigMsg(hash_type, ext_flag)'' computes the message being signed as a byte array. It is implicitly also a function of the spending transaction and the outputs it spends, but these are not listed to keep notation simple.
+
+The parameter ''hash_type'' is an 8-bit unsigned value. The <code>SIGHASH</code> encodings from the legacy script system are reused, including <code>SIGHASH_ALL</code>, <code>SIGHASH_NONE</code>, <code>SIGHASH_SINGLE</code>, and <code>SIGHASH_ANYONECANPAY</code>, plus the default ''hash_type'' value ''0x00'' which results in signing over the whole transaction just as for <code>SIGHASH_ALL</code>. The following restrictions apply, which cause validation failure if violated:
+* Using any undefined ''hash_type'' (not ''0x00'', ''0x01'', ''0x02'', ''0x03'', ''0x81'', ''0x82'', or ''0x83''<ref>'''Why reject unknown ''hash_type'' values?''' By doing so, it is easier to reason about the worst case amount of signature hashing an implementation with adequate caching must perform.</ref>).
+* Using <code>SIGHASH_SINGLE</code> without a "corresponding output" (an output with the same index as the input being verified).
+
+The parameter ''ext_flag'' is an integer in range 0-127, and is used for indicating (in the message) that extensions are added at the end of the message<ref>'''What extensions use the ''ext_flag'' mechanism?''' [[bip-0342.mediawiki|BIP342]] reuses the same common signature message algorithm, but adds BIP342-specific data at the end, which is indicated using ''ext_flag = 1''.</ref>.
+
+If the parameters take acceptable values, the message is the concatenation of the following data, in order(with byte size of each item listed in parentheses). Numerical values in 2, 4, or 8-byte are encoded in little-endian.
+
+* Control:
+** ''hash_type'' (1).
+* Transaction data:
+** ''nVersion'' (4): the ''nVersion'' of the transaction.
+** ''nLockTime'' (4): the ''nLockTime'' of the transaction.
+** If the ''hash_type & 0x80'' does not equal <code>SIGHASH_ANYONECANPAY</code>:
+*** ''sha_prevouts'' (32): the SHA256 of the serialization of all input outpoints.
+*** ''sha_amounts'' (32): the SHA256 of the serialization of all input amounts.
+*** ''sha_sequences'' (32): the SHA256 of the serialization of all input ''nSequence''.
+** If ''hash_type & 3'' does not equal <code>SIGHASH_NONE</code> or <code>SIGHASH_SINGLE</code>:
+*** ''sha_outputs'' (32): the SHA256 of the serialization of all outputs in <code>CTxOut</code> format.
+* Data about this input:
+** ''spend_type'' (1): equal to ''(ext_flag * 2) + annex_present'', where ''annex_present'' is 0 if no annex is present, or 1 otherwise (the original witness stack has two or more witness elements, and the first byte of the last element is ''0x50'')
+** ''scriptPubKey'' (35): ''scriptPubKey'' of the previous output spent by this input, serialized as script inside <code>CTxOut</code>. Its size is always 35 bytes.
+** If ''hash_type & 0x80'' equals <code>SIGHASH_ANYONECANPAY</code>:
+*** ''outpoint'' (36): the <code>COutPoint</code> of this input (32-byte hash + 4-byte little-endian).
+*** ''amount'' (8): value of the previous output spent by this input.
+*** ''nSequence'' (4): ''nSequence'' of this input.
+** If ''hash_type & 0x80'' does not equal <code>SIGHASH_ANYONECANPAY</code>:
+*** ''input_index'' (4): index of this input in the transaction input vector. Index of the first input is 0.
+** If an annex is present (the lowest bit of ''spend_type'' is set):
+*** ''sha_annex'' (32): the SHA256 of ''(compact_size(size of annex) || annex)'', where ''annex'' includes the mandatory ''0x50'' prefix.
+* Data about this output:
+** If ''hash_type & 3'' equals <code>SIGHASH_SINGLE</code>:
+*** ''sha_single_output'' (32): the SHA256 of the corresponding output in <code>CTxOut</code> format.
+
+The total length of ''SigMsg()'' is at most ''209'' bytes<ref>'''What is the output length of ''SigMsg()''?''' The total length of ''SigMsg()'' can be computed using the following formula: ''177 - is_anyonecanpay * 52 - is_none * 32 + has_annex * 32''.</ref>. Note that this does not include the size of sub-hashes such as ''sha_prevouts'', which may be cached across signatures of the same transaction.
+
+In summary, the semantics of the [[bip-0143.mediawiki|BIP143]] sighash types remain unchanged, except the following:
+# The way and order of serialization is changed.<ref>'''Why is the serialization in the signature message changed?''' Hashes that go into the signature message and the message itself are now computed with a single SHA256 invocation instead of double SHA256. There is no expected security improvement by doubling SHA256 because this only protects against length-extension attacks against SHA256 which are not a concern for signature messages because there is no secret data. Therefore doubling SHA256 is a waste of resources. The message computation now follows a logical order with transaction level data first, then input data and output data. This allows to efficiently cache the transaction part of the message across different inputs using the SHA256 midstate. Additionally, sub-hashes can be skipped when calculating the message (for example `sha_prevouts` if <code>SIGHASH_ANYONECANPAY</code> is set) instead of setting them to zero and then hashing them as in BIP143. Despite that, collisions are made impossible by committing to the length of the data (implicit in ''hash_type'' and ''spend_type'') before the variable length data.</ref>
+# The signature message commits to the ''scriptPubKey''<ref>'''Why does the signature message commit to the ''scriptPubKey''?''' This prevents lying to offline signing devices about output being spent, even when the actually executed script (''scriptCode'' in BIP143) is correct. This means it's possible to compactly prove to a hardware wallet what (unused) execution paths existed.</ref>.
+# If the <code>SIGHASH_ANYONECANPAY</code> flag is not set, the message commits to the amounts of ''all'' transaction inputs.<ref>'''Why does the signature message commit to the amounts of all transaction inputs?''' This eliminates the possibility to lie to offline signing devices about the fee of a transaction.</ref>
+# The signature message commits to all input ''nSequence'' if <code>SIGHASH_NONE</code> or <code>SIGHASH_SINGLE</code> are set (unless <code>SIGHASH_ANYONECANPAY</code> is set as well).<ref>'''Why does the signature message commit to all input ''nSequence'' if <code>SIGHASH_SINGLE</code> or <code>SIGHASH_NONE</code> are set?''' Because setting them already makes the message commit to the <code>prevouts</code> part of all transaction inputs, it is not useful to treat the ''nSequence'' any different. Moreover, this change makes ''nSequence'' consistent with the view that <code>SIGHASH_SINGLE</code> and <code>SIGHASH_NONE</code> only modify the signature message with respect to transaction outputs and not inputs.</ref>
+# The signature message includes commitments to the taproot-specific data ''spend_type'' and ''annex'' (if present).
+
+==== Taproot key path spending signature validation ====
+
+To validate a signature ''sig'' with public key ''q'':
+* If the ''sig'' is 64 bytes long, return ''Verify(q, hash<sub>TapSigHash</sub>(0x00 || SigMsg(0x00, 0)), sig)''<ref>'''Why is the input to ''hash<sub>TapSigHash</sub>'' prefixed with 0x00?''' This prefix is called the sighash epoch, and allows reusing the ''hash<sub>TapSigHash</sub>'' tagged hash in future signature algorithms that make invasive changes to how hashing is performed (as opposed to the ''ext_flag'' mechanism that is used for incremental extensions). An alternative is having them use a different tag, but supporting a growing number of tags may become undesirable.</ref>, where ''Verify'' is defined in [[bip-0340.mediawiki#design|BIP340]].
+* If the ''sig'' is 65 bytes long, return ''sig[64] &ne; 0x00<ref>'''Why can the <code>hash_type</code> not be <code>0x00</code> in 65-byte signatures?''' Permitting that would enable malleating (by third parties, including miners) 64-byte signatures into 65-byte ones, resulting in a different `wtxid` and a different fee rate than the creator intended</ref> and Verify(q, hash<sub>TapSighash</sub>(0x00 || SigMsg(sig[64], 0)), sig[0:64])''.
+* Otherwise, fail<ref>'''Why permit two signature lengths?''' By making the most common type of <code>hash_type</code> implicit, a byte can often be saved.</ref>.
+
+== Constructing and spending Taproot outputs ==
+
+This section discusses how to construct and spend Taproot outputs. It only affects wallet software that chooses to implement receiving and spending,
+and is not consensus critical in any way.
+
+Conceptually, every Taproot output corresponds to a combination of a single public key condition (the internal key), and zero or more general conditions encoded in scripts organized in a tree.
+Satisfying any of these conditions is sufficient to spend the output.
+
+'''Initial steps''' The first step is determining what the internal key and the organization of the rest of the scripts should be. The specifics are likely application dependent, but here are some general guidelines:
+* When deciding between scripts with conditionals (<code>OP_IF</code> etc.) and splitting them up into multiple scripts (each corresponding to one execution path through the original script), it is generally preferable to pick the latter.
+* When a single condition requires signatures with multiple keys, key aggregation techniques like MuSig can be used to combine them into a single key. The details are out of scope for this document, but note that this may complicate the signing procedure.
+* If one or more of the spending conditions consist of just a single key (after aggregation), the most likely one should be made the internal key. If no such condition exists, it may be worthwhile adding one that consists of an aggregation of all keys participating in all scripts combined; effectively adding an "everyone agrees" branch. If that is inacceptable, pick as internal key a point with unknown discrete logarithm. One example of such a point is ''H = point(0x0250929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0)'' which is [https://github.com/ElementsProject/secp256k1-zkp/blob/11af7015de624b010424273be3d91f117f172c82/src/modules/rangeproof/main_impl.h#L16 constructed] by taking the hash of the standard uncompressed encoding of the [https://www.secg.org/sec2-v2.pdf secp256k1] base point ''G'' as X coordinate. In order to avoid leaking the information that key path spending is not possible it is recommended to pick a fresh integer ''r'' in the range ''0...n-1'' uniformly at random and use ''H + rG'' as internal key. It is possible to prove that this internal key does not have a known discrete logarithm with respect to ''G'' by revealing ''r'' to a verifier who can then reconstruct how the internal key was created.
+* If the spending conditions do not require a script path, the output key should commit to an unspendable script path instead of having no script path. This can be achieved by computing the output key point as ''Q = P + int(hash<sub>TapTweak</sub>(bytes(P)))G''. <ref>'''Why should the output key always have a taproot commitment, even if there is no script path?'''
+If the taproot output key is an aggregate of keys, there is the possibility for a malicious party to add a script path without being noticed by the other parties.
+This allows to bypass the multiparty policy and to steal the coin.
+MuSig key aggregation does not have this issue because it already causes the internal key to be randomized.
+
+The attack works as follows: Assume Alice and Mallory want to aggregate their keys into a taproot output key without a script path.
+In order to prevent key cancellation and related attacks they use [https://eprint.iacr.org/2018/483.pdf MSDL-pop] instead of MuSig.
+The MSDL-pop protocol requires all parties to provide a proof of possession of their corresponding secret key and the aggregated key is just the sum of the individual keys.
+After Mallory receives Alice's key ''A'', Mallory creates ''M = M<sub>0</sub> + int(t)G'' where ''M<sub>0</sub>'' is Mallory's original key and ''t'' allows a script path spend with internal key ''P = A + M<sub>0</sub>'' and a script that only contains Mallory's key.
+Mallory sends a proof of possession of ''M'' to Alice and both parties compute output key ''Q = A + M = P + int(t)G''.
+Alice will not be able to notice the script path, but Mallory can unilaterally spend any coin with output key ''Q''.
+</ref>
+* The remaining scripts should be organized into the leaves of a binary tree. This can be a balanced tree if each of the conditions these scripts correspond to are equally likely. If probabilities for each condition are known, consider constructing the tree as a Huffman tree.
+
+'''Computing the output script''' Once the spending conditions are split into an internal key <code>internal_pubkey</code> and a binary tree whose leaves are (leaf_version, script) tuples, the output script can be computed using the Python3 algorithms below. These algorithms take advantage of helper functions from the [bip-0340/referency.py BIP340 reference code] for integer conversion, point multiplication, and tagged hashes.
+
+First, we define <code>taproot_tweak_pubkey</code> for 32-byte [[bip-0340.mediawiki|BIP340]] public key arrays.
+In addition to the tweaked public key byte array, the function returns a boolean indicating whether the public key represents the tweaked point or its negation.
+This will be required for spending the output with a script path.
+In order to allow spending with the key path, we define <code>taproot_tweak_seckey</code> to compute the secret key for a tweaked public key.
+For any byte string <code>h</code> it holds that <code>taproot_tweak_pubkey(pubkey_gen(seckey), h)[0] == pubkey_gen(taproot_tweak_seckey(seckey, h))</code>.
+
+<source lang="python">
+def taproot_tweak_pubkey(pubkey, h):
+ t = int_from_bytes(tagged_hash("TapTweak", pubkey + h))
+ if t >= SECP256K1_ORDER:
+ raise ValueError
+ Q = point_add(point(pubkey), point_mul(G, t))
+ is_negated = not has_square_y(Q)
+ return bytes_from_int(x(Q)), is_negated
+
+def taproot_tweak_seckey(seckey0, h):
+ P = point_mul(G, int_from_bytes(seckey0))
+ seckey = SECP256K1_ORDER - seckey0 if not has_square_y(P) else seckey
+ t = int_from_bytes(tagged_hash("TapTweak", bytes_from_int(x(P)) + h))
+ if t >= SECP256K1_ORDER:
+ raise ValueError
+ return (seckey + t) % SECP256K1_ORDER
+</source>
+
+The following function, <code>taproot_output_script</code>, returns a byte array with the scriptPubKey (see [[bip-0141.mediawiki|BIP141]]).
+<code>ser_script</code> refers to a function that prefixes its input with a CCompactSize-encoded length.
+
+<source lang="python">
+def taproot_tree_helper(script_tree):
+ if isinstance(script_tree, tuple):
+ leaf_version, script = script_tree
+ h = tagged_hash("TapLeaf", bytes([leaf_version]) + ser_script(script))
+ return ([((leaf_version, script), bytes())], h)
+ left, left_h = taproot_tree_helper(script_tree[0])
+ right, right_h = taproot_tree_helper(script_tree[1])
+ ret = [(l, c + right_h) for l, c in left] + [(l, c + left_h) for l, c in right]
+ if right_h < left_h:
+ left_h, right_h = right_h, left_h
+ return (ret, tagged_hash("TapBranch", left_h + right_h))
+
+def taproot_output_script(internal_pubkey, script_tree):
+ """Given a internal public key and a tree of scripts, compute the output script.
+ script_tree is either:
+ - a (leaf_version, script) tuple (leaf_version is 0xc0 for [[bip-0342.mediawiki|BIP342]] scripts)
+ - a list of two elements, each with the same structure as script_tree itself
+ - None
+ """
+ if script_tree is None:
+ h = bytes()
+ else:
+ _, h = taproot_tree_helper(script_tree)
+ output_pubkey, _ = taproot_tweak_pubkey(internal_pubkey, h)
+ return bytes([0x51, 0x20]) + output_pubkey
+</source>
+
+[[File:bip-0341/tree.png|frame|This diagram shows the hashing structure to obtain the tweak from an internal key ''P'' and a Merkle tree consisting of 5 script leaves. ''A'', ''B'', ''C'' and ''E'' are ''TapLeaf'' hashes similar to ''D'' and ''AB'' is a ''TapBranch'' hash. Note that when ''CDE'' is computed ''E'' is hashed first because ''E'' is less than ''CD''.]]
+
+To spend this output using script ''D'', the control block would contain the following data in this order:
+
+ <control byte with leaf version and negation bit> <internal key p> <C> <E> <AB>
+
+The TapTweak would then be computed as described [[bip-0341.mediawiki#script-validation-rules|above]] like so:
+
+<source>
+D = tagged_hash("TapLeaf", bytes([leaf_version]) + ser_script(script))
+CD = tagged_hash("TapBranch", C + D)
+CDE = tagged_hash("TapBranch", E + CD)
+ABCDE = tagged_hash("TapBranch", AB + CDE)
+TapTweak = tagged_hash("TapTweak", p + ABCDE)
+</source>
+
+'''Spending using the key path''' A Taproot output can be spent with the secret key corresponding to the <code>internal_pubkey</code>. To do so, a witness stack consists of a single element: a [[bip-0340.mediawiki|BIP340]] signature on the signature hash as defined above, with the secret key tweaked by the same <code>h</code> as in the above snippet. See the code below:
+
+<source lang="python">
+def taproot_sign_key(script_tree, internal_seckey, hash_type):
+ _, h = taproot_tree_helper(script_tree)
+ output_seckey = taproot_tweak_seckey(internal_seckey, h)
+ sig = schnorr_sign(sighash(hash_type), output_seckey)
+ if hash_type != 0:
+ sig += bytes([hash_type])
+ return [sig]
+</source>
+
+This function returns the witness stack necessary and a <code>sighash</code> function to compute the signature hash as defined above (for simplicity, the snippet above ignores passing information like the transaction, the input position, ... to the sighashing code).
+
+'''Spending using one of the scripts''' A Taproot output can be spent by satisfying any of the scripts used in its construction. To do so, a witness stack consisting of the script's inputs, plus the script itself and the control block are necessary. See the code below:
+
+<source lang="python">
+def taproot_sign_script(internal_pubkey, script_tree, script_num, inputs):
+ info, h = taproot_tree_helper(script_tree)
+ (leaf_version, script), path = info[script_num]
+ _, is_negated = taproot_tweak_pubkey(internal_pubkey, h)
+ output_pubkey_tag = 0 if is_negated else 1
+ pubkey_data = bytes([output_pubkey_tag + leaf_version]) + internal_pubkey
+ return inputs + [script, pubkey_data + path]
+</source>
+
+== Security ==
+
+Taproot improves the privacy of Bitcoin because instead of revealing all possible conditions for spending an output, only the satisfied spending condition has to be published.
+Ideally, outputs are spent using the key path which prevents observers from learning the spending conditions of a coin.
+A key path spend could be a "normal" payment from a single- or multi-signature wallet or the cooperative settlement of hidden multiparty contract.
+
+A script path spend leaks that there is a script path and that the key path was not applicable - for example because the involved parties failed to reach agreement.
+Moreover, the depth of a script in the Merkle root leaks information including the minimum depth of the tree, which suggests specific wallet software that created the output and helps clustering.
+Therefore, the privacy of script spends can be improved by deviating from the optimal tree determined by the probability distribution over the leaves.
+
+Just like other existing output types, taproot outputs should never reuse keys, for privacy reasons.
+This does not only apply to the particular leaf that was used to spend an output but to all leaves committed to in the output.
+If leaves were reused, it could happen that spending a different output would reuse the same Merkle branches in the Merkle proof.
+Using fresh keys implies that taproot output construction does not need to take special measures to randomizing leaf positions because they are already randomized due to the branch-sorting Merkle tree construction used in taproot.
+This does not avoid leaking information through the leaf depth and therefore only applies to balanced (sub-) trees.
+In addition, every leaf should have a set of keys distinct from every other leaf.
+The reason for this is to increase leaf entropy and prevent an observer from learning an undisclosed script using brute-force search.
+
+== Test vectors ==
+
+Examples with creation transaction and spending transaction pairs, valid and invalid.
+
+Examples of preimage for sighashing for each of the sighash modes.
+
+== Rationale ==
+
+<references />
+
+== Deployment ==
+
+TODO
+
+== Backwards compatibility ==
+As a soft fork, older software will continue to operate without modification.
+Non-upgraded nodes, however, will consider all SegWit version 1 witness programs as anyone-can-spend scripts.
+They are strongly encouraged to upgrade in order to fully validate the new programs.
+
+Non-upgraded wallets can receive and send bitcoin from non-upgraded and upgraded wallets using SegWit version 0 programs, traditional pay-to-pubkey-hash, etc.
+Depending on the implementation non-upgraded wallets may be able to send to Segwit version 1 programs if they support sending to [[bip-0173.mediawiki|BIP173]] Bech32 addresses.
+
+== Acknowledgements ==
+
+This document is the result of discussions around script and signature improvements with many people, and had direct contributions from Greg Maxwell and others. It further builds on top of earlier published proposals such as Taproot by Greg Maxwell, and Merkle branch constructions by Russell O'Connor, Johnson Lau, and Mark Friedenbach.
+
+The authors wish the thank Arik Sosman for suggesting to sort Merkle node children before hashes, removing the need to transfer the position in the tree, as well as all those who provided valuable feedback and reviews, including the participants of the [https://github.com/ajtowns/taproot-review structured reviews].
diff --git a/bip-0341/tree.png b/bip-0341/tree.png
new file mode 100644
index 0000000..af56eda
--- /dev/null
+++ b/bip-0341/tree.png
Binary files differ
diff --git a/bip-0342.mediawiki b/bip-0342.mediawiki
new file mode 100644
index 0000000..3da9904
--- /dev/null
+++ b/bip-0342.mediawiki
@@ -0,0 +1,140 @@
+<pre>
+ BIP: 342
+ Layer: Consensus (soft fork)
+ Title: Validation of Taproot Scripts
+ Author: Pieter Wuille <pieter.wuille@gmail.com>
+ Jonas Nick <jonasd.nick@gmail.com>
+ Anthony Towns <aj@erisian.com.au>
+ Comments-Summary: No comments yet.
+ Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0342
+ Status: Draft
+ Type: Standards Track
+ Created: 2020-01-19
+ License: BSD-3-Clause
+ Requires: 340, 341
+ Post-History: 2019-05-06: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-May/016914.html [bitcoin-dev] Taproot proposal
+</pre>
+
+==Introduction==
+
+===Abstract===
+
+This document specifies the semantics of the initial scripting system under [[bip-0341.mediawiki|BIP341]].
+
+===Copyright===
+
+This document is licensed under the 3-clause BSD license.
+
+===Motivation===
+
+[[bip-0341.mediawiki|BIP341]] proposes improvements to just the script structure, but some of its goals are incompatible with the semantics of certain opcodes within the scripting language itself.
+While it is possible to deal with these in separate optional improvements, their impact is not guaranteed unless they are addressed simultaneously with [[bip-0341.mediawiki|BIP341]] itself.
+
+Specifically, the goal is making '''Schnorr signatures''', '''batch validation''', and '''signature hash''' improvements available to spends that use the script system as well.
+
+==Design==
+
+In order to achieve these goals, signature opcodes <code>OP_CHECKSIG</code> and <code>OP_CHECKSIGVERIFY</code> are modified to verify Schnorr signatures as specified in [[bip-0340.mediawiki|BIP340]] and to use a signature message algorithm based on the common message calculation in [[bip-0341.mediawiki|BIP341]].
+The tapscript signature message also simplifies <code>OP_CODESEPARATOR</code> handling and makes it more efficient.
+
+The inefficient <code>OP_CHECKMULTISIG</code> and <code>OP_CHECKMULTISIGVERIFY</code> opcodes are disabled.
+Instead, a new opcode <code>OP_CHECKSIGADD</code> is introduced to allow creating the same multisignature policies in a batch-verifiable way.
+Tapscript uses a new, simpler signature opcode limit fixing complicated interactions with transaction weight.
+Furthermore, a potential malleability vector is eliminated by requiring MINIMALIF.
+
+Tapscript can be upgraded through soft forks by defining unknown key types, for example to add new <code>hash_types</code> or signature algorithms.
+Additionally, the new tapscript <code>OP_SUCCESS</code> opcodes allow introducing new opcodes more cleanly than through <code>OP_NOP</code>.
+
+==Specification==
+
+The rules below only apply when validating a transaction input for which all of the conditions below are true:
+* The transaction input is a '''segregated witness spend''' (i.e., the scriptPubKey contains a witness program as defined in [[bip-0141.mediawiki|BIP141]]).
+* It is a '''taproot spend''' as defined in [[bip-0341.mediawiki#design|BIP341]] (i.e., the witness version is 1, the witness program is 32 bytes, and it is not P2SH wrapped).
+* It is a '''script path spend''' as defined in [[bip-0341.mediawiki#design|BIP341]] (i.e., after removing the optional annex from the witness stack, two or more stack elements remain).
+* The leaf version is ''0xc0'' (i.e. the first byte of the last witness element after removing the optional annex is ''0xc0'' or ''0xc1''), marking it as a '''tapscript spend'''.
+
+Validation of such inputs must be equivalent to performing the following steps in the specified order.
+# If the input is invalid due to BIP141 or BIP341, fail.
+# The script as defined in BIP341 (i.e., the penultimate witness stack element after removing the optional annex) is called the '''tapscript''' and is decoded into opcodes, one by one:
+## If any opcode numbered ''80, 98, 126-129, 131-134, 137-138, 141-142, 149-153, 187-254'' is encountered, validation succeeds (none of the rules below apply). This is true even if later bytes in the tapscript would fail to decode otherwise. These opcodes are renamed to <code>OP_SUCCESS80</code>, ..., <code>OP_SUCCESS254</code>, and collectively known as <code>OP_SUCCESSx</code><ref>'''<code>OP_SUCCESSx</code>''' <code>OP_SUCCESSx</code> is a mechanism to upgrade the Script system. Using an <code>OP_SUCCESSx</code> before its meaning is defined by a softfork is insecure and leads to fund loss. The inclusion of <code>OP_SUCCESSx</code> in a script will pass it unconditionally. It precedes any script execution rules to avoid the difficulties in specifying various edge cases, for example: <code>OP_SUCCESSx</code> in a script with an input stack larger than 1000 elements, <code>OP_SUCCESSx</code> after too many signature opcodes, or even scripts with conditionals lacking <code>OP_ENDIF</code>. The mere existence of an <code>OP_SUCCESSx</code> anywhere in the script will guarantee a pass for all such cases. <code>OP_SUCCESSx</code> are similar to the <code>OP_RETURN</code> in very early bitcoin versions (v0.1 up to and including v0.3.5). The original <code>OP_RETURN</code> terminates script execution immediately, and return pass or fail based on the top stack element at the moment of termination. This was one of a major design flaws in the original bitcoin protocol as it permitted unconditional third party theft by placing an <code>OP_RETURN</code> in <code>scriptSig</code>. This is not a concern in the present proposal since it is not possible for a third party to inject an <code>OP_SUCCESSx</code> to the validation process, as the <code>OP_SUCCESSx</code> is part of the script (and thus committed to by the taproot output), implying the consent of the coin owner. <code>OP_SUCCESSx</code> can be used for a variety of upgrade possibilities:
+* An <code>OP_SUCCESSx</code> could be turned into a functional opcode through a softfork. Unlike <code>OP_NOPx</code>-derived opcodes which only have read-only access to the stack, <code>OP_SUCCESSx</code> may also write to the stack. Any rule changes to an <code>OP_SUCCESSx</code>-containing script may only turn a valid script into an invalid one, and this is always achievable with softforks.
+* Since <code>OP_SUCCESSx</code> precedes size check of initial stack and push opcodes, an <code>OP_SUCCESSx</code>-derived opcode requiring stack elements bigger than 520 bytes may uplift the limit in a softfork.
+* <code>OP_SUCCESSx</code> may also redefine the behavior of existing opcodes so they could work together with the new opcode. For example, if an <code>OP_SUCCESSx</code>-derived opcode works with 64-bit integers, it may also allow the existing arithmetic opcodes in the ''same script'' to do the same.
+* Given that <code>OP_SUCCESSx</code> even causes potentially unparseable scripts to pass, it can be used to introduce multi-byte opcodes, or even a completely new scripting language when prefixed with a specific <code>OP_SUCCESSx</code> opcode.</ref>.
+## If any push opcode fails to decode because it would extend past the end of the tapscript, fail.
+# If the '''initial stack''' as defined in BIP341 (i.e., the witness stack after removing both the optional annex and the two last stack elements after that) violates any resource limits (stack size, and size of the elements in the stack; see "Resource Limits" below), fail. Note that this check can be bypassed using <code>OP_SUCCESSx</code>.
+# The tapscript is executed according to the rules in the following section, with the initial stack as input.
+## If execution fails for any reason, fail.
+## If the execution results in anything but exactly one element on the stack which evaluates to true with <code>CastToBool()</code>, fail.
+# If this step is reached without encountering a failure, validation succeeds.
+
+===Script execution===
+
+The execution rules for tapscript are based on those for P2WSH according to BIP141, including the <code>OP_CHECKLOCKTIMEVERIFY</code> and <code>OP_CHECKSEQUENCEVERIFY</code> opcodes defined in [[bip-0065.mediawiki|BIP65]] and [[bip-0112.mediawiki|BIP112]], but with the following modifications:
+* '''Disabled script opcodes''' The following script opcodes are disabled in tapscript: <code>OP_CHECKMULTISIG</code> and <code>OP_CHECKMULTISIGVERIFY</code><ref>'''Why are <code>OP_CHECKMULTISIG</code> and <code>OP_CHECKMULTISIGVERIFY</code> disabled, and not turned into OP_SUCCESSx?''' This is a precaution to make sure people who accidentally keep using <code>OP_CHECKMULTISIG</code> in Tapscript notice a problem immediately. It also avoids the complication of script disassemblers needing to become context-dependent.</ref>. The disabled opcodes behave in the same way as <code>OP_RETURN</code>, by failing and terminating the script immediately when executed, and being ignored when found in unexecuted branch of the script.
+* '''Consensus-enforced MINIMALIF''' The MINIMALIF rules, which are only a standardness rule in P2WSH, are consensus enforced in tapscript. This means that the input argument to the <code>OP_IF</code> and <code>OP_NOTIF</code> opcodes must be either exactly 0 (the empty vector) or exactly 1 (the one-byte vector with value 1)<ref>'''Why make MINIMALIF consensus?''' This makes it considerably easier to write non-malleable scripts that take branch information from the stack.</ref>.
+* '''OP_SUCCESSx opcodes''' As listed above, some opcodes are renamed to <code>OP_SUCCESSx</code>, and make the script unconditionally valid.
+* '''Signature opcodes'''. The <code>OP_CHECKSIG</code> and <code>OP_CHECKSIGVERIFY</code> are modified to operate on Schnorr public keys and signatures (see [[bip-0340.mediawiki|BIP340]]) instead of ECDSA, and a new opcode <code>OP_CHECKSIGADD</code> is added.
+** The opcode 186 (<code>0xba</code>) is named as <code>OP_CHECKSIGADD</code>. <ref>'''<code>OP_CHECKSIGADD</code>''' This opcode is added to compensate for the loss of <code>OP_CHECKMULTISIG</code>-like opcodes, which are incompatible with batch verification. <code>OP_CHECKSIGADD</code> is functionally equivalent to <code>OP_ROT OP_SWAP OP_CHECKSIG OP_ADD</code>, but only takes 1 byte. All <code>CScriptNum</code>-related behaviours of <code>OP_ADD</code> are also applicable to <code>OP_CHECKSIGADD</code>.</ref><ref>'''Alternatives to <code>CHECKMULTISIG</code>''' There are multiple ways of implementing a threshold ''k''-of-''n'' policy using Taproot and Tapscript:
+* '''Using a single <code>OP_CHECKSIGADD</code>-based script''' A <code>CHECKMULTISIG</code> script <code>m <pubkey_1> ... <pubkey_n> n CHECKMULTISIG</code> with witness <code>0 <signature_1> ... <signature_m></code> can be rewritten as script <code><pubkey_1> CHECKSIG ... <pubkey_n> CHECKSIGADD m NUMEQUAL</code> with witness <code><w_1> ... <w_n></code>. Every witness element <code>w_i</code> is either a signature corresponding to <code>pubkey_i</code> or an empty vector. A similar <code>CHECKMULTISIGVERIFY</code> script can be translated to BIP342 by replacing <code>NUMEQUAL</code> with <code>NUMEQUALVERIFY</code>. This approach has very similar characteristics to the existing <code>OP_CHECKMULTISIG</code>-based scripts.
+* '''Using a ''k''-of-''k'' script for every combination''' A ''k''-of-''n'' policy can be implemented by splitting the script into several leaves of the Merkle tree, each implementing a ''k''-of-''k'' policy using <code><pubkey_1> CHECKSIGVERIFY ... <pubkey_(n-1)> CHECKSIGVERIFY <pubkey_n> CHECKSIG</code>. This may be preferable for privacy reasons over the previous approach, as it only exposes the participating public keys, but it is only more cost effective for small values of ''k'' (1-of-''n'' for any ''n'', 2-of-''n'' for ''n &ge; 6'', 3-of-''n'' for ''n &ge; 9'', ...). Furthermore, the signatures here commit to the branch used, which means signers need to be aware of which other signers will be participating, or produce signatures for each of the tree leaves.
+* '''Using an aggregated public key for every combination''' Instead of building a tree where every leaf consists of ''k'' public keys, it is possible instead build a tree where every leaf contains a single ''aggregate'' of those ''k'' keys using [https://eprint.iacr.org/2018/068 MuSig]. This approach is far more efficient, but does require a 3-round interactive signing protocol to jointly produce the (single) signature.
+* '''Native Schnorr threshold signatures''' Multisig policies can also be realized with [http://cacr.uwaterloo.ca/techreports/2001/corr2001-13.ps threshold signatures] using verifiable secret sharing. This results in outputs and inputs that are indistinguishable from single-key payments, but at the cost of needing an interactive protocol (and associated backup procedures) before determining the address to send to.</ref>
+
+===Rules for signature opcodes===
+
+The following rules apply to <code>OP_CHECKSIG</code>, <code>OP_CHECKSIGVERIFY</code>, and <code>OP_CHECKSIGADD</code>.
+
+* For <code>OP_CHECKSIGVERIFY</code> and <code>OP_CHECKSIG</code>, the public key (top element) and a signature (second to top element) are popped from the stack.
+** If fewer than 2 elements are on the stack, the script MUST fail and terminate immediately.
+* For <code>OP_CHECKSIGADD</code>, the public key (top element), a <code>CScriptNum</code> <code>n</code> (second to top element), and a signature (third to top element) are popped from the stack.
+** If fewer than 3 elements are on the stack, the script MUST fail and terminate immediately.
+** If <code>n</code> is larger than 4 bytes, the script MUST fail and terminate immediately.
+* If the public key size is zero, the script MUST fail and terminate immediately.
+* If the public key size is 32 bytes, it is considered to be a public key as described in BIP340:
+** If the signature is not the empty vector, the signature is validated against the public key (see the next subsection). Validation failure in this case immediately terminates script execution with failure.
+* If the public key size is not zero and not 32 bytes, the public key is of an ''unknown public key type''<ref>'''Unknown public key types''' allow adding new signature validation rules through softforks. A softfork could add actual signature validation which either passes or makes the script fail and terminate immediately. This way, new <code>SIGHASH</code> modes can be added, as well as [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-December/016549.html NOINPUT-tagged public keys] and a public key constant which is replaced by the taproot internal key for signature validation.</ref> and no actual signature verification is applied. During script execution of signature opcodes they behave exactly as known public key types except that signature validation is considered to be successful.
+* If the script did not fail and terminate before this step, regardless of the public key type:
+** If the signature is the empty vector:
+*** For <code>OP_CHECKSIGVERIFY</code>, the script MUST fail and terminate immediately.
+*** For <code>OP_CHECKSIG</code>, an empty vector is pushed onto the stack, and execution continues with the next opcode.
+*** For <code>OP_CHECKSIGADD</code>, a <code>CScriptNum</code> with value <code>n</code> is pushed onto the stack, and execution continues with the next opcode.
+** If the signature is not the empty vector, the opcode is counted towards the sigops budget (see further).
+*** For <code>OP_CHECKSIGVERIFY</code>, execution continues without any further changes to the stack.
+*** For <code>OP_CHECKSIG</code>, a 1-byte value <code>0x01</code> is pushed onto the stack.
+*** For <code>OP_CHECKSIGADD</code>, a <code>CScriptNum</code> with value of <code>n + 1</code> is pushed onto the stack.
+
+===Signature validation===
+
+To validate a signature ''sig'' with public key ''p'':
+* Compute the tapscript message extension ''ext'', consisting of the concatenation of:
+** ''tapleaf_hash'' (32): the tapleaf hash as defined in [[bip-0341.mediawiki#design|BIP341]]
+** ''key_version'' (1): a constant value ''0x00'' representing the current version of public keys in the tapscript signature opcode execution.
+** ''codesep_pos'' (4): the opcode position of the last executed <code>OP_CODESEPARATOR</code> before the currently executed signature opcode, with the value in little endian (or ''0xffffffff'' if none executed). The first opcode in a script has a position of 0. A multi-byte push opcode is counted as one opcode, regardless of the size of data being pushed.
+* If the ''sig'' is 64 bytes long, return ''Verify(p, hash<sub>TapSigHash</sub>(0x00 || SigMsg(0x00, 1) || ext), sig)'', where ''Verify'' is defined in [[bip-0340.mediawiki#design|BIP340]].
+* If the ''sig'' is 65 bytes long, return ''sig[64] &ne; 0x00 and Verify(p, hash<sub>TapSighash</sub>(0x00 || SigMsg(sig[64], 1) || ext), sig[0:64])''.
+* Otherwise, fail.
+
+In summary, the semantics of signature validation is identical to BIP340, except the following:
+# The signature message includes the tapscript-specific data ''key_version''.<ref>'''Why does the signature message commit to the ''key_version''?''' This is for future extensions that define unknown public key types, making sure signatures can't be moved from one key type to another.</ref>
+# The signature message commits to the executed script through the ''tapleaf_hash'' which includes the leaf version and script instead of ''scriptCode''. This implies that this commitment is unaffected by <code>OP_CODESEPARATOR</code>.
+# The signature message includes the opcode position of the last executed <code>OP_CODESEPARATOR</code>.<ref>'''Why does the signature message include the position of the last executed <code>OP_CODESEPARATOR</code>?''' This allows continuing to use <code>OP_CODESEPARATOR</code> to sign the executed path of the script. Because the <code>codeseparator_position</code> is the last input to the hash, the SHA256 midstate can be efficiently cached for multiple <code>OP_CODESEPARATOR</code>s in a single script. In contrast, the BIP143 handling of <code>OP_CODESEPARATOR</code> is to commit to the executed script only from the last executed <code>OP_CODESEPARATOR</code> onwards which requires unnecessary rehashing of the script. It should be noted that the one known <code>OP_CODESEPARATOR</code> use case of saving a second public key push in a script by sharing the first one between two code branches can be most likely expressed even cheaper by moving each branch into a separate taproot leaf.</ref>
+
+===Resource limits===
+
+In addition to changing the semantics of a number of opcodes, there are also some changes to the resource limitations:
+* '''Script size limit''' The maximum script size of 10000 bytes does not apply. Their size is only implicitly bounded by the block weight limit.<ref>'''Why is a limit on script size no longer needed?''' Since there is no <code>scriptCode</code> directly included in the signature hash (only indirectly through a precomputable tapleaf hash), the CPU time spent on a signature check is no longer proportional to the size of the script being executed.</ref>
+* '''Non-push opcodes limit''' The maximum non-push opcodes limit of 201 per script does not apply.<ref>'''Why is a limit on the number of opcodes no longer needed?''' An opcode limit only helps to the extent that it can prevent data structures from growing unboundedly during execution (both because of memory usage, and because of time that may grow in proportion to the size of those structures). The size of stack and altstack is already independently limited. By using O(1) logic for <code>OP_IF</code>, <code>OP_NOTIF</code>, <code>OP_ELSE</code>, and <code>OP_ENDIF</code> as suggested [https://bitslog.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/ here] and implemented [https://github.com/bitcoin/bitcoin/pull/16902 here], the only other instance can be avoided as well.</ref>
+* '''Sigops limit''' The sigops in tapscripts do not count towards the block-wide limit of 80000 (weighted). Instead, there is a per-script sigops ''budget''. The budget equals 50 + the total serialized size in bytes of the transaction input's witness (including the <code>CCompactSize</code> prefix). Executing a signature opcode (<code>OP_CHECKSIG</code>, <code>OP_CHECKSIGVERIFY</code>, or <code>OP_CHECKSIGADD</code>) with a non-empty signature decrements the budget by 50. If that brings the budget below zero, the script fails immediately. Signature opcodes with unknown public key type and non-empty signature are also counted.<ref>'''The tapscript sigop limit''' The signature opcode limit protects against scripts which are slow to verify due to excessively many signature operations. In tapscript the number of signature opcodes does not count towards the BIP141 or legacy sigop limit. The old sigop limit makes transaction selection in block construction unnecessarily difficult because it is a second constraint in addition to weight. Instead, the number of tapscript signature opcodes is limited by witness weight. Additionally, the limit applies to the transaction input instead of the block and only actually executed signature opcodes are counted. Tapscript execution allows one signature opcode per 50 witness weight units plus one free signature opcode.</ref><ref>'''Parameter choice of the sigop limit''' Regular witnesses are unaffected by the limit as their weight is composed of public key and (<code>SIGHASH_ALL</code>) signature pairs with ''33 + 65'' weight units each (which includes a 1 weight unit <code>CCompactSize</code> tag). This is also the case if public keys are reused in the script because a signature's weight alone is 65 or 66 weight units. However, the limit increases the fees of abnormal scripts with duplicate signatures (and public keys) by requiring additional weight. The weight per sigop factor 50 corresponds to the ratio of BIP141 block limits: 4 mega weight units divided by 80,000 sigops. The "free" signature opcode permitted by the limit exists to account for the weight of the non-witness parts of the transaction input.</ref><ref>'''Why are only signature opcodes counted toward the budget, and not for example hashing opcodes or other expensive operations?''' It turns out that the CPU cost per witness byte for verification of a script consisting of the maximum density of signature checking opcodes (taking the 50 WU/sigop limit into account) is already very close to that of scripts packed with other opcodes, including hashing opcodes (taking the 520 byte stack element limit into account) and <code>OP_ROLL</code> (taking the 1000 stack element limit into account). That said, the construction is very flexible, and allows adding new signature opcodes like <code>CHECKSIGFROMSTACK</code> to count towards the limit through a soft fork. Even if in the future new opcodes are introduced which change normal script cost there is no need to stuff the witness with meaningless data. Instead, the taproot annex can be used to add weight to the witness without increasing the actual witness size.</ref>.
+* '''Stack + altstack element count limit''' The existing limit of 1000 elements in the stack and altstack together after every executed opcode remains. It is extended to also apply to the size of initial stack.
+* '''Stack element size limit''' The existing limit of maximum 520 bytes per stack element remains, both in the initial stack and in push opcodes.
+
+==Rationale==
+
+<references />
+
+==Examples==
+
+==Acknowledgements==
+
+This document is the result of many discussions and contains contributions by a number of people. The authors wish to thank all those who provided valuable feedback and reviews, including the participants of the [https://github.com/ajtowns/taproot-review structured reviews].
diff --git a/scripts/buildtable.pl b/scripts/buildtable.pl
index 144ede6..75ace26 100755
--- a/scripts/buildtable.pl
+++ b/scripts/buildtable.pl
@@ -34,6 +34,7 @@ my %MiscField = (
'Discussions-To' => undef,
'Post-History' => undef,
'Replaces' => undef,
+ 'Requires' => undef,
'Superseded-By' => undef,
);