Subject: A breakdown of Bitcoin "standard" script types
When challenged recently to provide an little known bitcoin fact, I presented that "Addresses are not stored anywhere in the blockchain". This got me thinking a bit more about the bitcoin OP codes and the scripting language they describe. There is a good wiki article on it all as a refresher. It's basically a stack based language similar to Forth or RPL language. Here's an example of a Mancala game I wrote in RPL to show more complex code.
Pay to Pubkey
The original bitcoin client defined two fields scriptSig
and scriptPubKey
which each contained half of the script needed to validate a transaction. The two scripts were concatenated togeather to create a complete script. Here's an example of a Pay to Pubkey script
P2PK | size | script |
---|---|---|
scriptSig |
72 | <sig> |
scriptPubKey |
35 | <pubkey> OP_CHECKSIG |
assembled | <scriptSig> <scriptPubKey> |
|
btc_address | b58_encode(pfx + hash160(spk[1:34])) |
|
Test | len(spk) == 35 and (spk[0:1] + spk[34:35]).hex() == '21ac' |
|
Total vB | 107 | 72 + 35 |
Since the OP_CHECKSIG operation takes two arguments, this can be interpreted as txin.OP_CHECKSIG(<pubkey>, <sig>)
from a non-stack based language perspective. In regards to TXN size, the total size of one of these assembeled scripts is 107 vB (bytes). In regards to bitcoin addresses, the address is derived by chopping off the first and last bytes (op codes) from the scriptPubKey
(spk
) then performing a Hash160
operation on the data. The script is recognized by it's length and the first and last op codes (OP_PUSH
, OP_CHECKSIG
).
In the original client P2PK was used for what was termed "Pay to IP". In this process, you would enter an IP address in the PayTo field, and the client would connect to the remote node to receive a scriptPubKey
from them.
Pay to Public Key Hash
Along with P2PK, the original client also supported P2PKH termed "Pay to address". Since addresses were always stored as the Hash160
of the pubkey, this format had the advantage of requiring no secondary piece of information. All the sender need was the bitcoin address, where as in P2PK the sender needed the pubkey
and could derive the address. But pubkeys
are long and generally no checksumed like bitcoin address notation is. Having send only need a small checksumed hash was simpler and became much more widely used, although it does require scriptSig
making it more expensive to spend
P2PKH | size | script |
---|---|---|
scriptSig |
106 | <sig> <pubkey> |
scriptPubKey |
25 | OP_DUP OP_HASH160 <pkHash> OP_EQUALVERIFY OP_CHECKSIG |
assembled | <scriptSig> <scriptPubKey> |
|
btc_address | b58_encode(pfx + spk[3:23]) |
|
Test | len(spk) == 25 and (spk[0:3] + spk[23:25]).hex() == '76a91488ac' |
|
Total vB | 131 | 106 + 25 |
the total size of one of these assembeled scripts is 131 vB (bytes). In regards to bitcoin addresses, the address is derived by chopping off the first 3 and last 2 bytes (op codes) from the scriptPubKey
(spk
). The script is recognized by it's length and the first 3 and last 2 bytes (OP_DUP
, OP_HASH160
, OP_PUSH
, OP_EQUALVERIFY
, OP_CHECKSIG
).
Pay to Script Hash
So this two scripts concatination worked well for the first three years, but then, eventually more flexability was desired and BIP-16 was introduced. It was a simple enough concept, but if your looking at a scripting engine 100% defined simply by the stack and the two TXN script segmets, a completed script can not be created. You will need to invent a new op code OP_DESERIALIZE
and then insert some op codes not originally provided in the script at all to exist purely in this scripting engine. The concept of OP_DESERIALIZE
is to take the top data element redeemScript
and reinterpret it as code instead of data.
P2SH | size | script |
---|---|---|
scriptSig |
?? | <sig> <<redeemScript>> |
scriptPubKey |
23 | OP_HASH160 <rsHash> OP_EQUAL |
assembled | <scriptSig> OP_DUP <scriptPubKey> OP_VERIFY OP_DESERIALIZE |
|
btc_address | b58_encode(pfx + spk[3:23]) |
|
Test | len(spk) == 23 and (spk[0:2] + spk[22:23]).hex() == 'a91487' |
|
Total vB | 96+ | 73 + len(redeemScript) + 23 |
The total size on the blockchain for a P2SH spent output will be at least 97 bytes. The actual size will be dependant upon the size of redeemScript
. The majority of non-segwit P2SH transactions are multisig related. At the time of BIP-16, multisig (P2MS) was already widely adopted, though it was mostly done in the scriptPubKey
element. As before, this put the burdon on the sender to maintain an intricate scriptPubKey
instead of a simple bitcoin address. P2SH allows complex scripts to be used while still providing basic pay to address type symantics. The address is derived like most pay-to-address outputs, though a differnet prefix (pfx
) is used. The script is recognized by its length and by clipping the first and last two bytes.
Pay to Witness Public Key Hash
The last four script types were all introduced with Segrigated Witness (BIP-141). In order for Segwit to allow backward compatibility, the scriptSig
and scriptPubKey
elements are either empty or consist of nothing more than data elements (OP_PUSH
). Since non-zero data will always pass validation, this makes all segwit TXNs default to valid if witness data is not included. Like P2SH
a lot of the op-codes are implied and to make the point I'll artificially insert them here as we did with P2SH
.
The P2WPKH is modeled after the P2PKH, but the scriptSig
is moved to the witness program and most of the op-codes are implied. Many scripts are also prefixed with OP_0
to signify segwit enablement. The goal of segwit was to allow blocks to expand to something approaching 4MiB while not breaking older implementations. So you can still only have 1MiB of "legacy" block data, but you can have up to 3MiB of witness data... well kinda... the real WU math is a bit more complex.
P2WPKH | size | script |
---|---|---|
witness |
107 | <sig> <pubkey> |
scriptPubKey |
22 | OP_0 <pkHash> |
assembled | <witness> OP_DUP OP_HASH160 <scriptPubKey> OP_SWAP OP_DROP OP_EQUALVERIFY OP_CHECKSIG |
|
btc_address | b32_encode(pfx + spk[2:22]) |
|
Test | len(spk) == 22 and (spk[0:2]).hex() == '0014' |
|
Total vB | 48.75 | 22 + 107/4 |
For those keeping score, you'll notice that the witness program is 107, yet the same scriptSig
elsewhere is 106. This is because the witness program has to push an element count (0x02) so it can be deserialized. I won't get into those specifics since I think we are already getting off in the weeds. You'll also notice with the WU math, we get to apply a 75% discount to the witness program. This gives our "virtual size" in the block at 48.75, making P2WPKH far and away the least expensive script type. The address is derived from the last 20 bytes of scriptPubKey
but by identifying the scriptPubKey
as a P2WPKH type, the address will use bech32 encoding instead of base58 encoding.
Pay to Witness Script Hash
P2WSH | size | script |
---|---|---|
witness |
?? | <sig> <<witnessScript>> |
scriptPubKey |
34 | OP_0 <wsHash> |
assembled | <witness> OP_DUP OP_HASH160 <scriptPubKey> OP_SWAP OP_DROP OP_EQUALVERIFY OP_DESERIALIZE |
|
btc_address | b32_encode(pfx + spk[2:34]) |
|
Test | len(spk) == 34 and (spk[0:2]).hex() == '0020' |
|
Total vB | 52+ | 34 + (74 + len(witnessScript))/4 |
P2SH Encapsulating Pay to Witness Public Key Hash
P2SH-P2WPKH | size | script |
---|---|---|
witness |
107 | <sig> <pubkey> |
scriptSig |
23 | <OP_0 <pkHash>> |
scriptPubKey |
23 | OP_HASH160 <ssHash> OP_EQUAL |
assembled | <witness> OP_DUP OP_HASH160 <scriptSig> OP_DUP <scriptPubKey> OP_VERIFY OP_DESERIALIZE OP_SWAP OP_DROP OP_EQUALVERIFY OP_CHECKSIG |
|
btc_address | b58_encode(pfx + spk[2:22]) |
|
Test | is_p2sh() and len(ss) == 23 and (ss[0:3]).hex() == '160014' |
|
Total vB | 72.75 | 23 + 23 + 107/4 |
- https://satoshi.nakamotoinstitute.org/code/
- https://blockgeeks.com/guides/best-bitcoin-script-guide/
- https://blockgeeks.com/guides/bitcoin-script-guide-part-2/
- https://en.bitcoin.it/wiki/Script
- https://github.com/bitcoin/bitcoin
- https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki
- https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki
- https://en.bitcoin.it/wiki/Script#Obsolete_pay-to-pubkey_transaction
No comments:
Post a Comment