The Python Quants

Hashing, Encryption, Blockchain and Bitcoin Mining with Python

A free Webinar from The Python Quants

Dr. Yves J. Hilpisch

team@tpq.io | http://tpq.io

The Python Quants GmbH

Some Bitcoin Data

Total Number of Bitcoins

We work with data from http://quandl.com. First, the total number of bitcoins mined.

In [2]:
import quandl as q
bn = q.get('BCHAIN/TOTBC') / 1e6  # in millions
In [3]:
bn.plot(figsize=(10, 6));

Total Number of Transactions

The total number of transactions.

In [4]:
bt = q.get('BCHAIN/NTRAT') / 1e6  # in millions
In [5]:
bt.plot(figsize=(10, 6));

Number of Unique Addresses

The number of unique bitcoin addresses.

In [6]:
ba = q.get('BCHAIN/NADDU')
In [7]:
ba.plot(figsize=(10, 6));

Bitcoin Value in USD

The USD/Bitcoin exchange rate.

In [8]:
be = q.get('BCHAIN/MKPRU')
In [9]:
be.plot(figsize=(10, 6));

Bitcoin Value in USD

The market capitalization in USD.

In [10]:
bm = q.get('BCHAIN/MKTCP') / 1e9  # in billions
In [11]:
bm.plot(figsize=(10, 6));

Bitcoin Transaction Volume (Units)

The Bitcoin transaction volume (in Bitcoins).

In [12]:
bv = q.get('BCHAIN/ETRAV')
In [13]:
bv.plot(figsize=(10, 6));

Bitcoin Transaction Volume (USD)

The Bitcoin transaction volume (in USD).

In [14]:
v = be * bv / 1e9  # in billions
In [15]:
v.plot(figsize=(10, 6));

Bitcoin Mining Difficulty

The bitcoin mining difficulty. How hard is it to mine a new bitcoin block?

In [16]:
bd = q.get('BCHAIN/DIFF') / 1e9  # in billions
In [17]:
bd.plot(figsize=(10, 6));

Bitcoin Network Hashrate

The hashrate of the Bitcoin mining network. How many giga hashes per second (GH/s) does the bitcoin mining network calculate per second?

In [18]:
bh = q.get('BCHAIN/HRATE')
In [19]:
bh.plot(figsize=(10, 6));

Hashing

From https://en.wikipedia.org/wiki/Hash_function:

A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes.

A Simple Hash Function

The first simplistic hash function that we consider maps any string to a three digit integer. It uses ordinal numbers of one-character string objects.

In [20]:
ord('a')
Out[20]:
97
In [21]:
chr(97)
Out[21]:
'a'

The implementation of the simplistic hash function ("average integer ordinal number").

In [22]:
def hash_function(text):
    value = sum([ord(l) for l in text]) / len(text)
    return '%03d' % value

Some examples.

In [23]:
hash_function('!')
Out[23]:
'033'
In [24]:
hash_function('yves')
Out[24]:
'113'
In [25]:
hash_function('yves2')
Out[25]:
'101'
In [26]:
hash_function('Quant4711')
Out[26]:
'080'

Collisions are easily found.

In [27]:
hash_function('yves')
Out[27]:
'113'
In [28]:
hash_function('sevy')
Out[28]:
'113'

Our function has a target space of 10 bits (only).

In [29]:
bin(999)
Out[29]:
'0b1111100111'
In [30]:
2 ** 10
Out[30]:
1024

Modern hash functions have a target space of 128 bits, i.e. $2^{128} - 1$ possible values.

In [31]:
2 ** 128
Out[31]:
340282366920938463463374607431768211456
In [32]:
hex(2 ** 128)
Out[32]:
'0x100000000000000000000000000000000'
In [33]:
len(hex(2 ** 128)) - 4
Out[33]:
31

Or 256 bits.

In [34]:
2 ** 256
Out[34]:
115792089237316195423570985008687907853269984665640564039457584007913129639936
In [35]:
hex(2 ** 256)
Out[35]:
'0x10000000000000000000000000000000000000000000000000000000000000000'
In [36]:
len(hex(2 ** 256)) - 4
Out[36]:
63

Or even 384 bits.

In [37]:
2 ** 384
Out[37]:
39402006196394479212279040100143613805079739270465446667948293404245721771497210611414266254884915640806627990306816
In [38]:
hex(2 ** 384)
Out[38]:
'0x1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'
In [39]:
len(hex(2 ** 384)) - 4
Out[39]:
95

The universe is assumed to consist of $10^{80}$ atoms.

In [40]:
10 ** 80
Out[40]:
100000000000000000000000000000000000000000000000000000000000000000000000000000000
In [41]:
hex(10 ** 80)
Out[41]:
'0x35f9dea3e1f6bdfef70cdd17b25efa418ca63a22764cec100000000000000000000'
In [42]:
len(hex(10 ** 80)) - 3  # close to 2 ** 256
Out[42]:
66

We require the following properties from a hash function:

  • collision resistance: it is highly unlikely to find to inputs that yield the same output
  • hiding: it is really difficult to find the exact input if you know an output
  • puzzle friendliness: if someone targets a certain output value and if parts of the input are randomly chosen then it is difficult to find another input value that hits exactly that target

MD5 Hash Codes

First, importing Python's hashing function library.

In [43]:
import hashlib

The first MD5 hash codes (cf. https://en.wikipedia.org/wiki/MD5).

In [44]:
md5_1 = hashlib.md5(b'yves').hexdigest()
md5_1
Out[44]:
'afe3bd960b4c46a68580c4e564cca24e'
In [45]:
md5_2 = hashlib.md5(b'yves2').hexdigest()
md5_2
Out[45]:
'664d839e06bada38ce04f7208896efdf'
In [46]:
hashlib.md5(b'Dr. Yves Johannes Hilpisch').hexdigest()
Out[46]:
'642394f6d25fd9fe4b88e359c8aaf051'

Brute Force Hash (Password) Cracking

We define first a character set of lower case letters only.

In [47]:
import string
charset = string.ascii_lowercase
charset
Out[47]:
'abcdefghijklmnopqrstuvwxyz'

Hash codes for all single characters in charset.

In [48]:
for c in charset:
    md5 = hashlib.md5(c.encode('ascii'))
    print(c, md5.hexdigest())
a 0cc175b9c0f1b6a831c399e269772661
b 92eb5ffee6ae2fec3ad71c777531578f
c 4a8a08f09d37b73795649038408b5f33
d 8277e0910d750195b448797616e091ad
e e1671797c52e15f763380b45e841ec32
f 8fa14cdd754f91cc6554c9e71929cce7
g b2f5ff47436671b6e533d8dc3614845d
h 2510c39011c5be704182423e3a695e91
i 865c0c0b4ab0e063e5caa3387c1a8741
j 363b122c528f54df4a0446b6bab05515
k 8ce4b16b22b58894aa86c421e8759df3
l 2db95e8e1a9267b7a1188556b2013b33
m 6f8f57715090da2632453988d9a1501b
n 7b8b965ad4bca0e41ab51de7b31363a1
o d95679752134a2d9eb61dbd7b91c4bcc
p 83878c91171338902e0fe0fb97a8c47a
q 7694f4a66316e53c8cdd9d9954bd611d
r 4b43b0aee35624cd95b910189b3dc231
s 03c7c0ace395d80182db07ae2c30f034
t e358efa489f58062f10dd7316b65649e
u 7b774effe4a349c6dd82ad4f4f21d34c
v 9e3669d19b675bd57058fd4664205d2a
w f1290186a5d0b1ceab27f4e77c0c5d68
x 9dd4e461268c8034f5c8564e155c67a6
y 415290769594460e2e485922904f345d
z fbade9e36a3f36d3d676c1b808451dd7

Now doing brute force hash code cracking — 'knowing' that the relevant word has a maximum of 4 characters.

In [49]:
import itertools as it
In [50]:
%%time
for i in range(1, 5):
    print('%d CHARACTERS USED NOW' % i)
    pm = it.product(charset, repeat=i)
    for comb in pm:
        comb = ''.join(comb)
        md5 = hashlib.md5(comb.encode('ascii')).hexdigest()
        if md5 == md5_1:
            print('SUCCESS')
            print(comb, md5)
            break
1 CHARACTERS USED NOW
2 CHARACTERS USED NOW
3 CHARACTERS USED NOW
4 CHARACTERS USED NOW
SUCCESS
yves afe3bd960b4c46a68580c4e564cca24e
CPU times: user 549 ms, sys: 2.65 ms, total: 552 ms
Wall time: 553 ms

Let us enlarge the character set to include digits as well.

In [51]:
charset2 = string.ascii_lowercase + string.digits
charset2
Out[51]:
'abcdefghijklmnopqrstuvwxyz0123456789'

Time to crack the second hash code increases due to greater passworword length and larger character set.

In [52]:
from itertools import product
In [53]:
import time
t0 = time.time()
z = 0
for i in range(1, 6):
    print('%d CHARACTERS USED NOW' % i)
    pm = it.product(charset2, repeat=i)
    for comb in pm:
        comb = ''.join(comb)
        md5 = hashlib.md5(comb.encode('ascii')).hexdigest()
        z += 1
        if md5 == md5_2:
            print('SUCCESS')
            print(comb, md5)
            break
sec = time.time() - t0
print('time in sec: %.1f' % sec)
1 CHARACTERS USED NOW
2 CHARACTERS USED NOW
3 CHARACTERS USED NOW
4 CHARACTERS USED NOW
5 CHARACTERS USED NOW
SUCCESS
yves2 664d839e06bada38ce04f7208896efdf
time in sec: 52.4

The algorithm has checked about 43 mn hashes before being successful. This represents a speed of about 850,000 hashes per second.

In [54]:
z
Out[54]:
43024025
In [55]:
z / sec
Out[55]:
821666.3604596546

Using Dedicated Password Cracking Tools ("Better Software")

Dedicated password cracking tools like Hashcat (cf. http://hashcat.net) allow for a much faster and more intelligent/targeted approach.

Let us check how long Hashcat needs to find the password yves2 stored as an MD5 hash. We assume:

  • password length is 5 letters
  • only lower case letters or digits
In [56]:
r1 = '''
664d839e06bada38ce04f7208896efdf:yves2

Session.Name...: hashcat
Status.........: Cracked
Input.Mode.....: Mask (?1?1?1?1?1) [5]
Custom.Chars...: -1 ?l?d, -2 Undefined, -3 Undefined, -4 Undefined
Hash.Target....: 664d839e06bada38ce04f7208896efdf
Hash.Type......: MD5
Time.Started...: Mon Aug 22 19:34:16 2016 (2 secs)
Speed.Dev.#1...: 13710.9 kH/s (0.94ms)
Recovered......: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.......: 28803600/60466176 (47.64%)
Rejected.......: 0/28803600 (0.00%)
Restore.Point..: 799470/1679616 (47.60%)

Started: Mon Aug 22 19:34:16 2016
Stopped: Mon Aug 22 19:34:22 2016

real    0m6.055s
user    0m1.750s
sys     0m1.490s
'''

Make Use of Human Traits ("Social Cracking")

Mask attacks are some of the most powerful tools (strategies) in password cracking. It relies on the fact that human beings like to use certain (easy to remember) structures for their passwords. An interesting analysis is found here: https://www.praetorian.com/blog/statistics-will-crack-your-password-mask-structure. The major finding is:

  • 50% of all passwords analyzed rely on
  • 13 password masks only out of
  • 260,000+ masks analyzed in total

An example: lisa2008 = name of daughter born in 2008

Let us consider the following case. A password is assumed to consist of upper case letters, lower case letters and digits. In this case, we make use of insights about "humanly generated passwords". I.e. we do a so-called mask attack where we implement the following rules:

  • password length is 9 letters
  • first letter is upper case (like A)
  • next four letters are lower case (like bbbb)
  • last four letters are digits (like 1992)

We assume a structure like Abbbb1992.

In [57]:
r2 = '''
8769d3723ec8f853d91f28208e97acce:Quant4711

Session.Name...: hashcat
Status.........: Cracked
Input.Mode.....: Mask (?u?l?l?l?l?d?d?d?d) [9]
Hash.Target....: 8769d3723ec8f853d91f28208e97acce
Hash.Type......: MD5
Time.Started...: Mon Aug 22 17:39:32 2016 (10 mins, 23 secs)
Speed.Dev.#1...: 42944.7 kH/s (13.43ms)
Recovered......: 1/1 (100.00%) Digests, 1/1 (100.00%) Salts
Progress.......: 27138885360/118813760000 (22.84%)
Rejected.......: 0/27138885360 (0.00%)
Restore.Point..: 1543910/6760000 (22.84%)

Started: Mon Aug 22 17:39:32 2016
Stopped: Mon Aug 22 17:50:03 2016

real    10m30.744s
user    12m24.880s
sys     0m35.770s
'''

Using GPUs and ASICs ("Better Hardware")

Cheapest Nvidia GPU (about 30 EUR net of VAT) reaches a MD5 hashing speed of about 400MH/s.

High-end Nvidia GPU cluster Brutalis reaches a MD5 hashing speed of about 200GH/s (cf. https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a270c40).

Dedicated Application Specific Integrated Circuit (ASIC) chips reach even higher speed at much lower cost. AntMiner S5 achieves a speed of 1,155 GH/s for Bitcoin mining (SHA256). Used at Amazon for about 200 EUR.

Basics of RSA Encryption

On the Wikipedia web site https://en.wikipedia.org/wiki/RSA_(cryptosystem) you find:

RSA is one of the first practical public-key cryptosystems and is widely used for secure data transmission. In such a cryptosystem, the encryption key is public and differs from the decryption key which is kept secret. In RSA, this asymmetry is based on the practical difficulty of factoring the product of two large prime numbers, the factoring problem. RSA is made of the initial letters of the surnames of Ron Rivest, Adi Shamir, and Leonard Adleman, who first publicly described the algorithm in 1977.

Introductory Example

In the respective Wikipedia article, you find the following example. Consider two (small) prime numbers.

In [58]:
p = 61
q = 53

Compute the product of these numbers as:

In [59]:
n = p * q
n
Out[59]:
3233

Compute the so-called totient of the product as:

In [60]:
t = (p - 1) * (q - 1)
t
Out[60]:
3120

Choose any number $1 < e < t$ which is coprime to $t$ like:

In [61]:
e = 17

Compute the modular multiplicative inverse $d$ (cf. https://en.wikipedia.org/wiki/Modular_multiplicative_inverse) for which we have

$$d \cdot e \mod t = 1$$

as

In [62]:
d = 2753

Let us check:

In [63]:
d * e % t
Out[63]:
1

The public key the is $n=3233, e=17$. The encryption function for a $m$ is

$$c(m) = m^{17} \mod 3233$$
In [64]:
m = 77  # our message
c = m ** e % n  # encryption
c
Out[64]:
3123

The private key is $d = 2753$. The decryption function for a ciphertext message $c$ is

$$m(c) = c^{2753} \mod 3233$$
In [65]:
m = c ** d % n
m
Out[65]:
77

Working with "Real" RSA Key Pairs

The library PyCryptodome provides a multitude of security algorithms and techniques for easy use with Python. Cf. https://pycryptodome.readthedocs.io/en/latest/ (pip install pycryptodomex).

In [66]:
from Cryptodome.PublicKey import RSA
In [67]:
secret_code = 'PythonQuants'
key = RSA.generate(2048)

The public key.

In [68]:
pub_key = key.publickey().exportKey()
print(pub_key)
b'-----BEGIN RSA PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAn4h9j236X4ydU324cbZ8\nqKQ9wH6rln0SCSl9CmR/yepvyVzA5SeX/MCtyw/R5XGEGQo/1sz25ntJwD5e+ExO\nZ0KlIpVNMLB9JGbm9J3Q0y9aEs7QDYzoRQU7oeiCQRA1SfK8viK9B4lVXusyQghS\n0ViTjjK9ND4H1LGXvPsH9OV+YX1dNtpR1hrtka36ByJTh6YEbL/7GbULG2HUaNos\nmcXKZLWhoYjilV+nMe9Qi2vRjbS30pOqTQ3joYYwibYTzqlBZQoUqT4yjUMBH8gl\noO86Ow2RdLFurfcmDhKXJxqUOZxSfKcS0OyxmfMejKDlexY5GY6d7yWZsolaopIq\njQIDAQAB\n-----END RSA PUBLIC KEY-----'

The private key.

In [69]:
priv_key = key.exportKey()
print(priv_key)
b'-----BEGIN RSA PRIVATE KEY-----\nMIIEowIBAAKCAQEAn4h9j236X4ydU324cbZ8qKQ9wH6rln0SCSl9CmR/yepvyVzA\n5SeX/MCtyw/R5XGEGQo/1sz25ntJwD5e+ExOZ0KlIpVNMLB9JGbm9J3Q0y9aEs7Q\nDYzoRQU7oeiCQRA1SfK8viK9B4lVXusyQghS0ViTjjK9ND4H1LGXvPsH9OV+YX1d\nNtpR1hrtka36ByJTh6YEbL/7GbULG2HUaNosmcXKZLWhoYjilV+nMe9Qi2vRjbS3\n0pOqTQ3joYYwibYTzqlBZQoUqT4yjUMBH8gloO86Ow2RdLFurfcmDhKXJxqUOZxS\nfKcS0OyxmfMejKDlexY5GY6d7yWZsolaopIqjQIDAQABAoIBACON0fvb9QuD5VEk\n6O0Q0D3uYqPKpzeR9qnLkQY2qSMpKBIupX+8n/sQvvrNMwhpUNkax2j6A4GQ0vow\n/E3mhaeuOurW7t3Ny4oIE+tK264hNLgR1tzaPS6asLRQAu14rQOmwSk3yTyLfxb5\nIFmGjNyCmtn7VV71/x1IGgrgQGDzxUFIMhIZTTqbK9y2Qug3OVQPSysUPlGlko31\nSuEhcfbMrkxgwnW8n3spqEwfwzgotlAzUtBsoBG3Nbm9KMz/2Mg1XrZvSZ3TVBXx\nknsE9gK/GT+XozV6CPwLnTrru7K1fItoGB8/qiT2Ag9NQ7JeBP2XBAXxxZT0Lc11\n/ZWbSckCgYEAvbKmJgLUhHWfSGJW8nwnFe+xCOoYpeuDefaPwr2AqUI/kqiPhQhh\nh8Nyr1r0kKk9jRUFzWeFhvNoxgVjGiPq6M8d7MDaFztt+KONfzjQ3l42kO3JSVDI\nFDBY4kYkWjLKoyElJxzWTlvLpB0hR7X3s3EGjD0UsSZkHZ5O2HhvYxkCgYEA10rV\nvLZsiHTkFR3YcwQ/FXN0QfWZeYr3vr/bZFw9GpyZ9MeLhfxhAvNQFOHMFNOn+TOn\nX9Ic2RgPwMgX4Mh4UM7gfjTczwTaEuIuwRCUzOqH1Ir16WloVduRQ28nv2O0Pe/z\n986lBoHTM0MoZeT2vSpBJOk9YXJWSG/HCZr+BZUCgYBwmXtf8qwc2AitGZpMov88\nWSN3jCUHT62iFoWs7XlSfr1nm2BdceUahhwsFEw5FgwwEzt3eixbN16ItCfoG0vT\n1yUHJE4QHjmfrc1Op/XrGHdBPbQOLcIlobEQ1tu7Ioz/mawI9GgM6RYOmy/m9g9K\n1xsw4QzIPmaM4iwg+BP9+QKBgQDUd5kp7Qp9PIafRF4JXEZ5gABa2+uDpJ7M1ADL\nMbBn4+aYSJmsOB7xT3hXs2K5pwwdhlmG2g/4O6PISHAwOQdsEE5Cnx7O+2XPvwIP\nLLL86/Py0NTRbmI2YUMhvsAGRHprqbQmefwoyNTKwVtL+8N5egriR4B4++KlCBLw\nqtJIZQKBgGmQYvD1n5KQ0v38Xxmwwzf8EwgRS9HGi5ml0E2mzc9qHZrQ+QmhqQCQ\nYrWDVqi9Xea1ov8s+jM5r2qOKZhFO4db4h+xo5XXsxElZW06s0sQ494pK0Iopafc\n4ycpALVfzriKRa6n8LdV8HKSkBkaBrGgu2fLgKSeoyzrHYe0MUdq\n-----END RSA PRIVATE KEY-----'

Let us now encrypt and decrypt a message with these keys. First, the encryption step using the public key — assuming that someone sends a message to The Python Quants who own the private key.

In [70]:
from Cryptodome.Cipher import PKCS1_OAEP
In [71]:
m = b'Hello TO The Python Quants. ' * 5
In [72]:
cipher = PKCS1_OAEP.new(key.publickey())  # public key
In [73]:
ciphertext = cipher.encrypt(m)
ciphertext
Out[73]:
b'\x9af\xa8e\x98\x18\x03+\xd2\xfc\xae\xbd\x92\xc0\xa9\xdc\xe2\xd6"}\xf9\x1f\x07b\x90QGs\x03,\\\x88\xe8\x9f\x12\x9dh\x03HZ\xc2D\xd8\x86\x9d\x03\xa1\xcde\xb7g\'\x81\x9d\x96\xfb@\x12\xce~b\xb6^\xcb\xe5\xca\xa6\xf8\x9eI\xda\xd7\x89\x9f\xc6:\xaaX\xcc\xffgt\x1a)\xc6\x98j\xa5\xfb\xe4\x81\xf1\xd7\xfb\xdbn\x8c\xfd\x89\xb3\t\x1f\x02@r\xba\xa6<6\xa2w\xcd\xb8\x0cUz\xe3X\xbc\x0e\xaa5b\x1f\xd9\x84\xc3\x9f\x9e\xba+<@\xd5\x1a\xf9\x86\x12\x0c\xd1.\x8c\xcb\x1c1\x12\xe8\x05\xc2YY\\\xa8[\x94\xa91\x1e\x11\x01\x7f\xad\x02+\xa8f\\g\xb0N>s,\x16\x1e\xc7\x95\x89\x08=\xa6r\xd6`\x98\x9f\r|\xef\x81\x1cj\xcb\x9f\xb9\xcc\x7f}@3\xc5%tsN\xf6\xb5\xaa(\x0eH\xa440\xc3\xf5\xf6g\xca]\xc1YT\x19\xcc\x91\xa3\xe5\xa4\xe96\x16\x9bK\x9b\x11\x94#+je\x98y\x93\x03B\xe3\x9f\xd0\x1fz\xf0j\xe0AR'

Second, the decryption step using the private key.

In [74]:
cipher = PKCS1_OAEP.new(key)  # private key
In [75]:
message = cipher.decrypt(ciphertext)
message
Out[75]:
b'Hello TO The Python Quants. Hello TO The Python Quants. Hello TO The Python Quants. Hello TO The Python Quants. Hello TO The Python Quants. '

Using openssl

Beyond Python, it is easy to generate RSA key pairs by the use of openssl (cf. https://www.openssl.org/).

In [76]:
# !openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout cert.key -out cert.pem -b

Signing Messages with Hashes and RSA Keys

Let us electronically sign a message The Python Quants send to someone else. To this end, we combine hashing with RSA encryption.

In [77]:
m = b'Hello FROM The Python Quants.' * 5

We generate a hash code for the message.

In [78]:
from Cryptodome.Hash import SHA256
In [79]:
h = SHA256.new(m)
In [80]:
hd = h.hexdigest()
hd
Out[80]:
'6d7e450cf23b6f867b03ac30e613c618fb5d91cdd56cdd9f79859dc12e6a4083'

We next sign the message, i.e. we encrypt the hash code with the private key from before.

In [81]:
from Cryptodome.Signature import PKCS1_v1_5
In [82]:
signer = PKCS1_v1_5.new(key)
In [83]:
signature = signer.sign(h)  # private key
signature
Out[83]:
b'\x85\x9f\xfe;\xcc\xbb\xfd\x96\xa5b\xa9r\xa1\xcaXe\x9a\xa2a\x14\xaa\xd3\xfe\x02P\x91\xb6\xb8\xfe\x12U\x88a52\xafWpfM\x91U\xd3\xb6o\x95\x7fA\x93\xb1\x82\x815qm\x95\xef\xce\x03\\\x01\x95\x1a\x8f\xda\x1f\xfa\x0fM\x98\x8f\xd7\xf1\x1d.\xfblP\xb8\xa5\xf2\x85\x08\x1eJu\xf0\x94e\xb8\\\x80\x13~]\x19)l\xc9\xcc\x82\xf7\x04i\xde\x1dY\xa7\xd6\x1d\xe0\r\xcca\xc8{\t\x03\xa8}\x98N*\x88\x1aW@\x93\xcb\xc1\xfe\xd1\x02\xf5\x91\xa1\xcc\xe9\xa9Mb\x87\xaaip\x0b\x8a6A\xe2\x16\xb8\xe9\xf9\x9e)\xf1\xb3\x08\xf3S\xecLRV$\x0e\x8f8\xb3\xba\xc4wp\x81F\xcc\xc0+\x90\xffR\xd5\xa8\xcd(\x17#q\xbc\xfb\t\x1dB\xd9?\x90\x94\xab\x88\xdc\xfd<q\x95\xb0!\x1d\xe5\xdf\xb2\xe7\x0f\x99\x9d\xc5\x07"\xc1\xcd\xe5V\xb57\x95]\x1c=\xf8o\x99\x12\xddA\xc9r\x02\xa7qhL\x1d\xfd\xf0\x8a\xf2\xe9\xc6\x01\xde\xcb\xceM\xf57\n'

Someone else — i.e. the receiver of our message — who knows our public key can now verify that we have signed the message as follows.

In [84]:
hashcode = SHA256.new(m)
hashcode.hexdigest()
Out[84]:
'6d7e450cf23b6f867b03ac30e613c618fb5d91cdd56cdd9f79859dc12e6a4083'
In [85]:
PKCS1_v1_5.new(key.publickey()).verify(hashcode, signature)  # public key
Out[85]:
True

Basic Idea Behind Block Chains

Very Simple Example

Let us now illustrate the basic idea behind a block chain based on a very simple example first. Recall the properties of hash functions that we require (collision resistence, hiding, puzzle friendliness). Let us focus on collision resistence and hiding for the moment.

As a starter, it is obviously easy to calculate the hash value of a string ("first block"), for instance, as follows:

In [86]:
import hashlib
In [87]:
b1 = 'Jil, 2004'  # our first dog
h1 = hashlib.md5(b1.encode('ascii')).hexdigest()  # hash for first block
h1
Out[87]:
'db29bc3f87a84f227d3b4bc7b19a3c6a'

It is highly unlikely that another input yields the same output. It is also really difficult ("almost impossible") to find the input given a certain ouput.

A block chain can be used to document events over time (e.g. transactions, new dogs). To this end, we take the hash code from the first block, add the second block information and calculate a new hash value:

In [88]:
b2 = h1 + ', Liz, 2009'
h2 = hashlib.md5(b2.encode('ascii')).hexdigest()
h2
Out[88]:
'db71b4766173bd93b965af8888262b51'

A third block is as easily added:

In [89]:
b3 = h2 + ', Phineas, 2011'
h3 = hashlib.md5(b3.encode('ascii')).hexdigest()
h3
Out[89]:
'7d00c38e077282a822ee91c69fccd547'

Our block chain now is:

In [90]:
print(b1)
Jil, 2004
In [91]:
print(b2)
db29bc3f87a84f227d3b4bc7b19a3c6a, Liz, 2009
In [92]:
print(b3)
db71b4766173bd93b965af8888262b51, Phineas, 2011

There is one major problem with this approach: the block chain is really easy to manipulate since you only need to re-calculate the whole chain. You have all the information needed.

We need add therefore one security measure to avoid manipulation (in theory): signing of the last hash value.

In [93]:
h3
Out[93]:
'7d00c38e077282a822ee91c69fccd547'
In [94]:
# key generation
from Cryptodome.PublicKey import RSA
key = RSA.generate(2048)
In [95]:
# signing
from Cryptodome.Hash import MD5
from Cryptodome.Signature import PKCS1_v1_5
md5 = MD5.new(b3.encode('ascii'))
signature = PKCS1_v1_5.new(key).sign(md5)  # private key
signature
Out[95]:
b"p\x8a\xdf\xe8$\xe7\r\xec3x\x04\xf1\x80]Nu\x19\x01\t\x1aP\xfc_\xa1N\xc9\x95\x12Yh\xceP\x17\x127\x96\xb8?\xda\x86|\n\xcd!y\x98f\xf49v;\xc3\xa1\xe04\xb0\x8d\x8f\x15\x85\x0b\x84+;'w}\xdf\xd4O-\x12\x9e%\x1e\x9fp\xf4-\xb0dG\x8f\xf8\x94N3]\x03rd\xf0\xd4R\x11\x0f\x91+\xa6Jg\x01\x19\x0e\x181z7\xb8\x9f\xf4\xba\xb5\x89\xd1\x0e\x98,\xed0\x1a\xb3\xe7\xdff\x10)=\x17\x17u\xd1\xdd\xb1s3\xa3\xc8\xad\xff<R+]/'\xde\xde\xbb\xd1\xa3\xf5\xc6A\xa1\x90zQ0r\x9b]\xc0TL\xf1Fp\xc1U$p!{\x15\x0f68\x9f\x00AW\xde\xf4\xa5`/'\x12C\xe7R\xbe\x02\x08\x9b:4\xa6@<\xe0\x8b\xb8t0\x98z\xf4\xacO\xdc$BO\n\x83\xe6\x19\x9d@\xdd25\xcb\xea`\x0c\x99\x99\xd5^f\x02\xff\x89pW\xb1\t\xfdz8\x1a\xbc\x10k&\x15\xd8\x07\x9c>OjU"

If we can make sure that the private key is safe, then the block chain plus the signature for the final hash value are "almost impossible" to manipulate — although all the information is publicly available.

In [96]:
b1, b2, b3, h3, signature
Out[96]:
('Jil, 2004',
 'db29bc3f87a84f227d3b4bc7b19a3c6a, Liz, 2009',
 'db71b4766173bd93b965af8888262b51, Phineas, 2011',
 '7d00c38e077282a822ee91c69fccd547',
 b"p\x8a\xdf\xe8$\xe7\r\xec3x\x04\xf1\x80]Nu\x19\x01\t\x1aP\xfc_\xa1N\xc9\x95\x12Yh\xceP\x17\x127\x96\xb8?\xda\x86|\n\xcd!y\x98f\xf49v;\xc3\xa1\xe04\xb0\x8d\x8f\x15\x85\x0b\x84+;'w}\xdf\xd4O-\x12\x9e%\x1e\x9fp\xf4-\xb0dG\x8f\xf8\x94N3]\x03rd\xf0\xd4R\x11\x0f\x91+\xa6Jg\x01\x19\x0e\x181z7\xb8\x9f\xf4\xba\xb5\x89\xd1\x0e\x98,\xed0\x1a\xb3\xe7\xdff\x10)=\x17\x17u\xd1\xdd\xb1s3\xa3\xc8\xad\xff<R+]/'\xde\xde\xbb\xd1\xa3\xf5\xc6A\xa1\x90zQ0r\x9b]\xc0TL\xf1Fp\xc1U$p!{\x15\x0f68\x9f\x00AW\xde\xf4\xa5`/'\x12C\xe7R\xbe\x02\x08\x9b:4\xa6@<\xe0\x8b\xb8t0\x98z\xf4\xacO\xdc$BO\n\x83\xe6\x19\x9d@\xdd25\xcb\xea`\x0c\x99\x99\xd5^f\x02\xff\x89pW\xb1\t\xfdz8\x1a\xbc\x10k&\x15\xd8\x07\x9c>OjU")

Another idea is to make it hard to construct another block chain with the same inputs (but in different sequence) or other inputs that satisfies a certain property. Let us define that only hash values with five zeros at the end are allowed. To this end, we must allow for an additional input parameter.

In [97]:
%%time
n = 0
while True:
    b = str(n) + ', ' + b1
    md5 = hashlib.md5(b.encode('ascii')).hexdigest()
    if md5[-5:] == '00000':
        print(b + ' --> ' + md5)
        break
    n += 1
1300915, Jil, 2004 --> 34b87ffb902576b7f2fdeea557500000
CPU times: user 1.86 s, sys: 2.31 ms, total: 1.86 s
Wall time: 1.86 s

Someone wanting to manipulate the block chain must put in much more effort in this case than without such a requirement. The difficulty can easily be increased by requiring eg more trailing zeros.

In [98]:
%%time
n = 0
while True:
    b = str(n) + ', ' + b1
    md5 = hashlib.md5(b.encode('ascii')).hexdigest()
    if md5[-6:] == '000000':
        print(b + ' --> ' + md5)
        break
    n += 1
15086119, Jil, 2004 --> ec1ddd68e721e015e44b58adf7000000
CPU times: user 22.7 s, sys: 33 ms, total: 22.7 s
Wall time: 22.8 s

The first security measure "signing" is vulnerable to stealing the private key (especially when multiple versions exist, e.g. due to backups). The second one "targeting" to sheer brute force. A combination of both, of course, works as well — and is probably more secure.

Another approach that adds some security is to use a random, publicly known, fixed initial hash.

In [99]:
import os
h0 = os.urandom(16).hex()
h0
Out[99]:
'c87664f669182a2b05d92fa3db16ba87'
In [100]:
b1 = h0 + ', Jil, 2004'  # our first dog
h1 = hashlib.md5(b1.encode('ascii')).hexdigest()  # hash for first block
h1
Out[100]:
'725b4f903914b4a2e1bf5a24e70ccc4e'

Simple Cryptocurrency based on a Block Chain

Let us now implement a simple cryptocurrency based on the block chain idea. The example is from the book Narayanan et al. (2016): Bitcoin and Cryptocurrency Technologies.

Consider a central authority, The Python Quants (TPQ), that issues a cryptocurrency calles TPQ Coins. Two basic transactions are possible CreateCoins and PayCoins in the system.

There are many participants in the system that are identified by their public key of a RSA key pair.

Consider an original transaction by which TPQ create 10 TPQ Coins. We have transaction id of 0. The coins created have a coin id of 0(0). The value is 10 and the recipient are TPQ represented by their public key. We also generate an initial hash value to be included in the first block. We define the initial block as follows:

In [101]:
# h = h0 = os.urandom(16).hex()
h = hashlib.md5('my secret passphrase'.encode('ascii')).hexdigest()
# prev. hash, transID, transaction, value, coinID, recipient
b0 = h + '\n0, CreateCoins, 10, 0a, 0xTPQ'
print(b0)
f83e9190098974434be56e5738d89f5f
0, CreateCoins, 10, 0a, 0xTPQ

This initial block gets hashed and the hash value gets signed by TPQ.

TPQ now pays another participant in the system 5 TPQ Coins. This is a PayCoins transaction with id 1.

In [102]:
h = hashlib.md5(b0.encode('ascii')).hexdigest()
b1 = h + '\n1, Consumed Coins: 0a\n'
# prev. hash, transID, transaction, value, transID, recipient
b1 += '1, PayCoins, 5, 1a, 0xRE1\n'
b1 += '1, PayCoins, 5, 1b, 0xTP2'  # another address of TPQ
print(b1)
784170c0b8ec70fea0cf81ce26435dfe
1, Consumed Coins: 0a
1, PayCoins, 5, 1a, 0xRE1
1, PayCoins, 5, 1b, 0xTP2

This second block gets hashed and the hash value gets signed by TPQ.

Now, the participant with address (public key) of 0xRE1 is able to spend TPQ Coins to e.g. buy goods or services from someone identified by 0xSEL.

In [103]:
h = hashlib.md5(b1.encode('ascii')).hexdigest()
b2 = h + '\n2, Consumed Coins: 1a\n'
# prev. hash, transID, transaction, value, coinID, recipient
b2 += '2, PayCoins, 2.5, 2a, 0xSEL\n'  # payment to seller
b2 += '2, PayCoins, 2.5, 2b, 0xRE2'  # another address of 0xRE1
print(b2)
d95f7bf5739fab813afd8b3bda668113
2, Consumed Coins: 1a
2, PayCoins, 2.5, 2a, 0xSEL
2, PayCoins, 2.5, 2b, 0xRE2

This block gets hashed and signed by the payer 0xRE1. It then gets added to the block chain by TPQ, signed by TPQ and published. Everybody now can recover the complete history of the TPQ Coins.

The block chain respresenting the history of the TPQ Coins now looks as follows.

In [104]:
print(b0)
f83e9190098974434be56e5738d89f5f
0, CreateCoins, 10, 0a, 0xTPQ
In [105]:
print(b1)
784170c0b8ec70fea0cf81ce26435dfe
1, Consumed Coins: 0a
1, PayCoins, 5, 1a, 0xRE1
1, PayCoins, 5, 1b, 0xTP2
In [106]:
print(b2)
d95f7bf5739fab813afd8b3bda668113
2, Consumed Coins: 1a
2, PayCoins, 2.5, 2a, 0xSEL
2, PayCoins, 2.5, 2b, 0xRE2

Bitcoin Addresses

Bitcoin addresses are based on Elliptic Curve Digital Signature Algorithm (ECDSA), cf. https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm. The starting point, however, is a SHA256 hash code. Using respective libraries you can generate it randomly or via a passphrase.

In what follows, we are going to use a Python 3 implementation for the generation of a Bitcoin address. See:

In [107]:
# %load bckeys.py
#
# Bitcoin Address Generation
# with Python 3.5
#
# Source: https://github.com/Shultzi/Bitpy
# Blog Post: http://zeltsinger.com/2016/07/18/keys/
#
import os
import ecdsa  # pip install ecdsa
import hashlib
import base58  # pip install base58
import binascii


class Key(object):

    def __init__(self, private_key=0):
        if private_key == 0:
            self.private_key = os.urandom(32)
            self.printable_pk = str(
                binascii.hexlify(self.private_key), "ascii")
        else:
            self.printable_pk = private_key
            self.private_key = binascii.unhexlify(private_key.encode('ascii'))

        self.sk = ecdsa.SigningKey.from_string(
            self.private_key, curve=ecdsa.SECP256k1)
        self.vk = self.sk.verifying_key
        self.public_key = b"04" + binascii.hexlify(self.vk.to_string())
        ripemd160 = hashlib.new('ripemd160')
        ripemd160.update(hashlib.sha256(
            binascii.unhexlify(self.public_key)).digest())
        self.hashed_public_key = b"00" + binascii.hexlify(ripemd160.digest())
        self.checksum = binascii.hexlify(hashlib.sha256(hashlib.sha256(
            binascii.unhexlify(self.hashed_public_key)).digest()).digest()[:4])
        self.binary_addr = binascii.unhexlify(
            self.hashed_public_key + self.checksum)
        self.addr = base58.b58encode(self.binary_addr)

    def get_addr(self):
        return self.addr

An address can be generated based on a random private key.

In [108]:
key = Key()
In [109]:
key.printable_pk
Out[109]:
'52f7c924c13b6b8f156d76b330ec8a76e9815b1576fc9b81dcca834ea7d0ea67'

Having generated the private key, the public key is generated then via ECDSA, i.e. via multiplying the private key with the respective generator point G (x,y) of the elliptic curve. The idea is that the multiplication is easy and straightforward when the factorization is an almost impossible to solve problem.

In [110]:
key.public_key
Out[110]:
b'04a832b0f0b8088b2fef4d78e89d0857ac65413e9c9cf5b7b2bd8cf476f58ffd82e1689cf31853e59f57335fc957145097e354ee2f120fcf5e69561db6a8663bc8'

Bitcoin addresses are represented in the so-called Wallet Import Format (WIF) (https://en.bitcoin.it/wiki/Wallet_import_format) which is based on the Base58Check encoding format (https://en.bitcoin.it/wiki/Base58Check_encoding).

In [111]:
key.addr
Out[111]:
'1Ae9Uj2cviKLivyFqJZPm4cnMzngD8emPh'

Of course, you can generate a private key based on your own passphrase.

In [112]:
pk = hashlib.sha256('Bitcoin is really cool.'.encode('ascii')).hexdigest()
pk
Out[112]:
'3098b5f8ae81c805cc4ab9e43c3ae08e37a2e0a2901abb9804eb55613a60d25b'
In [113]:
key = Key(pk)
In [114]:
key.printable_pk
Out[114]:
'3098b5f8ae81c805cc4ab9e43c3ae08e37a2e0a2901abb9804eb55613a60d25b'
In [115]:
key.public_key
Out[115]:
b'045bbb1c8c7ca479c705885d8a40ac14899a8bff51437d59eb2229fc1dfee86843267ec1788740d11783360813322c28c2ac78080e30bbca747b8b95c3010f4aa2'
In [116]:
key.addr
Out[116]:
'19QiM7YmzSV6fARZnVvaV5u9Y8gqw56g25'

Bitcoin Transactions

Bitcoin addresses are the basis for bitcoin transations.

In [117]:
o1 = '''
$ bitcoin-cli -regtest getnewaddress
mvbnrCX3bg1cDRUu8pkecrvP6vQkSLDSou

$ NEW_ADDRESS=mvbnrCX3bg1cDRUu8pkecrvP6vQkSLDSou
'''

Here, Bitcoins are sent to the previously generated address.

In [118]:
o2 = '''
$ bitcoin-cli -regtest sendtoaddress $NEW_ADDRESS 10.00
263c018582731ff54dc72c7d67e858c002ae298835501d80200f05753de0edf0
'''

10 Bitcoins belong now to the generated address.

In [119]:
o3 = '''
$ bitcoin-cli -regtest listunspent 0
[
    {
        "txid" : "263c018582731ff54dc72c7d67e858c002ae298835501d\
                  80200f05753de0edf0",
        "vout" : 0,
        "address" : "muhtvdmsnbQEPFuEmxcChX58fGvXaaUoVt",
        "scriptPubKey" : "76a9149ba386253ea698158b6d34802bb9b550\
                          f5ce36dd88ac",
        "amount" : 40.00000000,
        "confirmations" : 0
    },
    {
        "txid" : "263c018582731ff54dc72c7d67e858c002ae298835501d\
                  80200f05753de0edf0",
        "vout" : 1,
        "address" : "mvbnrCX3bg1cDRUu8pkecrvP6vQkSLDSou",
        "account" : "",
        "scriptPubKey" : "76a914a57414e5ffae9ef5074bacbe10a320bb\
                          2614e1f388ac",
        "amount" : 10.00000000,
        "confirmations" : 0
    }
]
'''

Basic Idea Behind Mining

Bitcoin mining is based on SHA256 hash codes (cf. https://en.wikipedia.org/wiki/SHA-2)

In [120]:
sha256 = hashlib.sha256('yves'.encode('ascii'))
sha256.hexdigest()
Out[120]:
'9195c7b9bb56d375948ba058cddea1982b49864e12c700de58d69c5c72dbd075'

The idea behind mining is to find a hash code that is 'small enough', i.e. lies below a certain target level (mainly defined by 'leading zeros' in the target hex value).

In [121]:
target = '%064x' % (1000000000 << 200)
target
Out[121]:
'0000003b9aca0000000000000000000000000000000000000000000000000000'

The original hash code for python is not small enough.

In [122]:
target
Out[122]:
'0000003b9aca0000000000000000000000000000000000000000000000000000'
In [123]:
sha256.hexdigest() < target
Out[123]:
False

However, adding (a) certain number(s) to the string, yields a hash code small enough.

In [124]:
sh = hashlib.sha256(b'%dyves' % 23240167).hexdigest()
sh
Out[124]:
'00000003b04fad4b30a527760fea6ee5beec8035ef636316c2bf2577b2789611'
In [125]:
sh < target
Out[125]:
True

The following code simulates a mining procedure.

In [126]:
%%time
i = 0
while True:
    sha256 = hashlib.sha256(b'%d' % i + b'yves')
    if sha256.hexdigest() < target:
        print('SUCCESS')
        print(i, sha256.hexdigest())
        # break
    if i % 2500000 == 0:
        print(i)
    i += 1
    if i > 55000000:
        break
0
2500000
5000000
7500000
10000000
12500000
15000000
17500000
20000000
22500000
SUCCESS
23240167 00000003b04fad4b30a527760fea6ee5beec8035ef636316c2bf2577b2789611
25000000
SUCCESS
27090678 00000007a53f0b5163e1cb7ade64881e4eb3e06f9c102ad72a19c942223ba82b
27500000
30000000
32500000
35000000
37500000
40000000
42500000
45000000
47500000
50000000
SUCCESS
50427211 000000099f6760417f3b2161a9ba9e989c62da1745d949267d5b648edaa21496
52500000
55000000
CPU times: user 1min 17s, sys: 115 ms, total: 1min 17s
Wall time: 1min 17s

The time to find a suitable hash code depends on the input string.

In [127]:
%%time
i = 0
while True:
    sha256 = hashlib.sha256(b'%d' % i + b'yveshilpisch')
    if sha256.hexdigest() < target:
        print('SUCCESS')
        print(i, sha256.hexdigest())
        # break
    if i % 2500000 == 0:
        print(i)
    i += 1
    if i > 55000000:
        break
0
2500000
5000000
7500000
10000000
12500000
15000000
17500000
20000000
22500000
25000000
27500000
30000000
32500000
35000000
37500000
40000000
42500000
45000000
47500000
50000000
52500000
55000000
CPU times: user 1min 14s, sys: 58.5 ms, total: 1min 14s
Wall time: 1min 14s

Mining a Bitcoin Block

The following example is from http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html and is about a 'real' bitcoin block and how to mine it wih Python.

In what follows we need the struct module (cf. https://docs.python.org/3/library/struct.html).

In [128]:
import struct
import binascii
import hashlib

The basic elements of a bitcoin block.

The elements translated into Python code. Cf. also https://en.bitcoin.it/wiki/Block_hashing_algorithm.

In [129]:
ver = 2
prev_block = b'000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717'
mrkl_root = b'871714dcbae6c8193a2bb9b2a69fe1c0440399f38d94b3a0f1b447275a29978a'
time_ = 0x53058b35  # 2014-02-20 04:57:25
bits = 0x19015f53  # difficulty
bits
Out[129]:
419520339

Cf. https://bitcoinwisdom.com/bitcoin/difficulty for data about mining difficulty and hash rates. See also https://en.bitcoin.it/wiki/Difficulty.

The following code snippets illustrate the derivation of the target value which is the upper limit for a successful hash code. Cf. https://www.codecademy.com/courses/python-intermediate-en-KE1UJ/0/1 for bitwise operations in Python.

In [130]:
ex = bits >> 24
ex
Out[130]:
25
In [131]:
mant = bits & 0xffffff
mant
Out[131]:
89939
In [132]:
hex(mant)
Out[132]:
'0x15f53'
In [133]:
8 * (ex - 3)
Out[133]:
176

The concrete values.

In [134]:
target_hexstr = '%064x' % (mant * (1 << (8 * (ex - 3))))
target_hexstr
Out[134]:
'00000000000000015f5300000000000000000000000000000000000000000000'
In [135]:
target_str = binascii.unhexlify(target_hexstr)
target_str  # C struct
Out[135]:
b'\x00\x00\x00\x00\x00\x00\x00\x01_S\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

The Python module struct can be used to convert Python values to C structs represented as Python strings and vice versa.

In [136]:
pa = struct.pack('LLL', 1, 20, 300)  # packing data
pa
Out[136]:
b'\x01\x00\x00\x00\x00\x00\x00\x00\x14\x00\x00\x00\x00\x00\x00\x00,\x01\x00\x00\x00\x00\x00\x00'
In [137]:
struct.unpack('LLL', pa)  # unpacking data
Out[137]:
(1, 20, 300)

The nonce is the value which is to be added to the other block elements during the hash code generation. One looks for the nonce that gives a hash code smaller than the target level.

In [138]:
nonce = 850000000
In [139]:
# nonce = 856192328
# Block 286819
# check under https://blockexplorer.com

Finally, the Python code to do the mining activitiy. Basically, the nonce values gets increased by 1 during the look, a new hash code is generated and compared to the target level.

In [140]:
%%time
while nonce < 0x100000000:
    header = (struct.pack("<L", ver)
            + binascii.unhexlify(prev_block)[::-1]
            + binascii.unhexlify(mrkl_root)[::-1]
            + struct.pack("<LLL", time_, bits, nonce))
    hs = hashlib.sha256(hashlib.sha256(header).digest()).digest()
    if nonce % 200000 == 0:
        print(nonce, binascii.hexlify(hs[::-1]))
    if binascii.hexlify(hs[::-1]) < binascii.hexlify(target_str):
        print(nonce, binascii.hexlify(hs[::-1]))
        print('success')
        break
    nonce += 1
850000000 b'2d06d5717ef51ce987ec0f0e4823620f8d9d2d6556174103297a6099900a04c0'
850200000 b'879860b11769268f1e5ca7df9a763a7daf63911d2e25eb599db5bd295eba15ee'
850400000 b'b00da7ec454b36fd7f7f96fd973361e56c8f19d252b95b4ada21d49dc6751001'
850600000 b'ed546943a811deed9a2e37a74fda8a66b60d93f80db8f98d02fac8209c116862'
850800000 b'275944cb52b35cc3993bd286e3861afbc246c7de7ed274da917cc22d05cd2bf5'
851000000 b'2caff003e29dfc2ec7130a20d84118580a48d81627e0a766a6450a14404aa030'
851200000 b'5328e4e15f9143668ee761fcd6703b4bbb491962d18d0198acb036eaddec27d4'
851400000 b'961beb37d43cb3e9ec00b12ad6999e7599f2671dc9bd0529cbb4825e045ef6ad'
851600000 b'58c3abc58a3fcccda18f205cc124bcfce61d244b46bb70161da040e346429b61'
851800000 b'4e0bc5299ef60ee2e542b3db8c17a1fc37b423d7d04a90b8afdca7d637a58879'
852000000 b'47c01a53e024c253601f54332790483091888c463d8b8647c8d6f7e078eafb9b'
852200000 b'53c762141e8c9d615dc50b9f3dbc7923d770e8909b1e9a2d3a07e4bb5bc4cbe1'
852400000 b'1bd5babc0d87efd848b56e09cf7cc46c95923864cc2924aa2a1c5fc71835ec21'
852600000 b'1906382ef5f88e718a0a3f511300278828b9e066f0f17538bb0a004789ce03f1'
852800000 b'b9cd34c3e42796b3acfdbade72f80f58d6d25210dd69bd7e2e07a1b1f11dcb44'
853000000 b'4691fb160b2d339e2a54e95f81b891d46acd7f68754b0bd1f4b6942ca0c804ca'
853200000 b'61988103dd17e23ded0344340f3b0f264cb8d96132730cf48ebc34b06377a75b'
853400000 b'ee63a5132d8126c255f73d48e02eec5c7b3d3fb91c447a74aa13b9c982dbbbf6'
853600000 b'6cac0791a530a9e3b8b53d72cbaa2fb46c14c5e65d843e08ab656444bea9a9ea'
853800000 b'75ebeafea5b28c3121000dcf48e023aa829a2a2644afbe530d2b631ed8906bdf'
854000000 b'b261dbf2a4d3178730c06c7d37b3d3f104b9b0c8266cc0bf8ddcffb88ee3e620'
854200000 b'4e7174dc8891c4cf1e6e97dc62166961a289f88206bdd383ba0c31a027610e4f'
854400000 b'0da8de618a00dd0aa71af0ff3350cd78435f10b160a06dee0578caf5bcab38b9'
854600000 b'20e6560f3b64596d9632005d9464d83a3125eb2faef9e34349fd7409869fef51'
854800000 b'e1f78c86fb5afb69ade79f1ba6a0b2f059fd04c1826b74dc7dd99f1ce1148c39'
855000000 b'c56c40696263177bce80ee786c31f5d488d5bab570efb4abee25acb004088645'
855200000 b'96519548a733cc53746e0b07b62f7c0ebb1389a6c4cb8d30309b1f24d78eff26'
855400000 b'18379a22c13f193c9cdb0d5e6eccc2745bc0c95402162ca9721c0dc286c8d61d'
855600000 b'0233a14315cf2b61fce61b534d530c6087e805731bf52fd8f273a24c0806024c'
855800000 b'b0603f071da6f70c98dd66d59be05bc9da7e5a144a7cd4088eac3fccf85f2579'
856000000 b'634192b589ea881333b5e566873ecbaaf8addc5d7ef4240f8e8c1325c29c4c7a'
856192328 b'0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50'
success
CPU times: user 29.6 s, sys: 111 ms, total: 29.8 s
Wall time: 29.8 s

Merkle Hash

The following are the hard-coded transaction hashes for the Bitcoin block under consideration (http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html).

In [141]:
# https://blockexplorer.com/rawblock/0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50
txHashes = [
  "00baf6626abc2df808da36a518c69f09b0d2ed0a79421ccfde4f559d2e42128b",
  "91c5e9f288437262f218c60f986e8bc10fb35ab3b9f6de477ff0eb554da89dea",
  "46685c94b82b84fa05b6a0f36de6ff46475520113d5cb8c6fb060e043a0dbc5c",
  "ba7ed2544c78ad793ef5bb0ebe0b1c62e8eb9404691165ffcb08662d1733d7a8",
  "b8dc1b7b7ed847c3595e7b02dbd7372aa221756b718c5f2943c75654faf48589",
  "25074ef168a061fcc8663b4554a31b617683abc33b72d2e2834f9329c93f8214",
  "0fb8e311bffffadc6dc4928d7da9e142951d3ba726c8bde2cf1489b62fb9ebc5",
  "c67c79204e681c8bb453195db8ca7d61d4692f0098514ca198ccfd1b59dbcee3",
  "bd27570a6cbd8ad026bfdb8909fdae9321788f0643dea195f39cd84a60a1901b",
  "41a06e53ffc5108358ddcec05b029763d714ae9f33c5403735e8dee78027fe74",
  "cc2696b44cb07612c316f24c07092956f7d8b6e0d48f758572e0d611d1da6fb9",
  "8fc508772c60ace7bfeb3f5f3a507659285ea6f351ac0474a0a9710c7673d4fd",
  "62fed508c095446d971580099f976428fc069f32e966a40a991953b798b28684",
  "928eadbc39196b95147416eedf6f635dcff818916da65419904df8fde977d5db",
  "b137e685df7c1dffe031fb966a0923bb5d0e56f381e730bc01c6d5244cfe47c1",
  "b92207cee1f9e0bfbd797b05a738fab9de9c799b74f54f6b922f20bd5ec23dd6",
  "29d6f37ada0481375b6903c6480a81f8deaf2dcdba03411ed9e8d3e5684d02dd",
  "48158deb116e4fd0429fbbbae61e8e68cb6d0e0c4465ff9a6a990037f88c489c",
  "be64ea86960864cc0a0236bbb11f232faf5b19ae6e2c85518628f5fae37ec1ca",
  "081363552e9fff7461f1fc6663e1abd0fb2dd1c54931e177479a18c4c26260e8",
  "eb87c25dd2b2537b1ff3dbabc420e422e2a801f1bededa6fa49ef7980feaef70",
  "339e16fcc11deb61ccb548239270af43f5ad34c321416bada4b8d66467b1c697",
  "4ad6417a3a04179482ed2e4b7251c396e38841c6fba8d2ce9543337ab7c93c02",
  "c28a45cded020bf424b400ffc9cb6f2f85601934f18c34a4f78283247192056a",
  "882037cc9e3ee6ddc2d3eba86b7ca163533b5d3cbb16eaa38696bb0a2ea1137e",
  "179bb936305b46bb0a9df330f8701984c725a60e063ad5892fa97461570b5c04",
  "9517c585d1578cb327b7988f38e1a15c663955ea288a2292b40d27f232fbb980",
  "2c7e07d0cf42e5520bcbfe2f5ef63761a9ab9d7ccb00ea346195eae030f3b86f",
  "534f631fc42ae2d309670e01c7a0890e4bfb65bae798522ca14df09c81b09734",
  "104643385619adb848593eb668a8066d1f32650edf35e74b0fc3306cb6719448",
  "87ac990808239c768182a752f4f71cd98558397072883c7e137efb49d22b9231",
  "9b3e2f1c47d59a444e9b6dc725f0ac6baf160d22f3a9d399434e5e65b14eccb0",
  "fbe123066ae5add633a542f151663db4eb5a7053e388faadb40240671ae1b09b",
  "1dd07e92e20b3cb9208af040031f7cfc4efd46cc31ec27be20a1047965a42849",
  "2709bb9ed27353c1fd76b9240cab7576a44de68945e256ad44b2cb8d849a8060",
  "d0174db2c712573432a7869c1508f371f3a1058aeedddc1b53a7e04d7c56c725",
  "b4a16f724cddb8f77ddf3d2146a12c4be13d503885eaba3518a03da005009f62",
  "2aa706d75decbe57745e01d46f9f5d30a08dedaf3288cee14cc4948e3684e1d4",
  "ee49c5f6a5129ccaf2abebbc1d6d07a402a600af6221476b89aafaa683ca95b7",
  "bea1011c77874845e9b4c876ed2ceebd530d428dd4a564ad003d9211d40bb091",
  "f1e88ffc2b1de2aa4827002f06943ce5468735f7433f960bf01e75885b9f832b",
  "19247d017e002fb9143d1a89eb921222a94f8a3d0faaf2e05b0f594989edc4c4",
  "13f714ff62ee7d26b6d69ca980c141ebc54e9f71d2697083fe6c5efc1b02bd0f",
  "0c78cbb8246572f015fbdc53dc9798fa54d1119ec77c1f07ac310bcbcc40dbf8",
  "4bcde0ef92a6d24a2be7be50ac5e5299d776df2e6229ba5d475c2491da94f255",
  "0cfd7d1058502730cf0b2ffa880c78ef534651e06832b5d87c0d7eb84eac5b0c",
  "3a168f794d6e0c614429ad874317cc4cd67a8177214880ff6ea1704d29228c2f",
  "f9a555d817334397b402518d6fd959dc73d981ee7f5fe67969b63974ebbef127",
  "24b52691f66eaed4ce391a473902e309018257c98b9f02aaa33b399c9e6f3168",
  "a37b5e623dc26a180d9e2c9510d06885b014e86e533adb63ec40511e10b55046",
  "9dbaeb485e51d9e25a5621dc46e0bc0aaf51fb26be5acc4e370b96f62c469b80",
  "a6431d3d39f6c38c5df48405090752cab03bfdf5c77cf881b18a946807fba74a",
  "faa77e309f125373acf19855dd496fffe2f74962e545420844557a3adc7ebc11",
  "3523f52543ecfea2f78486dc91550fad0e6467d46d9d9c82ca63b2e0230bfa71",
  "a0583e358e42d77d18d1fd0533ff0a65615fc3b3112061ef92f168a00bf640c1",
  "42ae900888d5e5dde59c8e3d06e13db9e84ef05d27726d4b67fd00c50cd9406a",
  "154940777d3ff78f592ef02790131a59263c36b4958bbc836f9a767ea1a9f178",
  "6a0337de6ac75eecf748306e8ebc5bfe5c811a1481ae50f6956a9e7f26a679f5",
  "c99530c2148e09688d0b88795625943371183bf1f5d56c7446c6ed51ea133589",
  "626421dbe8ad6a0fd0d622d5dd3308a1cdc00b98575a41a91fe01a439e6f40bd",
  "b2f3a559f605a158cc395126c3cf394a7e92a53b7514c75157e1dc43a6c7f93e",
  "dffe06d1bea81f2a01c76786404bb867258f9e68013bf25454097ce935090738",
  "0860159ec7a2a51ce107c182a988c40b4bc2057a734354a1219b6c65e72640ed",
  "a405ff1bb51846b1867acc0b0da17f6f9616e592a0a7ff5ef3297c1ecfd60911",
  "a7d451924263284765f6343bca8a21b79b89ebfe611c7355dd88e0ec1c29e232",
  "41c758d08a4d3fe4d90645711589b832a2cd54dd25bd5b66e463e5d389a53aff",
  "a05c1a93a521fa5dbc1790cfbb808893453a428a65f2c6b2d51249fbb12db309",
  "90997920aa9786e10f513cfdd14e294feee6739cee1ab61b3fb1e3f42e7a915d",
  "99fcb9cb62c20a3135484a70bd3f73983f8f3b7b26266dad34f3993958a7642c",
  "e05f9a668b37e5f78bd3b9d047f29f92b33a87f11dd48390410006f858188b7b",
  "56dbc65895f7992da4a6985e7edba4d1c00879f1b28442c644c8a07658ceab27",
  "5e9004fe262b829563d0804656ba68b1de1690401f08a1915273230d8c902fc0",
  "1ea9ed3717523c5e304b7a7ac8058a87fb4f3fed8c6004769f226c9bb67e79c5",
  "f0f1a4c009b3f1b2729e89898e2f5c0fcdc312edea5df884a9c897cb90e4c566",
  "b5bb4ddf04863e6a60f33cb96c20dac8175d3bae55f335781503143c97a50e43",
  "f14cc97a20c6f627b4b78301352ae35463bc359362589cd178a06c0fa90850b7",
  "628801c8f614015c0fa0ccb2768cccc3e7b9d41ceed06071ce2534d31f7236d6",
  "3be1013c8f8da150e2195408093153b55b08b037fd92db8bb5e803f4c2538aae",
  "c9e1f8777685f54ba65c4e02915fd649ee1edcbf9c77ddf584b943d27efb86c3",
  "4274e92ed3bd02eb101baa5fb8ff7b96236830762d08273749fbb5166db8ab0b",
  "aa84c955bea04c7cee8f5bbbec97d25930fcaca363eed1b8cad37b931556d3e3",
  "d6a29c948677fb1f71aaf16debc3d071a4dd349458eb9e056dce3a000ff853da",
  "ba84bdb3d78367ca365016ac4bff9269576eb010f874c2967af73e0de5638de0",
  "1546c79951e3b541bc64d1957b565b7a2850fc87192c7b374aee6cfc69b9805e",
  "f119227d492ebe27fe9aae321980802454dfa64b2691efbe796c5075d5b07f62",
  "b8cf13d64818b32f96bbb585998b1bc9505f6a94055488e5a71fee9479c6f2a9",
  "1aaf459705b6afef2d7b83e3f181f1af55be0813daf55edce104cc59abc28ed7",
  "61ac185c8f520b5e3134953dc52ff292a40e1e96b088dab259558a9d240ec02f",
  "2da96e3154d7ec2329f787b73cb8a436b92d64cf3cc28e920d073279ea73b5f8",
  "1c4d72ce733b971b9ec4e24f37d733355f6f2ea635cc67ffb3e22748484df446",
  "2a6f89769f3272ac8c7a36a42a57627eca6b260ab2c76d8046a27d44d4034893",
  "f8d11df51a2cc113698ebf39a958fe81179d7d973d2044322771c0fe63f4d7c9",
  "f2287f17a4fa232dca5715c24a92f7112402a8101b9a7b276fb8c8f617376b90",
  "bb5ee510a4fda29cae30c97e7eee80569d3ec3598465f2d7e0674c395e0256e9",
  "647ab8c84365620d60f2523505d14bd230b5e650c96dee48be47770063ee7461",
  "34b06018fcc33ba6ebb01198d785b0629fbdc5d1948f688059158f053093f08b",
  "ff58b258dab0d7f36a2908e6c75229ce308d34806289c912a1a5f39a5aa71f9f",
  "232fc124803668a9f23b1c3bcb1134274303f5c0e1b0e27c9b6c7db59f0e2a4d",
  "27a0797cc5b042ba4c11e72a9555d13a67f00161550b32ede0511718b22dbc2c",
]

Function to generate the (pair-wise) Merkle hash.

In [142]:
# hash pairs of items recursively until a single value is obtained
def merkle(hashList):
    if len(hashList) == 1:
        return hashList[0]
    newHashList = []
    # process pairs; for odd length, the last is skipped
    for i in range(0, len(hashList)-1, 2):
        newHashList.append(hash2(hashList[i], hashList[i+1]))
    if len(hashList) % 2 == 1: # odd, hash last item twice
        newHashList.append(hash2(hashList[-1], hashList[-1]))
    return merkle(newHashList)

def hash2(a, b):
    # reverse inputs before and after hashing
    # due to big-endian / little-endian nonsense
    a1 = binascii.unhexlify(a)[::-1]
    b1 = binascii.unhexlify(b)[::-1]
    h = hashlib.sha256(hashlib.sha256(a1 + b1).digest()).digest()
    return binascii.hexlify(h[::-1])

Finally, the Merkle hash code for the above transaction hashes as found in the block header.

In [143]:
print(merkle(txHashes))
b'871714dcbae6c8193a2bb9b2a69fe1c0440399f38d94b3a0f1b447275a29978a'

The Python Quants

http://tpq.io | @dyjh | team@tpq.io

Python Quant Platform | http://quant-platform.com

Python for Finance | Python for Finance @ O'Reilly

Derivatives Analytics with Python | Derivatives Analytics @ Wiley Finance

Listed Volatility and Variance Derivatives | Listed VV Derivatives @ Wiley Finance