Building blockchain in Go. Part 2: Proof of Work

Table of Contents

Introduction
Proof-of-Work
Hashing
Hashcash
Implementation

Introduction

In the previous article we built a very simple data structure, which is the essence of blockchain database. And we made it possible to add blocks to it with the chain-like relation between them: each block is linked to the previous one. Alas, our blockchain implementation has one significant flaw: adding blocks to the chain is easy and cheap. One of the keystones of blockchain and Bitcoin is that adding new blocks is a hard work. Today we're going to fix this flaw.

Proof-of-Work

A key idea of blockchain is that one has to perform some hard work to put data in it. It is this hard work that makes blockchain secure and consistent. Also, a reward is paid for this hard work (this is how people get coins for mining).

This explanation will focus on proof of work as it functions in the bitcoin network. Bitcoin is a digital currency that is underpinned by a kind of distributed ledger known as a "blockchain." This ledger contains a record of all bitcoin transactions, arranged in sequential "blocks," so that no user is allowed to spend any of their holdings twice. In order to prevent tampering, the ledger is public, or "distributed"; an altered version would quickly be rejected by other users.

The way that users detect tampering in practice is through hashes, long strings of numbers that serve as proof of work. Put a given set of data through a hash function (bitcoin uses SHA-256), and it will only ever generate one hash. Due to the "avalanche effect," however, even a tiny change to any portion of the original data will result in a totally unrecognizable hash. Whatever the size of the original data set, the hash generated by a given function will be the same length. The hash is a one-way function: it cannot be used to obtain the original data, only to check that the data that generated the hash matches the original data.

Generating just any hash for a set of bitcoin transactions would be trivial for a modern computer, so in order to turn the process into "work," the bitcoin network sets a certain level of "difficulty." This setting is adjusted so that a new block is "mined"—added to the blockchain by generating a valid hash—approximately every 10 minutes. Setting difficulty is accomplished by establishing a "target" for the hash: the lower the target, the smaller the set of valid hashes, and the harder it is to generate one. In practice, this means a hash that starts with a very long string of zeros.

This whole “do hard work and prove" mechanism is called proof-of-work. It's hard because it requires a lot of computational power: even high performance computers cannot do it quickly. Moreover, the difficulty of this work increases from time to time to keep new blocks rate at about 6 blocks per hour. In Bitcoin, the goal of such work is to find a hash for a block, that meets some requirements. And it's this hash that serves as a proof. Thus, finding a proof is the actual work.

One last thing to note. Proof-of-Work algorithms must meet a requirement: doing the work is hard, but verifying the proof is easy. A proof is usually handed to someone else, so for them, it shouldn't take much time to verify it.

Hashing

Hashing is a process of obtaining a hash for specified data. A hash is a unique representation of the data it was calculated on. A hash function is a function that takes data of arbitrary size and produces a fixed size hash. Here are some key features of hashing:

Original data cannot be restored from a hash. Thus, hashing is not encryption.
Certain data can have only one hash and the hash is unique.
Changing even one byte in the input data will result in a completely different hash.

Hashing functions are widely used to check the consistency of data. Some software providers publish checksums in addition to a software package. After downloading a file you can feed it to a hashing function and compare produced hash with the one provided by the software developer.

In blockchain, hashing is used to guarantee the consistency of a block. The input data for a hashing algorithm contains the hash of the previous block, thus making it impossible (or, at least, quite difficult) to modify a block in the chain: one has to recalculate its hash and hashes of all the blocks after it.

Hashcash

Bitcoin uses Hashcash, a Proof-of-Work algorithm that was initially developed to prevent email spam. It can be split into the following steps:

Take some publicly known data (in case of email, it's receiver's email address; in case of Bitcoin, it's block headers).
Add a counter to it. The counter starts at 0.
Get a hash of the data + counter combination.
Check that the hash meets certain requirements.
- If it does, you're done.
- If it doesn't, increase the counter and repeat the steps 3 and 4.

Thus, this is a brute force algorithm: you change the counter, calculate a new hash, check it, increment the counter, calculate a hash, etc. That's why it's computationally expensive.

Now let's look closer at the requirements a hash has to meet. In the original Hashcash implementation, the requirement sounds like "first 20 bits of a hash must be zeros". In Bitcoin, the requirement is adjusted from time to time, because, by design, a block must be generated every 10 minutes, despite computation power increasing with time and more and more miners joining the network.

To demonstrate this algorithm, I took the data from the previous example ("I like donuts") and found a hash that starts with 3 zero-bytes:

ca07ca is the hexadecimal value of the counter, which is 13240266 in the decimal system.

Implementation

Ok, we're done with the theory, let's write code! First, let's define the difficulty of mining:

const targetBits = 18

In Bitcoin, "target bits" is the block header storing the difficulty at which the block was mined. We won't implement a target adjusting algorithm, for now, so we can just define the difficulty as a global constant.

24 is an arbitrary number, our goal is to have a target that takes less than 256 bits in memory. And we want the difference to be significant enough, but not too big, because the bigger the difference the more difficult it's to find a proper hash.

// pkg/blockchain/proofofwork.go

type ProofOfWork struct {
	block  *Block
	target *big.Int
}

func NewProofOfWork(b *Block) *ProofOfWork {
	target := big.NewInt(1)
	target.Lsh(target, uint(256-targetBits))

	return &ProofOfWork{
		block:  b,
		target: target,
	}
}

Now, we need the data to hash. Let's prepare it:

// pkg/blockchain/proofofwork.go

func (pow *ProofOfWork) prepareData(nonce int) []byte {
	data := bytes.Join(
		[][]byte{
			pow.block.PrevBlockHash,
			pow.block.Data,
			utils.IntToHex(pow.block.Timestamp),
			utils.IntToHex(int64(targetBits)),
			utils.IntToHex(int64(nonce)),
		},
		[]byte{},
	)

	return data
}

This piece is straightforward: we just merge block fields with the target and nonce. nonce here is the counter from the Hashcash description above, this is a cryptographic term.

Ok, all preparations are done, let's implement the core of the PoW algorithm:

// pkg/blockchain/proofofwork.go

func (pow *ProofOfWork) Run() (int, []byte) {
	var hashInt big.Int
	var hash [32]byte
	nonce := 0

	fmt.Printf("Mining the block containing \"%s\"\n", pow.block.Data)
	for nonce < maxNonce {
		data := pow.prepareData(nonce)
		hash = sha256.Sum256(data)
		fmt.Printf("\r%x", hash)
		hashInt.SetBytes(hash[:])

		if hashInt.Cmp(pow.target) == -1 {
			break
		} else {
			nonce++
		}
	}

	fmt.Printf("\n\n")
	return nonce, hash[:]
}

First, we initialize variables: hashInt is the integer representation of hash; nonce is the counter. Next, we run an “infinite" loop: it's limited by maxNonce, which equals to math.MaxInt64; this is done to avoid a possible overflow of nonce. Although the difficulty of our PoW implementation is too low for the counter to overflow, it's still better to have this check, just in case.

In the loop we:

Prepare data.
Hash it with SHA-256.
Convert the hash to a big integer.
Compare the integer with the target.

As easy as it was explained earlier. Now we can remove the SetHash method of Block and modify the NewBlock function:

// pkg/blockchain/block.go

func NewBlock(data string, prevBlockHash []byte) *Block {
	block := &Block{
		Timestamp:     time.Now().Unix(),
		Data:          []byte(data),
		PrevBlockHash: prevBlockHash,
		Hash:          []byte{},
		Nonce:         0,
	}
	pow := NewProofOfWork(block)
	nonce, hash := pow.Run()

	block.Nonce = nonce
	block.Hash = hash[:]

	return block
}

Here you can see that nonce is saved as a Block property. This is necessary because nonce is required to verify a proof. The Block structure now looks so:

// pkg/blockchain/block.go

type Block struct {
	Timestamp     int64
	Data          []byte
	PrevBlockHash []byte
	Hash          []byte
	Nonce         int
}

Alright! Let's run the program to see if everything works fine:

Mining the block containing "Genesis Block"
000021840de32a6277935a5e4febbef7b14bb34828e4cd5851f11a5d36f2fe38

Mining the block containing "Send 1 BTC to Ivan"
00002ce36466d2e9cc12eb0961cb10e9be699885bb273f461fdef5c9d38ba856

Mining the block containing "Send 2 more BTC to Ivan"
000000adf85a0bda632f43b9150522fe42bdcf3d14e58f85cb2df4402fe02ab7

Prev. hash:
Data: Genesis Block
Hash: 000021840de32a6277935a5e4febbef7b14bb34828e4cd5851f11a5d36f2fe38

Prev. hash: 000021840de32a6277935a5e4febbef7b14bb34828e4cd5851f11a5d36f2fe38
Data: Send 1 BTC to Ivan
Hash: 00002ce36466d2e9cc12eb0961cb10e9be699885bb273f461fdef5c9d38ba856

Prev. hash: 00002ce36466d2e9cc12eb0961cb10e9be699885bb273f461fdef5c9d38ba856
Data: Send 2 more BTC to Ivan
Hash: 000000adf85a0bda632f43b9150522fe42bdcf3d14e58f85cb2df4402fe02ab7

There's one more thing left to do: let's make it possible to validate proof of works.

// pkg/blockchain/proofofwork.go

func (pow *ProofOfWork) Validate() bool {
	var hashInt big.Int

	data := pow.prepareData(pow.block.Nonce)
	hash := sha256.Sum256(data)
	hashInt.SetBytes(hash[:])

	isValid := hashInt.Cmp(pow.target) == -1

	return isValid
}

And this is where we need the saved nonce.

Finally, let's check one more time that everything's ok:

// pkg/blockchain/main.go

func main() {
	...

	for _, block := range bc.blocks {
		...
		pow := NewProofOfWork(block)
		fmt.Printf("PoW: %s\n", strconv.FormatBool(pow.Validate()))
		fmt.Println()
	}

}

Output:

...

Prev. hash:
Data: Genesis Block
Hash: 00000f2889a19b505c176009034d2aa83ad4a992b8796845a63fe64a6444a8d5
PoW: true

Prev. hash: 00000f2889a19b505c176009034d2aa83ad4a992b8796845a63fe64a6444a8d5
Data: Send 1 BTC to Ivan
Hash: 000021f6776030be7f22aacc8cf37b2eca03b4c42ec670b62b056143d487125b
PoW: true

Prev. hash: 000021f6776030be7f22aacc8cf37b2eca03b4c42ec670b62b056143d487125b
Data: Send 2 more BTC to Ivan
Hash: 00003351be743bc826512f0cb0d85fd3cd8d1fb528e3586d82f7b8e3207a2812
PoW: true

References:

Full Source Code

Proof of Work (PoW)

Hashcash

Building Blockchain in Go