Blockchain Difficulty Algorithms - Part 2: Mathematics

In Part 1 of this series, we described the basic ideas of what a difficulty algorithm is and why we need one to help us control block times in a decentralized network. In this post I will formalize some of the mathematics. The definitions presented here will form the basis for the rest of this series on difficulty algorithms, but feel free to skip ahead if you don't need the deep dive.

Definitions

Difficulty Function

First, let's define the difficulty algorithm itself. A difficulty algorithm is a function \( D \) of the form:
\[ D(\Delta t_i) = \frac{1}{T} \]
Where \( \Delta t_i \) is the block time interval, and \( T \) is a target value. Though it is not strictly necessary, it's generally practical to have a restriction that \( \Delta t_i > 0 \). It is common to consider that \( \Delta t_i \) may be a vector, i.e. \( \Delta \vec{t} \) though we will forego this notation.

Difficulty

We'll define difficulty \( d \) in terms of the target value as:
\[ d = \frac{1}{T} \]
Thus, the difficulty function \( D \) is a function which maps one or more block time intervals \( \Delta t_i \) to a difficulty \( d \).
\[ D(\Delta t_i) = \frac{1}{T} = d \]
Each block in a blockchain stores a difficulty value. The difficulty function is used to determine the correct difficulty value for the next block. There are many nodes which need to independently agree on the precise value of the next difficulty, so a difficulty function must be deterministic. For this reason we usually consider that \( d \) and \( T \) must be integers, so there can be no rounding errors or disagreements about the representation of a floating point value.
\[ d,T \in \mathbb{Z} \]

Target Value

The target value is a scalar value equal to the inverse of the difficulty. The target value is the value we use in Proof of Work, Proof of Stake, or other Sybil-resistance mechanisms to control how difficult it is to produce a certain block. Specifically, the target value is used in the difficulty condition to determine whether a block is valid or not.

Difficulty Condition

The difficulty condition describes a simple check that can determine whether or not a block is valid. In Bitcoin this is defined as:
\[ \text{SHA256}(b) \le T \]

If a block meets this difficulty condition, then it is considered valid. Otherwise, it must be rejected. The choice of difficulty condition in Bitcoin is somewhat arbitrary. What is really important is the property of the difficulty condition which is that the larger the difficulty, the less likely it is that a given block will satisfy the difficulty condition. Furthermore, it is desirable that this probability condition be linear, so that the difficulty function is more simple. Remember, the goal is to control the block time interval. So we can say that the probability of satisfying the difficulty condition is linearly proportional to the target value, and inversely proportional to the difficulty:
\[ P( \text{SHA256}(b) \le T ) \propto T \]

Mining

The job of a miner is to find a block \( b \) which meets the difficulty condition. Since all the nodes in the network abide by the same difficulty function, there is an agreed upon difficulty for the next block. From this a node computes the target value as the inverse, and tries to create a block which satisfies the difficulty condition. Upon doing so, a miner is generally rewarded for contributing this block to the network if it is accepted into the main blockchain. The difficulty condition is based on the block hash, so any piece of data that a miner changes within a block will result in a new block hash. The process of mining generally refers to repeatedly changing a number in the block reserved for the purpose, the nonce, and checking to see if the difficulty condition is satisfied yet. The miner may also change the block timestamp to produce different hashes. In a Proof of Work system like Bitcoin, miners must spend energy and time to compute hashes in the hope of finding a block which satisfies the difficulty condition. The more hashes a block producer can compute, they higher likelihood they have of solving the difficulty condition. In a Proof of Stake system like Energi, the difficulty condition is more likely to be solved by higher balances. In both cases, as time progresses there are more opportunities to solve the block. So we can define mining a search function over two dimensions: a weight \( W \) which measures the relative strength of a miner, and time \( t \).

Problem Statement

With the above definitions, we have all we need to fully define a system which is capable of decentralized time keeping. A well ordered blockchain is a linked list of blocks in chronological order, each block keeping track of its own timestamp. For some timestamp \( t_i \) we can compute \( \Delta t_i = t_i - t_{i-1} \).
The goal of a difficulty algorithm is to produce a series of difficulty values \( d_i \) such that if there are no changes to the miners, the average block time interval will converge upon some predetermined value \( t \):
\[\lim_{i \to \infty} \overline{\Delta t_i} = t\]
In principle, a difficulty algorithm simply needs to increase the difficulty if \( \Delta t_i < t \), and decrease the difficulty if \( \Delta t_i > t \). This describes the basic problem statement behind many engineering problems classified as control theory.

Ryan Lucchese's Blog

Crypto, Mathematics, Physics, AI