Blockchain difficulty algorithms have been the subject of much research and a lot of incremental improvements. After many incremental improvements there has been a significant leap forward which is unlikely to be surpassed any time soon. At Energi, I created a difficulty algorithm that outperforms every other algorithm in the industry, using PID control. In this series of posts I plan to cover difficulty algorithms in depth, discuss the motivation and technical details of Energi's difficulty algorithm, and go over some performance results.
If the term "difficulty algorithm" feels a bit vague, don't worry we'll start there. A blockchain generally has the characteristic that blocks should be produced at some predetermined rate. The very reason we put "blocks" in a "chain" is so that we can have a decentralized network reach consensus. But what exactly is a block? A block is nothing more than a small package of data that follows certain rules, which every node on the decentralized network can verify.
The blocks are linked together to form a chain, very much like a linked list for those familiar with data structures. Each successive block in the chain packages some data which may update the state of the network, according to the rules established by the network participants.
Why Does It Matter?
There are a number of reasons we want to control the speed at which blocks are allowed to be added to the chain. One reason is economic - in most blockchains the producer of a block, often called a miner, is rewarded for doing so. This reward serves to ensure that there will be nodes out there validating the rules of the network and packaging data to be added to the chain. In a truly decentralized network like Energi, anyone can produce a block. But we don't want people producing useless blocks just to get a reward, there has to be some limit to when a block can be produced, and block producers compete for the reward.
Another reason we want to control the speed at which blocks are produced relates to the security of the network itself. As anyone can produce a block, if there were no limit on how fast blocks could be produced, the network participants could be overwhelmed with different potential blocks. Nodes would have to sort out which blocks belong in the chain, sifting through many possible candidates. The network could become fragmented, with nodes receiving new block candidates faster than they can determine which block to add in the chain. Without some way to limit block production, a decentralized network cannot function properly.
One unfamiliar with distributed systems might initially think this is an easy enough problem to solve. Blockchain nodes just need to be programmed to only accept a block every so often, for example 1 block per minute. However a system designed this way would run into a big problem: clocks will always differ between different network participants. With potentially thousands of nodes with different clocks, and various latency between these nodes, we could never agree on the time at which the block could be produced. To make matters worse, because block producers have an incentive to produce each block, they might try to lie about their own clock in order gain some advantage. We need some kind of decentralized clock suitable for a distributed system.
This is what a "difficulty algorithm" really is, it is a mechanism that allows us to build a decentralized clock suitable for use in large scale distributed systems. We call it a difficulty algorithm because it allows us to adjust how difficult it is to produce a block. The higher the difficulty is, the longer it takes block producers to produce a block. The lower the difficulty is, the sooner block producers can produce a block.
An Analogy From Cryptography
There is an analogy here to some concepts in other cryptographic systems, for example the rounds in bcrypt. The password hashing scheme bcrypt uses a variable number of rounds to indirectly determine the amount of processing power needed to compute a bcrypt hash. In the case of password hashes, we want a cryptographic hashing algorithm that is fast enough to quickly verify a piece of data, but not so quick that we could brute force the data by trying a large amount of guesses. The variable number of rounds in bcrypt allows us to increase or decrease the difficulty of computing a hash, and in doing so indirectly control how much time it takes to compute a hash.
In an analogous fashion, the difficulty algorithm allows a decentralized network to control the time it takes to produce a block. Difficulty does not control block timing directly, but rather determines the probability that a block producer will be able to produce a block per unit time. By controlling this probability, we can achieve an expected value for the time between blocks. Since block producers can join or leave the network at any time, a difficulty algorithm must be able to adjust this probability in response to conditions on the network. In Part 2 of this series we'll dive deep into the mathematics of a difficulty algorithm.