A state machine is a conceptual model that describes a system's various states and the transitions between those states. In the context of blockchain, a state machine represents the changing states of the blockchain network as transactions are processed and the overall system evolves.
One of the fundamental problems with state machines, however, is the “state explosion problem.” With regards to blockchain, the state explosion problem arises when the size of the blockchain's state grows exponentially due to increasing transaction volume, smart contract execution, and data storage within the blockchain. This growth in state size presents significant challenges in terms of scalability, storage requirements, network performance, and potential centralization concerns, necessitating innovative solutions to mitigate the impact of state explosion.
In this series of articles, we talk about state explosion, how EVM chains are challenged with this issue, and explore the different strategies to resolve this problem.
In part one of this series, we discuss in depth:
Blockchain technology has gained significant attention and adoption across various industries due to its decentralized and immutable nature. However, as blockchain networks scale and handle an increasing number of transactions, a critical challenge known as "state explosion" arises. This phenomenon refers to the exponential growth in the size of the blockchain's state, leading to various performance and scalability concerns.
One of the most important concepts to clear before we dive into the details of the state explosion problem is what actually is the state in terms of blockchain. It is well known that data in blockchains are stored in blocks, and once saved, the data becomes immutable. Here it is important to understand whether all data saved is “state” and cannot be changed.
The state doesn’t only refer to current data but also to data in use. Data saved on the blockchain can be divided into two categories：
To answer our question of which data is immutable, in terms of blockchain, history is immutable. However, with new transactions happening constantly, the state changes accordingly. To better understand the impact of state explosion, it is important to understand the storage layout of the state trie of the Ethereum Virtual Machine (EVM), which is the core of several blockchains like Ethereum, BNB Chain, etc.
The Ethereum Virtual Machine (EVM) is a computer programthat executes smart contracts on EVM compatible blockchain, like Ethereum from where it originates, BNB Smart Chain, Polygon etc. The EVM stores information about the blockchain network using a special type of tree called a Merkle Patricia tree (MPT).
This tree structure helps keep track of the current state of the system and the transactions that occur. In this tree, the bottom-most nodes store the actual data in blocks in the form of block hashes, while higher-level nodes contain hashes of their child nodes, as shown in the figure below. When data is changed, the corresponding node hashes are updated all the way up to the top of the tree. By comparing the topmost hash, we can check if the data is the same.
This tree also allows us to prove the validity of specific data without storing all the information, which saves storage space and ensures the integrity of the data. For more in-depth details, refer here.
Now that we have a brief overview of the Merkle trees let's dive into the main objects in the EVM state storage layout. Remember, all of the storage tries use MPT as their data storage structure.
The state trie is the core of an EVM-based blockchain network. It has four types: world state trie, account storage trie, transaction trie, and transaction receipt trie. Each state trie is constructed with Merkle Patricia Tree/Trie (MPT), and only the root node (top node of state trie) is stored in the block for efficient use of storage.
The three main state tries: world state trie, transaction trie, and receipt trie are stored in the block. Whereas the account storage trie makes the leaf node in the world state trie.
The World State Trie represents the global state of a blockchain network, encompassing current account states, balances, contract codes, and more. It plays a crucial role in determining transaction outcomes and smart contract execution.
When transactions or smart contracts are modified or created, the World State is updated accordingly and recorded in a new block. This ensures consistency among participants, allowing independent verification of transactions and contracts by the network participants.
The World State Trie is connected to the Account Storage Trie through the "storageRoot" field, with the Account Storage Tier serving as the leaf nodes in the World State Tier.
EVM has two account types: Externally Owned Accounts (EOA) and Contract Accounts. EOAs are controlled by private keys, while Contract Accounts are smart contracts controlled by code.
The account state contains information about an EVM account, such as balances and transaction counts. Each account has its own state, with all fields except "codeHash" being mutable. Once a code is deployed on EVM, it cannot be changed and requires a new deployment.
For Contract Accounts on EVM, the Account Storage Trie is used to store associated data. EOAs, on the other hand, have an empty "storageRoot" field, and the "codeHash" field represents the hash of an empty string. EOAs do not have persistent storage and their state is primarily defined by their balance.
Contract Accounts rely on the Account Storage Trie to store and manipulate data. It uses a mapping of 32-byte integers to enable flexible and structured data storage within a contract. For more details, refer here.
Transactions are essential in a blockchain network and are eminent to provide transparency and security, as they are responsible for the change of states in the EVM. Further, once a transaction is added to a block, it becomes immutable and cannot be modified. This immutability ensures the integrity of account balances (world state).
The Transaction trie stores transaction information in an MPT for efficient retrieval and verification on an EVM chain. Each leaf node represents a unique transaction containing sender and recipient addresses, values, gas prices, etc. These nodes are hashed with their parents to create the trie. The root node's hash, called "transactionRoot," is stored in the block's header, referencing all transactions in the block. EVM has one transaction trie per block. For more details, refer here.
The Transaction Receipt Trie organizes and stores transaction receipts in a block. Each leaf node corresponds to a unique transaction and holds the receipt data. These leaf nodes are combined and hashed with parent nodes to form the entire Receipt Trie. The receiptRoot, stored in the block header, serves as a reference to the transaction receipts.
Receipt data includes transaction status, gas consumed, logs generated, and other metadata. The Receipt Trie validates transaction execution and facilitates efficient validation and auditing with Merkle proofs in the EVM-based blockchain. For more details, refer here.
In a blockchain network, nodes play a crucial role in managing the state trie. While the goal is to allow consumer-grade devices to function as nodes, higher hardware requirements are typically necessary due to the local maintenance of the state trie. The state trie represents the current state of the blockchain network, ensuring its integrity. This section briefly explains the connection between nodes and the state trie.
The state of the blockchain is maintained by nodes within the network through the use of the state trie. This trie is a crucial component that allows nodes to confirm transactions, carry out smart contracts, and maintain the network's security. By working together to manage the state trie, nodes ensure the secure and decentralized operation of the blockchain network.
State explosion problem refers to the state growing rapidly and being out of control. Blockchain platforms that offer smart contract programmability, like Ethereum, BNB Chain, etc., face this problem because their users save all kinds of data on-chain, e.g., state data, history data, contract data, account data, transactions, etc. mass adoption, this problem gains severity and requires attention to make sure the blockchain platform maintains its scalability and decentralization.
When a node participates in a blockchain network, it maintains a copy of the state trie locally. The state trie can be quite large, especially as the number of accounts and transactions increases over time. Each node needs to store and update the state trie to remain in sync with the network and perform various operations, such as transaction validation and execution of smart contracts.
With mass adoption and an increase in the state tire at an accelerated speed, state explosion can impact several different aspects of blockchain technology.
BNB Smart Chain (BSC) is one of the rapidly growing EVM blockchain platforms. Offering lower gas costs, faster finality, complete EVM compatibility, smart contract programmability, and several innovative solutions to Web3 developers, it is one of the biggest competitors of Ethereum, the pioneer of EVM chains.
The peak of BSC daily transactions reached 16 million on 25th Nov 2021, that is ~188 TPS continuously running for 24 hours. None of the other EVM blockchains have faced such large online traffic yet.
Due to the higher volume of traffic, the storage size requirements on BSC are also growing very rapidly. As of the end of 2022, a pruned BSC full node snapshot file is approximately 1.6TB in size, compared to approximately 1TB just one year ago.
The 1.6TB storage consists mainly of two parts:
With a higher influx of transactions and smart contract deployments/interactions, the need for storage requirements will also increase linearly. Making it a problem that needs quick attention and a forward-compatible solution. A solution that will keep the storage size and hardware requirements for the nodes in check.
Over the years, BSC has implemented several solutions to keep this problem at bay, like sharding, pruning, layer 2 solutions, storage data structure optimizations, etc. However, BEP206 and BEP215 have surfaced as the latest and most suitable solutions to maintain the storage size and scalability of the BNB Chain.
BEP206 proposes a practical solution to address the problem of increasing world state storage on the BSC by removing expired storage state. Whereas BEP215 aims to introduce a state revival transaction type based on BEP-206. The details of these proposals will be covered in the upcoming parts of this article series. Note that both of these proposals are still in progress and under the community discussion phase.
Over the years, blockchain technology has gained immense popularity. However, with this, one of the fundamental issues that have surfaced and are of prime concern is the explosion in the size of the state trie, giving rise to the state explosion problem which is imminent to state machines.
Addressing the challenges posed by state explosion is crucial for ensuring the scalability, performance, and decentralization of blockchain networks. In brief, state explosion can damage the very integral claims of blockchain technology like decentralization, high throughput, scalability, etc.
These challenges require innovative solutions to make sure the mass adoption of blockchain technology is maintained. In the next part of this series, we discuss the different proposals that have been suggested to mitigate the impact of state explosion and enable the widespread adoption of blockchain technology.