Sharding is a method of dividing and storing a single logical set of data in the form of multiple databases. Another definition of sharding is horizontal data partitioning.
When and who invented sharding?
The concept of sharding has been applied in the management of traditional centralized databases since the late 1990s. The term “shard” (fragment) was popularized by one of the first massively multiplayer online role-playing games, Ultima Online, in which the developers assigned players to different servers (different “worlds” in the game) to cope with traffic.
A popular scenario for using sharding in business is dividing the user database by geographic location. Users belonging to one geographic location are combined into one group and placed on a unique server.
What is sharding in the context of blockchain?
Blockchain is a database with nodes representing individual servers. About blockchain, sharding means dividing the blockchain network into individual segments (shards). Each shard contains a unique set of smart contracts and account balances.
A node is assigned to each shard that verifies transactions and operations, as opposed to a scheme in which each node is responsible for verifying every transaction in the entire network.
Dividing the blockchain into more manageable segments allows to increase transaction throughput and thereby solve the scalability problem faced by most modern blockchains.
How does sharding work?
Explanation of the example of Ethereum:
The Ethereum blockchain consists of thousands of computers or nodes, each of which “lends” a certain amount of hash rate to the network. It is this hash rate that allows the Ethereum Virtual Machine (EVM) to function — to execute smart contracts and manage decentralized applications (DApps).
Ethereum currently operates on a sequential execution basis in which each of the nodes must calculate each operation and process each transaction. Therefore, it takes a significant amount of time for a transaction to complete the verification process: Ethereum carries out approximately 10 transactions per second, while Visa, for example, has this figure in the region of 24,000.
Adding computers to the network does not necessarily increase efficiency, since the entire ledger is stored on each device, and the verification chain just gets longer.
The idea behind sharding is to move away from a model in which each node has to compute every operation, in favor of a parallel execution model in which the nodes process only certain computations. This allows multiple transactions to be processed in parallel.
The blockchain is split into separate shards (subdomains or segments). Nodes only manage the part of the ledger to which they are attached (execute processes and commit transactions), and do not maintain the entire ledger.
What problems does sharding solve?
Sharding is a potential solution to the scaling problem.
The more popular blockchain becomes, the more users initiate transactions, launching decentralized applications and other processes on the network. As a result, the speed of transactions drops, which hinders the expansion of the blockchain in the long term. The growth of transactional activity requires nodes to intensify the process of transaction verification. There is a threat that these blockchains could be “plugged,” as happened with Ethereum during the CryptoKitties boom when the game accounted for 11% of the network’s transactions.
If groups of nodes are responsible for individual segments, then each node does not need to maintain an entire registry to perform each operation. Therefore, transactions can be validated in a parallel rather than linear fashion, which increases network speed. This solves the problem of scaling.
What are the disadvantages of sharding?
The main problems of sharding are communication and security. If you divide the blockchain into isolated segments, then each shard becomes a separate network. Users and applications of one subdomain will not be able to communicate with users and applications of another subdomain without using a special communication mechanism.
A segmented blockchain also poses a security problem, since it is easier for hackers to capture one shard — due to the lower hash rate required to control individual segments (the so-called 1% attack).
Once a segment is captured, attackers can forward invalid transactions to the main network. Also, data in this particular segment may become invalid and be irretrievably lost. Ethereum offers a randomized sample solution — shard protocols are randomly assigned to different sections to confirm block authentication.
What are the alternatives to sharding?
The developers have proposed two solutions to improve the performance and speed of transactions in blockchains.
The first solution is to increase the block size. The key idea is that the larger the block size, the more transactions can be placed in it and, therefore, the higher the number of transactions per second.
However, the larger the block, the more computing power is needed to verify it. If the block size is increased significantly, then only the most powerful computers will be able to manage the processing power required to act as nodes.
The high cost of such computing hardware means that node pools will inevitably become smaller and more centralized, increasing the risk of a 51% attack. Increasing the block size also requires a hard fork, which threatens to split the community: if not all users accept the update, then there will be two different chains using different coins. Increasing the block size may not be a long term solution.
The second proposal is to use altcoins so that various functions and applications are implemented on their own networks with their own coins.
Such a model will improve performance as the single blockchain will not be overloaded, but it will also increase security risks as computing power will be distributed across multiple blockchains. Again, the risk of a network being compromised will also increase because the computing power required to carry out a 51% attack will be much less.
Who uses sharding?
Zilliqa is the first platform to implement sharding. At the test net stage, she managed to reach an indicator of 2828 transactions per second.
The Near blockchain ecosystem enables developers to build and deploy decentralized applications. Near calls itself a “PoS sharded blockchain” and claims that its sharding technology keeps nodes small enough to function on low-performance devices — potentially even mobile phones.
Ethereum offers a blockchain ecosystem for implementing smart contract-based DApps. The Ethereum Foundation plans to include sharding in the updated version of the Ethereum 2.0 protocol.
Other projects working with sharding include Cardano, QuarkChain, and PChain.
What is the future of sharding?
Sharding technology is featured in the white paper of the Libra digital currency. Ahead of the launch, Facebook acquired Chainspace, whose development team specializes in sharding. The specific details are still unknown, but it can be assumed that a kind of sharding will be introduced into the Libra blockchain.
Sharding could theoretically be a solution to the so-called blockchain trilemma.
The blockchain trilemma, as explained by Vitalik Buterin, is that only two of the three key features of the blockchain can be stored simultaneously — security, decentralization, and scalability. By overcoming the difficulties faced by sharding, distributed networks can be scaled without sacrificing decentralization or security.