Turbine Block Propagation and Data Availability

Alperen Tunçkıran
14 min readFeb 24, 2024

--

1) Introduction to Solana

Solana is an incredibly fast and innovative decentralized blockchain project that has revolutionized the ecosystem. With its unique perspective on scalability, Solana has gained a significant advantage over its competitors.

In this article, we’ll dive into the exciting world of the Turbine Block Propagation mechanism and explore how Solana is tackling the Data Availability Problem with ease. Before we proceed, let me explain how Solana works using an analogy!

Solana operates on a democratic system where Members of Parliament (MPs) have voting power proportional to the amount of money they hold. You can vouch for an MP who will vote on your behalf in national decisions. The MP you choose will have more voting power and will represent you in the vote.

MPs record their votes on paper. With each vote requiring the purchase of a sheet. After each vote, a sum of money is printed and distributed to the MPs. The amount received by each MP is proportional to the money pledged, and those who pledged receive a share of the MP’s earnings.
All trades between individuals and all votes cast by MPs are subject to taxation. Half of the taxes are paid to the MPs who vote, and to the people who vouch for the MP in proportion to the money vouched for.

But here’s the exciting part; For a bill to be approved, at least 66% of the votes must be in favor. However, if less than 66% of the MPs votes, the parliament stops working.

Every 4 votes, the Speaker of the Parliament changes and the new Speaker takes over in a predetermined order among MPs. The MPs submit the demands of the people to the current Speaker and the Speaker sorts them. There is a limited time for this sorting. At the end of this time, the Speaker has to deliver the sorted list of demands to the MPs. The Speaker breaks down the list of demands in such a way that even if a few of the pieces are lost, the rest can be collected and the whole list can be reconstructed. These pieces go to first 5, then 25, then 125 MPs (these numbers are adjusted according to the intensity of the parliament at that moment). The first 5 MPs send them to the next 25 MPs, who in turn send them to the next 125 MPs. These groups also share among themselves the pieces they have received and reconstruct the list of demands.

Then the voting starts. If a list of demands is followed by +31 more lists of demands and these are approved, the first list of demands is considered accepted.

Now let’s dive into Solana!

  • The parliament is composed of validator nodes, and the amount of money is represented by the amount of SOL staked/delegated on the validator.
  • The public, who are daily users, delegate SOL coins to the validators, making it a truly decentralized system.
  • Transaction fees are referred to as tax, but don’t worry, they’re minimal.
  • List of demands are equivalent to blocks, which are created by requests from the public, i.e. our transactions. Solana uses the cutting-edge Tower-BFT Consensus, ensuring fast and secure transactions.
  • And to top it off, for a block to be approved, at least 66% of the validator vote weight must vote in favor. If more than 33% of validator core weight fail to vote, Tower-BFT and Solana will stop.

Solana distributes blocks to validators using a block propagation method called Turbine. The leader node divides the block into structures called “shreds” using Reed-Solomon Erasure Coding. These shreds are distributed to validators.

Thanks to Reed Solomon Erasure Coding, even if some shreds are lost or do not reach the validators, the block can still be reconstructed using the available shreds!

Now that we have a basic understanding of how Solana works, let’s explore its deeper aspects!

2) Turbine Block Propagation

Turbine Block Propagation is an absolutely crucial aspect of Solana Blockchain.

The distribution of block data to nodes in the network determines the speed and efficiency of the network, so it’s essential to choose the most optimal solution.

Solana produces blocks in as little as 400 milliseconds! Let’s consider a block of 250,000 transactions, 64MB in size, where each transaction consists of 256 bytes. Let the current leader in the network distribute this block to 1000 validators.

The simplest approach to this is for the leader to send this 64MB data to 1000 validators one by one. But how efficient is this for a network like Solana that produces a block every 400ms?

Instead, Solana uses Turbine to solve this problem. For a 64MB block, the current leader divides it into 1000 equal chunks of 64KB. It then distributes each small piece of data to validators, using randomized roadmaps generated by Turbine.

2.1) Transport Layer Network Protocols

If you are programming a blockchain protocol, you need to use a data transfer protocol for data transfers on the network. The three most commonly used protocols are TCP (Transmission Control Protocol), UDP (User Datagram Protocol) and QUIC (Quick UDP Internet Connections).

Solana uses QUIC Protocol to transfer data quickly and with low probability of data loss.

QUIC is more secure than UDP. It is faster than TCP. But it is neither as fast as UDP nor as secure as TCP in terms of data transmission.

->TCP sends data completely and securely, due to the security protocols used on it. Data integrity is important for TCP.

-> UDP, on the other hand, does not care much about security protocols and just wants to make sure that the data is sent.

  • > QUIC is right in between. It ensures both security and ensures that data packets reach the other party as completely as possible.

2.2) Shred Structure

The Leader Node breaks blocks into MTU (Maximum Transmission Unit)-sized data shreds through a process called “shredding”.

The Leader Node also generates “Recovery Shreds” through Reed-Solomon Erasure Coding. If any of shreds are lost or corrupted during transfer between validators, the entire block can be reconstructed using the recovery shreds without requiring the missing pieces.

This implementation is absolutely crucial for two reasons.

  • Firstly, in decentralized networks, malicious participants may modify data during transmission and send incorrect data to validators. When a block is reconstructed using shreds, this incorrect data is deemed invalid and cannot be included in the block, thus preventing fraud.
  • Secondly, fast data transmission protocols like QUIC can result in damage or loss of transmitted data. Recovery Shreds can reconstruct the entire block without the need for lost or damaged data.

Solana uses a 32:32 FEC ratio with Reed-Solomon Erasure Coding, a specific type of Forward Error Correction algorithm.

If 32 out of 64 Shreds are lost or damaged, the remaining 32 can reconstruct the entire block! Reed Solomon Erasure Coding will be discussed in detail in the Data Availability Problem section.

To increase network efficiency, we have the Turbine Tree. The Leader Node sends generated shreds to the Turbine Tree, which then distributes them to “Neighborhood” validators for processing. Validators are randomly shuffled into subgroups after each slot. Validators also receive shreds from other “Neighborhood” validators.

Nodes represent the Validators. In every hop, nodes share their shreds with each other.

The number of validators in a group is determined by Turbine using the DATA_PLANE_OUT parameter, which is based on the total number of validator nodes in the network!

In a network with DATA_PLANE_OUT=200, the leader can reach up to 40,000 validators in 2 hops on the resulting tree. Assuming each connection takes an average of 100ms to resolve, this deployment takes an average of 200ms, which is half the block time!

3) Data Availability Problem

Turbine’s solution to block distribution is not only efficient and secure, but also exciting! It ensures lightning-fast delivery to nodes, making it a top choice for blockchain enthusiasts.

While the validity of the data inside the blocks is dependent on the consensus reached by the validators, we must remain vigilant against malicious actors who may try to manipulate the system.

But who do we, as users, trust for the data inside the blocks? Of course, the consensus reached by the validators.

But what if the majority of validators’ votes fell into the hands of malicious actors and applications such as Double Spending existed on the network?
Or what if these malicious people censor the network and list only the transactions they want in the way they want?

The first is to run a full node and become a validator. However, this option may be unnecessary and costly for most people, especially those in countries with poor economic situations. Therefore, we need to find a simpler and more accessible method.

The second method is to run a Light Client. Light Clients can be run on lower hardware requirements as they don’t download the entire block data, don’t participate in consensus, or don’t execute! They only verify data from validator nodes by Data Availability Sampling.

3.1) Data Availability Sampling

There is a Block Header section within blocks. The Block Header contains a single hash value that serves as a common proof for all transactions. This is created using a data structure called a Merkle Tree.

For example, in a block with 8 transactions, the hash outputs are grouped in binary and a common hash output is taken for each group.

This process is repeated until a single hash output is obtained, which represents the Merkle Root. The Merkle Root is located in the Block Header.

When Light Clients download the Block Header, they get to request a Merkle Proof for specific transactions. To verify the accuracy and completeness of the 8th transaction, they request a Merkle Proof from their connected validators.

These validators provide hash values as proof to the Light Clients to demonstrate the existence and accuracy of the requested transaction. The image shows that hash outputs coloured in blue MUST be sent to the Light Clients to prove the integrity and correctness of the 8th transaction!

In the event that most of the network has been compromised by malicious actors and only one honest validator node remains, the validator has the opportunity to restore the entire network by preparing a Fraud Proof.

Let’s simulate a blockchain. The network’s task is to determine whether a number is prime or not, starting from 2 and increasing the number. While it is easy to reach a consensus for small numbers, larger numbers such
as 2147483651 pose a challenge if the majority of nodes may be controlled by malicious parties. If there is only one honest node remaining in the network and the malicious nodes declare 2147483651 as prime, the honest node will detect fraud. 2147483651 is divisible by 105991.

The role of our Light Clients is to monitor the network and request an exact divisor from our validator nodes for every non-prime number. Our Honest Node generates a Fraud Proof, which states that 105991 is a prime divisor of 2147483651.

Our Light Clients verify this proof. If the proof is correct, the Honest Nodes will stop listening to nodes that accept 2147483651 as a prime number in the consensus vote.

Solana’s block structure and block distribution model are slightly different from other blockchain networks. For this reason, the Light Client design also differs. Let’s try to understand how Solana Light Clients work through TinyDancer, Solana’s first Light Client. Then, let’s examine the methods planned to be implemented.

3.2 ) TinyDancer

TinyDancer is Solana’s first Light Client. TinyDancer is specifically a Diet Client. The idea was presented by Anatoly Yakovenko. It performs Data Availability Sampling with Reed-Solomon Erasure Coding, which Solana already uses with Turbine.

Solana shreds contain a payload section. The Shred Root in the payload is a root created by turning the shreds of that block into a Merkle Tree. This Shred Root is signed by the Leader Node that created the shreds and the root.

TinyDancer clients connect to Solana Clusters and request a shred from these nodes. This is how they obtain the Shred Root.

The working logic of Shred Tree and Shred Root is the same as Merkle Tree and Merkle Root described above.

Validators create a Merkle Proof that goes through the shred requested by the Light Client to the Shred Root and share this Proof with Light Clients. The Light Clients compare the Shred Root resulting from the proof with the Root they have. They also check the signature of the Leader Node.

TinyDancer Clients store the shreds they receive from validators throughout the epoch. They also send these shreds to other TinyDancer Clients. When a new validator node tries to synchronise with the network, it can quickly receive these shreds from Light Clients and synchronise with the network.

In addition, if a validator falls off the network or fails to collect enough shreds to create the block, it can retrieve the missing shreds from Light Clients and quickly create and validate blocks.

3.3) Other Approaches

For full Data Availability Sampling, a Protocol-Level Data Availability Layer must be established.

TinyDancer, Solana’s first and only Light Client, checks the integrity of shreds and therefore the integrity of blocks. For advanced Data Availability Sampling we need to be able to check individual transactions.

There are proposals for this on Solana that have been accepted but not yet implemented.

3.3.1) Simple payment and State Verification

I mentioned Light Clients above. They allow us to validate transactions at a much lower cost by requesting small proofs of specific transactions from validators via RPCs, rather than validating the entire chain.

These little proofs on the proposal are called “receipts”.

To access these receipts, we first need to create a Merkle Tree. Solana offers a different approach to other networks due to its account structure.

The “Entry Merkle” is a list of all transactions in an entry sorted by validator signatures. Solana nodes already store transactions with hash outputs.

With Entry Merkle we can verify that a transaction “T” is contained in an entry “E”.

The root of all Entry Merkle structures within a block is called the “Block Merkle”. We can verify that a transaction “T” is contained in a Block Merkle in the same way using Merkle Proof.

Block Merkle transforms the hash output of the transactions of a block in the Solana network into a Merkle tree.

Due to the Solana account structure, we need to keep a common account hash, called “Account Hash”, for accounts whose state changes within a slot.

The common hash output of the Account Hash and Block Hash values gives us the “Bank Hash” value.

Bank Hash is the common summary of all account state changes and transactions in a slot.

The Block Header structure contains the Bank Hash. In order to validate a transaction, Light Clients need to download the Block Header and request a Merkle Proof, according to the Bank Hash value block header contains, from the validator cluster they are connected to.

This Merkle Proof is called the Transaction Inclusion Proof.

Of course, there is a prerequisite for Transaction Inclusion Proof; the block must be optimistically verified.

In Solana, a block is Optimistically Verified when more than 66% of the Validator vote weight in the network votes for approval.

Light clients must request an Optimistic Confirmation Proof before requesting a Transaction Inclusion Proof. Or both proofs should be sent to light clients at the same time.

According to the solution presented in the proposal, light clients can also perform “Account State Verification”. First, a new instruction should be added to the network. This instruction records the state change of an account as a transaction in the network. Then, with Transaction Inclusion Proof, light clients will be able to verify account states.

3.3.2) Inter-chain Transaction Verification

The light client structure we’ve been talking about so far consists of off-chain nodes. We have been sending various proofs and data to these clients via RPCs.

But we can also do this proofing with smart contracts that we will deploy on the Solana. So we don’t have to rely on a third-party network and its participants. With its high performance and efficiency, the Solana network is well suited for generating different proofs on-chain and storing them for a certain period of time.

The Solana Developers have accepted a proposal where we can store proofs on-chain that can be used for Data Availability Sampling.

Within the proposal, an on-chain SPV Program will be deployed as a smart contract. The SPV program will act as a marketplace for SPV Proofs. It will be used for chain users/nodes to submit and request proofs.

Multiple SPV programs will be deployed and will be able to connect to networks outside of Solana. Transactions of simple payment networks like Bitcoin as well as network state changes of complex state machines like Ethereum can be controlled with high-level APIs.

The SPV Program will rely on a component called the SPV Engine to statelessly validate the SPV Proofs within it in a way that does not overwhelm and bind the Solana network. The structure of the SPV Engine can be tailored to the network whose proofs it processes, but it can also process proofs from within a standard SPV Program to develop a multi-chain ecosystem.

For the purposes of Proof Requests, the Proof Requester will most often be a Solana Contract, and these contracts will be referred to as “Program Clients”. Clients will be able to filter the proofs they request, creating a simple verification scheme for atomic swaps or programs connected to the banking system.

Once the client request has been sent, if the proof is successfully verified, the progress of the request can be tracked through the SPV Program and a “proof request account” will be created. Provers write and store the details of the verified Proof Request in the Proof Request Account. Clients can follow the process through the SPV Engine and query valid transactions with their proofs.

4) Conculusion

Solana was announced with new 8 technology for blockchains. My favourite innovation among these is Turbine.

Solana is generating blocks like an engine every 400ms. It needs to send these blocks to hundreds of validator clusters distributed around the world and these validators need to validate the blocks. There is a continuous processing and communication load on the network.

With Turbine, we reduce this communication and processing load very well. We also offer the opportunity to maximize the security of the network with Reed-Solomon Erasure Coding. Light Clients can quickly collect block fragments that have already been erasure coded and divided into small shreds and verify consensus and transactions.

5) Further Reading & References

Except the in-text references, you can find the sources I used to write this article. To undertand the topic better, I suggest you to read these articles/papers.

-> Turbine: Block Propagation by Ryan Chern (Helius)

-> Fraud and Data Availability Proofs: Maximising Light Client Security and Scaling Blockchains with Dishonest Majorities by Mustafa Al-Bassam (Celestia Labs), Vitalik Buterin (Ethereum) & Alberto Sonnino (Mysten Labs)

-> Turbine — Solana’s Block Propagation Protocol Solves the Scalability Trilemma by Anatoly Yakovenko (Solana Labs)

-> Solana and the 8 Key Innovations by BlockofChain :)

--

--

No responses yet