Motivation
The primary motivation of this suggestion is to let StarkNet have fast finality, which is economically backed by a lot of stake. We also try to leverage Tendermint - a well-established BFT protocol deployed on numerous blockchains.
On the downside, due to the quadratic complexity of Tendermint, this suggestion limits (in practice) the sequencers amount to a few dozens. However, since sequencers can’t include invalid transactions as everything is proven to Ethereum, there is appealing to have a more performant algorithm with only a few active validators each time, compared to similar potential designs for L1s.
Stake management
Since Tendermint includes slashing, all the stake has to be managed on L2.
Sequencing
Proposers (i.e., sequencers) will be chosen according to their stake. The randomness source and algorithm for selecting the specific sequencer for each slot are out of scope.
Each proposer will be chosen for a concise period (around 2-10 seconds) and is expected to behave as described by the Tendermint protocol. It is recommended to read the Tendermint whitepaper if you are not familiar with it. If you need only a refreshment, here is the Tendermint flow in one draw:
To align honest and rational behaviors, we add the following slashing rules to Tendermint (similar to what’s the Cosmos SDK is doing):
- Preventing double voting - sequencers that are signing on more than one block in the same round will be slashed
- Preventing idle sequencers: the last round in which each sequencer signed will become part of the StarkNet state. Sequencers that are idle for too long will get slashed
- Preventing inconsistency generated from not respecting “Tendermint locks”
Once a block of StarkNet is committed, it is final and cannot be reverted unless >⅓ of the stake of the system is slashed.
Proving
To couple incentives between the sequencer and prover in the best way possible, we want to create a protocol in which any sequencer might get requested to become a prover. Such property has several advantages:
- It saves us the need to split the fees between sequencer and prover. Participants earn from being sequencers, and being a prover can be viewed as a “community service”.
- It makes sure that sequencers would never blindly sign/not-verify - since there is a real chance the protocol will request them to prove it.
However, such design also has several notable disadvantages:
- It means that every sequencer needs to be able to run a prover. This might, de-facto, reduce decentralization.
- To “keep the element of surprise,” the protocol’s latency and amount of proved blocks each time would be suboptimal.
To be concrete:
- L1 keeps track of the round number of the last block it expects to prove and the current round number on StarkNet (initially, they are both zero).
- After at least MAX_ROUNDS have been managed on StarkNet, fresh randomness on L1 determines:
- How many rounds would be proven between (1,…, MAX_ROUNDS)? We note this by R.
- Who (among the quorum signed in the R-th round) would be the prover?
Crucially, observe that the sequencers don’t know when they participate in the Tendermint what block is the last block to be proven next, and how the prover will be shuffled from this block’s signers. This means that it’s impossible to be a sequencer without being a prover.
The consensus transactions (signatures of the signing nodes, proof that the weight for each block is sufficient, slashing transactions) are being proved alongside the “applicative transactions” as part of the proof. Also, a consensus on the block that comes after the last-to-be-proven block is proved as part of the proof. This guarantees that no sequencer “messes” with the signatures on the last block.
In addition, the identity of the eligible prover (i.e., the sampling algorithm given the signatures on the height-X block) is also part of the proof.
The selected prover has a limited time to generate a proof - T_1 hours after the previous proof has been submitted. If the selected prover fails to create proof (either because it is malicious or due to honest power failure, etc.), anyone can generate proof for T_1 additional hours. Shall this happen, the protocol slashes the original prover, and rewards the proof submitter.
As a rule of thumb, it is safe to assume that a proof would arrive at the second window, as such event would cause all the sequencers to lose some stake. A detailed treatment of this edge case is discussed below.
Note:
- There is an edge case hiding here, as not all Tendermint “Rounds” ends with a consensus on a new block - and if there wasn’t agreement on a round there is nothing to prove. Therefore, to make the proven quota well-defined, the last block included in the proof is actually the first block with a round number greater than R.
- Unlike the previous suggestion there is no “commitment” mechanism. Instead, L1 assumes a certain L2_blocks/L1_blocks rate, and consensus on the new state is attested in the proof itself.
When a proof doesn’t arrive - Handling inactivity in the network
Inactivity happens when L1 expects a new proof to be submitted, yet no such proof arrives. On L2, inactivity might occur due to the two possible scenarios:
- A large portion of the stakes went offline at once, leaving the L2 state unable to progress. (Notice it must be at once. The slashing rules mitigate a gradual disappearance of stakers)
- The stakers agreed on an invalid block that cannot be proven
We need to develop a mechanism that will allow the state to advance (and will reduce the stake of dishonest parties while not hurting the stake of honest parties).
This is challenging, as L1 is unaware of the reason behind the delay and has no information on who should be punished.
The suggestion is to allow this to happen through:
- Allowing state updates to occur even when less than ⅔ of the stake agree on a valid state. These unique “state updates” need to happen only when L1 “gave up” on continuing with ⅔ of the stake, and the margin for how much stake we need to progress has to advance slowly.
- Slash stakers that didn’t participate in the new state update. This is done so the stake of the sequencers that didn’t participate in the chain will decline fast, and we will return fast to the stable state.
- Slash, a bit, even the active stakers, to incentivize them to submit a proof as fast as they can (and not, for example, censor other major staker and wait for the next time) window).