Simple Decentralized Protocol Proposal

This post aims to describe a simple complete candidate for a decentralized Starknet protocol. It should be viewed as a proposal for “phase 1” of a gradual evolution that may introduce more sophisticated mechanisms and flows. More specifically, this proposal aims for maximal simplicity under the assumption that more than ⅔ of stake is honest: they follow the protocol without modification.

The proposal is structured in five parts:

  1. High level overview
  2. Time in the protocol
  3. The basic processes
  4. Recovery mechanism
  5. Summary

Note this post does not aim to provide specifications. There are many small design choices and optimizations which result in a large variety of different low level descriptions. The goal is to provide a bird’s eye view of the protocol that will encourage the community to cooperate on it, whether by discussion or by creation!

For readers who want some background, here is a link to the first of a series of seven posts on decentralization which documents the thought process leading to this proposal.

High level overview

We divide the Starknet protocol into four layers: leader/proposer elections, consensus, proving, and L1 state updates. In this section we give a very high level overview of these four layers.

  1. L1 staking and leader elections. Starknet will use a leader-based consensus protocol, with proof-of-stake for Sybil resistance. The role of the first layer is to allow people to join and leave the set of stakers, and to allow all L1 full nodes to determine the present stake distribution and upcoming leader schedule. This previous forum post on leader elections gives a good account of the proposal. In this post we elaborate on how exactly slots are defined.

  2. L2 consensus. We want to decentralize the block proposal layer from a single proposer to a network of proposers. To ensure this network reaches agreement on the chain, we need a consensus protocol. Starknet will use a variant of Tendermint, which is a leader based PoS consensus protocol. For more information, the Tendermint paper is a very lucid source. There is also this forum post on Tendermint for Starknet which discusses some potential modifications for our context.

  3. Proving. Decentralizing the block proposal layer is necessary but insufficient – we must also decentralize the proving layer to mitigate liveness attacks. Our proposal is to solve this problem by using a chained proof protocol, which we sketch now. In a nutshell, the idea (motivated by Mina) is to require every block proposal to also include a proof of a previous block. That way, liveness of the chain is coupled to liveness of the proving layer. The question is “what is the proving job of each proposer”? Here’s the answer. First, divide all of time into a sequence of numbered slots (we will define them in the next section). Next, classify them into k strands by the remainder of the slot number modulo k. Here’s a picture of 12 slots in three strands. Recall the leader schedule associates a unique leader to each slot, but that a slot may also end without reaching consensus on a new block. Hence slots are potentially empty, like the eighth slot in the picture. Now the proving rule:

    1. The first proposer in each strand only proposes a block without proving anything.

    2. The second proposer in each strand must propose a block alongside a proof of the previous block in its strand.

    3. Subsequent proposers must propose a block, alongside a proof of:

      1. The previous block in its strand,
      2. The verification of the proof appended to that block.

    Note this job designation recovers from empty slots, as illustrated here. The strands are recursive; in the case of three strands, the statement proven at slot 10 is the state transition by block 7, alongside the verification of the proof in block 7 which itself contains the block at slot 4 alongside the verification of the proof in block 4, and so on. Hence, verifying just one proof from each strand suffices to prove a prefix of the entire ledger. For more information, we recommend reading the post on chained proof protocols.

  4. L1 state updates. So far we’ve had leaders collectively agree on a chain of blocks that include proofs. To reach L1 security, we must perform an L1 state update. This procedure requires two things:

    1. Convince the verifier of the proof’s validity,
    2. Convince the Starknet L1 contracts that the state transition is backed by consensus.

    The chained proof protocol we have outlined so far, however, does not produce a single “continuous” proof of a sequence of Starknet blocks which is eligible for submission to L1. To this end, we must have a merging process that “braids” the strands and performs the necessary bookkeeping to actually compose the individual state transitions. For instance, given state transitions A→B, B→C, C→D, we’d like to merge them into a single state transition A→D without any reference to intermediate steps on L1. This merging process is time consuming, and we propose to handle it by giving the stakers long turns for performing it. These turns will be given as windows of sufficiently many L1 blocks, and will serve both as time for merging the strands, and as time for submitting the relevant transactions to L1.

That’s it! We’ve gone from joining the staker set, to computing the leader schedule, to consensus, to proving, to L1 state updates.

Time in the protocol

In this section, we address the vague mentions of time in the protocol.

Slots

Being a chained proof protocol, it’s best for Starknet to provide each block proposer with enough time, before its turn, to compute the proof required of it. To this end, we want a notion of time that is predictable, easy for sequencers to agree on, persists between blocks, and does not rely on subtle assumptions about clock synchronization. This measurement of time will serve as the foundation for the block proposer schedule (also referred to as leader elections). The chained proof post points out the unpredictability of block numbers prevents them from serving as a good foundation in this respect. We need something else.

Fortunately, the Tendermint protocol supplies the goods. As mentioned in the post on Tendermint for Starknet, the Tendermint protocol has two intrinsic concepts pertaining to time, namely height and round. The height of a block equals its block number. Within each height there are rounds. After a round without consensus on a new block, the round is incremented. As soon as a block is appended to the chain, the height is incremented and the round resets to zero. Each new round (whether incremented or reset) is the turn of the next proposer. Thus rounds are the foundation for the proposer schedule in Tendermint. A small issue remains: rounds reset when new blocks are appended, so they do not persist. This is easily rectified: maintain an additional counter, called the slot number, which counts rounds without resetting.

In Tendermint for Starknet, slots serve as the foundation for the proposer schedule.

Remark. We could have avoided the new term “slot” by modifying the round number to persist between blocks. However, it seems reasonable to leave the canonized Tendermint terminology undisturbed, and just introduce a name for what we need.

Epochs

Epochs are the granularity at which Starknet staking operations take effect. They are used for proposer scheduling, as explained in the leader elections post.

The high level algorithms

Monitor

The monitor process monitors the L1 state and the Starknet stake distribution. It also calculates the proposer schedule, associating a proposer to every slot. The monitor makes this information available to the other processes.

Voter

The voter executes the Tendermint consensus protocol (participating in P2P, validating, and voting). Here’s a rough overview of the process.

  1. Constantly monitor L1:

    1. Calculate proposer schedule for each epoch as soon as possible.
    2. Track L1 state of Starknet.
  2. Constantly maintain log of L2 P2P messages.

  3. When receive new block:

    1. Validate proposer of slot according to proposer schedule.

    2. Broadcast further (e.g by gossip).

    3. Validate block:

      1. Verify proof therein.
      2. Execute transactions.
      3. Verify execution matches state-diffs.
  4. Vote according to the Tendermint state machine.

In practice there may be subtle optimizations to avoid unnecessary work, e.g some flows may avoid validating a block if there are too many votes that conflict with it.

Proposer

The proposer process calls the prover as soon as it can, and then

waits for the earliest time at which it can start to compute

  1. As soon as the previous slot in my strand ends, call the prover with the following job: in the latest non-empty slot in my strand, prove the block of transactions alongside the execution of the verifier program on the preceding non-empty slot in the strand.

  2. When I receive the last transaction from the previous proposer’s stream (before consensus):

    1. Begin building a block with the most profitable transactions from the mempool.

    2. Upon consensus:

      1. If the previous proposer’s stream was finalized, continue building the block.
      2. Otherwise, append the rejected block to the mempool and rebuild a block.
  3. When my block-building window ends, stop building and stream the remaining transaction(s).

Prover

The prover computes the proofs required by the protocol. In this post we consider it a black box.

Updater

The updater process calls the prover to recursively merge strands, and then initiates and performs L1 state updates.

  1. As my window of N L1 blocks begins, I use the prover to recursively merge all strands to obtain a proof of a state transition from the current state of Starknet on L1.

  2. At n L1 blocks (\tfrac n5 minutes) before the end of my window:

    1. Wrap up the merging process.
    2. Initiate L1 state update by posting the proof and associated L1 data.

Recovery mechanism

Even under the honest majority assumption, bugs in either the sequencer or the prover may result in consensus on a state transition which cannot be proven. For this reason, we will add a recovery mechanism fork choice rule: if there has not been an L1 state update for n L1 blocks, then honest nodes must fork the existing unproven L2 continuation of the L1 state.

Summary

This post outlined a complete proposal for a decentralized Starknet protocol. It seems simple enough, and the battle-testedness of Tendermint inspires hope in the protocol’s liveness and security properties. We’d love to hear your feedback to learn of potential concerns, caveats, recommendations, and improvement proposals!

This is actually massive and a positive move in the right direction

We would end up paying heavily for the staked capital.

Short talk from Jon: https://www.youtube.com/watch?v=toPd1vgHjVE
Blog: https://dba.mirror.xyz/UTPfxWe65dYrUu_RJX-5VkAJypFRyw3AZh6m0dRXYZk

I mostly agree with Jon’s insights and I think everyone who will be deciding on this in Starknet should be aware of the analsysis.

Thank you for posting all of this in public. I’ve really enjoyed going through all seven parts plus the other posts. It’s exciting to watch the decentralization of Starknet be developed in real time with the communities input/feedback

Question - Are users provided with a preliminary confirmation immediately after the Tendermint voting concludes, followed by a final confirmation once the merged strand is posted to L1? If so, how long is the expected time interval between these two confirmations, and how is transaction security ensured during this period?

@chaskinonchain Howdy! Happy you enjoyed reading!

  1. Yes, users will be provided with a preliminary confirmation upon Tendermint consensus, referred to as L2 finality and can be observed by any Starknet node that follows the network, without requiring any execution of transactions.

  2. We are aiming toward a block time of no more than 10 seconds at first, so L2 finality will have a similar latency. For the L1 state updates, it’s more difficult to say since latency in the decentralized setting will likely fluctuate with L1 gas prices. For now, the L1 state update latency is several hours. I expect it to stay in the range of at least an hour for at least a year or two.

  3. Great question. There are three main security threats:

    1. L2 consensus equivocates: users will see L2 finality on one chain, but a conflicting chain will reach L1 finality in its stead. To mitigate this danger, there will be a permissionless L1 slashing contract that will allow anyone to submit evidence of equivocation and severely slash the offending parties. (Note equivocation is always committed on purpose and therefore constitutes an unambiguous offense.)
    2. There is a liveness failure among the stakers in charge of performing L1 state updates. This concern is mitigated by allowing anyone, at any time, to perform an L1 state updated provided they supply the required STARK proof. Hence, if you see/compute a proof of a chain with L2 finality, you can ensure its L1 finality. (The parties “in charge” are those who receive a protocol-level compensation, but anyone can do it if they are so inclined.) Since chained-proofs ensure the blocks contain proofs of past blocks, ideally not much proving computation will be needed to “cover” for idle provers.
    3. There’s a bug somewhere in the flow that prevents the chain with L2 finality from being promoted to L1 finality. Bugs are an implicit problem in every flow of every system and there’s no elegant way around them. We are still thinking about how to tackle this sort of problem. The first candidate is delegation to social consensus for “cancelling L2 finality” and switching to a different Starknet chain in which the bug is either resolved or not present. The second candidate is some sort of governance mechanism for dealing with bugs, that may have the ability to even modify the L1 core contract. (The L1 contracts are outside the scope of L2 social consensus, so a critical bug in the L1 core contract necessitates some L1 modifications.)

To summarize, L2 finality will have lower latency (seconds vs hours) for the foreseeable future) and lower security than L1 finality. It will be backed by a slashing mechanism meant only for equivocation, and only a permissionless L1 state update mechanism to ensure liveness.

Does this answer your questions?

@tkstanczak Hey Tomasz! Just to follow-up, we’ve read up on proof-of-governance and will start a round of discussions on it both internally and externally.

Thank you, that answers my questions. I appreciate the depth of your response. On a related note, I’m interested in learning more about the other slashing conditions. Is there a specific resource available that elaborates on this?

I appreciate the depth of the your questions! Regarding slashing - no particular source comes to mind. We’d like to minimize slashing because it may scare off operators. For this reason we prefer to only slash unambiguously intentional offenses, which, ironically, are all equivocation i.e ambiguous behavior. Happy to hear your take on this!

I dont see ordering mentioning anywhere outside of “building a block with most profitable transactions from mempool”?

It would be important to take this into account, especially as currently ordering differs a lot in different blockchains, from Ethereum’s dynamic gas to Solana’s FIFO.

From what I understand, most ORUs currently also operate on FIFO, as well as many tendermint based chains. How will starknet approach this?

Hi @Jommi! I agree the question of ordering txs inside a block is important. For now, Starknet operates in FIFO. However, Starknet will introduce a fee market in an upcoming version. In particular, this fee market will reflect (in a naive way, disregarding MEV) the profitability of each transaction. Thus, (again, disregarding MEV), the sequencer will sort its mempool by profitability and construct its blocks accordingly.

Does this answer the question?

Thanks for the reply king!

I think I understand your point - when describing sucha complex system at an early point we can’t go to the specifics of every single detail, so looking into ordering and MEV right now might be a bit early.

However, we should make sure to still keep it in mind, as we can see now in Ethereum how much pain it has caused and how diligently it is being combatted.