Starknet Decentralized Protocol I - Introduction

A few months ago we shared a preliminary version of our thoughts about the decentralization of Starknet. Since then we have absorbed feedback, consulted with experts, and expanded our thoughts to previously “blackboxed” areas such as leader election. The time is ripe for another round of feedback.

Our thought process about Starknet’s decentralized protocol will be presented in a series of posts.

  1. Overview and tight requirements that reflect the tradeoffs in question.
  2. A concrete suggestion for leader elections.
  3. Some trade-offs involving consensus.
  4. Thoughts and questions revolving around proofs in the protocol.
  5. Checkpoints for fast finality.
  6. The buffer problem.
  7. Chained proof protocols; braiding.

We highly recommend starting by reading this post for an outline of the architecture, and the problems specific to Starknet.

The (ongoing) planning for Starknet’s decentralization has been challenging and we have received valuable help. Thanks to Bartek Kiepuszewski and Joachim Neu for illuminating discussions on consensus and high level architecture.

Actors in the protocol

In a nutshell, Starknet’s decentralized scheme will use PoS Sybil resistance and feature two sets of actors: block proposers (often called sequencers) and provers who produce proofs of many blocks at a time. At a high level, the protocol answers three questions about each set:

  1. Who are the members?
  2. How are tasks assigned to members?
  3. What actions comprise each task?

In this section we will discuss the first two questions.

Choosing block proposers

The fundamental obstruction to consensus is honest forking – when honest agents diverge from an agreed-upon state. Protocols widely vary in their treatment of honest forks. For example, bitcoin allows them and is content with arguing they are resolved with high probability as time passes. On the other hand, Tendermint prevents any honest forks while operating within its security assumptions. A sound consensus protocol must resolve each possibility of honest forking, and providing such an analysis is challenging but necessary.

The root of honest forks is conflicting information among agents. Such conflicts are caused either by a faulty network or by malicious agents. Some consensus protocols reduce conflict possibilities by limiting the set of agents who can propose blocks at each time slot. Specifically, they assign a leader to each slot in a publicly verifiable way, allowing honest agents to disregard block proposals by non-leaders.

In some protocols, honest agents can disagree on the leader for a given slot, resulting in honest forks. This happens in bitcoin: honest agents A,B may receive conflicting children of the same block from distinct authors, and accept them if the PoW is valid. On the other hand, Tendermint requires consensus on the upcoming leadership schedule. The distinction of whether the protocol involves consensus on future leaders is worthwhile, because a global leadership schedule simplifies protocol analysis: conflicting block proposals in a given slot are caused either by a faulty network or by a malicious leader.

For the above reason we have opted for a leader-based consensus protocol with global scheduling (the network agrees in advance). Our prioritized requirements are detailed below, following the requirements from the consensus protocol.

Having decided on PoS, Starknet’s block proposers will be stakers, and their tasks for each slot will be assigned by the leader election protocol.

Remark. Recent leader-based BFT protocols (Bullshark and Fin) use DAGs to increase throughput by decoupling leadership from data propagation. Specifically, they allow all stakers to propose batches of transactions which are then ordered by the leader to form blocks (composed of batches). The symmetry of these protocols with respect to batchers also makes the DAG live during asynchrony and inactivity. We are considering some of these ideas for the Starknet protocol. In this setting, one may envision a decoupling of batchers and sequencers, in analogy to the decoupling of sequencers and provers. This would introduce another set of actors into the protocol. In our view, the gap in resources required for batching and sequencing is relatively small and does not warrant explicit separation of these roles in the protocol. Thus, a DAG-based Starknet will still consist of just “sequencers” and “provers”, and does not warrant a separate high-level treatment.

Choosing provers

The Starknet L1 state update mechanism requires a STARK proof (of a sequence of L2 blocks) verified by the L1 cairo-verifier contract. Such proofs will likely involve thousands, or even hundreds of thousands of transactions. Consequently, the provers who compute them will consume considerable resources.

While the use of leaders is motivated by fork mitigation, we see no analogous motivation for a particular procedure to choose provers that we consider uncontested. Instead, we have a multitude of trade-offs and open questions. We briefly mention those directly related to the aforementioned high-level questions.

  • Relation between the sets of sequencers and provers – do we assume an inclusion in some direction, or maybe even equality?
  • Assigning provers – do we use a variant of leader elections or another mechanism?
  • Division of proofs among the provers – does the protocol associate a single proving task to each state update, or does it explicitly assign composite recursive proving tasks?

We are thinking hard about this part of the protocol. A detailed account of the various open questions and trade-offs will be given in the third post of the series.

Three not-so-black boxes

To tackle the decentralization problem we went for a divide-and-conquer approach by tentatively dividing the protocol into three black boxes, and investigating each separately. These components are:

  1. Leader-based consensus protocol.
  2. Leader election protocol.
  3. Proof protocol.

The above decomposition was very helpful to us, but there is inevitable coupling. For example, a proof of consensus obviously depends on the consensus protocol, and changes drastically between longest chain and BFT fork choice rules.

This series of posts will start from these not-so-black boxes and gradually dive into their dependencies. We consider requirements under assumptions about the ambient network, which are detailed in the first appendix.

Tight requirements for consensus

We began the process by trying to distill some important properties. Those pertaining to safety and liveness warrant a brief introduction, given in the second appendix. For emphasis, let us record the key point: we hope to use the assumption of L1 as an always available public board to circumvent the classical impossibility results on standalone consensus protocols.

As a rollup, our primary goal is to reduce user costs by scaling Ethereum. Specifically, Starknet will be inexpensive in the sense of decreasing L1 gas consumption 100x.

Starknet’s consensus will be strongly permissionless: participation will not require L2 consensus. Note this property is not satisfied e.g by Ethereum: new validators are recorded in the ledger, which requires consensus; hence a minority of validators can censor new participation by preventing a supermajority.

There is little point in a permissionless protocol that can’t handle participation loads. Hence the consensus protocol must be sufficiently scalable in the sense of working well with 50 simultaneously active sequencers.

L1 state updates will require sequencer consensus, which must therefore be L1-verifiable in the sense that an L1 full node can verify proofs of Starknet consensus.

Starknet must have final L1 state updates in the same of being safe under network partitions and inactivity, modulo L1 reorgs.

Starknet consensus must be strongly accountable in the sense that safety and liveness violations are punishable by slashing for any fraction of the participants (including malicious majority of stake).

Furthermore, a one-time malicious majority must not damage Starknet forever. Instead, the protocol should spring to life soon after the malicious majority vanishes.

Starknet will have fast indication of inclusion, namely with an L2 block appears within 10 secs of inclusion.

The first practical barrier to act on txs is their validity. For mid-high-value txs one waits for some economic backing, but for low-value txs validity can be decoupled from finality. Some dual-ledger finality gadgets involve a process of sanitization that prevents inclusion of conflicting txs in the finalized ledger. In this case, validity of a tx is undetermined even upon inclusion in a block. Moreover, despite consensus on a block, sanitization may remove a particular tx from it in order to avoid conflicts with previous txs in the finalized ledger. We want users to know their txs cannot be omitted from a block after inclusion. Specifically, Starknet will have atomic consensus on blocks, meaning consensus on a block implies consensus on all constituent transactions.

For mid-value txs, Starknet will have fast accountable L2 finality, where “fast” means a latency of several minutes.

There is not much use for liveness during network partitions due to obvious safety hazards. However, liveness under inactivity is still nice to have, e.g in the case of 40% inactive stake. For an L1 protocol, honest inactivity is indistinguishable from a network partition and therefore conflicts with safety. In our case, safety comes from L1, so it will be nice for the consensus protocol to be adaptive in the sense of liveness under dynamic participation.

Lastly, it will be nice to use tried-and-proven implementations. Ideally, the protocol will be battle-tested, meaning there exist implementations by systems operating for at least one year with at least 100M$ TVL.

Note these requirements are much more concise than those presented a few months back. Even more notably, they are not conflicting!

Prioritized requirements for leader election

In order to maintain a pure PoS Sybil resistance, the leader election protocol must not be biased toward strong machines, but rather determined by stake and randomness only. Note we definitely want to incentivize sequencers to have strong machines (for reduced latency), and this will be achieved by a fee mechanism that encourages the inclusion of more txs per block. Specifically, leader elections should be fair in the sense of being determined by stake, with no benefit from other resources.

Users should be able to enforce consensus in real time. Hence leader elections must be locally verifiable, i.e an L1+L2 full node can verify all leadership claims in real time.

In the same spirit, the Starknet protocol must eventually assure mere observers (non-executors) that it has been followed. To this end, leader elections should be efficiently provable. By this we mean correct block authorship (at each slot, the author of the block is the designated leader) is efficiently STARK provable.

A good decentralized consensus protocol is resilient to faulty/malicious behavior. In such a protocol, no single entity can have a dominant role for extended periods. In particular, leadership must be brief. On one hand, this is not a property of the black box labeled “leader elections” but rather a constraint on its use for consensus. On the other hand, we must have brief leadership where nobody leads continuously for too long (i.e more than 10-15 secs).

Starknet’s proofs enable an L1 node to trustlessly synchronize on the last proven Starknet state (we disregard data availability). This streamlines the bootstrap procedure to become an L2 full node by eliminating the need for historical data. However, verifying consensus on a certain block may require historical knowledge of past stake distribution or actions. Longer dependence means a cumbersome bootstrap procedure due to a longer synchronization period. We’d like to mitigate this burden for consensus in general and for leader elections in particular by having them memoryless, meaning any dependence of elections on historical data (L1 and L2) goes back at most several hours.

We have discarded some admittedly wonderful properties because they increase the likelihood of honest forks. To us, finality is more important. Still, we think they warrant the elaboration below.

In leader based consensus protocols, the role of a leader includes at least a choice of order on some set of data (usually a set of transactions), and often even a choice of the data itself (when the leader constructs the block it proposes). This choice has value and can be sold; it is often perceived as undesirable manipulation over consensus and we’d like to minimize its influence. Furthermore, this choice becomes more valuable if the leadership schedule is known in advance, because people can plan profitable manipulations in advance. Perhaps such creative schemes are best thwarted at the protocol level?

If information in a distributed system is available for only a short period, then different agents are likely to have conflicting information. This has advantages too. Start with a reasonable sounding property.

Globally unpredictable – can only calculate a future leader shortly before it acts.

A stronger locally unpredictable version is interesting, where it’s impossible to predict future personal results.

One can go even further with secret elections, where others’ results can never be calculated; the election outcome is decided only upon all candidates revealing their result.

Prioritized requirements for proof protocol

In Starknet, an L1 state transition requires an L1-verified proof to obtain L1-security. Since proofs are such a crucial component of the protocol, any decentralization requirement applies to the proof protocol. To this end we want a permissionless protocol. While a consensus protocol requires Sybil resistance for DDoS protection, a proof protocol has no analogue because proofs can not conflict with each other. Although PoS is tempting for strong accountability (as in the consensus protocol), it is not assumed. Instead we formulate a weaker version of the requirement, without specifying some resources underlying prover elections.

Strongly permissionless – participation does not require L2 consensus.

L1 state-updates must require sequencer consensus, so the proof protocol must in particular enforce proofs-of-consensus.

Proof of consensus – A proof submitted without proving sequencer consensus is rejected.

A permissionless protocol does little for decentralization if the most powerful machines monopolize proving. We therefore want it to be computationally accessible by enabling and incentivizing participation of machines of variable computational power.

High-value txs warrant the full economic backing of L1, for which the users may wait. We are fine with hourly L1 finality, so that latency between a proof job and its completion is incentivized to be at most a few hours.

While STARK proofs cannot be forged, malicious inactivity is still possible (recall the dynamic participation model allows for honest inactivity). The protocol should be resilient against such inactivity attacks and/or disincentivize them. For the former: it would be nice to have an adaptive protocol which is live under L2 network partitions and inactivity.

Summary

In this introductory post we have outlined some thoughts and requirements for a decentralized Starknet. The next post will dive into these thoughts in more detail, focusing on open questions pertaining to consensus and the behavior of provers in the protocol. The last post will describe a concrete suggestion for a leader election protocol.

Appendices

Appendix I – Network behavior

We work under the partially synchronous model of communication and take interest in dynamic participation. The definitions are recorded below for completeness.

The partially synchronous model of communication provides a convenient middle ground between synchrony and asynchrony. It does this by handling network faults of unknown but bounded duration.

Partial synchrony – the protocol admits a known delay bound ∆ and an unknown global stabilization time (GST) which satisfy the following conditions (for every protocol execution):

  1. Asynchronous phase – a message sent before GST is received by the recipient by GST+∆.
  2. Synchronous phase – a message sent after GST is received by the recipient within the delay bound ∆.

Classical BFT protocols involving signature collection require constant participation. Hence it is often assumed that honest agents are always active. In such settings, the only cause of apparent inactivity by an honest agent is actual asynchrony. This assumption may be unrealistic for large scale protocols. Longest chain protocols need not require constant participation; indeed it is possible for an honest bitcoin miner to sleep for some time and then resume activity. We don’t want honest participants to vanish for extended periods, but prefer to incentivize their activity instead of assuming it.

Dynamic participation – the protocol admits an unknown global awake time (GAT), beyond which every honest agent is awake.

We consider the underlying L1 network as a graph whose vertices are the L1 nodes and whose edges signify communication between nodes. It is a fundamental insight that from the perspective of a self-contained protocol, inactivity is indistinguishable from asynchrony.

Every L2 full node is in particular an L1 full node, because it needs to track stake (maintained on L1) and L1 state updates. Hence we consider the L2 network as an “overlayer” given by a subgraph of the underlying L1 network. This model is meant to emphasize that even during L2 network faults, L2 nodes may still admit L1 connections. Assuming L1 connectivity and liveness, an L2 full node can distinguish between asynchrony and L2 inactivity by observing L1. Specifically, observed L1 inactivity signifies “true” inactivity while observed L1 activity signifies an L2 network partition.

Appendix II – A tour of safety and liveness

The two fundamental properties of consensus protocols in the context of blockchain are safety and liveness. Safety is a reformulation of the classical “agreement” axiom of BFT theory. In what follows, we informally define these properties and their specializations to communication and participation models.

Throughout, we assume a known “universe” of participants is specified via public key infrastructure (PKI). Lastly, the honesty threshold of a set of agents is the fraction of honest agents out of all the agents.

Safety – if the set of active participants fulfills a specified honesty threshold, then all active honest nodes agree (i.e there are no honest forks).

Liveness – if the set of active participants fulfills a specified honesty threshold, then the protocol can advance, i.e a malicious minority cannot freeze it indefinitely.

During synchrony and under static participation (every node is active), these are necessary properties for every worthwhile consensus protocol. However, their fulfillment during asynchrony and/or dynamic participation is much more subtle.

Finality – safety under L2 network partitions and inactivity. If the unknown subset of active participants fulfills a specified honesty threshold, then all active honest nodes agree during both synchrony and asynchrony.

Adaptivity (dynamic availability) – liveness under L2 network partitions and inactivity. If the unknown subset of active participants fulfills a specified honesty threshold, then the protocol can advance, i.e a malicious minority cannot freeze it indefinitely. (In our setting, this implies the protocol remains live if some of the stake becomes idle.)

For an L1, inactivity is indistinguishable from asynchrony. Consequently, the CAP theorem implies that adaptivity and finality are mutually exclusive properties. Recent literature on consensus protocols sidesteps this problem with user-level solutions, for instance using dual ledgers with a finality gadget as in PoS Ethereum. By introducing assumptions about the underlying L1, we hope to either resolve the availability/finality dilemma for L2 consensus at the protocol level, or to improve on existing user-dependent solutions.

Safety is crucial, but a protocol in which a safety violation is inconsequential to the violators is vulnerable. The following enhancement of the safety property incentivizes honest behavior.

Accountable safety – for every honest fork, the active honest part of the network contains evidence proving a large subset of nodes acted maliciously. (The punishment mechanism is not a part of the property, and can be social or protocol-level.)

The above argument shows that adaptivity and accountable finality are mutually exclusive properties for an L1. However, we want to employ L1 to get a strong form of accountability which applies even for a malicious majority of the stake.

83 Likes

honest distribution is the key to the viability of this network. I like that

36 Likes

Hi @ilia Great update on the decentralization plans, thanks. I’d like to share the following thoughts:

  • My first impression is that using (a variation of) Tendermint to build consensus at the L2 level is a good basis to achieve most of the properties mentioned in the post.

  • It seems that the main concern with Tendermint is the weaker availability of the sequencer/prover under network partitions. This is a valid concern from a theoretical standpoint.

  • But I’ve learned by hard that theory and practice are not always the same. I am not aware of actual serious incidents due to network partitions for Tendermint chains (or the Cosmos ecosystem), but this could be my lack of knowledge.

  • Tendermint is to my knowledge the most battle-tested open-source consensus engine in the market, I personally find it difficult to have the same confidence on anything else (PoS-based). This is why at Autonolas we use Tendermint to build decentralised off-chain infrastructure.

All this said, I also see the value of using a PoS consensus mechanism with less intense P2P requirements than PBFT consensus protocols.

Hope you find this useful!

35 Likes

I’ll put my DAPP dev hat on for one sec:
From the perspective of a DAPP dev, I would say that it would be very preferable if successive block leaderships where not predictable.
For DEFI devs as well as game devs the idea of one entity being able to reliably perform two consecutive transactions across two blocks (the last one of block#1 and the first one of block#2.) could catch them off guard.
One example of that happening on Ethereum (It’s partially also an oracle design problem):

Choosing a solution where a leader can foreseeable be the same across two blocks or where there is enough time for bribes/collusion to occur is not the end of the world. But in that case Devs need to be made aware that this is a possibility and they should keep that fact in mind when designing their apps.

25 Likes

Dear David, thanks for your feedback! I certainly agree that Tendermint warrants respect. The third post in the series will dive into our perceived trade-offs between consensus protocols for L2.

26 Likes

Hey @TotalPizza, thanks for the feedback and the interesting link. Our current thinking is to follow Ethereum and have epochs of 5-10 mins, with a predictability of 2-3 epochs. This is flexible and not decided.

I do not claim this level of predictability is ideal, but I think high unpredictability comes at the cost of possible honest forks. Could you elaborate on your comment with some numbers that seems reasonable from your perspective?

24 Likes

Think about it a bit more, I might have overstated how much of an issue this actually is.
I’ll take some more time to look into what the realistic impact reliable cross chain transaction might have.
I’ll talk with some of the game devs on StarkNet and look if I can find some more past DEFI exploits. :slight_smile:

30 Likes

Great write-up!

Trying to understand the different pieces here. Have a question regarding this bit:

Assuming L1 connectivity and liveness, an L2 full node can distinguish between asynchrony and L2 inactivity by observing L1. Specifically, observed L1 inactivity signifies “true” inactivity while observed L1 inactivity signifies an L2 network partition.

Is it accurate to reframe this as follows?

  • observed L1 inactivity implies L2 inactivity, or “true” inactivity on the L2
  • {observed L1 activity & observed L2 inactivity} implies an L2 network partition

Thanks!

Yep. Thanks for the catch - editing now.