Starknet Decentralized Protocol II - Candidate for Leader Elections

In the previous post we remarked the Starknet consensus protocol will be turn-based: time will be divided into slots and each slot will have a unique leader. This post describes a suggestion for a leader election protocol modeled on an epoch-based public lottery. The remainder of the consensus protocol is abstracted away and will be treated separately. In particular we do not treat rewards or incentives.

We begin with an overview of the protocol and a brief discussion of the leader_election function. Next we dive into the protocol components, and end with some remarks on randomness.

Overview

  • The protocol is epoch based – time is divided into epochs which are in turn subdivided into slots; there is one Starknet block per slot. A Starknet epoch will be 5-10 minutes in real-time as measured by L1 block timestamps. Thus L1 divides all of time into roughly equal 10 minute intervals. A Starknet slot will be at most 10-15 seconds. These numbers are not final and may-well end up smaller.
  • To promote agreement, the leader schedule for epoch(i) is determined at the end of epoch(i-2) with epoch(i-1) acting as a synchronization period. This assumes L1 transactions are included in the ledger within 1 epoch.
  • The leader_election function receives an epoch’s randomness and stake distribution as parameters; it takes a slot to a unique slot leader based on the stake and randomness. This function is common knowledge and can be calculated off-chain without sending Ethereum transactions.
  • Randomness comes from L1 and should be renewed once per epoch for unpredictability. The trigger will be part of the decentralized protocol, with a frequency of several minutes. If randomness is not renewed, old randomness will be recycled. We’ll leave this point for last.
  • The leader schedule for epoch(i) only considers the stake distribution at epoch(i-3). This is to ensure that randomness for epoch(i), which is sampled at epoch(i-2), is not known to depositors.
  • An L1 staking contract bookie controls stake and randomness.
    • The bookie has deposit and withdraw functionalities. We will refer to them as actions.
    • There is a withdrawal delay lock_period whose length has not been decided yet. Its purpose is to avoid nothing-at-stake scenarios where stake is withdrawn before malicious activity is detected.

A state update of the L1 core contract ensures via proof that L2 block authorship follows the leader schedule.

The leader_election function

As remarked above, the leader_election function receives an epoch’s randomness and a stake distribution as parameters; it takes a slot to a unique slot leader based on the stake and randomness.

We went with a simple leader election function: flatten each user’s stake to an interval, concatenate the intervals, and randomly choose a point. The leader is the “owner” of this point. This description requires us to specify exactly how we order stakers. We propose the lexicographic order on addresses.

bookie contract

The L1 bookie is Starknet’s staking contract. Its primary goal is to facilitate staking, namely entering and leaving the participant list of Starknet’s leader election protocol. Our design aims to make the leader schedule easy to calculate and efficient to prove. The bookie contract has four roles:

  1. Provide deposit/withdraw functionality for stakers.
  2. Store current stake distribution and pending actions for calculating current and upcoming leader elections.
  3. Store randomness (current and historical) for calculating/verifying current and historical leader elections.
  4. Store some historical data to cheapen the verification of historical leader elections. Specifically, the contract stores a list of actions for each epoch.

The bookie fulfills its first two roles by storing at each time the current stake distribution, alongside some functions which facilitate deposits and withdrawals. Withdrawals involve a “lock period” to avoid nothing-at-stake scenarios.

The third role of the bookie is fulfilled through a mapping rand(epoch)⟶randomness which takes an L2 epoch to its associated randomness.

The final role of the bookie is fulfilled through a mapping actions(epoch)⟶action_list which takes an L2 epoch to the list of actions performed during it, e.g deposits and withdrawals. Actions facilitate easy computation of past stake distributions with less storage, and also without need for historical data. The verification mechanism is detailed below.

Verifying leader elections in an L1 state update

This section describes how the L1 bookie contract is used alongside proofs to verify past leader elections as part of Starknet’s L1 state update in the core contract.

Before diving in, let’s give a rough outline of current Starknet state updates. We want to verify two things:

  1. The submitted proof is valid, i.e the state transition in question is valid.
  2. The initial state that was input to the off-chain proof computation actually coincides with the state of Starknet recorded in the L1 core contract.

Since proof verification is stateless, the second item is what ties the proof to the current state of Starknet. Note the proof computation is off-chain, so the initial state supplied for the proof can be arbitrary. We compare it to the actual state commitment on L1 to avoid such arbitrary state updates.

Our focus is on Starknet state updates which account for the leader schedule. This introduces an additional complication: the randomness and stake distribution are stored on L1, so this part of the Starknet state lives on L1 and changes independently of the L1 state update transactions. Proofs of leader elections must account for intermediate changes in this part of the state in the same way they account for messages from L1. In summary, the verification of leader elections in an L1 state update checks two statements:

  1. The proof should prove that each block author in the sequence of blocks is correct according to the leader_election function. Specifically, it should prove that each block author is the evaluation of the leader_election at the corresponding slot using the randomness and stake distribution described in the state (starting from the initial state, and accounting for possible changes on L1).
  2. The randomness and stake distribution in the proof’s initial state and intermediate states (!) are consistent with the information on L1. That is, the inputs of the leader_election are verified against the bookie contract.

Before elaborating, let us describe a trick to reduce calldata (and consequently reduce costs) for the second item above. To illustrate, imagine a proof of a single block. In the naive approach, the proof is submitted to L1 alongside the resulting state (following the block’s execution). The trick is to only submit the proof alongside state-diffs, i.e the changes in the state caused by the block’s execution. This avoids the redundancy of specifying unchanged parts of the state, but at the cost of “spreading out” the state over L1 history. In our context, the trick amounts to sending proof alongside calldata which only describes intermediate L1 staking actions.

Now we can formalize the above discussion. A proof of leader elections for epochs n+1,\dots n+k is computed with respect to inputs from different places:

  1. Inputs from L1:
  • Initial state commitment (in L1 core contract)
  • For each epoch n+1\leq i\leq n+k , action list actions(i-3)
  • For each epoch n+1\leq i\leq n+k randomness rand(i-2)
  1. Inputs from L2:
  • The initial stake state corresponding to the state commitment in core
  • A sequence of Starknet blocks for the epochs in question (alongside each block’s author)

The L1 verification of the proof (which is stateless) ensures that block authors are indeed output by the leader_election function. To ensure the correctness of the inputs (from L1) to the leader_election function, we must compare them to information on L1. To this end, the proof is appended with calldata in the form of:

  • A mapping rand(epoch)⟶randomness
  • A mapping actions(epoch)⟶action_list

Upon state update, the calldata is compared with the information stored in the bookie.

Having verified the inputs to the proven leader_election computations, we are now convinced of the validity of state transition with regards to the leader schedule.

In summary, we have described how a Starknet L1 state update transaction verifies leader elections using reduced calldata from the state update transaction.

Randomness

As far as we’re concerned, Ethereum’s randomness is of sufficient quality for Starknet. Consequently we will not pursue enhancements such as VDF and/or VRF constructions.

The bookie stores randomness in the aforementioned mapping rand(epoch)⟶randomness. Updating the randomness requires someone to initiate a transaction. Part of the decentralized protocol will be used for such updates. This raises the question of what happens if nobody sends a randomness update transaction in a certain epoch. The simplest answer is to always use the most recent randomness. In this case, the leadership schedule can be predicted just from actions involving the stake. We may slightly refine the randomness R, so that the actual value input to the leader_election function is a derivative of the latest randomness, e.g by saying the derivative for epoch(i) is R(i)=H(R,i). In this case, the schedule will continue to change even in case of a static stake distribution.

Summary

This post described a concrete suggestion for an epoch-based public lottery protocol for leader elections in Starknet. The leader schedule can be derived from L1 data, namely randomness and staking actions stored in the permissionless L1 bookie contract.

We look forward to reading your feedback on any aspect of leader elections!

56 Likes

Hey!

Will there be any mechanism which deals with absent leaders? I.E. situation where chosen leader is not online. It seems to me that in those situations things become messy.

23 Likes

Can you elaborate on this decision a bit more in terms of rationale and/or trade-offs considered?

20 Likes

@3alpha howdy! In this suggestion there is no backup mechanism in case of an idle leader. Instead, an idle leader will incur a latency of 1 slot on the protocol, and everything continues. Could you elaborate on what becomes messy?

@apriori as a validity rollup, StarkNet’s eventual security is derived from Ethereum. Our line of thinking is “Ethereum randomness is used for Ethereum leader elections, and that’s good enough for StarkNet leader elections”. Obviously this makes life much simpler since there’s no randomness construction to construct/implement. I’d be happy to hear arguments in favor of enhanced randomness!

18 Likes

Ah, ok, I misunderstood. I had impression that single leader is responsible for the whole epoch, not just one slot. Reading again, it is clear now. In that case, it is fine. Whole backup mechanism is what becomes messy, but this seems like much better solution.

13 Likes

Hi! I’ve read the above a couple of times and I’m not certain about a some parts. I am hoping you can shed some light on the following:

  1. What randomness are you referring to? I’m assuming it’s the randao_reveal?
  1. Can you describe this in more detail? Who’s responsible for the randomness updates in bookie? Who and how verifies the randomness?

    If my assumption about the use of randao_reveal from above is correct and given that L1 and L2 epochs are out of sync, can the party that that’s updating the bookie choose any L1 block for the randomness as long as the L1 block is within some bounds (not quite sure what those should be)?

    If so, this gives them quite a strong position in deciding the leader schedule, since the other input into the leader_election function (stakes in epoch(i-3)) is already determined.

  2. One other thing I’m not certain about is how is the state settled to L1? Is it the responsibility of the last leader in an epoch? Something else? What if they fail?

6 Likes

Hello and thanks for your penetrating questions!

  1. Yes, it is randao_reveal.

  2. This (very welcome) question is unaddressed here because it depends on the rest of protocol. For example, in the presence of an L1 checkpoint mechanism with a frequency of several minutes, in would be natural to piggyback randomness updates. In our mind, the “granularity” at the disposal of the randomness updater is at the level of L1 epochs and not blocks. Personally I have not thought this out all the way, so please press on. :slight_smile:

  3. The L1 state updates will be coupled to proofs and will therefore depend on a proof schedule. In particular, L2 epochs may pass without any L1 state updates. As to the question of who initiates L1 state updates – this too will revolve around scheduled leaders; the schedule may simply be different from the sequencer schedule (e.g the turns/slots themselves will likely be longer). Regarding failure, I think this is orthogonal to leader elections – you are asking how StarkNet will deal with stalled state updates. This is a major question about the entirety of the protocol which will be addressed in the upcoming post, and also in later concrete suggestions for the protocol.

Does this answer your questions?

6 Likes

Thank you for your prompt answer. Yeah, I know my questions are a bit outside of the immediate leader election problem, but I got here by thinking about how it all works together :slight_smile:

  1. So you would use the randao_reveal of the last block in the L1 epoch? That would solve the issue of bookie updater being able to chose from the set of randomness values.
  1. This implies block producers and sequencers are different entities. I thought they are the same :thinking: Can you please clarify what’s the relationship between the two?
6 Likes
  1. Yep.

  2. Sequencers are block producers. I am merely saying that the schedule itself may be different, e.g instead of Alice having a 10 second slot followed by Bob’s 10 second slot, Bob will start with a 5 minute slot followed by Alice’s 5 minute slot. Incidentally, it is also plausible for provers to be a distinct set of stakers from sequencers; let us address this in the next post. :slight_smile:

8 Likes

It’s not so clear about the slot and epoch, there are 2 questions here

  1. epoch is 5-10 minutes, what about the time slot, 12s like Ethereum ?
  2. leader_election chose the leader, so the protocol still need 2 rounds voting like ordinary BFT?

Dear @F.F, thanks for reading!

  1. The time slots for leadership will likely be similar to Ethereum, around 10 seconds. Note this does not imply actually imply the same blocktime. For instance, leader could have 10 second slots with 1 sec blocktimes, so each leader would be able to propose a sequence of ten blocks.

  2. A secure BFT protocol will likely need two rounds of voting. (Other options like longest-chain protocols are secure without any rounds of voting.)

Thanks for your answer
So as far as I understand that

  1. Epoch is 5- 10 minutes, Slot is around 10 seconds, the leader selected is for whole epoch i ?
  2. Whether BFT or the longest-chain protocol has not been decided?

Dear @F.F

  1. The leader is not selected for the entire epoch, but per slot. The point I was making is that there could be several blocks during a slot.

  2. We will likely go for a BFT protocol.

Thanks, ilia
I got you, the leader is selected for per slot and may propose several blocks there. And the leaders in epoch i is predictable when the leader_election runs in epoch i-2, so the block production will be faster because the order of block producers are known in advance, just like Solana.

Hey!

I have a few questions, largely related to L2 epochs and L1 epochs being out of sync.

Q1: When is map[rand(epoch)] updated?

Q2: map[rand(epoch)] = Beacon RANDAO value for which L1 block?

Q3: Can stakers deposit/withdraw once per L2 epoch? In which case the stake distributin will be fixed across the L2 Epoch i-3 (so it’s clear which values to take). Or can stakers deposit/withdraw once per L1 block? In which case, which stake distribution do we take when calculating the leader election_function?

Q4: Who is responsible for executing the leader_election function?

Thanks :slight_smile:

Hi again,

I don’t quite understand how the leader election verification proof is carried out.

Also, can’t a off-chain prover provide call-data (specifically the maps) that match what the bookie has, but use completely different maps in the off-chain proof?

Thanks.

Hey @0xRyan, your questions are spot on!

The real answer to your point about synchronization is that we’re considering redefining the leader election from a purely slot-based mechanism to a height-based mechanism. This will remove ambiguity about the stake distribution at a particular L2 height. We will publish a more detailed account in a few weeks. At any rate, I’ll try answer your questions.

  1. In the simplest proposal, the map will be updated at every L1 state update. More complicated proposals feature protocols that interact with L1 more frequently, and we can couple the randomness update to these interactions.

  2. This is a good question we’re not yet decided on. By the post, it is first L1 block in each L2 epoch. However, there is risk of ambiguity due to unsynchronized local clocks. This returns us to my remark above about defining things in terms of L2 height.

  3. Staking actions are possible at any L1 block, but their effects are aggregated in epochs. For instance, if you deposit stake twice within epoch(k), then both deposits will come into effect together, at the beginning of a later epoch. In summary, the stake distribution is constant throughout an L2 epoch because the effects of individual staking operations are jointly interpreted at the beginning of later L2 epochs.

  4. Anyone who wants to discover the leader schedule should execute the leader election function from their local view of L1. This means at least the L2 sequencers, but likely also many L2 nodes of varying kinds (e.g wallets, and other low-latency applications).

  5. I am not sure I understand the question. Are you asking “how can one prove that a particular L2 block was created by its designated leader”? The answer “by verifying consensus”. Specifically, the slot number will be part of each block’s header, so L2 consensus on this block affirms the correctness of its leader. I suspect you may have been asking something else, so please correct me!

  6. Perhaps I don’t understand the question, but just to clarify: among other things, the proof sent to L1 also proves the asserted call-data is consistent with the proven state transitions. Does this answer your question?