Starknet Decentralized Protocol VII - Chained Proof Protocols & Braiding

Until now, our posts (1,2,3,4,5,6) have largely focused on describing protocol requirements and design problems. We are now fast approaching solutions. In this post we outline two concrete decentralized proof production protocols. Both are interesting candidates for Starknet. The next post will sketch a complete decentralized protocol.

Readers interested in the derivation of the protocol from more naive attempts, will find a brief appendix recounting some of ours.

We proceed in three parts:

  1. High level outline of chained proof protocols, with the idea that valid blocks must contain proofs of previous parts of the ledger.

  2. A concrete chained proof protocol without recursion: strands

  3. Incorporating recursion

    1. Recursive strands
    2. Braiding

Outline

In this section we roughly outline a class of decentralized proof production protocols that we refer to as chained proof protocols. We postpone more detailed discussion to later sections and content ourselves with hand-waving for now.

Before proceeding, let us first divide a decentralized Starknet protocol into three main components:

  1. L2 consensus and leader elections
  2. Proof production (§3 of proofs in the protocol)
  3. L1 state update (§4 of proofs in the protocol)

For now, treat each component as a black box.

Chained proof protocols are distinguished by a strong coupling between consensus and proof production. This coupling is orthogonal to the L1 state update protocol, which we defer to the end of the post. Disregarding recursion (for the moment) facilitates a simple definition of chained proof protocols: every valid block must prove a portion of the ledger that is specified by the protocol. For example, block n proves block n-k , with k an offset parameter. However, the many benefits of recursion (§1) warrant refining the protocol to incorporate it and reap the fruit: every valid block must prove a portion of the forest of jobs (including both “leaf jobs” and recursive “merge jobs”). For example, block n proves block n-k alongside the merge of blocks n-2k-1,n-2k . See this illustration, where black arrows represent the block chain and colored arrows represent proofs.

Note that chaining “merely” imposes additional conditions for a block to be valid, and is compatible with essentially any consensus protocol. The resulting architecture strongly parallels dual ledger designs for consensus, with a feedback loop of a proven ledger into the consensus protocol.

The offset parameter is chosen to give sequencers enough time to compute/purchase proofs in advance. Specifically, the goal is to have sequencers purchase/compute proofs outside the critical path of consensus. Observe the implicit assumption on predictability of the leader schedule at least k slots in advance.

Candidate chaining protocol

In this section we derive a concrete chained proof protocol without recursion, which we defer to the next section.

A workable proposal: strands

A naive approach requires slot n to include a proof of slot n-k . The trouble is that every proof is designated to only one slot: if slot n is missing from the chain then the job of proving slot n-k will remain a hole forever, without recovery. See this illustration, where the black arrows represent the L2 blockchain and the colored horizontal arrows represent proofs.

We refine the naive slot attempt by increasing the potential provers of slot n-k from the single slot n to the sequence of slots n,n+k,n+2k,\dots , henceforth referred to as a strand. Specifically, we require a valid (block in) slot n must contain a proof of the first unproven slot that it continues its strand. For example, if slot n continues slot n-k then it must contain a proof of it. If slot n does not continue slots n-k,n-2k but does continue slot n-3k , it must contain a proof of it. See this illustration, where the colored arrows represent proofs. The empty square represents an empty slot, and we see the recovery process which “skips” the empty slot and proves its predecessor in the red strand.

Let us rephrase the above requirement in a more flexible formulation that will be useful in the next section.

  • Say the naive job of slot n is to prove slot n-k .
  • The (block in) slot n must include the naive job of the smallest slot from its strand (the sequence \dots , n-3k,n-2k,n-k,n) that is missing from the chain it continues.

For example, if slot n does not continue slots n-k,n-2k but does continue n-3k then it must contain the naive job of slot n-2k.

Division and designation of labor are both determined dynamically: each sequencer determines its job based on its view of the ledger.

Potential bummers

There are some issues with the above proposal:

  • Clear vector of attack: in order to prevent slot n from being proven, the adversary needs to control/censor only the slots from one strand: the sequence n+k,n+2k,\dots with the next target always known several minutes in advance.
  • Latency: an empty slot n adds a latency of k slots for the inclusion of a proof of a the block in slot n-k.

We don’t think the attack vector above is very significant. Nonetheless, it can be mitigated by hiding the recovery slots from the attacker. Specifically, we can use a VRF to implement a lottery to determine which slots are responsible for which recovery.

Incorporating recursion

To achieve L1 security, Starknet must perform L1 state updates. This process consumes expensive L1 gas. This is where recursion comes in: it greatly increases the economic efficiency of L1 state updates by “compressing” proofs. Moreover, recursion is an incremental process and therefore compatible with decentralization. For these reasons, we wish to incorporate recursion into the chained proof protocol. In this section we explore two methods of doing so. First, we present a simple modification of the strand protocol above that partially incorporates recursion – recursive strands. We culminate with the braiding protocol, which fully incorporates recursion.

With that prelude behind us, let’s start from first principles. Recall recursion is handled by a forest of trees of recursive jobs. Binary trees have a pleasant property in this respect: the number of leaves equals one plus the number of internal nodes. Consequently, apart from the very first sequencer, we have a natural job assignment in the happy flow of two proving jobs per sequencer: one leaf and one merge.

In the happy flow of a single chain without empty slots, the natural job assignment above works well. In the unhappy flow where failures (e.g empty slots) are possible, the challenge is to recover from missing proofs. This problem of “filling holes” boils down to the dependencies of recursive trees (whatever the structure): while leaf jobs are independent of each other, merge jobs require completion of previous jobs.

First step: recursive strands

In the strand protocol above, the offset parameter k gives rise to k disjoint proof strands, each composed of slot numbers with a particular remainder \mod k . For remainder zero we get the sequence 0,k,2k,\dots while for remainder k-1 we get k-1,2k-1,3k-1,\dots instead. In each chain, a block contains a proof of its most recent ancestor. Our goal is to merge these chains in a robust way that recovers from empty slots without overburdening individual provers.

The first step is to simplify merging by having each proof strand take care of its internal merges. To achieve this, we make every proof strand into a recursive strands: the (block in) slot n must include a proof of the execution of its parent in its strand chain (slots \dots ,n-2k,n-k) alongside the verification of the proof contained in that slot. Now, verifying the proof in slot n verifies all previous proofs in the associated recursive strand. The picture remains essentially the same.

Each of the recursive strands is robust: if slot n is empty then the leader of slot n+k will in turn prove slot n-k alongside the verification of the proof it contains. Thus we are concurrently building k robust recursive strands.

With k robust recursive strands in hand, we may delegate the merging of strands to a separate protocol function of a strand merger for the sake of simplifying the block production process. This role lives on top of the consensus, and involves merging strands into a single proof that covers a prefix of the entire ledger, useful for an L1 state update. Alternatively, we may choose to merge strands at the level of block production. This path is pursued in the next subsection. Let us emphasize that failure of a strand merger does not create a backlog, since the strands continually “update” themselves using recursion.

Note the division of labor above is somewhat similar to a forest of k growing unbalanced binary trees, with each tree describing a strand associated to a remainder \mod k . The designation of labor involves a round robin: visualize the trees arranged in a circle (the cyclic group \mathbb Z/k\mathbb Z), with each slot adding a leaf and the previous merge to its tree.

Second step: braiding the strands

To merge the recursive strands we require some strands to also prove the verification of at least another strand. To illustrate the idea, consider the simple case of two strands, comprised of the even and odd slots. Suppose we require an even slot to (additionally) prove the verification of the most recent slot of the odd strand which they continue. In total, the proof would consist of three parts: proving execution of the transactions of the previous block in the same strand, proving verification of the previous block in the same strand, and proving verification of a past block in the other strand. Together, both verifications attest to the validity of the entire history from genesis until the end of the earlier proof.

In the above example, the even strand was distinguished in that only its proofs involved verification of other strands. Failures in the distinguished strand would halt the merging of different strands. A more robust option is to have every strand try to merge itself with another. Think of this process as braiding the recursive strands into a single rope. In the simple example of two strands, the even slots would try to merge with the odd strand and vice versa.

A larger k means more strands are involved. In this case, one option is to require each slot to simultaneously merge with several strands. Another option which involves less proving jobs is for each strand to alternate its merge attempts between the remaining strands. Let us give the example of three strands. Help yourself to this illustration. Each row is a strand. The black arrows represent the blockchain while the colored arrows are proofs. Focus on a single color, say, red. The red arrows represent the proofs required of the middle strand, and indeed the diagonal red arrows alternate between the other two strands. The same goes for every color.

Observe that slots 11,12 contain proofs which respectively validate history from genesis until slots 3,4. To see how things change in the presence of an empty slot, you are invited to stare at this diagram.

The latency of merging all the strands corresponds to the distance between the slot number of a proof and the slot number which ends the interval proven from genesis. We therefore want to reduce this distance, which is the natural distance in the graph (with each edge having unit weight). In the above example, the distance for slots 11,12 is 8. More generally, the distance in the latter braiding proposal is \mathrm O(k) slots whence the latency is \mathrm O(k^2) slots. We will outline an algorithm to reduce distance to \mathrm O(\log _2k) whence reducing latency to \mathrm{O}(k \log _2(k)). The key idea is to have exponentially increasing increments. Specifically, strands will uniformly try to merge with their neighbor to the left, then with the strand two neighbors down, then four neighbors down until k and then repeat. Seeing as a picture is worth a thousand words, let us seduce you with some pictures: four strands, five strands, and seven strands. Let’s flesh things out for five strands: in this picture we are looking at the prefix of the ledger that is proven at the bottom right slot labeled \otimes . The nodes labeled with a sequence of arrows are the most recent nodes in each strand that node \otimes will merge with. The sequence of arrows is meant to be read from right to left and specifies the path to them from \otimes.

With braiding, there is no need for a “strand merger” as in the previous subsection. This lack of computation potentially allows for a robust L1 state update protocol, either with brief turns or perhaps even an open race. Of course there is one additional verification to prove at each slot, increasing k, alongside some more redundancy. Certainly the latter illustration is more crowded with arrows than this one.

Summary

We began this post with an outline of chained proof protocols. We then used slots to give a workable proposal that neglects recursion. Finally, we proposed two methods of incorporating recursion: first by maintaining recursive strands and merging them on demand, and second by braiding the strands as part of block production. What are your thoughts on each proposal?

Appendix: naive attempts

In this appendix we describe some naive attempts encountered during our thought process. And their issues.

Naive attempt at chaining

  • Our first naive attempt is for block n in the chain must include proof of block n-k , as in this illustration (linked above). The problem with this proposal is that block numbers depend on the activity of previous sequencers and are therefore unpredictable: if you are leader of slot n and your predecessor was idle (did not produce a block) then your block number is <n and you could not anticipate this. Unpredictability of this kind is problematic since it would prevent sequencers from knowing in advance which proofs to compute/purchase.

Naive attempts at incorporating recursion

  • Unbalanced binary tree. Refine the naive jobs of slot n to proving slot n-k and the merge of the proof in slot n-k-1 with the largest (in terms of slots) merge proof available (see this illustration). Thus each slot does two things: it proves a leaf, and then merges the previous leaf into a single root which proves a prefix of the ledger from genesis. Even in the happy flow, a big problem with this protocol is the lack of time to work on a job. Specifically, the merge job of slot n requires the completion of the merge job of slot n-1; consequently there is only 1 slot for computation of the merge proof instead of the desired k slots. Lack of time may be resolved by merging with the largest available merge-proof that is at least k slots old, but this causes asynchrony issues: the tree will bifurcate many times, resulting in several trees that are never merged.
  • Forest of balanced binary trees. To mitigate the above problem, we want every two jobs within an interval of k slots to be independent. To this end we can work in intervals of k slots (more precisely, with the next power of two), resulting in a queue of balanced binary trees with k leaves. This division of labor highly resembles Mina’s scan state. The length of the queue is chosen to provide sufficiently many jobs to ensure the desired independence. For simplicity, suppose the trees have just two levels. In this case, the naive jobs of slot n are to prove slot n-k and: if n is even - merge slots n-2k,n-2k-1; if n is odd – do nothing. In the happy flow this works well, with enough time to compute. However, empty slots cause backlog accumulation by postponing merges, which subsequently postpone descendant merges. To illustrate this, suppose you are the leader of slot n and that slot n-k-1 was idle.
    • If slot n-1 was also idle then there’s no available proof of slot n-2k-1.
    • If slot n-1 supplied a proof of n-2k-1 there isn’t enough time for n to merge.

In both cases your job of merging n-2k,n-2k-1 must be delayed until a proof of n-2k-1 is provided.

We’d be happy to hear about ideas to improve/fix these naive suggestions or to pursue entirely different paths! The braiding protocol presented above can be seen as a refinement of the first naive proposal of a single unbalanced tree.

29 Likes

Chained proof-of-a-protocol (LSP) and weaving are two important components of the StarkNet decentralized protocol. Their role is to improve the scalability and security of StarkNet and enable more developers to participate in the StarkNet ecosystem.

Chained Proof Protocol (LSP) is a proof system based on blockchain technology that helps StarkNet achieve efficient, scalable, and secure transaction verification. The basic idea of this protocol is to form a chain structure of all transactions, with each block containing the cryptographic hash of the previous block. This way, it is possible to verify that the transaction is legitimate by checking the hash value, avoiding problems such as duplicate transactions and double spending.

Weaving refers to combining multiple proof chains to form a larger chain structure, enabling more efficient, scalable, and secure transaction verification. Weaving technology can combine multiple verification nodes to jointly verify the legitimacy of transactions and ensure the consistency and reliability of transaction records. In this way, the scalability and security of StarkNet can be greatly improved, thereby attracting more developers to join the StarkNet development ecosystem.

In summary, the chained proof-of-a-protocol and weaving are the two core components of the StarkNet decentralized protocol, which are used to improve the efficiency and security of transaction verification, thereby driving the development of the StarkNet ecosystem.

9 Likes

Thanks for the update @ilia It is great to learn details about your current thinking on (recursive) proof building for Starknet.

AFAICT the approach proposed above for computing proofs to verify L1 updates, that enshrine in the L1 the consensus achieved at the L2 level, requires either a single (parallelisable) prover or coordination among a set of disparate provers. Could you comment on this?

Also, in case coordination amongst provers may be needed, do you foresee the need of a specialised coordination protocol between them?

It would also be great if you could comment your current thinking on how to incentivise/reward provers.

Keep up the excellent work!

9 Likes

Dear @dagacha thank you for your comments! I do not follow what you mean by coordination. Are you referring to the braiding proposal or the recursive strands? Regarding incentives, I’m on it.

6 Likes

Both braiding and strands. If there is a pool of provers, and they may build the proof collaboratively, how does each prover individually know which strand or braid they should work on? How does a collection of provers agree that a proof for a L1 state update is complete? How do they avoid wasting proving resources on sub-proofs that are being computed by others? Of course these coordination problems I see go away if provers work in isolation.

Hope this clarifies my question @ilia

3 Likes

In all of these chained proof protocols, the leader of slot n must include proofs of previous parts of the ledger. The designation of labor is specified in each proposal, e.g leader of slot n includes a proof of the transactions in slot n-k. This coupling of proof production with block production coordinates everything. L1 state updates require further action and therefore a separate protocol. There are several possible avenues here (e.g slots, races), all mostly orthogonal to proof production.

4 Likes