In continuation of post I (introduction), post II (leader elections), and post III (consensus), this post outlines our thoughts and questions surrounding proofs in the decentralized Starknet protocol. The exposition is divided into five parts:
- Proof fundamentals
- Goals of prover decentralization
- Proof production (division and designation of labor)
- L1 state updates
- Pricing and transaction fees
Proof fundamentals
Given a sudoku puzzle, verifying a solution is easier than solving from scratch. If our goal is to convince people of the statement “this puzzle has been solved”, we can save a lot of computation by having one person compute a solution and then propagate it for others to verify. In this strategy, each computation of a solution becomes a one-time event which does not require replication by society. In a similar vein, Starknet scales Ethereum by replacing heavy L1 computation with lighter (hence cheaper!) L1 verification using STARK proofs computed off-chain. In this section, we outline high-level information about these proofs which directs protocol design.
Complexity: proof vs verification
The STARK proof complexity of an execution of a program in n steps is n\log n while verification complexity is \operatorname{polylog}n .
- It is slightly more computationally efficient to prove k small jobs of length n than a single large job of length kn , because k\times n\log n \leq knĂ—\log kn .
- It is much more computationally efficient to verify (a proof) of a large job of length kn than of k small jobs of length n , because \operatorname{polylog} kn\ll kĂ—\operatorname{polylog}n .
The benefit from the proof inequality is logarithmic and therefore minor. In contrast, the benefit from the verification inequality is substantial: it reduces a linear dependence on the number of jobs to a logarithmic dependence.
Recursion
By a recursive proof we mean a proof of verifications of other proofs. Efficient recursion allows us to reap the benefits of logarithmic verification without the drawback of computing enormous proofs. A recursive Cairo verifier has been in production for several months. Recursive proofs can be described by a forest of trees whose leaves are applicative jobs and whose internal nodes are recursive merge jobs.
- Compression – verification complexity of the root is polylogarithmic in the sum of the leaves. Consequently, submitting a root proof of a tree with many leaves makes L1 verification more economically efficient than verifying many leaves.
- Concurrency – a queue of jobs can be divided into concurrent queues of smaller jobs which are handled separately and later merged.
- Less waiting – no need to wait for the last job in a train to begin proving.
- Accessibility – the absence of huge proofs allows smaller machines to contribute to the recursion tree.
More details on benefits of recursion may be found in this lucid post by StarkWare’s head of core engineering.
Goals of prover decentralization
Having outlined the fundamentals of proofs, we ask ourselves: What is the goal of decentralizing the proving layer on Starknet? By design, performing an L1 state update to a state S will require evidence of consensus on S to prevent hijacking the L1 rollup contract. Since proofs are sound, provers will therefore be restricted to proving the ledger output by consensus. Hence, the only malicious prover behavior is a liveness attack by veto on consensus. The primary goal of decentralizing the prover layer is to ensure (user confidence in) its liveness.
We observe two orthogonal protocol properties that facilitate decentralization:
- Permissionless (censorship resistant) – prover participation cannot be censored by stakers
- Accessible – the entry barrier for participation is low enough to prevent a corporate cartel
If the entry barrier involves computational capabilities (latency, throughput) or economic efficiency (proof price), then there is a trade-off between accessibility and performance. For example, PoW is more accessible than racing for the fastest proof, since the latter will be completely dominated by specialized corporations (more on this comparison later).
An exclusive protocol is susceptible to liveness failures ordered by some executive decree, e.g an ill-disposed CEO of the dominant proving conglomerate. In such a scenario, sufficiently profitable permissionless protocols will eventually incentivize competing corporations and recover. However, eventual recovery raises an obvious question: what is the expected recovery period? As long as the proof protocol stalls, users cannot perform L1/L2 interaction nor finalize high-valued transactions. Even if provers are highly performant, delayed recovery periods may deter many users.
A theoretical method to cut extended recovery periods is gradual opening: while proofs stall, the entry barrier is lowered until liveness is recovered. In practice, this opening may not improve on the eventual recovery scenario outlined above. True improvement requires people and small companies to maintain proving infrastructure which can be used as a fallback. Unfortunately, we don’t see how to incentivize such maintenance if it is unprofitable in the happy flow, where the dominant provers are live.
What is a good sweet spot in the trade-off between accessibility and performance of the proof protocol? How can it be achieved in practice?
Proof production
With motivation to decentralize the proving layer we turn our focus to the proof production production process, which is necessary for L1 state updates. We analyze the production process through its stages:
- Division of labor – what are the jobs?
- Designation of labor – who performs each job?
While designation of labor is coupled with decentralization, division of labor may be treated in the centralized setting. This observation suggests a centralized starting point. Before diving in, we’d like to acknowledge the Mina protocol whose division of labor and designation of labor are used in production.
Division of labor
Following the above, let us begin with a centralized proving layer operated by a single entity. The prover must handle the stream of blocks output by consensus (likely recorded as L1 checkpoints).
The prover will determine jobs and their scheduling to optimize for some parameter (compression, latency, price) constrained by its computational limitations and the protocol itself. The optimization favored by the prover will depend on incentives. For example, low proving rewards will cause the prover to spin up cheaper servers to optimize for costs at the expense of latency and/or throughput.
Here’s a representative division of labor problem. Assume a consensus throughput of 100K transfers/sec. Further assume the prover can spin up at most 256 proving machines, each at a fixed hourly rate with a throughput of 1K transfers/sec. Lastly, assume a root proof encompassing 100M leaf steps is profitable for an L1 state update. Given a maximal hourly budget, describe which leaf size and tree structure minimizes latency for a profitable root proof. It will be natural to decide on division of labor for particular performance parameters, and only once we have decided what we wish to optimize.
Designation of labor
A decentralized proving layer requires the protocol to designate which provers are eligible for the rewards of a particular job. We dissect the designation process through a series of questions.
- Which qualities are sought by the designation process?
- Competition or turns (or a combination)?
- Which designation process satisfies the above criteria?
Example qualities are proving speed, cost efficiency, and amount of stake. A competition for proving speed takes the form of a race, for cost efficiency - the form of an auction, and for stake - the form of a vote.
Competition has two main advantages: it discovers the best candidate and also incentivizes improvement. Disadvantages depend on the distribution of winners:
- The “best always wins” model can lead to centralization by disincentivizing participation. Specifically, the concern is that A is ε “better” than B, but always wins. This is especially risky when participation requires effort that will likely go unrewarded, e.g in a deterministic computation race.
- The “fair share” model (winning chance roughly equals relative performance) is more compatible with decentralization (e.g PoW) but introduces redundancy which increases operation costs. Specifically, if A is ε “better” than B then both have roughly equal winning chances, then they both expend effort to participate.
How to best enjoy the benefits of competition without risking centralization?
In contrast to competition, turn-based monopolies resolve incentive problems well, but may not discover the best candidate for each job or incentivize improvement. A glaring question is: what are the turns based on? Stake-based turns allow stakers to choose provers. This does not seem problematic in principle, but the “classical” role of stake is to provide Sybil resistance only for the consensus layer. What are some other interesting ways of realizing turn-based monopolies? How to incentivize improvement in such a model?
A turn-based model can also be competitive, e.g a turn-based auction. For example, Ethereum’s designation of labor for sequencers is essentially a turn-based auction, with the winner being the first sequencer willing to sell block space to a transaction in exchange for its fee.
L1 state updates
The last part of the Starknet protocol is the L1 state update, which finally achieves L1 security by L1 proof verification. To incentivize provers to actually converge (i.e merge instead of only proving leaves), it seems natural to issue prover rewards only upon L1 state updates. Furthermore, L1 logic can regulate the entire recursive tree associated with the submitted root proof. For example L1 logic may check prover identities, issue individual rewards according to proof size, and reject overpriced proofs.
A key distinction between proof production and L1 state updates is the non-computational nature of the latter. Indeed L1 state updaters merely observe the forest of proofs output by the proof production process and follow their own protocol to perform L1 state updates by “copy-pasting” proofs to L1. The absence of computation precludes wasted computation in case of failed L1 state updates. Furthermore, it renders computational capacity irrelevant and consequently averts centralization around strong machines. These benefits suggest a “best always wins” competitive model for L1 state updates. Specifically, an open race where only the first person to perform an L1 state update receives a reward.
Unfortunately, in our current L1 state update mechanism the submission of proofs for L1 verification also includes state-diffs that require a lot of expensive calldata. Since reverted transactions still pay for all of their data, an open race would inflict non-negligible losses on the losers, disincentivizing participation. Is there a clever way to have an open race protocol for L1 state updates in which the losers’ losses are negligible? One idea is to use a commit-reveal scheme where the first committer deposits collateral in exchange for a brief monopoly. To defend against extended DoS, we may exponentially increase the required collateral.
An open race for L1 state updates can refuse proofs unless submitted by the designated prover as described above. However, it is also interesting to only restrict issuance of rewards to designated provers, without rejecting the proofs themselves. This would allow incentivized parties to perform L1 state updates at personal expense using their own proofs.Note we can always fall back to a stake-based leader schedule for L1 state updates. In this case we simply lose the benefit of updating at the rate of the first incentivized entity.
Pricing and transaction fees
This section discusses the following pair of questions.
- Division of funds – how are transaction fees divided between sequencers and provers?
- Fee calculation – how do users calculate transaction fees without severely under/overpaying?
Starknet transactions inflict distinct computational burdens on the sequencer and prover layers. Since users send their transactions directly to sequencers, it is natural to facilitate free trade via sequencer transaction fees. Specifically, the sequencer sells block space to users in exchange for transaction fees.
The situation is different for provers, whose jobs are not determined by direct interaction with users but rather restricted to the ledger output by consensus. Hence requiring users to pay a separate proving fee for each transaction feels somewhat contrived. It seems more natural to abstract the division of funds between sequencer and prover away from the user. To this end, the user may specify a single transaction fee for the combined computation inflicted on the protocol, with division of funds occurring at a later stage outside user experience.
This approach is taken by the Mina protocol: transaction fees are paid to sequencers who subsequently use them to purchase proofs. The protocol is not involved in pricing and incentivizes sequencers to purchase proofs by merely coupling block production with proof production. In this architecture, transaction fees become correlated to the proving complexity of previous transactions and not themselves. It is possible to uphold correlation between a transaction’s fee and its proof complexity by moving proof production into the critical path by requiring each sequencer to append to its block a proof of its execution. This architecture further supports a flexible division of labor, while designation of labor is via “best always wins” competition for cost efficiency. It is also possible to move to a “fair share” competition by introducing some non-determinism into proof generation by forgoing completeness. This idea is explored by this paper and serves as the foundation for consensus in the Aleo blockchain.
An alternative approach to division of funds is to have the users separately specify fees for their transactions’ marginal execution and proof complexities.
Both approaches to division of funds face the fee calculation problem, which exists regardless of proofs in the protocol. Ethereum resolves this problem for sequencing using the EIP1559 method of protocol-level price discovery. Since Ethereum block time is several seconds, the responsive base fee algorithm discovers price changes quickly, promoting a pleasant user experience. If proofs are frequently propagated as part of the protocol, an EIP1559-like method will supply responsive posted-prices regardless of the underlying designation of labor. For example, the turn-based auctions for sequencing can be replaced with open auctions, with a collateral deposit required for every bid. The extent to which resulting prices reflect the market depends on accessibility of the protocol. If proofs are infrequently propagated, an EIP1559-like algorithm will be insufficiently responsive and it may be better to record auctions on L1 for quick censorship-resistant price discovery.
Summary
The abundant literature on consensus and the proliferation of blockchains are a great guide for designing the sequencing layer of the protocol. For proofs, less is established. We have presented our thoughts and questions about a decentralized proving layer for Starknet and hope to receive insightful feedback. A future post will discuss how to use checkpoints for a high level of fast finality, and subsequent posts will outline concrete protocol suggestions. See you there!