How might StarkNet provide high-throughput high-integrity data availability? (validium/volition)

The first step might be a simple DAC. What might come after?

12 Likes

We discussed several DA solutions in the past, I’ll briefly describe a few of them to give a flavor of how DA could look like on StarkNet.

  • Permissionless DA Providers: In this approach, every contract will specify his “DA service”. Every change to this contract’s off-chain storage must be accompanied by a signature from that service (enforced at the OS level). The improvement over the naive DAC is that anyone can serve as a DA service for any contract. There are some issues to note here

    • The state commitment must be designed in a way that a change in one contract’s off-chain storage does not hurt other unrelated contracts. One way to achieve this would be to always publish the off-chain storage root for every contract (just the root).
    • It might not be enough to know my contract’s storage, e.g. my balances are usually kept in different ERC20 contracts. If the data of those contracts is lost, then it clearly affects other entities as well. One way to mitigate this could be to add a field to a transaction of “allowed DA services”, which is only allowed to interact with contracts whose data provider is included in the given list of trusted-providers.
    • What happens if the data provider does not provide a timely signature? This could be a potential dos on the system. One way to mitigate this could be to require that any DA service will lock some funds. If the signature was not provided the sequencer will just fallback to the naive approach, i.e. send the diff as calldata to L1, and cover the expenses from the data provider deposit. One might want to allow the sequencer to only partially cover the expenses, in order to make it unprofitable for a sequencer to not include a given signature in an attempt to hurt the data provider.
  • Global stake: A weighted signature of at least P percent of some token holders must be provided in order to produce a valid proof, e.g. if we have 100 T tokens and P=0.2, and Alice and Bob have 15, 10 tokens correspondingly, then a signature from Alice and Bob would suffice. The semantics of signing is that you agree that the data for the next state is available. Note that this is not punishable, and there’s a risk of token holders carelessly signing on a dishonest next state (dishonest in the sense that the data for it was not made available to the network). I’m ignoring more considerations here, e.g. the possible locking of the voters tokens for some duration.

  • Relying on the consensus of a different L1: Instead of posting the data to Ethereum, we could post it elsewhere, e.g. in an L1 which focuses on data availability like Celestia. The challenges of this approach are twofold. First, you need some way to reach consensus on data availability, which is far from straightforward (we need to be able to agree that all the data up to state S is available to the network, unfortunately there is no concise data availability argument). The second challenge is to bridge this consensus to Ethereum. The motivation here is to be able to verify this in our StarkNet contract, that is, whenever a valid proof is seen for the transition S → S’, we also expect to receive some witness/proof of the fact that out data availability layer has reached consensus that the data for S’ is available.

Note that the question we’re handling here is orthogonal to the technical design of Volition, namely, how would one define off-chain/on-chain storage inside a contract. This can be available before a DA solution, i.e. anyone will be able to define (cheaper) storage which does not reach L1, however until some guarantees are provided along the lines of the above, using this storage is completely at the risk of the user (when we’re the only sequencer, we can obviously store it for less than L1 call data fee, however in the multi sequencer phase, using such trusted storage is prone to attacks, as anyone can update your storage and refuse to share the new values).

17 Likes

Global stake: A weighted signature of at least P percent of some token holders must be provided in order to produce a valid proof, e.g. if we have 100 T tokens and P=0.2, and Alice and Bob have 15, 10 tokens correspondingly, then a signature from Alice and Bob would suffice. The semantics of signing is that you agree that the data for the next state is available. Note that this is not punishable, and there’s a risk of token holders carelessly signing on a dishonest next state (dishonest in the sense that the data for it was not made available to the network). I’m ignoring more considerations here, e.g. the possible locking of the voters tokens for some duration.

To prevent lazy signers, we could require DA participants to prove that they know all of the information in the state delta for the batch:

  • Require locking set number of tokens to be considered a DA validator. For a validator V, let V_privkey be the private key capable of withdrawing these locked tokens.
  • For a given batch, randomly select a committee of validators of size m
  • Construct a merkle tree of state updates (key/value pairs)
  • Require a quorum of DA validators to submit a zk proof that they (a) know their V_privkey, and (b) know a series of state updates U1, U2, … which when merklized result in the correct root value.
    • Requiring knowledge of V_privkey prevents DA validators from outsourcing their work (whoever they outsourced to could steal their funds.)
    • Only reward the DA validators (with fees, etc) if the batch succeeds, so they have an incentive to work with honest sequencers
    • If a sequencer fails to submit all the DA validator proofs in time, penalize the sequencer for likely having refused to send validators the data they would’ve needed to generate the proofs.
  • Verify these proofs inside of the STARK for each batch (would probably require recursive proofs)

This moves us to a 1-of-m trust model, where you need to trust at least one of the randomly-selected DA participants who is participating in a given block.

At a committee size of m=500, an attacker who controls 90% of all locked tokens would have a per-batch probability of DoSing the system of p^m. Given hourly batches, the system could run for 100 years with a cumulative DoS probability of .0000000000000001


(A simpler solution might be to just not even attempt to provide super high integrity DA, and instead rely on a centralized Starkware DAC until ethereum data shards are released and calldata becomes 100-1000x cheaper.)

13 Likes