Starknet: Multilang & MultiVM

Disclaimer: These are just some thoughts of mine. They currently have no implications on the Starknet roadmap.

This post outlines some thoughts around the following topics:

  1. Why should we support multiple smart contract languages?

  2. What core stack(s) should we choose for Starknet?

  3. How can we implement them, and how painful is it to add things?

  4. What migrations are necessary and how should we go about them?

When should you care? Nothing here is actionable in the upcoming months, so take your time.

Stack of a zk-blockchain

Here are the key terms. You can skip to the diagram below for some visuals (ignore the red arrows).

  1. High level language (HLL) not coupled to blockchain

  2. Smart contract languages (SCL) which extend high level languages with blockchain-related notions of state, syscalls, etc.

  3. Blockchain VM (bcVM) which defines a blockchained-oriented instruction set architecture (ISA) that provides resource metering, state, syscalls, and safety.

  4. zkVM which defines a proving-oriented ISA.

In Ethereum there are just two levels to the core stack: contract languages (Solidity, Vyper) and the blockchain VM (EVM). Ethereum is currently not a zk-blockchain as its protocol involves no proofs of EVM execution.

Rollups have another level in their stack: zkVM i.e a provable (and proof-friendly) arithmetization of a VM.

  • EVM-based rollups have a zkEVM, and compile EVM bytecode to their zk instruction set.

  • Starknet has an entirely different stack, with Cairo as the SCL, Sierra as the bcVM, and Cairo VM as the zkVM.

Why? Building for adoption

Our goal is to bring adoption to Starknet. The consumer-facing part of our stack is the SCL(s), “consumed” by SC developers. All other things equal, devs obviously prefer languages that are familiar and/or have a good reputation/adoption. Blockchains with exotic SCLs must fight back with an efficient stack (causing lower transaction fees & latency) and other features.

Historically, both the Ethereum and Starknet stacks developed bottom-up; the SCLs are both exotic.

  • EVM was invented to facilitate general computation on blockchain. Solidity was subsequently invented to provide good SC devX.

  • Cairo VM was invented to avoid manual AIR-writing per business logic. Then CairoZero was invented as a low-level language for writing exchange-oriented business logic. Later arose two problems: how to charge for reverted Starknet transactions (red-green) and the devX, respectively motivating Sierra and high level Cairo. Note the direction was bottom-up: the SCL was invented to fit the existing VM.

To improve adoption, we can attract devs by adding SCLs to Starknet. Some examples:

  1. Solidity, highly adopted in blockchain as the main SCL in the Ethereum ecosystem

  2. Other exotic SCLs inspired by Rust, e.g Move and Sway.

  3. SCLs based on HLLs that are adopted in web2

    1. Existing ones include Solana’s flavor of Rust, Arbitrum Stylus, CosmWasm

    2. Perhaps we’ll want to define our own SC extensions for Rust, typescript, python, etc

What core stack(s)?

There’s a lot of weight to attractive SCLs and to other features of the core stack.

Candidate directions

Pro Con Remarks
M31 Cairo Easiest way toward lightning-fast client-side proving Bonus: HLL Cairo as the best (fastest & friendliest) provable language Other HLLs & SCLs accessible only through extra compilation interpretation layers
LLVM-friendly May attract Web2 devs: HLLs that compile to LLVM are supported, and can be extended to SCLs LLVM forgets types, so type-safety unclear No off-shelf bcVM Athena bcVM WIP
WASM-friendly SpaceMesh advised against, due to ever-growing instruction set
Solidity Friendly Helps unlock Solidity devs Doesn’t provide tooling, which is coupled to EVM and Ethereum RPC
Move-friendly Benefit from a bcVM built by expertsBootstrap Aptos & Sui ecosystem Not LLVM-friendly Typed Starknet

Priorities?

Depends on the goal. IMO here are the most interesting goals as of Oct 13, 2024 and some paths toward them:

  1. :zap:Lightning fast client side proof & verification on Starknet

    1. M31 Cairo

    2. Rusty SCL + Rust-friendly stack to wrap Rust verifier in a contract

  2. Attract Web2 devs

    1. Pursue an LLVM-friendly stack (bcVM unclear)

    2. Compile HLLs to MoveVM and write a Move zkVM?

    3. Compile HLLs to Sierra (WIP by Reilabs)

  3. Attract Solidity devs

Do we need to choose just one direction for years?

No. See next section.

How? MultiVM!

⚠️ What follows is just a sketch of an idea; apologies if imprecise, unclear, inaccurate, etc.

There are three flavors of solutions, ranked by efficiency:

  1. Native Multi-VM – prove AIRs for more instructions

  2. Compile everything to one core stack (example: LLVM→Sierra WIP by Reilabs)

  3. Write interpreters in Cairo (example: RISC-V interpreter WIP by MassaLabs; Kakarot :carrot:)

Conceptually, option #1 is “downwards” while option #2 also involves horizontal steps across stacks.

Some advantages of #1 over #2:

  1. Adding a new stack is easier:

    1. Option #1. Nothing across stacks but do need AIRs (which we’re best at). We’ll also benefit from tooling that will be naturally developed as part of other stacks.

    2. Option #2. Compilers across stacks can be unnatural (e.g EVM→CASM) both in principle and tooling-wise.

  2. In option #2, moving to a new distinguished stack sucks. Choice between two evils:

    1. Throw away the compilers to the previous distinguished stack and develop new stuff

    2. Only develop a compiler from the old to the new distinguished stack, but incur inefficiency of two layers of compilation from the other stacks.

We’ll focus on option #1: native multi-VM. Below we outline two approaches. The modular one manages several separate core stacks that operate on different (but possibly intersecting) parts of the Starknet state. The monolithic one is founded on a big monolithic ISA.

  • From the product PoV, the modular approach has the slight benefit that we can use the separate stacks for other products/purposes. I don’t see additional differentiators atm.

  • Engineering PoV – deferring to engineering. Two observations:

    • The monolithic approach incurs some complexity cost due to many constraints being used in the same proof (very large AIR).

    • The modular approach has a more complicated flow using applicative recursion.

Modular – multiple operating systems

The state of Starknet is partitioned per-VM. Adding another stack means appending new parts to the Starknet state that are compatible with the zkVM that will prove them. For example, adding the M31-Cairo stack means a new part of the state whose commitment hash may be Blake/M31-Pedersen, and not the current mix of Pedersens and Poseidons. The old state can be unaffected.

Each part of the state is managed by its own operating system. OSₖ is a program that compiles to assembly for zkVMₖ – just as the current OS compiles to CASM.

So far we can handle disjoint stacks, but we really want composability. To this end we need to define several things:

  1. How does a transaction specify entry points to contracts from multiple stacks?

  2. How can contracts communicate across stacks? Suppose a function f in contract A wants to call a function g from contract B…

  3. How can a contract change to a class from a different stack?

  4. How is inter-stack communication proven? OSₘ manages stack m but it doesn’t “understand” n. How to prove a transaction that goes back and forth between the stacks?

We’ll cover each in turn.

  1. Either transactions will specify entry points in “heterogeneous”, stack-specific formats, or we’ll have some uniform unstructured identifiers.

  2. The SCLs will refer to contract interfaces in “universal” unstructured terms. Each SCL→bcVM compiler will require a utility to specialize the universal description to its zkVM.

  • @ilyalesokhin suggests using the ABI, and also thinks associated utilities will be straightforward to implement.
  1. The replace_class_hash syscall will be aware of all stacks. :warning: Perhaps this is relevant for more syscalls. (Is it cleaner to move this logic into the state manager?)

  2. Hints.

    1. During the execution phase, every context-switch will be marked. During the proving phase, each OSₘ will:

      1. Receive will receive hints about intermediate states.

      2. Externalize intermediate state data as public outputs.

    2. The hints will be jointly verified by a “state manager” program in an applicative recursion proof. The state manager is written in whatever HLL preferred by engineering (don’t need SCL).

    3. :warning:There is inevitable overhead in translating between the different state representations for each stack.

      1. Universal translation logic can sit in the state manager.

      2. Each OS can translate to some “universal” unstructured encoding, so the state manager only deals with book-keeping.

  • This design draws from the syscall handler and also from book-keeping that arises in proof-chains.

Monolithic – large instruction set

Bundle everything into a big ISA. TBD whether it also makes sense to bundle everything into a big bcVM. Compiler work seems more complicated to me, but leaving such discussions to engineering.

Migrations

This section has some proposed procedures for migrating Starknet (classes, state, OS) from its current core stack to another one. Everything is very naive; apologies if also stupid. :see_no_evil:

In the multi-VM approach, I think we can avoid migrations entirely:

  1. There are no necessary large scale migrations.

  2. To avoid breaking changes when adding a new stack, we can move the state to e.g a forest of several trees indexed by some sequence numbers. This is similar to DA trees in the Volition design.

State. One-time proof of conversion: we prove the conversion from the current state representation to the new one. For example, to move from the present 252 Cairo + mixed Pedersen-Poseidon commitment to M31 Cairo with e.g Blake, we’ll prove type conversions and also compatibility of commitments using the old hash and the new.

I’m really in favor of

option #1: native multi-VM. Below we outline two approaches. The modular one manages several separate core stacks that operate on different (but possibly intersecting) parts of the Starknet state

I think that it’s the best to foster an ecosystem of AIR/ISA builders each with different purposes. The monolitic approach on the other would most probably limit the number of contributors and consequently the total number of built ISA

Although it is not ZK, I think Arbitrum Stylus might be a good example for native/modular multi-VM.
I wonder how they solve multi-OS approach. Will read a bit more on it.