Special thanks to Fede, Danno Ferrin and Tim Beiko for feedbackand review
Ethereum aims to be the world ledger: the platform that storescivilization's assets and records, the base layer for finance,governance, high-value data authentication, and more. This requires twothings: scalability and resilience.The Fusaka hard fork aims to increase the amount of data space availableto L2 data by 10x, and the currentproposed 2026 roadmap includes a similarly large increase for theL1. Meanwhile, the merge upgraded Ethereum to proof of stake, Ethereum'sclient diversity hasimproved rapidly, work on ZKverifiability, work quantum resistance is progressing, andapplications are getting more andmore robust.
The goal of this post will be to shine the light on an aspect ofresilience (and ultimately scalability) that is just as important, andeasy to undervalue: the protocol's simplicity.
One of the best things about Bitcoin is how beautifullysimple the protocol is:
There is a chain, which is made up of a series of blocks. Each blockis connected to the previous block by a hash. Each block's validity isverified by proof of work, which means... checking that the first fewbytes of its hash are zeroes. Each block contains transactions.Transactions spend coins that were either created through the miningprocess, or outputted by previous transactions. And that's pretty muchit. Even a smart high school student is capable of fully wrapping theirhead around and understanding the Bitcoin protocol. A programmer iscapable of writing a client as a hobby project.
Keeping the protocol simple brings a number of benefits that are keyto Bitcoin or Ethereum being a credibly neutraland globally trusted base layer:
- It makes the protocol simpler to reason about, increasingthe number of people who understand and can participate inprotocol research, development and governance. It reduces the risk thatthe protocol gets dominated by a technocratic class that has a highbarrier to entry.
- It greatly decreases the cost of creating newinfrastructure that interfaces with the protocol (eg. newclients, new provers, new logging and other developer tools).
- It reduces long-term protocol maintenancecosts
- It reduces the risk of catastrophic bugs, both inthe specification itself and in the implementation. It also makes iteasier to verify that there are no such bugs.
- It reduces the social attack surface: there's fewermoving parts, and so fewer places to guard against specialinterests.
Historically, Ethereum has often not done this (sometimes because ofmy own decisions), and this has contributed to much of our excessivedevelopment expenditure, all kinds of securityrisk,and insularity of R&D culture, often in pursuit of benefits thathave proven illusory. This post will describe how Ethereum 5years from now can become close to as simple as Bitcoin.
Simplifying the consensuslayer
Simulation of 3-slot finality in 3sf-mini
The new consensus layer effort (historically called the "beam chain")aims to use all of our learnings in consensus theory, ZK-SNARKdevelopment, staking economics and other fields over the last ten yearsto create a long-term optimal consensus layer for Ethereum. Thisconsensus layer is well-positioned to be much simpler than the statusquo beacon chain. Particularly:
- The 3-slot finality redesign removes the concept ofseparate slots and epochs, committee shuffling, and many other parts ofthe protocol specification that are related to efficiently handlingthese mechanisms (as well as other details, eg. sync committees). Abasic implementation of 3-slot finality can be made in about200 lines of code. Unlike Gasper, 3-slot finality also hasnear-optimal security properties.
- The reduced number of active validators at a timemeans that it becomes safer to use simpler implementations of the forkchoice rule.
- STARK-based aggregation protocols mean that anyonecan be an aggregator, and we do not have to worry about trustingaggregators, over-paying for repeated bitfields, etc. The complexity ofthe aggregation cryptography itself is significant, but it is at leasthighly encapsulatedcomplexity, which has much lower systemic risk toward theprotocol.
- The above two factors also likely enable a simpler and morerobust p2p architecture.
- We have an opportunity to rethink how validator entry, exit,withdrawal, key transition, inactivity leak and other related mechanismswork, and simplify them - both in the sense of reducingline-of-code (LoC) count, and in the sense of creating more legibleguarantees of eg. what the weak subjectivity period is.
The nice thing about the consensus layer is that it is relativelydisconnected from EVM execution, which means that there is a relativelywide latitude to continue to make these types of improvements. Theharder challenge is how to do the same on the execution layer.
Simplifying the executionlayer
The EVM is increasingly growing in complexity, and much of thatcomplexity has proven unnecessary (in many cases my own fault): a256-bit virtual machine that over-optimized for highly specific forms ofcryptography that are today becoming less and less relevant, andprecompiles that over-optimized for single use cases that are barelybeing used.
Attempting to address these present-day realities piecemeal will notwork. It took a huge amount of effort to (only partially!) remove the SELFDESTRUCT opcode,for a relatively small gain. The recent EOF debate shows the challengesof doing the same thing to the VM.
As an alternative I recently proposed a more radical approach:instead of making medium-sized (but still disruptive) changes to the EVMfor the sake of a 1.5x gain, perform a transition to a new and muchbetter and simpler VM for the sake of a 100x gain. Like the Merge, wehave fewer points of disruptive change, but we make each one much moremeaningful. Specifically, I suggested we replacethe EVM with either RISC-V, oranother VM that is the VM that Ethereum ZK-provers will be writtenin. This gives us:
- A radical improvement in efficiency, because(within provers) smart contract execution will run directly, without theneed for interpreter overhead. Data from Succinct shows a potential100x+ performance improvement in many cases.
- A radical improvement in simplicity: the RISC-Vspec isabsurdly simple compared to the EVM. Alternatives (eg. Cairo) aresimilarly simple.
- All the benefits that motivated EOF (eg. codesections, more static analysis friendliness, larger code sizelimits)
- More options for developers: Solidity and Vyper canadd backends to compile to new VMs. At the same time, if we chooseRISC-V, then developers who write in more mainstream languages will beable to port their code over to the VM.
- Removal of the need for most precompiles, perhapswith the exception of highly-optimized elliptic curve operations (thoughthose too will go away once quantum computers hit)
The main downside of this approach is that unlike EOF, which is readytoday, with a new VM it will take a relatively longer amount of time forthese benefits to reach developers. We can mitigate this by also addingsome limited but high-value EVM improvements (eg. contract code sizelimit increase, DUP/SWAP17-32) that could be implemented in theshort term.
This gives us a much simpler VM. The main challenge is: what dowe do with the existing EVM?
Abackwards compatibility strategy for VM transition
The biggest challenge with meaningfully simplifying (or evenimproving without complexifying) any part of the EVM is how tobalance accomplishing the desired goals with preserving backwardscompatibility for existing applications.
The first thing that is important to understand is: thereisn't one single way to delineate what is the "Ethereumcodebase" (even within a single client).
The goal is to minimize the green area: the logicthat nodes have to run in order to participate in Ethereumconsensus: computing the current state, proving, verifying, FOCIL,"vanilla" block building.
The orange area cannot be decreased: if an executionlayer feature (whether a VM, a precompile, or another mechanism) iseither removed from the protocol spec, or its functionality is altered,clients that care about processing historical blocks will have to keepit - but, importantly, new clients (or ZK-EVMs, or formal provers) cansimply ignore the orange area entirely.
The new category is the yellow area: code that isvery valuable for understanding and interpreting the chaintoday, or for optimal block building, but is not part ofconsensus. One example that exists today is Etherscan (andsome blockbuilders') support for ERC-4337 user operations. If we replace somelarge Ethereum feature (eg. EOAs, including their support for all kindsof old transaction types) with an onchain RISC-V implementation, thenconsensus code would be considerably simplified, but specialized nodeswould likely continue using their exact same code to interpret them.
Importantly, the orange and yellow areas are encapsulatedcomplexity, anyone looking to understand theprotocol can skip them, implementations of Ethereum are free to skipthem, and any bugs in those areas do not pose consensus risks.This means that code complexity in the orange and yellow areas has farfewer downsides than code complexity in the green area.
The idea of moving code from the green area to the yellow area issimilar in spirit to how Apple ensures long-term backwards compatibilitythroughtranslation layers like Rosetta.
I propose, inspired by recentwritings from the Ipsilon team, the following process for a VMchange (using EVM to RISC-V as an example, but it could also be used foreg. EVM to Cairo, or even RISC-V to something even better):
- We require any new precompiles to be written with acanonical onchain RISC-V implementation. This gets theecosystem warmed up and started working with RISC-V as a VM.
- We introduce RISC-V as an option for developers towrite contracts alongside the EVM. The protocol nativelysupports both RISC-V and EVM, and contracts written in one orthe other can freely interact with each other.
- We replace all precompiles, except elliptic curveoperations and KECCAK (as these require truly optimal speed),with RISC-V implementations. That is, we do a hardforkthat removes the precompile and simultaneously changes the code of thataddress (DAO-fork-style) from being empty to being a RISC-Vimplementation. The RISC-V VM is so simple, that this is a netsimplification even if we stop here.
- We implement an EVM interpreter in RISC-V (this ishappening anyway, because of ZK-provers) and push it onchain as a smartcontract. Several years after the initial release, existing EVMcontracts switch to being processed by being run through thatinterpreter.
Once step 4 is done, many "implementations of the EVM" will remainand be used for optimized block building, developer tooling and chainanalysis purposes, but they will no longer need to be part of thecritical consensus spec. Ethereum consensus would"natively" understand only RISC-V.
Simplifying bysharing protocol components
The third and most easily underrated way to reduce total protocolcomplexity is to share one standard across different parts of the stackas much as possible. There is typically very little or no benefit tousing different protocols to do the same thing in different places, butsuch patterns appear anyway, largely because different parts of protocolroadmapping don't talk to each other. Here are a few specific examplesof places where we can simplify Ethereum by ensuring that components aremaximally shared across the stack.
One single shared erasurecode
We need an erasure code in three places:
- Data availability sampling - clients verifying thata block has been published
- Faster P2P broadcasting - nodes being able toaccept a block after receiving n/2 of n pieces, creating an optimalbalance between latency reduction and redundancy
- Distributed history storage - each piece ofEthereum's history being stored in many chunks, such that (i) the chunkscan be independently verified, and (ii) n/2 chunks in each group canrecover the remaining n/2 chunks, greatly reducing the risk that anysingle chunk gets lost
If we use the same erasure code (whether Reed-Solomon, random linearcodes, or otherwise) across the three use cases, we get some importantadvantages:
- Minimize total lines of code
- Increase efficiency because if individual nodeshave to download individual pieces of a block (but not the whole block)for one of the use cases, that data can be used for the other usecase
- Ensure verifiability: the chunks in all threecontexts can be verified against the root
If different erasure codes are used, they should at least becompatible erasure codes: for example, a Reed-Solomon codehorizontally and a random linear code vertically for DAS chunks, wherethe two codes operate over the same field.
One single sharedserialization format
Ethereum's serialization format is today arguably onlysemi-enshrined, because the data can be re-serialized and broadcasted inany format. The only exception is signature hashes for transactions, asthere a canonical format is required for hashing. In the future,however, the degree of enshrinement of serialization formats willincrease further, for two reasons:
- With full account abstraction (EIP-7701), the fulltransaction contents will be visible to the VM
- As gas limits go higher, the execution block data will needto be put into blobs
When this happens, we have an opportunity to harmonize serializationacross the three layers of Ethereum that currently need it: (i)execution layer, (ii) consensus layer, (iii) smart contract callingABI.
I propose we use SSZ.SSZ is:
- Easy to decode, including inside smart contracts(because of its 4-byte-based design and smaller number of edgecases)
- Already widely in use in the consensus layer
- Highly similar to the existing ABI, making toolingrelatively easy to adapt
There are efforts to migrate more fully toSSZ already; we should keep these efforts in mind when planning futureupgrades, and build on them.
One single shared tree
Once we migrate from EVM to RISC-V (or an alternative minimal VM),the hexary Merkle Patricia tree will become by far the largestbottleneck to proving block execution, even in the average case.Migrating to a binarytree based on a much more optimal hash function will greatly improveprover efficiency, in addition to reducing data costs for light clientsand other use cases.
When we do this, we should also use the same tree structure for theconsensus layer. This ensures that all of Ethereum, consensus andexecution alike, can be accessed and interpreted using the samecode.
From here to there
Simplicity is in many ways similar to decentralization. Both areupstream of a goal of resilience. Explicitly valuing simplicity requiressome cultural change. The benefits are often illegible, and the cost ofextra effort and turning away some shiny features is felt immediately.However, as time goes on, the benefits become more and more evident -and Bitcoin itself is an excellent example.
I propose that we followthe lead of tinygrad, and have an explicit max lineof code target for the long-term Ethereum specification, withthe goal of making Ethereum consensus-critical code close to as simpleas Bitcoin. Code that has to do with processing Ethereum's historicalrules will continue to exist, but it should stay outside ofconsensus-critical code paths. Alongside this, we should have a generalethos of choosing the simpler option where possible, favoringencapsulated complexity over systemic complexity, and making designchoices that provide clearly legible properties and guarantees.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。