An interactive investigation into how Patricia compression
eliminates 97% of state trie nodes — and why leaf nodes,
not extension nodes, do almost all the work.
Every account balance, every smart contract variable, every nonce — all of it lives in a single data structure called the Modified Merkle Patricia Trie. Understanding why that structure was chosen requires understanding the problem it solves.
Ethereum needs a key-value store with three non-negotiable properties: cryptographic commitment (a single root hash proves the entire state), efficient proofs (verify any value without downloading everything), and deterministic structure (every node with the same state gets the same root hash, regardless of insertion order).
A trie (from "retrieval") gives you determinism for free: the structure is entirely determined by the keys, not by the order you inserted them. A Merkle overlay gives you cryptographic commitment and proofs. But a naive combination of the two — a Merkle hex trie — is catastrophically wasteful at Ethereum's scale.
To see why, we need to understand how keys work in Ethereum's state trie.
Every Ethereum address is 20 bytes. But the trie doesn't use the raw address as the key — it uses keccak256(address), a 32-byte hash. This is 64 hexadecimal nibbles, and the trie uses each nibble as one step in the path from root to leaf.
At each depth, a branch node has 16 possible children — one per hex digit (0–f). This makes the trie a 16-ary tree with a maximum depth of 64.
The total keyspace is 1664 ≈ 1077 possible paths. As of early 2026, Ethereum has roughly 383 million unique addresses. That means the trie is occupied at a density of approximately:
This occupancy is almost incomprehensibly low. For perspective, there are roughly 1080 atoms in the observable universe. The Ethereum state trie is sparser than if each atom were a slot and only one were filled.
This extreme sparsity is the entire story. Everything that follows is a consequence of putting 383 million keys into a 1077-slot structure.
At every depth d, we can calculate exactly how many of the 16d possible prefixes actually contain at least one key. This is a classical occupancy problem: throw 383M balls into 16d bins.
The expected number of occupied prefixes at depth d is:
This creates a sharp phase transition. The crossover depth is log16(383M) ≈ 7.13. Below this depth, almost every possible prefix is occupied. Above it, almost every key is alone.
Look at the cliff between depth 7 and depth 9. At depth 6, there are 16.8 million prefixes and all of them contain keys — 100% occupancy. At depth 8, there are 4.3 billion possible prefixes but only 366 million are occupied — 8.5%. By depth 10, occupancy drops to 0.03%.
Past depth 8, the occupied count flatlines at ~383M — one per key. Every additional depth level has exactly 383 million branch nodes, each with exactly one child. They're not branching. They're not storing useful structure. They're just passing through to the next level.
To appreciate what Patricia compression does, we need to see the trie without it. In a plain Merkle hex trie, every occupied prefix at every depth is a branch node — even if it has only one child.
Those 21.5 billion single-child nodes would each require a database entry — typically a 17-element RLP-encoded array with 16 child slots and 1 value slot. Even with most slots empty, the overhead is enormous: roughly 1.17 terabytes just for the state trie's structural nodes, before you store any actual account data.
Every state lookup — reading your account balance, checking a contract's storage — would require traversing 64 database lookups, one per depth. Every Merkle proof for a light client would need 64 levels of sibling hashes: approximately 30 KB per proof.
This is what Patricia compression eliminates.
Ethereum's MPT has three node types. Only one — the branch node — exists in a plain hex trie. The other two are Patricia inventions: leaf nodes and extension nodes. Both eliminate single-child branch nodes, but they operate in different zones and at vastly different scales.
The hex-prefix encoding (HP) is how Ethereum distinguishes leaf nodes from extension nodes — they share the same 2-element structure, but the first nibble of the encoded path includes a flag: 0x2/0x3 for leaves (terminating), 0x0/0x1 for extensions (non-terminating). In geth's source code, both are implemented as the same shortNode struct — the distinction is whether Val is account data (leaf) or a hash of another node (extension).
When a key becomes the only key in its subtree — which happens for every key somewhere around depth 7–9 — the entire remaining tail of single-child branch nodes is unnecessary. A leaf node stores all the remaining nibbles in its encoded path and the account data in its value. One node replaces an entire chain.
Consider a key that becomes unique at depth 8. In a plain trie, it still needs branch nodes at depths 8, 9, 10, … 63 — that's 56 single-child branch nodes just to reach the leaf. In a Patricia trie, a single leaf node stores those 56 nibbles in its encoded path. One node replaces 56.
For 383M keys that become unique at an average depth of ~8, this eliminates roughly:
Extension nodes handle a different (and much rarer) case: when multiple keys share a common prefix between two branch points. In the transition zone around depths 7–8, you sometimes get short runs of single-child branch nodes between two places where keys actually diverge. An extension node compresses these shared-prefix runs.
But extension nodes are comparatively rare because the dense zone (depths 0–6) has branch nodes at nearly every level already, and in the sparse zone (depths 9+), keys are alone — so they become leaves, not extensions. Extension compression accounts for roughly ~2.3% of eliminated nodes.
The Ethereum Yellow Paper (Appendix D) defines the trie via a recursive structural composition function c(J, i), where J is a set of key-value pairs and i is the current nibble position. There are three cases:
The critical insight is in Case 2. When exactly one key-value pair remains at depth i, the function unconditionally produces a leaf node — not an extension node. The remaining nibbles I₀[i..] (everything from position i to the end) become the leaf's encoded path.
This means: once a key is alone in its subtree, its entire tail — however many nibbles remain — collapses into a single leaf. In a 64-nibble-deep trie where keys become unique around depth 8, that's ~56 nibbles stored in one node instead of 56 separate branch nodes.
Extension nodes (Case 3, prefix branch) can only appear when |J| > 1 — when multiple keys share a prefix. This is why they're rare in the sparse zone: by definition, if you're in the sparse zone, you have one key per subtree, and Case 2 (leaf) fires immediately.
trie/trie.go, the insertion function creates a shortNode{key, valueNode(value)} when it encounters an empty (nil) slot — always a leaf. The same shortNode struct is used for extensions, but with a different Val type: a hashNode or nested trie node instead of a valueNode. The hex-prefix encoding in trie/encoding.go handles the terminator flag that distinguishes them at the encoding level.
Let's trace through the complete calculation depth by depth. The table below shows every zone of the trie, what the plain hex trie contains, and what Patricia compression reduces it to.
| Depth | 16d possible | Occupied | Avg children | Multi-child? | Patricia node type |
|---|
| Zone | Depths | Plain branch nodes | Patricia branch | Single-child eliminated | Replaced by |
|---|---|---|---|---|---|
| Dense | 0–6 | ~17.9M | ~17.9M | ~0 | — |
| Transition | 7–8 | ~570M | ~128M | ~443M | Mix of extensions + leaves |
| Sparse | 9–63 | ~21.07B | ~1.1M | ~21.06B | Leaf suffixes (97.7%) |
| Total | 0–63 | ~21.65B | ~147M | ~21.50B |
Adding leaf nodes (383M) and extension nodes (~18M) to the 147M Patricia branch nodes gives a total of roughly 548 million nodes — a 39.5× reduction from the plain trie's 21.65 billion + 383M leaves.
Toggle between the two structures to see the impact on every metric that matters: node count, storage, proof size, and database I/O.
The proof size reduction — from ~30 KB to ~4.7 KB — is particularly consequential. Light clients verify state by requesting Merkle proofs from full nodes. At 30 KB per proof, verifying even a few hundred accounts would be impractical over constrained connections. At 4.7 KB, light client viability becomes real.
The I/O reduction — from 64 to ~9 random disk reads — directly impacts transaction throughput. Every SLOAD and SSTORE EVM opcode must traverse the trie path. Reducing that path length by 7× is a 7× reduction in the I/O bottleneck that makes state access Ethereum's primary performance constraint.
Patricia compression is enormously effective at eliminating structural waste, but the Merkle Patricia Trie still has fundamental problems that Ethereum is actively trying to solve.
State growth is unbounded. Every new account and every new storage slot adds nodes to the trie that persist indefinitely. There's no garbage collection — even abandoned contracts keep their storage trie alive because any node could be referenced by a historical state root.
The trie is hash-linked. Each node is stored in the database by its keccak256 hash. This means every traversal is a random-access disk read — the worst case for modern storage hardware. SSDs can handle hundreds of thousands of random reads per second, but with ~9 lookups per state access and millions of state accesses per block, this is still a major bottleneck.
Merkle proofs are still too large for statelessness. Even at ~4.7 KB per account proof, proving an entire block's state accesses requires megabytes of witness data. This is why Ethereum is planning to migrate to Verkle tries — which use polynomial commitments (KZG) instead of hash-based Merkle proofs, reducing proof sizes by 5–10× further.
Patricia compression was the right answer for 2014. It reduced a 1.17 TB structural nightmare to an 80 GB tractable problem. But the trie is now being asked to do things — stateless validation, real-time sync, light client proofs at scale — that push beyond what any hash-linked tree structure can deliver. The next chapter is Verkle.
The claim stands. Ethereum's state trie, without Patricia compression, would contain approximately 21.65 billion branch nodes, of which 21.50 billion are single-child pass-throughs that store no useful branching information.
Patricia compression eliminates them through two mechanisms. Leaf suffix storage accounts for ~97.7% of the savings — when a key is alone in its subtree, a single leaf node stores all remaining nibbles, replacing the entire tail of single-child branches. Extension nodes handle the remaining ~2.3%, compressing shared prefixes between branch points in the narrow transition zone.
The critique that "leaf node compression ≠ branch node compression" correctly identifies that these are different mechanisms, but incorrectly implies the model conflates them. The model counts eliminated branch nodes — and both leaf suffixes and extension paths eliminate branch nodes. The question is only which Patricia construct absorbs them. The answer is overwhelmingly leaf nodes.
The formal spec confirms it. The Yellow Paper's c(J, i) function produces a leaf when |J| = 1 and an extension only when |J| > 1 with a shared prefix. In the sparse zone (depths 9+), every subtree has exactly one key, so |J| = 1, and Case 2 fires unconditionally. Leaf nodes. Every time.