21 Billion Useless Nodes — Why Ethereum Uses Patricia Compression

§1 — The Setup

Ethereum stores everything in tries

Every account balance, every smart contract variable, every nonce — all of it lives in a single data structure called the Modified Merkle Patricia Trie. Understanding why that structure was chosen requires understanding the problem it solves.

Ethereum needs a key-value store with three non-negotiable properties: cryptographic commitment (a single root hash proves the entire state), efficient proofs (verify any value without downloading everything), and deterministic structure (every node with the same state gets the same root hash, regardless of insertion order).

A trie (from "retrieval") gives you determinism for free: the structure is entirely determined by the keys, not by the order you inserted them. A Merkle overlay gives you cryptographic commitment and proofs. But a naive combination of the two — a Merkle hex trie — is catastrophically wasteful at Ethereum's scale.

To see why, we need to understand how keys work in Ethereum's state trie.

§2 — The Keyspace

64 nibbles into a 10⁷⁷ void

Every Ethereum address is 20 bytes. But the trie doesn't use the raw address as the key — it uses keccak256(address), a 32-byte hash. This is 64 hexadecimal nibbles, and the trie uses each nibble as one step in the path from root to leaf.

At each depth, a branch node has 16 possible children — one per hex digit (0–f). This makes the trie a 16-ary tree with a maximum depth of 64.

Why hash the address?

Two reasons. First, it prevents malicious actors from crafting addresses that create pathologically deep or unbalanced tries (a DoS vector). Second, it uniformly distributes keys across the trie, which is exactly the property we'll exploit to model the compression savings.

The total keyspace is 16⁶⁴ ≈ 10⁷⁷ possible paths. As of early 2026, Ethereum has roughly 383 million unique addresses. That means the trie is occupied at a density of approximately:

Keys

383M

unique addresses

Keyspace

10⁷⁷

possible paths

Occupancy

10^-69

keys / keyspace

This occupancy is almost incomprehensibly low. For perspective, there are roughly 10⁸⁰ atoms in the observable universe. The Ethereum state trie is sparser than if each atom were a slot and only one were filled.

This extreme sparsity is the entire story. Everything that follows is a consequence of putting 383 million keys into a 10⁷⁷-slot structure.

§3 — The Sparsity Cliff

Where the trie goes from dense to desolate

At every depth d, we can calculate exactly how many of the 16^d possible prefixes actually contain at least one key. This is a classical occupancy problem: throw 383M balls into 16^d bins.

The expected number of occupied prefixes at depth d is:

Occupancy Formula

E[occupied at depth d] = 16^d × (1 − e^{−N/16^d})
where N = 383,000,000

This creates a sharp phase transition. The crossover depth is log₁₆(383M) ≈ 7.13. Below this depth, almost every possible prefix is occupied. Above it, almost every key is alone.

Hover bars for details

Look at the cliff between depth 7 and depth 9. At depth 6, there are 16.8 million prefixes and all of them contain keys — 100% occupancy. At depth 8, there are 4.3 billion possible prefixes but only 366 million are occupied — 8.5%. By depth 10, occupancy drops to 0.03%.

Past depth 8, the occupied count flatlines at ~383M — one per key. Every additional depth level has exactly 383 million branch nodes, each with exactly one child. They're not branching. They're not storing useful structure. They're just passing through to the next level.

The core problem

In a plain hex trie, depths 9 through 63 contribute 55 × 383M ≈ 21 billion single-child branch nodes. Each one allocates 16 child slots, 15 of which are empty. This is pure structural waste — and it's 97.3% of all nodes in the trie.

§4 — The Hypothetical

What Ethereum would look like without Patricia compression

To appreciate what Patricia compression does, we need to see the trie without it. In a plain Merkle hex trie, every occupied prefix at every depth is a branch node — even if it has only one child.

Hover to explore depth-by-depth

Total Branch Nodes

21.65B

depths 0–63

Single-Child Nodes

21.50B

97.3% of total

Multi-Child Nodes

147M

actual branching points

Estimated Storage

~1.17 TB

state trie alone

Those 21.5 billion single-child nodes would each require a database entry — typically a 17-element RLP-encoded array with 16 child slots and 1 value slot. Even with most slots empty, the overhead is enormous: roughly 1.17 terabytes just for the state trie's structural nodes, before you store any actual account data.

Every state lookup — reading your account balance, checking a contract's storage — would require traversing 64 database lookups, one per depth. Every Merkle proof for a light client would need 64 levels of sibling hashes: approximately 30 KB per proof.

This is what Patricia compression eliminates.

§5 — The Mechanism

Two compressions, one goal

Ethereum's MPT has three node types. Only one — the branch node — exists in a plain hex trie. The other two are Patricia inventions: leaf nodes and extension nodes. Both eliminate single-child branch nodes, but they operate in different zones and at vastly different scales.

BRANCH NODE

17-element array:
[v₀, v₁, …, v₁₅, value]
16 child slots + 1 value

Exists where ≥2 keys diverge

LEAF NODE

2-element array:
[HP(remaining, true), value]
Key suffix + account data

Stores all remaining nibbles
when key is unique in subtree

EXTENSION NODE

2-element array:
[HP(shared, false), next]
Shared prefix + hash of next node

Compresses shared prefix
between branch points

The hex-prefix encoding (HP) is how Ethereum distinguishes leaf nodes from extension nodes — they share the same 2-element structure, but the first nibble of the encoded path includes a flag: 0x2/0x3 for leaves (terminating), 0x0/0x1 for extensions (non-terminating). In geth's source code, both are implemented as the same shortNode struct — the distinction is whether Val is account data (leaf) or a hash of another node (extension).

Leaf compression: the dominant mechanism

When a key becomes the only key in its subtree — which happens for every key somewhere around depth 7–9 — the entire remaining tail of single-child branch nodes is unnecessary. A leaf node stores all the remaining nibbles in its encoded path and the account data in its value. One node replaces an entire chain.

PLAIN TRIE

PATRICIA

Consider a key that becomes unique at depth 8. In a plain trie, it still needs branch nodes at depths 8, 9, 10, … 63 — that's 56 single-child branch nodes just to reach the leaf. In a Patricia trie, a single leaf node stores those 56 nibbles in its encoded path. One node replaces 56.

For 383M keys that become unique at an average depth of ~8, this eliminates roughly:

Leaf Compression Savings

383M keys × ~56 remaining depths ≈ 21.0 billion nodes eliminated
Each key's tail of single-child branches → 1 leaf node

Extension compression: the minor mechanism

Extension nodes handle a different (and much rarer) case: when multiple keys share a common prefix between two branch points. In the transition zone around depths 7–8, you sometimes get short runs of single-child branch nodes between two places where keys actually diverge. An extension node compresses these shared-prefix runs.

But extension nodes are comparatively rare because the dense zone (depths 0–6) has branch nodes at nearly every level already, and in the sparse zone (depths 9+), keys are alone — so they become leaves, not extensions. Extension compression accounts for roughly ~2.3% of eliminated nodes.

The critique addressed

The objection "leaf node compression ≠ branch node compression" is taxonomically correct — leaf nodes and extension nodes are different Patricia constructs. But it's analytically irrelevant. Both mechanisms eliminate the same thing: chains of single-child branch nodes that would exist in a plain hex trie. The 21-billion figure counts those eliminated branch nodes, regardless of whether they were replaced by a leaf or an extension. And the answer is: ~97.7% were replaced by leaf suffixes.

§6 — The Formal Proof

What the Yellow Paper actually specifies

The Ethereum Yellow Paper (Appendix D) defines the trie via a recursive structural composition function c(J, i), where J is a set of key-value pairs and i is the current nibble position. There are three cases:

Yellow Paper — Appendix D: Modified Merkle Patricia Tree

Case 1: J = ∅
→ produces () — empty node, no keys remain

Case 2: |J| = 1
→ produces RLP( ( HP(I₀[i..], true), I₁ ) )
A leaf node. The remaining nibbles from position i onward are hex-prefix encoded with the terminator flag TRUE. The value is the account data.

Case 3: |J| > 1
Check for common prefix of length > 0:
  If prefix exists: → produces RLP( ( HP(prefix, false), c(J, i+len) ) )
  An extension node. The shared prefix is encoded with terminator FALSE. Points to the next node.

  If no prefix: → produces branch node [c(J₀,i+1), …, c(J₁₅,i+1), v]
  A branch node. Keys are partitioned by their next nibble into up to 16 subsets.

The critical insight is in Case 2. When exactly one key-value pair remains at depth i, the function unconditionally produces a leaf node — not an extension node. The remaining nibbles I₀[i..] (everything from position i to the end) become the leaf's encoded path.

This means: once a key is alone in its subtree, its entire tail — however many nibbles remain — collapses into a single leaf. In a 64-nibble-deep trie where keys become unique around depth 8, that's ~56 nibbles stored in one node instead of 56 separate branch nodes.

Extension nodes (Case 3, prefix branch) can only appear when |J| > 1 — when multiple keys share a prefix. This is why they're rare in the sparse zone: by definition, if you're in the sparse zone, you have one key per subtree, and Case 2 (leaf) fires immediately.

Implementation confirmation

In geth's trie/trie.go, the insertion function creates a shortNode{key, valueNode(value)} when it encounters an empty (nil) slot — always a leaf. The same shortNode struct is used for extensions, but with a different Val type: a hashNode or nested trie node instead of a valueNode. The hex-prefix encoding in trie/encoding.go handles the terminator flag that distinguishes them at the encoding level.

§7 — The Full Accounting

Every node, every depth, both structures

Let's trace through the complete calculation depth by depth. The table below shows every zone of the trie, what the plain hex trie contains, and what Patricia compression reduces it to.

Depth	16^d possible	Occupied	Avg children	Multi-child?	Patricia node type

The accounting by zone

Zone	Depths	Plain branch nodes	Patricia branch	Single-child eliminated	Replaced by
Dense	0–6	~17.9M	~17.9M	~0	—
Transition	7–8	~570M	~128M	~443M	Mix of extensions + leaves
Sparse	9–63	~21.07B	~1.1M	~21.06B	Leaf suffixes (97.7%)
Total	0–63	~21.65B	~147M	~21.50B

Adding leaf nodes (383M) and extension nodes (~18M) to the 147M Patricia branch nodes gives a total of roughly 548 million nodes — a 39.5× reduction from the plain trie's 21.65 billion + 383M leaves.

§8 — The Comparison

Plain trie vs. Patricia trie, head to head

Toggle between the two structures to see the impact on every metric that matters: node count, storage, proof size, and database I/O.

PLAIN TRIE

PATRICIA TRIE

Total Nodes

22.0B

branch + leaf

Branch Nodes

21.65B

97.3% single-child

Est. Storage

~1.17 TB

node data only

Proof Size

~30 KB

64 levels × 15 siblings

DB Lookups

per state access

The proof size reduction — from ~30 KB to ~4.7 KB — is particularly consequential. Light clients verify state by requesting Merkle proofs from full nodes. At 30 KB per proof, verifying even a few hundred accounts would be impractical over constrained connections. At 4.7 KB, light client viability becomes real.

The I/O reduction — from 64 to ~9 random disk reads — directly impacts transaction throughput. Every SLOAD and SSTORE EVM opcode must traverse the trie path. Reducing that path length by 7× is a 7× reduction in the I/O bottleneck that makes state access Ethereum's primary performance constraint.

§9 — The Limits

What Patricia compression can't fix

Patricia compression is enormously effective at eliminating structural waste, but the Merkle Patricia Trie still has fundamental problems that Ethereum is actively trying to solve.

State growth is unbounded. Every new account and every new storage slot adds nodes to the trie that persist indefinitely. There's no garbage collection — even abandoned contracts keep their storage trie alive because any node could be referenced by a historical state root.

The trie is hash-linked. Each node is stored in the database by its keccak256 hash. This means every traversal is a random-access disk read — the worst case for modern storage hardware. SSDs can handle hundreds of thousands of random reads per second, but with ~9 lookups per state access and millions of state accesses per block, this is still a major bottleneck.

Merkle proofs are still too large for statelessness. Even at ~4.7 KB per account proof, proving an entire block's state accesses requires megabytes of witness data. This is why Ethereum is planning to migrate to Verkle tries — which use polynomial commitments (KZG) instead of hash-based Merkle proofs, reducing proof sizes by 5–10× further.

The hexary regret

Ethereum's design documentation notes that using 16-ary branching was "suboptimal" — a binary trie with batch storage could achieve similar lookup performance with smaller proofs. But changing the trie structure is enormously difficult once the chain is live, so this is being addressed through the Verkle migration rather than a hex-to-binary change.

Patricia compression was the right answer for 2014. It reduced a 1.17 TB structural nightmare to an 80 GB tractable problem. But the trie is now being asked to do things — stateless validation, real-time sync, light client proofs at scale — that push beyond what any hash-linked tree structure can deliver. The next chapter is Verkle.

§10 — The Bottom Line

21.5 billion nodes, eliminated by leaf suffixes

The claim stands. Ethereum's state trie, without Patricia compression, would contain approximately 21.65 billion branch nodes, of which 21.50 billion are single-child pass-throughs that store no useful branching information.

Patricia compression eliminates them through two mechanisms. Leaf suffix storage accounts for ~97.7% of the savings — when a key is alone in its subtree, a single leaf node stores all remaining nibbles, replacing the entire tail of single-child branches. Extension nodes handle the remaining ~2.3%, compressing shared prefixes between branch points in the narrow transition zone.

The critique that "leaf node compression ≠ branch node compression" correctly identifies that these are different mechanisms, but incorrectly implies the model conflates them. The model counts eliminated branch nodes — and both leaf suffixes and extension paths eliminate branch nodes. The question is only which Patricia construct absorbs them. The answer is overwhelmingly leaf nodes.

The formal spec confirms it. The Yellow Paper's c(J, i) function produces a leaf when |J| = 1 and an extension only when |J| > 1 with a shared prefix. In the sparse zone (depths 9+), every subtree has exactly one key, so |J| = 1, and Case 2 fires unconditionally. Leaf nodes. Every time.

39.5×

FEWER NODES WITH PATRICIA COMPRESSION

22.0B → 548M · 1.17 TB → ~80 GB · 30 KB proofs → 4.7 KB · 64 lookups → 9