Ethereum processes vast amounts of data every second. To keep the network efficient and accessible, this data must be organized in standardized and memory-optimized formats. To achieve this, Ethereum uses specialized data structures that allow anyone to run a node, even on modest consumer-grade hardware.

This blog explores the core data structures that power Ethereum’s execution and consensus layers: Patricia Merkle Tries, which provide cryptographically secure storage of key-value pairs, Recursive Length Prefix (RLP), a compact and consistent data serialization method, and Simple Serialize (SSZ), the consensus layer’s preferred serialization format designed to work seamlessly with merklelization.

Understanding these foundations is key to seeing how Ethereum achieves security, scalability, and decentralization.

In this blog, we will focus specifically on Merkle Patricia Tries.

Merkle Patricia Trie

Ethereum works like a state machine. Every transaction updates the global state, which contains account balances, contract storage, and code. This state is stored as key–value pairs, and Ethereum needs a data structure that is both efficient and secure. For this, it uses the Modified Merkle Patricia Trie (MPT).

The MPT combines two ideas: tries and Merkle trees. A trie is a tree-like structure for looking up keys by following a path of characters (or, in Ethereum’s case, hexadecimal nibbles). A Merkle tree is a hash-based structure that allows anyone to verify data without needing the whole dataset.

By combining them, Ethereum gets a database that is efficient for lookups and updates, while also cryptographically secure against tampering.

The key strength of the MPT is that it produces a single root hash that represents the entire state of Ethereum. If even one balance changes, this root hash changes. This means two nodes can quickly agree on whether they have the same state by just comparing root hashes. It also enables Merkle proofs, where someone can prove a piece of data exists in the state by only providing the path and hashes, rather than the whole database.

Radix Tries (The Foundation)

The foundation of MPT is a radix trie. In a radix trie, each key is split into symbols (like hex digits), and we walk down the tree one symbol at a time. Each node can have multiple children, and the value is stored at the end of the path.

For example, suppose the key "dog" is stored. First, we convert it into hex (64 6f 67). Then we start at the root and follow the path nibble by nibble: 6 → 4 → 6 → 15 → 6 → 7. At the end of this path, we find the value associated with "dog".

Updating works similarly: you walk down the path, copy the nodes that change, update the value, and then recompute the hashes on the way back up. Deleting means following the path, removing the value, and cleaning up any nodes that become unnecessary.

While this works, it can be very inefficient. Ethereum addresses are 32 bytes, which is 64 hex characters. A plain radix trie would require 64 steps just to look up one key, with many empty branches along the way. This is where the Patricia optimization comes in.

Adding Hashing: Merkle Radix Tries

Ethereum doesn’t just store values in the trie—it also hashes them. Every node is stored in the database under a key equal to the hash of its contents, specifically keccak256(rlp(value)). This makes the structure content-addressable: the same node always has the same hash, and if the content changes, the hash changes.

This hashing turns the trie into a Merkle Radix Tree, where the root hash acts as a fingerprint of the entire structure. If you know the root, you can verify that any piece of data belongs in the trie by walking down the path and checking the hashes. If someone tries to fake data, the hashes won’t line up.

Think of it like a library catalog: the root hash is like a digital seal on the entire catalog. If even one entry is wrong, the seal won’t match.

The Patricia Optimization

To fix the inefficiency of radix tries, Ethereum adds Patricia optimizations. Instead of storing one nibble per node, paths are compressed. There are three types of nodes in MPT:

  1. Branch nodes, which have 16 child slots (one for each hex nibble) plus a value slot.
  2. Leaf nodes, which store the end of a path and the value.
  3. Extension nodes, which store a sequence of nibbles leading to the next node.

This way, instead of having 64 steps for a 64-nibble path, we might only have a handful of steps, since long single-child paths are collapsed into extension nodes.

Compact Encoding of Paths

When Ethereum stores paths in the Merkle Patricia Trie, it does not save them nibble by nibble (too large). Instead, it uses a compact encoding scheme that packs extra information into the first nibble.

This first nibble tells us two things:

  1. Node type: Is it a leaf node or an extension node?

  2. Path length: Is the remaining path odd or even in length?

    The rules are:

Hex

Bits

Node type

Path length

0

0000

Extension

Even

1

0001

Extension

Odd

2

0010

Leaf (end)

Even

3

0011

Leaf (end)

Odd

For even remaining path length (0 or 2), another 0 "padding" nibble will always follow.

    def compact_encode(hexarray):
        term = 1 if hexarray[-1] == 16 else 0
        if term:
            hexarray = hexarray[:-1]
        oddlen = len(hexarray) % 2
        flags = 2 * term + oddlen
        if oddlen:
            hexarray = [flags] + hexarray
        else:
            hexarray = [flags] + [0] + hexarray
        # hexarray now has an even length whose first nibble is the flags.
        o = ""
        for i in range(0, len(hexarray), 2):
            o += chr(16 * hexarray[i] + hexarray[i + 1])
        return o

Example:

  1. Input:
> hexarray = [0xf, 0x1, 0xc, 0xb, 0x8, 0x10] or [f, 1, c, b, 8, 10]
  '3f 1c b8'

Notice the last nibble is 0x10(decimal 16).
- That’s theterminator marker (leaf) in Ethereum’s compact encoding.

Walk through code:

term = 1 if hexarray[-1] == 16 else 0

Last element = 16 → so term = 1.
if term:
    hexarray = hexarray[:-1]
    
Strip the terminator →
hexarray = [f, 1, c, b, 8] (5 nibbles).
oddlen = len(hexarray) % 2

Length = 5 → oddlen = 1
flags = 2 * term + oddlen

2 * 1 + 1 = 3 → so flags = 3.
if oddlen:
    hexarray = [flags] + hexarray

Since oddlen = 1, prepend 3 →
hexarray = [3, f, 1, c, b, 8].

Now, Pack Into Bytes

The loop does:
o += chr(16 * hexarray[i] + hexarray[i + 1])

Pairs:

(3, f) → 0x3f
(1, c) → 0x1c
(b, 8) → 0xb8
Output: 
'3f 1c b8'

  1. Input:

    > hexarray = [0x1, 0x2, 0x3, 0x4, 0x5, ...] or [1, 2, 3, 4, 5, ...]
      '11 23 45'
    

Walk through code:

term = 1 if hexarray[-1] == 16 else 0

Last element = 5 → so term = 0.
if term:
    hexarray = hexarray[:-1]
    
will not execute as term = 0
oddlen = len(hexarray) % 2

Length = 5 → oddlen = 1
flags = 2 * term + oddlen

2 * 0 + 1 = 1 → so flags = 1.
if oddlen:
    hexarray = [flags] + hexarray

Since oddlen = 1, prepend 1 →
hexarray = [1, 1, 2, 3, 4, 5].

Now, Pack Into Bytes

The loop does:
o += chr(16 * hexarray[i] + hexarray[i + 1])

Pairs:

(1, 1) → 0x11
(2, 3) → 0x23
(4, 5) → 0x45
Output: 
'11 23 45'

Here is the extended code for getting a node in Merkle Patricia Trie.

    def get_helper(node_hash, path):
        if path == []:
            return node_hash
        if node_hash == "":
            return ""
        curnode = rlp.decode(node_hash if len(node_hash) < 32 else db.get(node_hash))
        if len(curnode) == 2:
            (k2, v2) = curnode
            k2 = compact_decode(k2)
            if k2 == path[: len(k2)]:
                return get(v2, path[len(k2) :])
            else:
                return ""
        elif len(curnode) == 17:
            return get_helper(curnode[path[0]], path[1:])

    def get(node_hash, path):
        path2 = []
        for i in range(len(path)):
            path2.push(int(ord(path[i]) / 16))
            path2.push(ord(path[i]) % 16)
        path2.push(16)
        return get_helper(node_hash, path2)

Example Trie

Let’s walk through an example of how a Merkle Patricia Trie (MPT) is built from key-value pairs.

We’ll use five accounts with these key-value pairs:

Account

Key (address/hash, shortened)

Value (simplified state)

1

0x616b6c64

0x01

2

0x616b6c65

0x02

3

0x616b6c78

0x03

4

0x616b6d31

0x04

5

0x31323334

0x05

In Ethereum, the keys are normally 32-byte Keccak-256 hashes of addresses. Here, we’re using shorter keys so the example is easy to follow.

Step 1: Insert the first key-value pair.

Start with an empty MPT, and add the first key-value pair: (0x616b6c64, 0x01). Since it is just one key, it results in a trie of only one leaf node. To create that node, proceed in the following way:

This should output this hex: c78520616b6c6401.

This should output:

4e2d0fbe6726eac15c5ecf49a4e1f947aa50e0531f4f3e98b8e1577ba52e1783

The resulting MPT should look like this:

[Key, Value] -> [“0x616b6c64”, “0x01”]
RLP Encoding -> 0xc78520616b6c6401
Hash of RLP Encoding -> 4e2d0fbe6726eac15c5ecf49a4e1f947aa50e0531f4f3e98b8e1577ba52e1783

Step 2: Insert the second key-value pair.

[“0x616b6c65”, “0x02”]. Since this key shares the first 7 digits with the previous one, we proceed in the following way:

Step 3: Insert the third key-value pair.

[“0x616b6c78”, “0x03”], Since this key shares the first 6 digits with the previous two keys, we proceed in the following way:

  1. Add the last two key-value pairs, continuing in this way, following the same steps as we did for the previous keys. When you're done, you should have the following MPT:

Verify

When a verifier gets the StateRoot and the proof path for a key, they must check if the proof is correct. Unlike a normal Merkle Tree, where the check starts at the leaf and goes up to the root, in a Merkle Patricia Trie, the process goes from the top down.

The verifier begins at the StateRoot and then moves step by step through the nodes in the proof path. At each step, they decode the node and make sure that the hash stored inside matches the next node in the path. This continues until the verifier reaches the final leaf node. If this leaf contains both the remaining part of the key and the correct value, then the proof is valid.

Let's say the verifier receives the proof of the above for the key-value (0x616b6d31, 0x04). Then, he has to follow these steps:

  1. Hash the first element of the path and check that it matches the given StateRoot. Indeed:

    keccak(bytes.fromhex(
        "f851808080a0a26b2ac124718443aeed68fa0309225d0c8dd9dbee45685909e92fb594e1a4638080a02ccd118c9470c051689543b233ab109ad74d2fb4f57eb429c4d43294d6ae686780808080808080808080"
    )).hex() ==
        "13ea549e268b5aa80e9752c6be0770cffba34d2b1aa1f858cb90f3f13ac3bed7"
    

  2. Decode the first RLP element of the path and verify that in the index, 6 has the hash of the second path element. Indeed:

    rlp.decode(bytes.fromhex(
        "f851808080a0a26b2ac124718443aeed68fa0309225d0c8dd9dbee45685909e92fb594e1a4638080a02ccd118c9470c051689543b233ab109ad74d2fb4f57eb429c4d43294d6ae686780808080808080808080"
    ))[6].hex() ==
        "2ccd118c9470c051689543b233ab109ad74d2fb4f57eb429c4d43294d6ae6867"  
    

  3. Decode the second RLP element of the path. You will see an array with two items. The first item is 0x1616b6. Because it starts with 1, this means it is an extension node. Now, check that the rest of the digits match the key we are trying to prove. Then, confirm that the second item in the array is the hash of the next element in the path.

  4. Decode the third element of the path. This will be a branch node. Look inside it, and make sure that at index d, it holds the hash of the following element in the path.

     Merkle Patricia Trie

  5. Decode the last element of the path. This will be an array with two items. The first item is 0x2031. Since it begins with 20, this means it is a leaf node. Verify that this first item contains the remaining key digits 31 and that the second item is the value 0x04.

Tries in Ethereum

In Ethereum’s execution layer, all the merkle tries are based on a data structure called the Merkle Patricia Trie (MPT). From every block header, you can actually see three important roots — each representing a different trie:

Let’s break them down one by one.

State Trie

There is one global state trie, and it gets updated every time a block is processed.

An Ethereum account is stored as an array of four items:

[ nonce, balance, storageRoot, codeHash ]

Here’s the interesting part: the storageRoot itself is the root of another trie — the Storage Trie.

Storage Trie

The storage trie is where all contract data lives. Each account (that is, a contract) has its own storage trie.

If you want to retrieve a contract’s stored value, you need:

For example, to fetch data from slot 0 of address 0xc0ffee254729296a45a3885639AC7E10F9d54979, you can use the JSON-RPC method eth_getStorageAt:

curl -X POST --data '{
  "jsonrpc":"2.0",
  "method":"eth_getStorageAt",
  "params": [
    "0xc0ffee254729296a45a3885639AC7E10F9d54979", 
    "0x0", 
    "latest"
  ],
  "id":1
}' localhost:8545

This might return something like:

{"jsonrpc":"2.0","id":1,"result":"0x00000000000000000000000000000000000000000000000000000000000004d2"}

For other slots, you can’t just pass the number directly. Instead, Ethereum calculates the slot’s position as:

keccak256( padded(address) + padded(slotIndex) )

For example, for slot 1, at address 0x391694e7e0b0cce554cb130d723a9d27458f9298, the key is:

keccak256("000...391694e7...f9298" + "000...0001")

In a Geth console, you can calculate it with:

var key = "000000000000000000000000391694e7e0b0cce554cb130d723a9d27458f9298" +
          "0000000000000000000000000000000000000000000000000000000000000001"
web3.sha3(key, { "encoding": "hex" })

This gives the hashed position, which you then pass into eth_getStorageAt to fetch the actual stored value.

If an account is not a contract, its storageRoot is empty.

Transactions Trie

Every block also has its own transactions trie. This stores all the transactions of that block as (key, value) pairs.

This allows Ethereum to support both old and new transaction formats

Receipts Trie

Finally, every block also has a receipts trie. Like the transactions trie, the path is also rlp(transactionIndex).

Receipts can be of two kinds:

  1. LegacyReceipt → stored as rlp([status, cumulativeGasUsed, logsBloom, logs]).

  2. Receipt (typed) → stored as TransactionType | ReceiptPayload.

Again, this supports both old and new transaction formats, thanks to EIP-2718.