Ethereum's Compact Merkle Trie (Part I)

An exploration of what Clojure can offer the EVM.

This is the third post in a series on implementing the Ethereum Virtual Machine in Clojure. Relevant portions of the code are reproduced inline, with the full source available at nervous-systems/sputter on Github.

Introduction

The EVM persists transactions, receipts and account state in data structures best described as compact Merkle tries. My goal below is to describe and implement the underlying data structure as clearly and succinctly as possible.

While I originally intended to cover the implementation end-to-end in a single post, it became unwieldy. I’ve split the persistence/crypto into a follow-up article — Ethereum’s Compact Merkle Trie (Part II: Merkling). The below is focused exclusively on an in-memory, immutable trie with the correct compaction behaviour.

We’re primarily interested in the data structure itself, and won’t spend much time discussing the details of how it’s applied.

Terminology

Ethereum call their data structure a Modified Merkle-Patricia Tree. As modified is vague, and the term Patricia tree is commonly used in reference to at least two subtly different things, I’ll prefer compact Merkle trie.


Trie and tree are used interchangeably in this article.

What’s a Trie?

Tries are an intuitive means of encoding associative data in a tree structure. Unlike, say, a hash table, a trie structurally represents its keys, enabling the efficient retrieval of all suffixes, given a partial key.

On the right, you’ll hopefully see abstract depictions of two character-oriented tries containing word counts for the string the cat casts the cats. The compact trie (bottom) contains fewer nodes, as it decomposes keys only at points of difference.

Of course, the keys need not be textual — we could just as well store routing tables by branching at each bit of an IP address, or almost anything else. Even when keys are essentially textual, we may choose to decompose them at a level more granular than characters (bits, hex nibbles, etc.).

Ethereum’s Compact Trie

The specific data structure used by Ethereum decomposes keys into hex nibbles1 — half bytes — rather than characters:

(-> "hello"
    nibble/->nibbles ;; => [6 8 6 5 6 12 6 12 6 15]
    nibble/->bytes
    String.) ;; => "hello"

While the compact trie diagram in the previous section looked fairly intuitive, in practice, we’ll benefit from unambiguous rules about where compactions can occur, and how to navigate through them. Accordingly, Ethereum’s trie requires three distinct node types:

  • Branch — inserted at points where at least two keys diverge, each branch has a child slot for every possible nibble (0-15). Branches optionally contain a value, in the event a key terminates at the branch point.
  • Extension — holds a single, non-empty key (1 or more nibbles) and a single child. An extension can be thought of as a compaction of layers of redundant branch nodes.
  • Leaf — Similar to extensions, with the exception that leaves reference a value, rather than a child node, and may have empty keys.

Before jumping into the code, let’s look at a representation of a trie in which the above three node types are used to store the mappings hellohello and helphelp.

;; (->nibbles "hello") = [6 8 6 5 6 12 6 12 6 15]
;; (->nibbles "help")  = [6 8 6 5 6 12 7 0]

#Extension{:key   [6 8 6 5 6 12]
           :child #Branch{6 #Leaf{:key   [12 6 15]
                                  :value "hello"}
                          7 #Leaf{:key   [0]
                                  :value "help"}}}

The root is an extension node holding the common prefix of the two keys — [6 8 6 5 6 12] (or hel). Its child, a branch node, represents the divergence of the keys after the shared run — the first differing nibble being 6 for hello and 7 for help. Predictably enough, both leaves contain the unique suffix of their respective keys.

1 While implementations are free to decompose keys however they want, the canonical node represenation (from which cryptographic hashes are computed) includes the key’s nibbles — we’ll stick with nibbles for now, so as not to deviate from other implementations without good reason.

Node

Let’s look at the central abstraction we’ll build around:

(ns sputter.state.trie.node)

(defprotocol Node
  (terminates? [node k]
    "Does this node represent a successful termination of a search
     for the given key remainder?")
  (-value [node]
    "The value of this node, if any.")
  (attach [node k v]
    "Wrap the supplied value in the appropriate node type, and
     attach it to this node, at the given key.
     Return a new Node.")
  (reattach [node k child]
    "Replace the child node at `k` with `child`.
     Return a new Node.")
  (descend [node candidate-k]
    "Locate the immediate child pointed to by the given key.
     Return vector of `[child remaining-k]` or `nil`."))

(defn value
  "Return the value of the given node, or nil.
   If a key is supplied, return the value when the node matches."
  ([node] (-value node))
  ([node k]
   (when (terminates? node k)
     (-value node))))

We want an interface broad enough to accommodate implementations in which children may not be immediately available (e.g. they’re in some compact representation, or have to be fetched remotely) — to that end, there are no recursive operations on Node.

The protocol implies a traversal pattern in which we drop nibbles off the front of the search key at each successful descent — the remaining-k returned by descend (i.e. the candidate-k argument, stripped of its matching prefix) is intended as the input to the child’s descend call.

Searching

Given a root and some sequence of nibbles k, we can easily locate the associated value via the above protocol:

(loop [node root, k k]
  (if-let [[child rest-k] (node/descend node k)]
    (recur child rest-k)
    (node/value node k)))

We retrieve the value at the point we can no longer descend (e.g. we’re at a leaf, or an extension with a key that doesn’t match) — node/value, per the earlier definition, ensures we return the value only if the key remainder matches1.

1 If nil were a valid node value held by our terminal node, we’d be unable to differentiate between failed (k doesn’t match, in node/value) and successful (k matches, value is nil) searches without calling node/terminates? explicitly. In this instance, we don’t mind.

Updating/Inserting

To update the node at some key (or the closest node, in the absence of a perfect match), we define a simple depth-first recursive function, walk-path. In addition to repeatedly descending in a similar manner to the above loop, it uses reattach to replace each child on the key path with the result of some function.

(defn- walk-path* [node k outer inner]
  (if-let [[child rest-k] (node/descend node k)]
    (outer (node/reattach node k (inner child rest-k)) k)
    (outer node k #{:terminal?})))

(defn walk-path [f node k]
  (walk-path* node k f (partial walk-path f)))

Inserting a value v at key k beneath the root:

(walk-path
 (fn [node rest-k & [opts]]
   (cond-> node (:terminal? opts) (node/attach rest-k v)))
 root
 k)

The result would be a new root Node.

We call attach only on the closest node (or the exact match) when descending along the key we plan to insert. The alternative, which may seem more natural, is doing the tree-recursion inside Node/attach, and calling it on the root. Keeping it external buys us some generality, and may prove helpful once we’re lazily fetching nodes.

Trie

Before getting into the individual Node implementations, let’s look at Trie and a record which provides it:

(ns sputter.state.trie
 (:require [sputter.state.trie.node :as node]
           [sputter.state.trie.util :as util]
           [sputter.util.nibbles    :as nibbles]))

(defprotocol Trie
  (-insert [trie k v]
   "Return a new Trie containing the supplied mapping.")
  (-search [trie k]
   "Descend beneath the root, consuming `k`.
    Return a seq holding a `[node rest-k]` pair for each level."))

(defn- attach* [v node k & [opts]]
  (cond-> node (:terminal? opts) (node/attach k v)))

(defrecord KVTrie [root]
  Trie
  (-insert [this k v]
    (let [root (util/walk-path (partial attach* v) root k)]
      (assoc this :root root)))
  (-search [this k]
    (->> (iterate #(some->> % (apply node/descend)) [root k])
         (take-while identity))))

(defn insert [t k v]
  (-insert t (nibble/->nibbles k) v))

(defn search
  "Retrieve the value associated w/ the given key, or `nil`."
  [t k]
  (let [path (-search t (nibble/->nibbles k))]
    (apply node/value (last path))))

Trie/-search is defined as a lazy sequence of [node rest-k] pairs, as the intermediate results may be useful for later operations.

Node Implementations

Finally, let’s implement the three node types, back in sputter.state.trie.node. First, branches:

(declare ->Leaf ->Extension)

(defrecord Branch [branches value]
  Node
  (terminates? [this k] (empty? k))
  (-value      [this]   value)
  (attach [this [nibble & rest-k] v]
    (if nibble
      (assoc-in this [:branches (int nibble)] (->Leaf rest-k v))
      (assoc    this :value v)))
  (reattach [this [nibble] child]
    (assoc-in this [:branches (int nibble)] child))
  (descend [this [nibble & rest-k]]
    (when nibble
      [(branches (int nibble)) rest-k])))

(defn ->Branch
  ([]         (->Branch {}))
  ([branches] (->Branch branches nil))
  ([branches value]
   (Branch. branches value)))

Recall that each child of a branch is addressed by a single nibble — the branches field above is either a map of nibblechild, or a 16 element vector with children at their nibble’s offset.

As noted above, attach must be called on the closest node to the target key — it follows that if attach is invoked on a branch, either the key has been exhausted, or the slot into which we place the Leaf is empty — otherwise we’d have been able to descend further into the tree.

Opacity

Implementing the three node types as distinct data structures may seem a little excessive. After experimenting with different approaches — and reading through others — I think the above scheme (host-level polymorphic dispatch) is by far the easiest to read.


As long as we avoid making assumptions about the concrete implementation, it’d be trivial to implement Node in terms of some other data structure (e.g. for performance reasons) without affecting the rest of the system.

(defrecord Leaf [key value]
  Node
  (terminates? [_ candidate-k] (= candidate-k key))
  (-value      [_]             value)
  (attach [this input-k v]
    (if (= input-k key)
      (assoc this :value v)
      (let [[pre key-suf input-suf] (split-prefix key input-k)]
        (-> (->Branch)
            (attach input-suf v)
            (attach key-suf   value)
            (cond->> pre (->Extension pre))))))
  (descend [_ _] nil))

As we can’t descend beneath a Leaf, we’ve no need to implement reattach.

Attaching values, on the other hand, is more involved than for branches — the leaf requires replacement by a node to which children may be attached. Ignoring the trivial case (where the keys match perfectly), we extract any shared prefix, and strip it from the beginning of key and input-k. The two differing key suffixes are attached to an empty branch, conditionally wrapped in an extension if there was a common prefix.

split-prefix is a trivial sequence function where:

(split-prefix '(a a b) '(a c d)) => [(a) (a b) (c d)].

(defrecord Extension [key child]
  Node
  (terminates? [_ _] false)
  (-value      [_]   nil)
  (attach [this input-k v]
    (let [[pre key-suf input-suf] (split-prefix key input-k)
          me                      (if (< 1 (count key-suf))
                                    (assoc this :key (rest key-suf))
                                    child)]
      (-> (->Branch)
          (attach   input-suf v)
          (reattach key-suf   me)
          (cond->> pre (->Extension pre)))))
  (reattach [this _ child] (assoc this :child child))
  (descend [this candidate-k]
    (let [[prefix tail] (split-at (count key) candidate-k)]
      (when (= prefix key)
        [child tail])))

When attaching, it’s clear upfront that input-k and key don’t match exactly — if they did, the preceding descend would have moved through the extension node.

As with Leaf, the input value is attached to an empty branch node, to which we also reattach a node — either an abbreviated extension (assoc this :key (rest key-suf)) — or the extension’s child directly, if the key suffix is empty beyond the single nibble consumed by Branch/reattach.

For convenience, we also provide Node for nil:

(extend-type nil
  Node
  (terminates? [_ _]   false)
  (-value      [_]     nil)
  (attach      [_ k v] (->Leaf k v))
  (descend     [_ _]   nil))

Trying It Out

;; hello = [6 8 6 5 6 12 6 12 6 15]
;; help  = [6 8 6 5 6 12 7 0]

(-> (KVTrie. nil)
    (insert "hello" "hello")
;; =>
#KVTrie{:root #Leaf{:key   [6 8 6 5 6 12 6 12 6 15]
                    :value "hello"}}
    (insert "help" "help")
;; =>
#KVTrie{
  :root #Extension{:key   [6 8 6 5 6 12]
                   :child #Branch{
                     :branches {6 #Leaf{:key   [12 6 15]
                                        :value "hello"}
                                7 #Leaf{:key   [0]
                                        :value "help"}}
                     :value    nil}}}
     (search "help"))
;; => "help"

Great. But will it merkle?