Ethereum's Compact Merkle Trie (Part II)

An exploration of what Clojure can offer the EVM.

This is the fourth post in a series on implementing the Ethereum Virtual Machine in Clojure. Relevant portions of the code are reproduced inline, with the full source available at nervous-systems/sputter on Github.

Introduction

By the end of the previous post, we’d implemented a referentially transparent in-memory trie which correctly compacted nodes. The fundamental abstractions were the Trie and Node protocols.

Our implementation requires a couple of additional features, if we’re to call it a Merkle trie:

  • A means of reconstructing a tree with reference to the cryptographic hash of some root node (i.e. from a root hash).
  • Generation of Merkle proofs.

It’d also be a good time to add support for persisting tries in a key-value store. While neither of the above requirements specifically entails disk persistence, it may help us reason about how the structure is used in practice.

What’s a Merkle Tree?

A merkle tree is any tree in which nodes are addressed by cryptographic hash, with each node’s hash computed across the hashes of its descendents.

(The coloured labels are for visual clarity. The hash expressions beneath them — in terms of some hash function H — are better thought of as the node labels).

Imagine our access to the binary merkle tree above is being intermediated by some untrusted agent, of whom we request the value corresponding to path b.f (“val4”).

Assuming we know nothing trustworthy about the contents of the tree — only its root hash and the key or path we requested — the agent can convince us of the provenance of the value with a Merkle proof: a representation of the nodes on the path to the returned value. In the context of b.f from the above binary tree, that might look like:

[[Ha Hb] [He Hf] "val4"]

Where each inner vector is a node represented as [left-child-hash, right-child-hash]. To verify, we check that:

  • H(Ha + Hb) matches our trusted root hash
  • H(He + Hf) matches the hash at index 1 in the previous node serialization1 (Hb)
  • H("val4") matches Hf.

We’re now confident we either chose a terrible hash function, or that the value exists the claimed position in the tree.

1 Assume that the client understands how to determine the branch selected at each level, for a given key.

Uses

An intuitive use for Merkle trees can be found in file distribution networks. In the peer-to-peer case, a tracker — a trusted central authority — stores the root hashes of Merkle trees constructed from each file’s contents. Supplied with the trusted hash for some file, a client solicits blocks of content from untrusted peers, each accompanied by a Merkle proof of their position beneath the root1.

Ethereum relies on Merkle trees to ensure data consistency: block headers incorporate the root hashes of various tries used to store bookkeeping information. Further, full nodes can Merkle-prove any values retrieved at the behest of light clients, who possess only partial knowledge of the state. A client may request, e.g. the balance of account 0x180b92ec... and verify the accompanying proof against the relevant root hash from the block headers.

1 I believe Gnutella does something like this. While very different projects, IPFS & Docker also distribute files with Merkle trees.

Ethereum’s Merkle Trie

In order to hash nodes (per the Merkle tree explanation), we require a canonical node representation — Ethereum’s is vectors of byte strings, encoded in RLP (Recursive Length Prefix) — a simple type-length-value format. When encoding nodes containing children, the child’s identity is:

len(rlp(child)) < 32 ? rlp(child) : Keccak256(rlp(child))

— we use whichever is more compact, the child’s RLP or its hash.

For the purposes of this article, we don’t really care about the details of RLP, just that we’ve a reversible encoding:

(-> (->Leaf (->nibbles "k") "v")
    trie.rlp/->rlp            ;; => <-60, -126, 32, 107, 118>
    util.rlp/decode           ;; => [<-126, 32, 107> <118>]
    trie.rlp/vector->node)    ;; => #Leaf{:key ...}

Our strategy is similar to several other implementations: we’ll persist nodes in a key value store (pluggable, with support for LevelDB), using node hashes as keys, and RLP-encoded nodes as values1, 2. We may store several independent tries within a single key store — a trie is reconstituted with reference to both the disk location of the store and a particular root hash within it.

1 I think this is an awkward scheme. The persisted values are opaque to the K/V store, which — aside from ensuring inefficient/cumbersome descent — means it’s impossible to safely delete anything without implementing persistent reference counts on top of the K/V store. I think Ethereum’s use of inter-trie references (e.g. some hash stored as value in trie A is the root of trie B) further complicates things.
2 Why store the nodes as RLP? No particular reason — given we’re using an unstructured key store, we need some serialization for them. We need to generate RLP in order to calculate hashes, so we may as well use it as the disk format.

Implementation

How do we determine which nodes are to be retained in memory (i.e. as realized Node implementations, hanging off the trie’s root), and when do we write changes to disk?

One straightforward approach’d be to only retain dirty nodes (in addition to the root), and offer an explicit Trie/commit operation responsible for atomically flushing to disk any retained nodes — a downside being that we’ll hit the store whenever a Trie/-search involves clean nodes1. On the upside, we can implement it with very few changes to our in-memory trie.

Before writing any code, let’s explore what the tree’ll to look like at different stages, with occasional reference to this storage protocol:

(ns sputter.state.kv)

(defprotocol KeyValueStore
  (retrieve [store k]
    "Return the value associated with the given key.")
  (insert [store k v]
    "Return a new store, with an additional mapping.")
  (insert-batch [store k->v]
    "Return a new store, with additional mappings."))
1 We could easily slightly alter this in the future for better read performance — by using a separate read cache, or an explicity :dirty? flag for in-memory nodes.

Example

Node Recap

Keys are represented as sequences of hex nibbles. Branch nodes hold at most one child per nibble (0-15), with an optional value. Leaf and extension nodes are compactions of redundant layers of branch nodes. Leaves point to values, extensions point to child nodes.

Starting from a trie with a nil root, we insert two mappings, hellohello and helphelp. The entirety of the tree now resides in memory, with a root node like so:

#Extension{:key   [6 8 6 5 6 12] ;; "hel"
           :child #Branch{6 #Leaf{:key   [12 6 15]
                                  :value "hello"}
                          7 #Leaf{:key   [0]
                                  :value "help"}}}

;; (->nibbles "hello") = [6 8 6 5 6 12 6 12 6 15]
;; (->nibbles "help")  = [6 8 6 5 6 12 7 0]

On calling Trie/commit, we expect all of the in-memory nodes to be persisted through KeyValueStore, resulting in a shallow, memory-efficient representation of the tree that we can expand via the store as needed. The root node after the commit:

#Extension{:key   [6 8 6 5 6 12]
           :child <bytes -96, 13, 114...>}

Where <bytes -96, 13, 114...> is an RLP representation of the branch node from the previous snippet. In this case, the branch exceeds our threshold for inline representation — decoding the bytes yields the string d726c196...: a hash we can retrieve from the store.

(-> (kv/retrieve store "d726c196...")
    rlp/decode
    trie.rlp/vector->node)
;; =>
#Branch{:branches {6 <bytes -55, -126, 60...>
                   7 <bytes -58, 48, -124, 104, 101, 108, 112>}}

The branch’s children are similarly encoded. Decoding the child at position 7 — an inline node, rather than a hash — yields a vector, from which we can easily construct a node:

(-> (rlp/decode <bytes -58, 48, -124, 104, 101, 108, 112>)
    trie.rlp/vector->node)
;; =>
#Leaf{:key [0] :value "help"}

Trie/commit

At a high level, Trie/commit wants to:

  • Visit all realized Nodes depth-first, flattening them into RLP
  • Insert into the key store a hashrlp mapping for each visited node.
  • Return a “shallow” representation of the root node in which all children have been flattened.

We’re going to need a means of visiting all of a node’s children, without breaking the Node abstraction. In the previous instalment, we implemented the utility function walk-path in terms of Node/descend & Node/reattach — let’s go with a generic Node/walk function:

(ns sputter.state.trie.node ...)

(defprotocol Node
 ...
  (walk [node outer inner]
    "Apply `inner` to all children & `outer` to this.
     Return the resulting Node."))

(extend-protocol Node
  Branch
  (walk [this outer inner]
    (outer (update this :branches (partial util/map-vals inner))))
  Leaf
  (walk [this outer _]
    (outer this))
  Extension
  (walk [this outer inner]
    (outer (update this :child inner)))
  nil
  (walk [_ _ _] nil))

Using the above to recursively transform nodes beneath root (via some function f) would look like:

(defn transform [f root]
  (node/walk root f (partial transform f)))

We can use this pattern to implement sputter.state.trie.util/flatten, which’ll do most of the lifting for Trie/commit.

To avoid clutter, the RLP codec functionality isn’t reproduced:

(def ^:private hash-bytes 32)

(defn inline-node? [rlp]
  (< (count rlp) hash-bytes))

(defn- walk-realized [f node]
  (cond-> node
    (node/node? node) (node/walk f (partial walk-realized f))))

(defn- flatten-one [writes node]
  (let [rlp (trie.rlp/->rlp node)]
    (if (inline-node? rlp)
      rlp
      (let [hash-k (util/sha3-bytes rlp)]
        (swap! writes assoc hash-k rlp)
        (rlp/encode hash-k)))))

(defn flatten [root]
  (let [writes (atom {})]
    (node/walk
     root
     (fn finish [root]
       (let [rlp    (trie.rlp/->rlp root)
             hash-k (util/sha3-bytes rlp)]
         [(swap! writes assoc hash-k rlp) hash-k root]))
     (partial walk-realized (partial flatten-one writes)))))

flatten-one is called depth-first on all realized descendents of the root (i.e. every descendent which passes the node/node? check in walk-realized). Each node is replaced in its parent by flatten-one‘s return value — either an RLP-encoded hash (for which a corresponding entry exists in writes), or an RLP-encoded inline node.

finish, inside flatten, returns a vector containing the final value of writes, the hash of the root node, and new root node with flattened/unrealized children.

Given the above, Trie/commit itself is trivial:

(defrecord KVTrie [root store]
  Trie
  ...
  (commit [this]
    (let [[writes hash root] (util/flatten root)]
      (assoc this :root  root
                  :store (kv/insert-batch store writes)
                  :hash  (util/bytes->hex hash))

You’ll recall Trie/-search is responsible for returning a lazy sequence of [node remaining-key] vectors, one for each level descended while searching for the given key.

Our previous, in-memory implementation of Trie/-search was something like:

(->> (iterate #(some->> % (apply node/descend)) [root k])
     (take-while identity))

Now that it has to do a little more, let’s lift it into a utility which uses lazy-seq directly:

(defn- search-below [store node k]
  (lazy-seq
   (cons [node k]
         (when-let [[child rest-k] (node/descend node k)]
           (search-below store (realize child store) rest-k)))))

At each step, child is one of three things:

  • A realized Node.
  • An inline node, represented as an RLP-encoded byte array
  • The hash of a node in store, represented as an RLP-encoded byte array.

realize is a trivial function responsible for taking any of those inputs and returning a Node. Given search-below, Trie/-search just provides the root as the starting point:

(defrecord KVTrie [root store]
  Trie
  ...
  (-search [this k]
    (search-below store root k)))

And realize, if you’re curious:

(defn- hash->vector [hash store]
  (-> (kv/retrieve store hash)
      rlp/decode))

(defn realize [child store]
  (if (node/node? child)
    child
    (some-> (rlp/decode child)
            (cond-> (not (rlp/vector? child)) (hash->vector store))
            trie.rlp/vector->node)))

rlp/decode will return nil (and short-circuit the some->) if passed the RLP encoding for a null node.

Trie/-insert

Previously, insertion was defined in terms of a utility function walk-path, which visits nodes along some key:

(defn- walk-path* [node k outer inner]
  (if-let [[child rest-k] (node/descend node k)]
    (outer (node/reattach node k (inner child rest-k)) k)
    (outer node k #{:terminal?})))

(defn walk-path [f node k]
  (walk-path* node k f (partial walk-path f)))

We add a store parameter (KeyValueStore) and insert a realize call:

(defn walk-path [f store node k]
  (walk-path* (realize node store) k f (partial walk-path f store)))

And, finally, we modify the definition of Trie/-insert to pass the extra parameter:

(defn- attach* [v node k & [opts]]
  (cond-> node (:terminal? opts) (node/attach k v)))

(defrecord KVTrie [root store]
  Trie
  ...
  (-insert [this k v]
    (let [root (util/walk-path (partial attach* v) store root k)]
      (assoc this :root root))))

Bonus Round: Merkle Proofs

Ethereum’s representation of a proof is a vector of the RLP-encoded nodes visited when retrieving a given key — inline nodes are not included at the top-level, as they’ll be nested within their parent’s RLP.

As Merkle proofs may be used to demonstrate the absence of a key, we don’t care whether we terminated at an exact match, or the closest node.

prove (trie, key[rlp ...]) retrieves from a trie the realized nodes along the key’s path, and encodes each in RLP, returning the result as a vector:

(defn prove [trie k]
  (let [rlp (map (comp trie.rlp/->rlp first) (-search trie k))]
    (into [(first rlp)]
      (filter (complement util/inline-node?) (rest rlp)))))

Note that prove would fail in trie.rlp/->rlp if called on a trie with uncommitted writes, as it doesn’t recursively flatten the nodes received from its search. I think this is basically reasonable behaviour.

(defn verify [hash k [head & rest-proof]]
  (if-not (and head (Arrays/equals hash (sha3-bytes head)))
    ::invalid
    (let [node          (util/bytes->node head)
          [next-hash v] (descend-inline node k)]
      (if next-hash
        (recur next-hash v rest-proof)
        (if rest-proof
          ::invalid
          v)))))

verify takes a trusted root hash, the key for which we requested a proof, and a proof vector — returning either ::invalid or the value associated with the key, if any.

descend-inline takes a top-level node from the proof vector and passes through any inline children before returning the next node’s hash (& remaining key), or the final value:

(defn- descend-inline [node k]
  (if-let [[child rest-k] (node/descend node k)]
    (if (util/inline-node? child)
      (recur (util/bytes->node child) rest-k)
      [(rlp/decode child) rest-k])
    [nil (node/value node k)]))

You made it! The full source is available at nervous-systems/sputter on Github.