Bacoor Logo.png

Data structure in Ethereum. Episode 2: Radix trie and Merkle trie.

 

 

In episode 1 và 1+, we got to know some encoding/decoding algorithms which are used for constructing Ethererum data. So now we are going to move on to the organization of data in Ethereum. This time we will introduce you two kinds of trie – Radix trie and Merkle trie. Actually, Ethereum doesn’t use them the way they were originally, but mixes them then created a new trie that more optimized, named Patricia trie.

 

You can consider this episode as a step stone to understand about Patricia trie later.

 

Trie

Trie is a word or a terminology that represents digital tree in science computer. Sometime, we can see that ‘tree’ is used, it’s ok because of the same meaning.

 

In others word, trie is an ordered data structure that is used to store a dynamic set or associative array which is formed to key-value where the keys are usually strings.

 

 

 

We can get familiar with some terminators of trie by image above, further, set of root, internal node, leaf will be called node in common.

 

Dataset

We will use this data sample for all example.

 

https://gist.githubusercontent.com/sontuphan/9e8ee0c4966b18d9062edbfb1e246e47/raw/ac17d776907aa8bd3f0fd6c3fa2dbf5fb9f53b19/dataset.js

 

 

Radix trie

Radix trie is used to optimize for searching.

In radix trie, the key in dataset will be the path to reach the value.

 

 

Basic radix trie

 

As you can see, every path of trie represents a character is ASCII and it is used for searching value.

For example, we are looking for the value of key which is dodo. Just start from the root, try to look for the d path first and keep descenting whole the path. The final result is the red line and green node with value 4.

 

 

 

Radix trie

 

However, the branch of houseand houses key was degraded, too many internal node with null value. In order to reach value of houses , we have to decent down the path so many times. It causes wasted space.

So, we can be improved by combining the degraded path. Now, a path is not represented by a single character, but a string.

We get a improvement on radix trie as the image beside.

To reach houses node, we just need to descent twice and it seems to be good for searching.

 

Merkle trie

Merkle trie is used to authenticate data

In merkle trie, data of it will be used to create a deterministic cryptographic hash that help to authenticate data.

To get the details, we move to an example:

 

 

 

Merkle trie

 

Whole data will be stored at leafs. Value of parent of those leafs will be equal to Hash(valueOfChild1, valueOfChild2, …) .

 

 

Deterministic cryptographic hash

 

If we try to change value of 4th node to 44. So the parents on the path to root from 4th node will be totally change, H2 → H’2 | H5 → H’5 | Root → NewRoot .

 

Thus, if we hold value of root, we can verify the consistency of data by rebuilding the trie to get root and compare it with our root. Practically, it is impossible to fake data without changing value of root.

 

Conclusion 

By getting through these two kinds of trie, we had provided enough tools to understand the final matter: Patricia trie.

 

In the next episode, we will show you how Ethereum combines between Merkle trie and Radix trie, as well as some improvements to create Patricia trie with more optimized things and also keeping main attributes which are optimization of searching and deterministic cryptography.

 

References

Radix tree Wiki

Merkle tree Wiki

Ethereum/ Wiki

 

 

Share on Facebook
Share on Twitter
Please reload

Please reload

Copyright © 2019 Bacoor Inc. All rights reserved.