Bacoor Logo.png

Data structure in Ethereum. Episode 4: Practicing.

Image source: pxhere.com

 

In this episode, our aim is to help you see clearly how Ethereum data organized through practicing. By creating an example, we hope you can get a hang of data structure in Ethereum.

 

You can also use the git below (develop branch) for more convenient:

https://github.com/sontuphan/Trie/tree/develop

 

Some specs:

  1. Full node: Geth

  2. Network: Ropsten testnet

  3. Programming language: Javascript/NodeJS

  4. Subject of the study: state trie (stateRoot)

 

What we have to do

 

Geth

Follow this link, you will know the way to setup geth as a full node in your computer.

https://github.com/ethereum/go-ethereum/wiki/geth

 

To start full sync mode of Ropsten testnet and open RPC, you can use this command:

geth –testnet –datadir “~/Library/Ethereum/ropsten” –rpc –rpcapi “eth,net,personal,web3” –rpcaddr “0.0.0.0” –rpccorsdomain “*” –ws –wsapi “eth,net,personal,web3” –wsorigins “0.0.0.0”

And remember that you should change param after –datadir flag to be available to you.

Because we run geth in full sync mode, it means geth need time to sync entire blockchain data (it can be pretty long, around 3 days in our case). But when you see the logs turned like this, I’m pretty sure it’s done.

Full sync

 

Web3 — Testing geth

Refer this link for web3:

https://github.com/ethereum/wiki/wiki/JavaScript-API

 

First of all, we need to create a nodeJS project and then install web3 package.

https://gist.githubusercontent.com/sontuphan/0a93559f77b74f7ddbe0248609691f87/raw/16343771927f09d379c40ddf2dd94df986df2cea/geth.js

 

 

Try running function getStateRoot with blockNumber is nearly newest block number on Ropsten. We need to avoid to get the newest since it will lead to the risk of delaying sync. My choice is 2596315 and it will be changed at the moment you read this article. Be careful.

 

My result of running getStateRoot(2596315):

stateRoot: 0x1a63facb2a82966504a643f7c6cce28ddb47ea056b02009975c665bdada64c81

At this moment, we can be sure that our full node work perfectly.

 

levelDB

https://github.com/google/leveldb

Something we need to be careful about levelDB is that it allows only one connection at one time. Thus, for the next steps, we need to stop geth after full sync.

In order to create a connection by NodeJS, we will use 2 packages are levelup and leveldown. So please install levelup, leveldown and pathmodules.

https://github.com/Level/levelup

Create a connection:

https://gist.githubusercontent.com/sontuphan/8ae71dbd048df4fe0911bf31f597e02f/raw/614ac207d8554239e789961b3dd8fc1703a86ade/leveldb.js

 

 Create a connection to levelDB

 

Here, we tried connecting to our levelDB with specific path that point to chaindata folder (this path depends on your config when we start geth). And then, we globalized it to use afterward.

 

Get into database

In the Web3 — Testing geth part, we got stateRoot of block number 2596315. Because we used web3, so the result is certainly correct.

Now, we warn up by getting stateRoot in a block header corresponding with a specific block number and then we compare it to the previous result in the Web3 — Testing geth part.

Please install ethereumjs-block module first, we need it to parse block data.

 

 

Warn-up steps

 

Source code:

https://gist.github.com/anonymous/b8eb698e5befe7cf89ff9038f7d0eec1#file-stateroot-js

 

** About utils library, please take a look at our repo to get source code. The path is ./libs/utils.

First steps, we need to pad a number of0 to the left of 2596315 so that total length will be 16, notice that everything we do will be in hex.

hexBlockNumber = 00 00 00 00 00 27 9d db

In geth, they used h as prefix and n as suffix.

prefix = 68
suffix = 6e

And then, we concatenate all of them in sequence.

keyString = prefix + hexBlockNumber + suffix = 68 00 00 00 00 00 27 9d db 6e

Here the result:

 

Warn-up results

 

As we can see, final result is the same with result in web3 part.

 

Get deeper

We will use merkle-patricia-tree and rlp module, let’s install them.

Now, we are starting to create a trie library that uses address to parse whole info saved in state trie.

https://gist.githubusercontent.com/sontuphan/7f8c9dbe90c75120b5150b65e6d99330/raw/3b54ab258c3bed5a6691541c2bca20fec1488a2b/trie.js

 

 

 

Focusing on getInfoByAddress function, we use merkle-patricia-tree to create trie with root inputed, then we get data of an address by this trie. Remember that all data was encoded by rlp before saved down, so in order to read it out, we need to decode them.

This is completed example:

https://gist.github.com/sontuphan/59b30e183f20c99562415f8b42693577#file-accountparser-js

The result:

 The final result

 

Address data contains 4 info. In sequence, they are nonce, balance, storageRoot and codeHash.

 

Conclusion

This is the end of our series about Ethereum’s data structure. Hope we were able to give you a some quick ideas how this vast system works. In the near future, we’ll work our best to introduce you more useful subjects. Please looking forwards to it!

 

References

LevelDB in Geth, key and values
Exploring Ethereum’s state trie with Node.js

 

Share on Facebook
Share on Twitter
Please reload

Please reload

Copyright © 2019 Bacoor Inc. All rights reserved.