good morning!!!!

Skip to content
Snippets Groups Projects
Forked from an inaccessible project.
Code owners
db_walkthrough.MD 22.03 KiB

This document attempts to explain how Turbo-Geth organises its persistent data in its database and how this organisation is different from go-ethereum, the project from which it is derived. We start from a very simple genesis block and then apply one block containing an ETH transfer. For each step, we show visualisations produced by the code available in Turbo-Geth and code added to a fork of go-ethereum.

Genesis in Turbo-Geth

For the genesis block, we generate 3 different private keys and construct Ethereum addresses from them. Then, we endow one of the accounts with 9 ETH, and two others with 0.2 and 0.3 ETH, respectively. This is how the initial state trie looks:

genesis_state

In this and other illustrations, the colored boxes correspond to hexadecimal digits (a.k.a. nibbles), with values 0..f. Here is the palette:

hex_palette

First thing to note about the illustration of the initial state trie is that the leaves correspond to our accounts with their ETH endowments. Account nonces, in our case all 0s, are also shown. If you count the number of coloured boxes, from top to bottom, up to any of the account leaves you will get 64. Since each nibble occupies half a byte, that makes each "key" in the state trie 32 bytes long. However, account addresses are only 20 bytes long. The reason we get 32 and not 20 is that all the keys (in our case account addresses) are processed by the Keccak256 hash function (which has 32 byte output) before they are inserted into the trie. If we wanted to see what the corresponding account addresses were, we will have to look into the database. Here is what Turbo-Geth would persist after generating such a genesis block:

genesis_db

The Turbo-Geth database is a key-value store organised in tables (also commonly referred to as "buckets"). This is the list of the main tables in the database:

  • Headers
  • Block Bodies
  • Header Numbers
  • Receipts
  • PlainState
  • History Of Accounts
  • Change Sets
  • HashedState
  • IntermediateTrieHashes
  • Tx Senders

Table "Headers"

This table stores information about block headers. For each block there are three types of block header records with the following (key, value) formats:

  • FULL HEADER: contains the complete block header with all its information parameters
    • key : 8-byte big-endian block number + 32-byte block hash
    • value : the RLP-encoded block header structured as defined in the Ethereum Yellow Paper:
      • parent hash: 32-byte hash of the parent block
      • uncles hash: 32-byte hash of the uncle blocks
      • coinbase: 20-byte address beneficiary of mining reward
      • state root: 32-byte root hash of the state trie
      • transactions root: 32-byte root hash of the trie structure made up of the block transactions
      • receipts root: 32-byte root hash of the trie structure made up of the transaction receipts
      • logs bloom: 256-byte Bloom filter composed from indexable information in each log entry from the transaction receipts
      • difficulty: 8-byte scalar value corresponding to the diffculty level of this block
      • block number: 8-byte scalar value equal to the number of ancestor blocks in the chain (aka block heigth)
      • gas limit: 8-byte scalar value equal to the current limit of gas expenditure per block
      • gas used: 8-byte scalar value equal to the total gas used in transactions in this block
      • timestamp: 8-byte scalar value equal to the Unix epoch timestamp at the inception of this block
      • extra data: up to 32-byte array of additional relevant data for this block (at least for Eth-hash consensus engine, for Clique could be more)
      • mix_hash: 64-byte hash proving, combined with nonce, the computation amount spent in block mining
      • nonce: 8-byte value proving, combined with mix hash, the computation amount spent in block mining
  • TOTAL DIFFICULTY HEADER: contains the total mining difficulty (TD) of the chain ending in such specific block
    • key : 8-byte big-endian block number + 32-byte block hash + 0x74 suffix (ASCII code for t character)
    • value : variable-length RLP-encoded total difficulty value: the cumulative difficulty value from first block to this one
  • CANONICAL HEADER: contains the block hash
    • key: 8-byte big-endian block number + 0x6E suffix (ASCII code for n character)
    • value: 32-byte block hash

as shown in the following picture for the block 0 (from top to bottom the three types of records, key on the left and value on the right):

genesis_db_headers

You can check that for the genesis block

  • block number 0 is encoded as 0x0000000000000000 (see 8-byte zeros start of each key)
  • block hash is 0x61eb...aba1
  • logs bloom is 256-byte zeros (see the 8 white rows of all 32-byte 0s)
  • total difficulty is 0x80, which is RLP encoding of value 0

Table "Block Bodies"

genesis_db_block_bodies

The keys in this table are concatenations of 8-byte encoding of the block number and 32-byte block hash. The values are RLP-encoded list of 2 structures:

  1. List of transactions
  2. List of ommers (a.k.a uncles)

In the case of the genesis block, all of these lists are empty (RLP encodings 0xC0), and the prefix 0xC3 means in RLP "some number of sub-structures with the total length of 3 bytes".

Next three tables, "Last Header", "Last Fast", and "Last Block", always contain just one record each, and their keys are always the same, the ASCII-encodings of the strings LastHeader, LastFast, and LastBlock, respectively. The values record the block/header hash of the last header, receipt or block chains that the node has managed to sync from its peers in the network. The value in "Last Fast" bucket is not really used at the moment, because turbo-geth does not support Fast Sync.

Table "Header Numbers"

Is a mapping of 32-byte header/block hashes to the corresponding block numbers (encoded in 8 bytes):

genesis_db_header_numbers

Table "Receipts"

Records the list of transaction receipts for each block:

genesis_db_receipts

The 8 bytes of the key (or 16 nibbles, equaling to 0s here) encode the block number, which is 0 for the Genesis block. The value is the CBOR-encoded list of receipts. In our case, there were no transactions in the Genesis block, therefore, we have CBOR encoding of an empty list, 0xf6.

Table "PlainState"

genesis_db_accounts

Store together Accounts and Storage.

Accounts: key="account address", value="current state of each account". Storage: key="account address+incarnation+sorage hash", value="current value of storage".

The accounts encoded by function account.go:EncodeForStorage() so that the first byte of a field is its length and bytes afterward the field itself. They all start with a fieldset of 0x02, for each bit set into the fieldSet a field is present. In this case 0x02 in binary is 10 meaning that only the second field is set (the balance). the order in the fieldset is the following:

  • 1st bit: Nonce
  • 2nd bit: Balance
  • 3rd bit: Incarnation
  • 4th bit: Code Hash

Therefore, immediately after the fieldset we have the length in byte of the balance: 0x08 (8 bytes). the following 8 bytes (aka. 16 nibbles) contains the hexadecimal value of the balance that in the first record would be: 0x7ce66c50e2840000.

$ python
Python 2.7.15
Type "help", "copyright", "credits" or "license" for more information.
>>> 0x7ce66c50e2840000
9000000000000000000

Which is 9 followed by 18 zeros, which is 9 ETH (1 ETH = 10^18 wei).

The third bit of the field set represent the Incarnation. The Incarnation is a turbo-geth specific attribute, which is used to make removal and revival of contract accounts (now possible with CREATE2 since Constantinopole) efficient with turbo-geth's database layout. For now it will suffice to say that all non-contract accounts will have incarnation 0 (not set), and all contract accounts will start their existence with incarnation 1.

Contract accounts may also contain a contract code hash and storage root. These two pieces of information would make the record in the "Accounts" bucket contain 5 instead of 3 fields.

Table "History Of Accounts"

genesis_db_history_of_accounts

key="account addresses + incarnation", value="encoded array of block numbers, where account changed/created".

The history of accounts records how the accounts changed at each block. But, instead of recording, at each change, the value that the accounts had after the change, it records what value the accounts had before the change. That explains the empty values here -- it records the fact that these three accounts in questions did not exist prior to the block 0.

Table "Change Sets"

Records the history of changes in accounts and contract storage. But, unlike in "History of Accounts" and "History of Storage", where keys are derived from accounts' addresses (key="address hash + block number"), in the "Change Sets" table, keys are derived from the block numbers:

genesis_db_change_sets

In the cases of our Genesis block, the key is composed from the encoding of the block number (0x20), and the ASCII-code of hAT (meaning history of Acounts Trie). The "Change Set" table records changes that happen to accounts and contract storage slots at every block. It is important to note that the values recorded in the "Changes Set" table are not the values the accounts (or storage slots) had AFTER the change, it records what value the accounts (or storage slots) had BEFORE the change. That explains the empty values here - it records the fact that these three accounts in question did not exist prior to the block 0. The encoding of the values in the records is tailored for fast access and binary search. It has 5 parts:

  1. Number of keys-value pairs, encoded as a 4-byte (32-bit) number. In this example, it is 0x00000003, which means there are 3 key-value pairs
  2. Size of each key, also encoded as a 32-bit number. All keys are the same size, which makes it possible to access them without deserialisation. In this example, it is 0x00000020, which 32, meaning that all keys are 32 bytes long.
  3. Keys themselves. In our examples, these are the coloured boxes before the streak of white 0s. Keys are sorted lexicographically. This, together with the keys being the same size, allows binary search without deserialisation, as well as linear-time merge of multiple changesets.
  4. Value offsets. These offsets mark the beginning of the next, 5th part as offset 0. First value has offset 0. In our example, all values are empty strings, therefore we see 3 zero offsets (24 white boxes with zeros in them).
  5. Values themselves. In our example, they are empty, so this 5th part is not present.