good morning!!!!

Skip to content
Snippets Groups Projects
Unverified Commit 5b62588a authored by canepat's avatar canepat Committed by GitHub
Browse files

Update description of table Headers (#1258)

* Update description of table Headers

* Complete the table Headers description
parent dd21d07a
No related branches found
No related tags found
No related merge requests found
Database Walkthrough
====================
This document attempts to explain how Turbo-Geth organises its persistent data in its database, how this organisation is
This document attempts to explain how Turbo-Geth organises its persistent data in its database and how this organisation is
different from go-ethereum, the project from which it is derived. We start from a very simple genesis block and then
apply one block containing a ether transfer. For each step, we show visualisations produced by the code available in
turbo-geth and code added to a fork of go-ethereum.
apply one block containing an ETH transfer. For each step, we show visualisations produced by the code available in
Turbo-Geth and code added to a fork of go-ethereum.
Genesis in Turbo-Geth
---------------------
......@@ -14,39 +14,76 @@ This is how the initial state trie looks:
![genesis_state](state_0.png)
In this, and other illustrations, the colored boxes correspond to hexadecimal digits (a.k.a nibbles), with values 0..F.
In this and other illustrations, the colored boxes correspond to hexadecimal digits (a.k.a. nibbles), with values 0..f.
Here is the palette:
![hex_palette](hex_palette.png)
First thing to note about the illustration of the state trie is that the leaves correspond to our accounts with their
First thing to note about the illustration of the initial state trie is that the leaves correspond to our accounts with their
ETH endowments. Account nonces, in our case all 0s, are also shown. If you count the number of coloured boxes, from top
to bottom, to any of the account leaves you will get 64. If each nibble occupies half of a byte, that makes each "key"
to bottom, up to any of the account leaves you will get 64. Since each nibble occupies half a byte, that makes each "key"
in the state trie 32 bytes long. However, account addresses are only 20 bytes long. The reason we get 32 and not 20 is
that all the keys (in our case account addresses) are processed by the `Keccak256` hash function (which has 32 byte
output) before they are inserted into the trie. If we wanted to see what the corresponding account addresses were, we
will have to look into the database. Here is what turbo-geth would persist after generating such a genesis block:
will have to look into the database. Here is what Turbo-Geth would persist after generating such a genesis block:
![genesis_db](changes_0.png)
The database is organised in buckets (also commonly referred to as "tables").
Bucket "Headers"
----------------
This bucket stores information about block headers.
The Turbo-Geth database is a key-value store organised in tables (also commonly referred to as "buckets"). This is the list
of the __main__ tables in the database:
- __Headers__
- __Block Bodies__
- __Header Numbers__
- __Receipts__
- __PlainState__
- __History Of Accounts__
- __Change Sets__
- __HashedState__
- __IntermediateTrieHashes__
- __Tx Senders__
Table "Headers"
---------------
This table stores information about block headers. For each block there are three types of block header records with the following (key, value) formats:
* __FULL HEADER__: contains the complete block header with all its information parameters
- _key_ : 8-byte big-endian block number + 32-byte block hash
- _value_ : the RLP-encoded block header structured as defined in the [Ethereum Yellow Paper](https://github.com/ethereum/yellowpaper):
* parent hash: 32-byte hash of the parent block
* uncles hash: 32-byte hash of the uncle blocks
* coinbase: 20-byte address beneficiary of mining reward
* state root: 32-byte root hash of the state trie
* transactions root: 32-byte root hash of the trie structure made up of the block transactions
* receipts root: 32-byte root hash of the trie structure made up of the transaction receipts
* logs bloom: 256-byte Bloom filter composed from indexable information in each log entry from the transaction receipts
* difficulty: 8-byte scalar value corresponding to the diffculty level of this block
* block number: 8-byte scalar value equal to the number of ancestor blocks in the chain (aka block heigth)
* gas limit: 8-byte scalar value equal to the current limit of gas expenditure per block
* gas used: 8-byte scalar value equal to the total gas used in transactions in this block
* timestamp: 8-byte scalar value equal to the Unix epoch timestamp at the inception of this block
* extra data: up to 32-byte array of additional relevant data for this block (at least for Eth-hash consensus engine, for Clique could be more)
* mix_hash: 64-byte hash proving, combined with nonce, the computation amount spent in block mining
* nonce: 8-byte value proving, combined with mix hash, the computation amount spent in block mining
* __TOTAL DIFFICULTY HEADER__: contains the total mining difficulty (TD) of the chain ending in such specific block
- _key_ : 8-byte big-endian block number + 32-byte block hash + `0x74` suffix (ASCII code for `t` character)
- _value_ : variable-length RLP-encoded total difficulty value: the cumulative difficulty value from first block to this one
* __CANONICAL HEADER__: contains the block hash
- key: 8-byte big-endian block number + `0x6E` suffix (ASCII code for `n` character)
- value: 32-byte block hash
as shown in the following picture for the block 0 (from top to bottom the three types of records, key on the left and value on the right):
![genesis_db_headers](changes_0_Headers.png)
The keys for the first two records start with 8-byte encoding of the block number (0), followed by
the block hash (or header hash, which is the same thing). The second record also has a suffix `0x74`,
which is ASCII code for `t`. The records of the first type store the actual headers in their values.
The records of the second type store total mining difficulty (TD) of the chain ending in that specific header.
In our case it is `0x80`, which is RLP encoding of 0.
The records of the third type have their keys composed of 8-byte encoding of the block number (0 here), and
suffix `0x6E`, which is ASCII code for `n`. These records allow for the lookup of a header/block hash, given its block
number. They are also called "canonical header" records, because there might be multiple headers for given
block number, and only one of them is deemed "canonical" at a time.
You can check that for the genesis block
* block number 0 is encoded as 0x0000000000000000 (see 8-byte zeros start of each key)
* block hash is 0x61eb...aba1
* logs bloom is 256-byte zeros (see the 8 white rows of all 32-byte 0s)
* total difficulty is 0x80, which is RLP encoding of value 0
Bucket "Block Bodies"
---------------------
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment