Deciphering of Bytom’s Data Structure
Author: Guo Guanghua, developer of Bytom, worked at ARRIS Group for years, former Java developer at Crypape Inc. One of the developers of CITA-Inter-Enterprise Trust Automation and smart digital draft system ZhongChao Smart Card Research Institute. Contributed code to Parity (Ethereum client), Blog: Gguoss.github.com
Many technical features have been introduced when designing the data structure of Bytom, like patricia tree，utxo， bvm， account model，protobuf，sql，memcache etc. The article explains why and how these features are implemented in Bytom.
Why use PAT Tree?
1. PAT tree allows fast lookup, a feature of Radix Tree(https://en.wikipedia.org/wiki/Radix_tree)
2. PAT tree allows fast data verification, a feature of Merkle Tree
In a distributed system, consistency and effectiveness are critical. By using PAT tree in Bytom system, data can be quickly verified and consistency of each state machine is quickly proved. The quick traceability of data allows Bytom to finds and test the validity of the data in each snapshot.
How is PAT Tree implemented in Bytom?
The PAT tree of Ethereum is based in 16 forks and in two layers. The first layer manages all accounts and the second layer managed the storage content of each account.
Ethereum PAT Tree
Difference between Bytom’s and Ethereum’s PAT tree
1. PAT tree of Bytom is 2-fork Radix Tree.
2. PAT tree of Bytom is to manage unspent outputs.
Bytom PAT Tree
Why use UTXO?
UTXO comes from Bitcoin and defines the circulation of Bitcoin on the network. The face value of bitcoin always remains the same since its initial distribution. The data structure of Bitcoin centers around the token instead of people who uses them. UTXO makes it easy to regulate and track record of assets, which is exactly what Bytom is designed for.
How UTXO is implemented in Bytom?
Compared with the UTXO structure of Bitcoin, 3 more fields are added to the Bytom’s model:
1. Assetid. This field is used to uniquely identify various assets.
2. Accountid, which facilitates the indexing and management of various accounts. Compared to Bitcoin, Bytom introduces the account model, which will be described later.
3. Program, accounts with which allows hosting of program written in Ivy language. The program will be executed in transaction by the Turing complete BVM.
BVM is activated in the conversion process of the state machine, which is execute (transaction) process.
Why use BVM?
The script language of Bitcoin is not Turing Complete and stack with very limited functions. It is difficult to achieve some of the more complex functions, such as verify_spv (cross-chain anchoring verification function, like btc_relay), or a simple function like multi_lock function (Encryption by M people, decryption by N set of private keys, 0 These functions could be realized by program written in Solidity language in EVM in Ethereum, but EVM is too complicated for Bytom that is only interested in the asset management. So Bytom adopts the CVM developed by [Chain] (https://chain.com/) company’s [Ivy] (https://chain.com/docs/1.2/ivy-playground/docs) , which allows high-level programming language to build customized extension.
How to use BVM?
When user sends a transaction, he can write his own program. Once transaction is confirmed, BVM will execute the code. Because BVM is Turing complete virtual machine, a pricing feed equivalent to gas * gasprice in Ethereum)mechanism is needed to solve the downtime problem.
Why use account model?
It is easy to manage relevant data in Account Model as it’s user-based and very straightforward. Also the execution based on account model is easy to perform on BVM. Moreover, the asset model we introduced is easy to regulate and query assets.
How to implement account model on Bytom?
The account model in Bytom is also divided into two categories: asset accounts and individual accounts, which are unlike the individual accounts and the contract accounts in Ethereum.
1. Asset Account
- assetid is globally unique asset identification id.
- alias is the alias of the asset, which is easy to remember, such as (gold, silver).
- vmversion is for the dynamic transit of soft fork.
- program is the program that needs to be executed when the asset is published.
- initialblockhash refers to the block height at which the asset is registered.
- signer manages public and private key pairs to sign with the private key of the asset, and only the person who owns the private key of the asset can publish the asset.
-definition Note of the asset
2 Personal account:
- accountid globally unique account id
-alias account name
- signer, private key pairs, to send the transaction.
- * utxos index of all unspent transaction under the account, so that the assets under the account can be quickly managed.
- program refers to the program that can be inserted when broadcasting transactions.
The physical structure of UTXO physical is stored in memcache. The ogical structure of UTXO is managed with a binary PAT tree.
Personal account, Releated UTXO could be quickly indexed through AccountId. Asset accounts can manage relevant utxo quickly according to AssetId.
Figure above is an uml diagram describing the main data structure of Bytom.
Bytom uses the PAT tree to organize utxo as a world state tree.
The account model is divided into two categories, asset accounts and individual accounts, which can be indexed to manage their associated utxo.
UTXO pool will use the memory database like memcache, relational database will be chosen for storage. Data will be sequenced by protobuf.
Each account can look up its own UTXO from the world state tree when doing transaction from account, and write their asset program as TxInput.
When transaction is packed into block, validator node will instantiate the BVM and execute the program in all TxInput of that transaction.