Mulitbranch: Core File System introduction

edit: this content has been added to the project wiki: https://github.com/ucoin-io/ucoin/wiki/uCoin-File-System


Maybe some of you tried the master branch of uCoin code lately and noticed that current version is 0.12, which hasn’t been released yet. This version is particular because it is the first version including multibranch feature. This feature also comes with a brand new data management layer which I would like to detail a bit for potential contributors, or just for your own culture of technical aspects of the project :smile: .

The multibranch

What is multibranch exactly? Well, the network often don’t agree with the current state of the currency, because in its essence the network is P2P and each node is free to have its own point of view. This can lead to situations where one node considers a member as « gone », while the other node says « he still is a member » for example.

Such a situation is called a fork of the blockchain because our 2 nodes share a same history of the blockchain, and as such agrees on most of the data, but also diverge on the last few blocks.

Multibranch is what allows a node to handle multiple states of the currency for a period of time. After some time, the node will finally select one of the states (one of the forks) and forget the other. In most cases, the node will simply select the history that is the most shared across the network to follow the emerging consensus.

uCoin database

As you may know, uCoin database is now just vanilla files. More precisely JSON files. No relational database (SQLite, MySQL, …) nor NoSQL databases (MongoDB, CouchDB, …) out there. This choice of mine was made because uCoin is rather simple in its functional aspects (the protocol only manages 4 entities ! Identities, Certifications, Transactions and Memberships). We don’t need third party software for this part, and this is an important point if we want uCoin to be easily releasable on a wide range of platforms like Linux, Mac, Windows, ARM platforms such as Raspberry PI.

So we have a simple file system database.

The Core File System

So what’s the deal here? Handling branches (forks) means being able to have contextual data as opposed to absolute data. Because you won’t handle a transaction the same way if you are on B5_b (blue) or B5_a (green) on above diagram.

Our main issue here is to contextualize data. Believe me or not, but this is not an easy task. It is like adding another dimension to your data. Dealing with space is rather simple, but space-time is another world.

I wanted to keep my filesystem database. A possible solution, an easy one, was to simple make hot copies of the database for each fork that occured. Concretely, if 2 forks occured, we would have almost 3 times the same database (with only the few deltas of the diverging blocks making them different). But that would be disk space consuming!

So I finally opted for the simplest solution I also had in mind that would not be that hard to implement: inheritance model. Just like Object Oriented Programming (OOP) allows us to redefine methods and data of our objects, I wanted to have databases that inherit from each other. Here, I wanted to have 2 objects (B5_a, B5_b) that inherits from B4 database.

I’ve called this file system « Core File System » because of the similarity it has with CPU: a CPU has cores it delegues computation to, according to its computation charge. uCoin does the same: it delegues data computation to its cores according to the data it has to handle.

Here is a schema summarizing it. I hope it will help you to understand the idea!

In this schema, I represented the 3 blocks of first diagram (B3, B4, B5_a) as a folder. So we have folders B3, B4 and B5_a. When uCoin receives a block B6, it will check if the B6 is based upon B5_a or B5_b (this is super easy to do since each block refers to its previous block by a number + a fingerprint).

The schema represents the case where the block B6 is based upon B5_a, and thus need to retrieve data from context B5_a to check if the rules of uCoin protocol are respected by B6 and decide wether B6 can be added to the blockchain (the chain of B5_a, …, B3) or not.

Source code

I implemented this algorithm in a simple file. Please have a look at below commit to see the technical details.

The commit consists of 2 files:

  • a unit test file, where the full schema example above is tested
  • the CFS library

https://github.com/ucoin-io/ucoin/commit/58fc45ac25fc90d2b76e952e903d6ae155667a4b

1 « J'aime »