OK, so this is a big topic . A little disclaimer:
I would be very grateful if you guys interested in uCoin protocol, tell me what you think about it.
Thanks!
We used to talk a lot about Sybil attack & distance rule with a lot of controverse & passion. This leads me today to tell you I want to radically change this rule and enumerate the reasons why below, along with the new form I want to give it and I could call statistical distance rule.
Just to avoid you reading the whole post to know what is the new rule I propose, here it is:
Statistical distance rule is the rule that considers a member to be at a max distance
k
from any other member if it is certified by enough unique members in its first s steps, with s [1;k] and k being the maximum diameter of the web-of-trust.
Or differently said:
If an identity has enough unique certifiers in its first s steps, then we can consider the member will be at a distance k from almost all the members of the WoT.
So, with k = 5 and s = 3 distance rule, the member will be accepted in a WoT of diameter k = 5 if he has at least u unique certifiers in its s = 3 first steps, u value being given by a formula.
So the whole WoT won’t be checked to accept the member. That’s why I call it statistical distance rule: because the distance is not hardly checked, but deduced from statistics.
Note: we use indifferently k to refer to the maximum WoT size and to the maximum distance between 2 parts of the WoT, or two parts in an hypothetic WoT. The reason is: the definition of the diameter is that any part (any member, node or whatever synonym we use to designate a vertex) a has a maximum distance k with any other part of the WoT.
If you want to dive into the details, please read the following paragraphs.
The current distance rule
How it works
Currently, the distance rule is processed the following way:
- when an identity joins or renews, it is checked against the WoT rule
- the identity must match the distance rule with each sentry of the WoT, that is all the members with at least
sigWoT
valid certifications to them - for each sentry:
- try to find a path from the sentry to the identity by using at most
stepMax
steps - if no path exist, the identity does not match the distance rule
- try to find a path from the sentry to the identity by using at most
The computation problem
The whole problem in this algorithm is scalability. Indeed, finding a path is a hard problem, and in the worst case (when no path actually exists) we have to go through almost all the WoT.
While this isn’t a big problem for 1.000 members, 100.000 members with 10 certifications each makes a lot of computation. Imagine: we might have to test all the links of 5 steps of a 100k members WoT for each sentry and for each newcomer/renewer, each time we receive a block with identities in it.
Given the fact we would like uCoin to be run by low machine peers, this might not be an acceptable algorithm. I thought it could work, but the more I dig in the more pessimist I am. Maybe it exists some trick that could make it easy to compute, but I havn’t found it yet.
Also, as I often say, I am not building an Airbus nor a Boeing. I just want something that flies. If I can use a cheaper solution (both in computing & time development cost), I will certainly prefer it. Today, I am looking at the better trade-off to have something that works for 100.000 members (if more, that would be a bonus), yet I don’t want to delay again and again the launch of a beta currency.
That’s why I propose you statistical distance rule.
The possible attacks
Not only the current rule is a hard computation problem, it is also vulnerable to an attack shared by @Arcurus.
To avoid it, we must alter the distance rule to something like:
The distance rule have to apply to at least 90% of the sentries
instead of 100% of them. Even with this, we are not fully protected against the attack (I let you dig in the above link for more details).
The new rule
Abstract
Considering what we want is to roughly have a maximum diameter k for the WoT, because we understood that:
- it may exists individuals at an infinite distance (if they do not certify anyone)
- we can’t have the rule applied continuously (the distance rule is applied against the WoT, so the WoT cannot be tested against itself, we have to consider only “externalities” against it: the newcomers/renewers)
We understand that it can be acceptable to have a rule that statistically leads to a WoT of max diameter k.
The perfect tree of k diameter
First of all, let’s define what we call a perfect tree of k diameter.
Imagine a tree where each member appears only once, for a newcomer. A perfect tree of degree d = 2 would be able to define this newcomer by 2 previous members, themselves being certified by 2 previous members, and so on until step k. So for a diameter k = 3 we have exactly:
$$N = 2^0 + 2^1 + 2^2 + 2^3 = 1 + 2+ 4 + 8 = 15$$ members.
Generalizing with degree d
Exact value
We cannot generalize this rule by an inequality with any value of d (this is a sum of successive powers of a same basis d, which apparently is an unsolved problem). Instead, we can use a table of value telling us what is the link between d, k and perfect tree N using the formula:
$$(a)$$ $$N = \sum\limits_{i=0}^k d^i$$
d k = 1 k = 2 k = 3 k = 4 k = 5
2 3 7 15 31 63
3 4 13 40 121 364
4 5 21 85 341 1 365
5 6 31 156 781 3 906
6 7 43 259 1 555 9 331
7 8 57 400 2 801 19 608
8 9 73 585 4 681 37 449
9 10 91 820 7 381 66 430
10 11 111 1 111 11 111 111 111
11 12 133 1 464 16 105 177 156
12 13 157 1 885 22 621 271 453
13 14 183 2 380 30 941 402 234
14 15 211 2 955 41 371 579 195
15 16 241 3 616 54 241 813 616
16 17 273 4 369 69 905 1 118 481
Which gives us the following graph:
Approximation
For the purpose of uCoin, we could use an approximation of (a) which would be:
$$(b)$$
$$ N = d^k $$
Which gives us the following table:
d k = 1 k = 2 k = 3 k = 4 k = 5
2 2 4 8 16 32
3 3 9 27 81 243
4 4 16 64 256 1 024
5 5 25 125 625 3 125
6 6 36 216 1 296 7 776
7 7 49 343 2 401 16 807
8 8 64 512 4 096 32 768
9 9 81 729 6 561 59 049
10 10 100 1 000 10 000 100 000
11 11 121 1 331 14 641 161 051
12 12 144 1 728 20 736 248 832
13 13 169 2 197 28 561 371 293
14 14 196 2 744 38 416 537 824
15 15 225 3 375 50 625 759 375
16 16 256 4 096 65 536 1 048 576
We can see the distribution is amlost the same as with (a) formula. Here is the difference:
d k = 1 k = 2 k = 3 k = 4 k = 5
2 -33,33% -42,86% -46,67% -48,39% -49,21%
3 -25,00% -30,77% -32,50% -33,06% -33,24%
4 -20,00% -23,81% -24,71% -24,93% -24,98%
5 -16,67% -19,35% -19,87% -19,97% -19,99%
6 -14,29% -16,28% -16,60% -16,66% -16,66%
7 -12,50% -14,04% -14,25% -14,28% -14,28%
8 -11,11% -12,33% -12,48% -12,50% -12,50%
9 -10,00% -10,99% -11,10% -11,11% -11,11%
10 -9,09% -9,91% -9,99% -10,00% -10,00%
11 -8,33% -9,02% -9,08% -9,09% -9,09%
12 -7,69% -8,28% -8,33% -8,33% -8,33%
13 -7,14% -7,65% -7,69% -7,69% -7,69%
14 -6,67% -7,11% -7,14% -7,14% -7,14%
15 -6,25% -6,64% -6,66% -6,67% -6,67%
16 -5,88% -6,23% -6,25% -6,25% -6,25%
So we see that, the more the degrees, the less the difference there is between (a) and (b). Also, the distance k does not affect much the result but on the early stages of the WoT where d \leq 5.
However the usage of this formula (b) allows us to determine d_{min} from any N:
(c)
$$ d_{min} = \exp{(\frac{\ln{(N)}}{k})} $$
Which gives the following values:
This graph actually shows how much N is reachable given each member has d_{min} certifications to him in the WoT of max diameter k.
Notable results for the perfect WoT (perfect tree of the WoT):
- requires d = 16 certifications per member to reach N = 1.118.481 in a WoT of k = 5 diameter
- requires d = 32 certifications per member to reach N = 1.118.481 in a WoT of k = 4 diameter
Of course, uCoin won’t deal with a perfect WoT. We will discuss this point later in this post.
The acceptable heuristic
My proposal is actually very simple:
When a newcomer/renewer comes in, if its first steps (k - 1 steps) lead to at least u = d_{min}^{k - 1} unique certifiers, then we agree that statistically he will be at a distance k from any other member of the WoT.
Did not get it? I just mean that if by checking the certifications, direct or indirect, at 1,2,3 and 4 steps from you we have enough unique certifiers, then we could accept to say you are at distance 5 from any other member.
Generalizing with s steps
We can generalize this rule by replacing k - 1 by s:
When a newcomer/renewer comes in, if its first steps (s steps) lead to at least u = d_{min}^s unique certifiers, then we agree that statistically he will be at a distance k from any other member of the WoT.
Of course, the lower the value of s, the easier it is for eventual cheaters to create fake accounts.
Some values
Here is a graph summing up values of u according to N (N is the size of the WoT when the individual comes in or renew) and k the max diameter of the WoT:
and the same graphe with a closer view:
- The growing curves are quantitative values of u
- The shrinking curves are relative values of u (relative to N)
- 2 curves have exactly the same values:
- %Cumulated 3 steps u with k = 5
- %Cumulated 4 steps u with k = 5
My wish for a WoT with k = 5 would be to have either:
-
s = 4 so we check only the first 4 steps of any newcomer:
\rightarrow means to find ~100.000 unique certifiers in a WoT of ~1.000.000 members (red curve) -
s = 3 so we check only the first 3 steps of any newcomer:
\rightarrow means to find ~4.200 unique certifiers in a WoT of ~1.000.000 members (blue curve)
I would prefer s = 3 for performance reasons.
The real WoT is not a perfect tree
Of course, the real, in-production web-of-trust won’t be a perfect tree.
As a rule of thumb, we could imagine an almost perfect tree to be contained in a tree of double size of the perfect tree.
This is pure speculation, I don’t know what value would be correct. But I need to go on, so I propose 2 because obviously it needs to be \gt 1, and 2 means that statistically one of two certifications we made can reach a k steps individual. This does not shock me, even if I really prefer to have real numbers to verify this assertion.
If we take the rule of double-size, this means members will require, statistically, the double quantity of certifications to reach maximized N.
In a WoT(k = 5), this means people will require to have at least d_{min} \times 2 = 16 \times 2 = 32 certification stock.
Link with other parameters
According to above result, if we also consider the following rules:
- certification expiry: each cert expires after 1 year
- certification replay: each cert can be replayed (from A to B) 5 years after expiry date
… a community aiming at reaching N = 1.000.000 individuals permanent size with k=5 must have constantly 32 cert by individual/year, so each individual must have 32 \times (5 + 1) = 192 directly known individuals over 5 + 1 = 6 years, which is rougly 3 individuals a month.
This results seems acceptable to me.
We finally have to consider the certification delay rule, which says “an individual can make a certification every X days”.
Given above values, X have to be \lt 365,25 / 32 = 11,41 days.
If we want some margin with certification stock, and put it to 60 for example, we will have to adapt certification delay rule value accordingly: X \lt 365,25 / 60 = 6 days.
Remarkable cases
s = k
This means any member must be directly connected to any member. This is a very strong rule, and will lead to blocking cases if a member refuses to certify individuals.
Even worse, an individual who do not certify anyone will have the power of death over the WoT as long as he stays to be a member.
That’s why, in reality, it is wisheable to have s \lt k.
s = k = 1
This is pure trust distance rule: every member directly signs each other member.
Consequences
Implementing such a rule would lead to:
- Removal of
sigWoT
parameter - Add s parameter
- Enventually add a p factor, to say the threshold to reach u_{threshold} = p \times u. So we can say “each member has to reach at least p times the theoretical value of unique certifiers”.
So on the implementation side, the rule is rather simple. Finding the unique certifiers are a one shot SQL query, but the higher s and N the more complex is the query in terms of execution.
Now I am waiting for your comments.
Thanks for reading.
Please find LibreOffice Calc file from which I’ve made the graphs: Statisitcal_distance.ods (93.2 KB)
More links: