I recently had fun with Neo4j which is a graph database.
It allows to represent the data as the graph and it seems to me that it could be usefull as Wot is basically a graph.
Data Import
First of all, I imported the data from the duniterβs sqlite database. (test_net currency)
The script is pretty simple and can be found here : http://hastebin.com/duxekagoxi.py
Data Model
My data model is basic :
Nodes are identities with three properties : block_number, pseudo, pub_key
Relationships are certifications with a single propety : block_number
More complex data models can be used. We could add currency flows for example.
Letβs have fun
Display the Wot as with certifications emitted before the block 2000.
MATCH (a)-[r]->(b)
WHERE r.block < 2000
RETURN a,b,r
Display the shortest path between a specific nodes and all other nodes
MATCH p=ShortestPath(((person {pseudo:βgreyzliiβ}) -[*]-> (f2f)))
RETURN DISTINCT person.pseudo, f2f.pseudo, length(p), [ x in nodes(p)[1β¦] | x.pseudo ]
ORDER BY length(p) DESC
This sytem could be used for a recommendation system (friend of friend).
For example, if a and b signed you and a and b are both signed by c, you probably know c.
If you sign c, you will shorten paths.
MATCH p=( (n {pseudo : βgreyzliiβ}) <-[*2]- (f2f) )
WHERE NOT (n) β (f2f) AND n <> f2f
RETURN f2f.pseudo, count(p), collect([ x in nodes(p)[1β¦-1] | x.pseudo])
ORDER BY count(p) DESC
Give details to a potential member when he is outdistanced.
We can display overdistanced sentries from a specific identity
MATCH (i) β (sentry)
WITH sentry, count(i) as count_i
WHERE count_i >= 10
MATCH p=ShortestPath((me {pseudo: βGaluelβ}) -[*]-> (sentry))
WITH sentry, length(p) as length_p
WHERE length_p > 2
RETURN sentry.pseudo, length_p
Frauders have to create sybil identities and sign it with their Β« real Β» accounts.
They probably sign more newcomers (identities which is not member yet) than others members.
First, we can know who signs newcomers the most.
Ex : For each member, check the early signers (first three ones), then display signers who appears the most.
MATCH (n) -[cert]-> (m)
OPTIONAL MATCH (p) -[previous_cert]-> (m)
WHERE NOT previous_cert.block >= cert.block AND n <> p
WITH n, m, count(previous_cert) as count_pcert
WHERE count_pcert < 3
RETURN n.pseudo, count(m) as count_m, COLLECT(m.pseudo)
ORDER BY count_m DESC
LIMIT 10
We can also try to find collaborative frauders.
Ex : For each member, check the early signers (first three ones), group them, and find how many times they sign a newcomer together.
MATCH (i) -[r]-> (n{pseudo : "cgeek"})
WITH i, r, n
ORDER BY r.block
WITH n, collect(r.block)[2] as certif
SET n.thirdsig = certif
RETURN n.pseudo, certif
MATCH (a) -[cert_a]-> (m) <-[cert_b]- (b), (c) -[cert_c]-> (m)
WHERE id(a) > id(b) > id(c) AND cert_a.block <= m.thirdsig AND cert_b.block <= m.thirdsig AND cert_c.block <= m.thirdsig
RETURN a.pseudo, b.pseudo, c.pseudo, count(m) as count_m, collect(m.pseudo)
ORDER BY count_m DESC
LIMIT 10
Looks like tierce, cgeek, Thatoo, myself and yourself, greyzlii, are damn Sybil attackers⦠We have to watch these members for the future!
Very nice work indeed!
I was wondering how we can distinguish the founders members helping first newcomers to join, from cheating members. It is quite difficult, if a founder get greedy and start cheating already at start.
So I started to play around with this. I especially implemented the F2F request since we are having trouble to expand the WoT.
This is a proof of concept, I was thinking about implementing standard requests under an API that could be added to the nodes.
For example, with the F2F example : http://gtest.duniter.inso.ovh/neo4j/f2f/inso
We can see that I should certify vincentux or moul to reduce the WoT diameter. They are suggested to me because they certified JF/Stanlog or cgeek, who certified me.
What do you think ? Should we put this project in main duniter tree, and integrate this module cleanly to the code of duniter ? (I have no idea how to do so )
The code is here : https://github.com/Insoleet/duniter-neo4j
It can be usefull to give recommandations thru calculations but not the kernel. At the moment we miss information yet in the clients :
Who is sentry or not
What is the distance from 1 member to 90% of sentries ?
Where is the max distance to 90% ?
Where is the formula of that max distance ?
The whole Wot diameter (max distance of all members tio 90% of sentries)
Generally members must have those crucial informations because they are part of the wot code
What is the distance from 1 member to 90% of sentries ?
This computation cannot be done by the clients and needs a specific API like this one
Where is the max distance to 90% ?
What do you mean by βwhereβ ? Who are the members on the border of the wot ?
Where is the formula of that max distance ?
What do you mean by βformulaβ ?
The whole Wot diameter (max distance of all members tio 90% of sentries)
This computation cannot be done by the clients and needs a specific API too
Generally speaking, since the modularization of Duniter, it seems that we could implement such an API which users who would like to provide could install on their node. But I didnβt understand yet how to add/remove module to an installation of Duniter.
Since this module is depending on a graph database engine, it cannot be deployed to desktop servers.
What about to create a new API, named βWOT_GRAPH_APIβ ?
Could you:
Add your project into duniter repo, ( as subproject βduniter-neo4jβ ?)
Add an endpoint to your Duniter node. So we could display this extended API into network view.
After that, we will be able to use it in Cesium+, to suggest friend. But maybe we will need to add features, such as sort (first suggestion should be link that could reduce the WoT size").