How does SAFE solve the Byzantine generals and double spending problems?

318 views
Skip to first unread message

kenne...@gmail.com

unread,
Apr 21, 2014, 9:30:30 AM4/21/14
to maidsafe-d...@googlegroups.com
Are there any examples and code regarding the Byzantine generals and double spending solutions that are being used in SAFE? How does SAFE solve this problem?

David Irvine

unread,
Apr 21, 2014, 9:38:40 AM4/21/14
to maidsafe-d...@googlegroups.com

On Mon, Apr 21, 2014 at 2:30 PM, <kenne...@gmail.com> wrote:
Are there any examples and code regarding the Byzantine generals and double spending solutions that are being used in SAFE? How does SAFE solve this problem?

It's a very big picture view. The network cryptographically secures identities (see routing). These identities (see passport) must perform certain actions that are monitored by nearby nodes (see vaults). With this in place (if we ignore mutating and signed data, as that's easier). Then you can take an action on the network and know it is controlled. So store data is one action. Store data under certain rules is another, the rules are (it's defined by the types (see common))
1: Immutable data (self validating)
2: Mutable Data (i.e. directories), signed to validate actions
3: Mutable transferrable data (mutable signed by old owner for new owner)
4: Mutable non transferrable data (record an action)

These solve the 2 generals problem as the villagers have to behave properly with these messages, if that makes sense. The network routes around bad nodes or misbehaving nodes.

In terms of double spend then if you imagine type 3 above. The network atomically transfers ownership and this happens on the network (away from the people, I know that sounds weird) and this transfer can be queried to the network to confirm who owns that data element. This is the requirement for atomicity and ensures a double spend cannot happen. We are debating whether double spends events (they are very detectable and ignored) should actually invalidate a coin (for instance) so hack attempts actually end up losing money.  

Hope that helps, it's a large area to discuss for sure and I cannot do it justice here, we do need technical authors for sure. The vault docs with routing will help a lot in seeing how this is implemented. 



--

David Irvine
maidsafe.net 
twitter: @metaquestions

David Irvine

unread,
Apr 21, 2014, 9:42:02 AM4/21/14
to maidsafe-d...@googlegroups.com
I should say the atomicity part of the network is by far the toughest part of this. It's many systems together, accumulators, sync and actions. You can see that in the vault library.

kenne...@gmail.com

unread,
Apr 21, 2014, 9:47:50 AM4/21/14
to maidsafe-d...@googlegroups.com
Okay, that has started to clear it up for me. Thank you for responding so quickly.

I am going to reread the documentation, but my take away is that a large portion, if not all, of the transaction security comes from not being able to easily position nodes in the network and there being a limited number of nodes that any one node is directly connected to. Is that a correct interpretation?

David Irvine

unread,
Apr 21, 2014, 10:04:27 AM4/21/14
to maidsafe-d...@googlegroups.com

On Mon, Apr 21, 2014 at 2:47 PM, <kenne...@gmail.com> wrote:
I am going to reread the documentation, but my take away is that a large portion, if not all, of the transaction security comes from not being able to easily position nodes in the network and there being a limited number of nodes that any one node is directly connected to. Is that a correct interpretation?

Yes this is a key element. We initially (some designs past) used signature checks at each step, it is very safe but significantly slower (as you need to get public keys etc.). This mechanism allows us to use close groups of consensus. To join a group requires crypto security, but when there then the group members monitor each other and their other persona members monitor them, with the group connected close able to put nodes off etc. To make it more difficult there is a ranking mechanism in place which we have not made great use of yet, but this can be used to weight decisions. If course all these mechanisms must be careful off sleeping agents (pretend to be good for a while nodes). We use the network growth and churn as combatants against that. 

It is churn that adds an extra element of security in a weird way, churn forces close group changes making attacks even harder ( if you are careful about placement). When we run tests we use no churn to get the worst case though (and that is even a debate). It is a bit of a mind bender, but allows some very cool things to happen, such as network agreement but using only a subset of the whole network. It is a truly fascinating and scary part of the thinking. We need to always look out for side vectors and the strangest of attacks. This is the mechanism to implement blockchain type technology at scale I believe (although not necessarily a chain). I also think there are a huge number of projects can benefit and this is why getting SAFE in the hands of everyone and letting others feel part of it is very important. There are some really good mechanisms in the vaults, as well as the client parts, which are a bit revolutionary really (log into a decentralised network). To good to be owned not have ego attached to them. 

I hope you jump right in and help out, good questions and a great approach to finding answers. 

David Irvine

unread,
Apr 21, 2014, 10:08:44 AM4/21/14
to maidsafe-d...@googlegroups.com
We also need to document chained consensus groups, basically trust no group, but force each group to deterministically require another group to agree. With chains of three we have found we cannot attack it. Groups of two seem to be very secure as well at scale. It requires some math to prove these tests, but the tests themselves are a good mechanism to test the hypothesis.  All help welcome with the math, it is pretty deep probability stuff for sure. (sorry for flippant math remarks, I am an Engineer and cannot help it :-)) Tests are have 100,000 nodes and create several hundreds of millions of attack nodes to find close group consensus. We will publish those for sure, they take weeks to run, but worth it.

kenne...@gmail.com

unread,
Apr 21, 2014, 10:51:40 AM4/21/14
to maidsafe-d...@googlegroups.com
Thank you for the explanation. I have a fair bit to digest, but it all comes together very well.

I think there are many applications for a network like this, and the test data will be great to look over. I would love to help out and already have a lot of application ideas bubbling up in my head. I am very interested in the project and will definitely keep going over everything and asking more questions as I have them.

dominicwi...@gmail.com

unread,
May 27, 2014, 3:53:36 PM5/27/14
to maidsafe-d...@googlegroups.com
Hi David, really intrigued and impressed with MaidSafe.

Your solution for Byzantine Fault Tolerance (BFT) has piqued my interest because I've been working with it recently.

BFT is a tricky area! Here is a quote from Leslie Lamport in his 1982 paper “The Byzantine Generals Problem” where he cautions against nonrigorous reasoning saying “we know of no area in computer science or mathematics in which informal reasoning is more likely to lead to errors…”.

As I understand, you organize nodes into consensus groups. Then for a consensus group to "finalize" a decision, it must find some other number of consensus groups that agree?

My questions are:-

1/ Presumably for any decision, the network chooses the consensus groups that must agree with each other, otherwise a compromised group could simply choose another compromised group?

2/ If some network algorithm chooses the groups that must agree on a decision deterministically, what stops an attacker doing a search of decisions until he finds one that would be decided by a chain of compromised groups?

3/ In relation to forgoing point, if an attacker wants to compromise a particular decision, what mechanisms prevent him using a Sybil attack (i.e. introducing or removing nodes) to modify the consensus group memberships such that the network algorithm will choose a chain of compromised groups?

4/ Within a consensus group, how is consensus reached. Specifically, how to groups deal with Byzantine members, which may tell one good member they vote for X, and another good member they vote for Y i.e. is such malicious behavior detectable via signed messages or something, and how to nodes confirm to each other that this hasn't happened?

5/ What is the maximum proportion of bad nodes your algorithm can tolerate e.g. traditional BFT usually requires less than a third to be bad?

Big thanks.

David Irvine

unread,
May 27, 2014, 4:20:22 PM5/27/14
to maidsafe-d...@googlegroups.com
Hi and welcome.

On Tue, May 27, 2014 at 8:53 PM, <dominicwi...@gmail.com> wrote:
1/ Presumably for any decision, the network chooses the consensus groups that must agree with each other, otherwise a compromised group could simply choose another compromised group?
The groups are all deterministic based on action and message. 

2/ If some network algorithm chooses the groups that must agree on a decision deterministically, what stops an attacker doing a search of decisions until he finds one that would be decided by a chain of compromised groups?
The chain should be too long for this to be feasible.  

3/ In relation to forgoing point, if an attacker wants to compromise a particular decision, what mechanisms prevent him using a Sybil attack (i.e. introducing or removing nodes) to modify the consensus group memberships such that the network algorithm will choose a chain of compromised groups?
Covered by the message and action determinism.  

4/ Within a consensus group, how is consensus reached. Specifically, how to groups deal with Byzantine members, which may tell one good member they vote for X, and another good member they vote for Y i.e. is such malicious behavior detectable via signed messages or something, and how to nodes confirm to each other that this hasn't happened?
This is a mix of signed messages and accurate detection of xor distance (authority to act distance). AFAIK no dht has ever been able to achieve arriate distance detection at scale.  If a group goes bad (Cannot reach or reach wrong answer) it's detected by the surrounding groups in the chain. This is also how the send ac works in routing, ask for the next group +1 to confirm the next group acted properly.  

5/ What is the maximum proportion of bad nodes your algorithm can tolerate e.g. traditional BFT usually requires less than a third to be bad?
We are finalising the math, but seems well over the usual 50% attack on distributed system. 

HTH


I really really need to document this and have it in a place to point at, many people ask the same question all over the Internet it seems :-)



--

David Irvine

dominicwi...@gmail.com

unread,
May 27, 2014, 7:24:37 PM5/27/14
to maidsafe-d...@googlegroups.com
Thanks very interesting. So as I understand:

> 2/ If some network algorithm chooses the groups that must agree on a decision deterministically, what stops an attacker doing a search of decisions until he finds one that would be decided by a chain of compromised groups?

>> The chain should be too long for this to be feasible.  

Ok as I understand chains are required to be long enough they cannot contain only compromised groups.

But, if we consider that DHTs have infinite size and therefore can contain very large numbers of compromised groups as a small proportion, and that some limit on chain size is required for performance reasons, presumably there has to be some possibility that a chain might be created completely from compromised groups, even if this is vanishing unlikely by random selection?

> 3/ In relation to forgoing point, if an attacker wants to compromise a particular decision, what mechanisms prevent him using a Sybil attack (i.e. introducing or removing nodes) to modify the consensus group memberships such that the network algorithm will choose a chain of compromised groups?

>> Covered by the message and action determinism.  

The *determinism* is my interest. If there is determinism, surely an attacker can experiment with message and action combinations until he finds that some super rare combination that causes a bad chain to be produced.

Surely this process can be repeated over and over if it costs him nothing to apply the network's deterministic algorithm, such that he can perform any number of successful attacks?

> 4/ Within a consensus group, how is consensus reached. Specifically, how to groups deal with Byzantine members, which may tell one good member they vote for X, and another good member they vote for Y i.e. is such malicious behavior detectable via signed messages or something, and how to nodes confirm to each other that this hasn't happened?

>> This is a mix of signed messages and accurate detection of xor distance (authority to act distance). AFAIK no dht has ever been able to achieve arriate distance detection at scale.  If a group goes bad (Cannot reach or reach wrong answer) it's detected by the surrounding groups in the chain. This is also how the send ac works in routing, ask for the next group +1 to confirm the next group acted properly.  

I think I sort of understand what you are doing, but would love to see how it works within the framework of formal Byzantine Fault Tolerance analysis. The danger is that although the system works in every obvious case, there are edge cases where good nodes think end up different things because of a Byzantine actor sending different versions of messages out.

The classical way of dealing with this is to have every node tell every other node what was sent to it. Different versions of messages, and thus the bad actors, can be identified and excluded from consensus but this requires a lot of message passing (n2).

I'm guess you guys are aiming for some kind of Monte Carlo system where there is some small probability of failure on individual iterations, but the system converges on consensus?

> 5/ What is the maximum proportion of bad nodes your algorithm can tolerate e.g. traditional BFT usually requires less than a third to be bad?
>
> We are finalising the math, but seems well over the usual 50% attack on distributed system. 

Theoretically speaking, traditional BFT science would say this was only possible if you were checking for different versions of messages and excluding any message made by the bad nodes. For example, Lamport's paper shows that agreement is possible with only n-2 good nodes in this case. However fiendish problems are still possible. Here, if the bad nodes can partition the two good nodes, or the a single messages they send to each other get lost by the network, then the bad nodes can clearly tell the two good nodes different stories resulting in it breaking.

> I really really need to document this and have it in a place to point at, many people ask the same question all over the Internet it seems :-)

It would be nice to see :-) because BFT is tricky area. As soon as you publish how it works then the community can get to work finding any issues.

dominicwi...@gmail.com

unread,
May 27, 2014, 7:26:49 PM5/27/14
to maidsafe-d...@googlegroups.com, dominicwi...@gmail.com
Typo - Lamport showed agreement was possible using signed messages with n-2 bad nodes!! ;-) much better..

David Irvine

unread,
May 27, 2014, 7:43:14 PM5/27/14
to maidsafe-d...@googlegroups.com

On Wed, May 28, 2014 at 12:24 AM, <dominicwi...@gmail.com> wrote:
I'm guess you guys are aiming for some kind of Monte Carlo system where there is some small probability of failure on individual iterations, but the system converges on consensus?

Yes in a fashion. The groups are groups of 4 themselves. Each node for different messages is connected to different nodes (in our tests so far from a selection of 15 nodes in a 'radius' of the closest nodes in the routing matrix (8*8). So this means certain messages and actions will be known to form within a known group of 4 (ignoring churn, which helps us).

I mean routing layer does the DHT part really. In routing the group size is actually 8. Each of these 8 states their close group to form an 8*8 matrix. If any node in the matrix notes churn then it informs the matrix nodes. This is possibly 64, but in testing it's closer to 15. This gets better (smaller number) as the population increases. 

An attacker could determine a message path and keep trying different messages and actions to get groups he owns to do the attack on that message/action. It's unlikely though to effect normal actors who will be sending message/actions at random. So potentially in this scenario the attacker could poison his own message/action.

The bad nodes would have to react to the other messages close by properly or the surrounding nodes would spot that attack as they synchronise messages and actions. This is possible but pretty difficult as the attack is narrowed significantly in the xor space around the bad nodes.
 
I feel the groups of consensus nodes around an action that then deterministically choose the next group is pretty tough to break (always looking though, so comments are welcome). The requester will know where to ask for the answer of the task, so say sending a coin to another user, the sender can request the coin form the network after X seconds to check the chain completed. This is a little bit further though and not possible with network wide computation, that requires additional checks. Don't want to go down that rabbit hole right now  :-)  

You are right though when we get a breath and get this documented it will help folk poke all around to look for weakness which is something to look forward to. There is a chap doing analysis on this for us at the moment, hopefully that will maybe produce a paper and save me from doing it. I can then get the testnet out :-) 

All good though and great comments, glad you made them. 
Reply all
Reply to author
Forward
0 new messages