Thanks very interesting. So as I understand:
> 2/ If some network algorithm chooses the groups that must agree on a decision deterministically, what stops an attacker doing a search of decisions until he finds one that would be decided by a chain of compromised groups?
>> The chain should be too long for this to be feasible.
Ok as I understand chains are required to be long enough they cannot contain only compromised groups.
But, if we consider that DHTs have infinite size and therefore can contain very large numbers of compromised groups as a small proportion, and that some limit on chain size is required for performance reasons, presumably there has to be some possibility that a chain might be created completely from compromised groups, even if this is vanishing unlikely by random selection?
> 3/ In relation to forgoing point, if an attacker wants to compromise a particular decision, what mechanisms prevent him using a Sybil attack (i.e. introducing or removing nodes) to modify the consensus group memberships such that the network algorithm will choose a chain of compromised groups?
>> Covered by the message and action determinism.
The *determinism* is my interest. If there is determinism, surely an attacker can experiment with message and action combinations until he finds that some super rare combination that causes a bad chain to be produced.
Surely this process can be repeated over and over if it costs him nothing to apply the network's deterministic algorithm, such that he can perform any number of successful attacks?
> 4/ Within a consensus group, how is consensus reached. Specifically, how to groups deal with Byzantine members, which may tell one good member they vote for X, and another good member they vote for Y i.e. is such malicious behavior detectable via signed messages or something, and how to nodes confirm to each other that this hasn't happened?
>> This is a mix of signed messages and accurate detection of xor distance (authority to act distance). AFAIK no dht has ever been able to achieve arriate distance detection at scale. If a group goes bad (Cannot reach or reach wrong answer) it's detected by the surrounding groups in the chain. This is also how the send ac works in routing, ask for the next group +1 to confirm the next group acted properly.
I think I sort of understand what you are doing, but would love to see how it works within the framework of formal Byzantine Fault Tolerance analysis. The danger is that although the system works in every obvious case, there are edge cases where good nodes think end up different things because of a Byzantine actor sending different versions of messages out.
The classical way of dealing with this is to have every node tell every other node what was sent to it. Different versions of messages, and thus the bad actors, can be identified and excluded from consensus but this requires a lot of message passing (n2).
I'm guess you guys are aiming for some kind of Monte Carlo system where there is some small probability of failure on individual iterations, but the system converges on consensus?
> 5/ What is the maximum proportion of bad nodes your algorithm can tolerate e.g. traditional BFT usually requires less than a third to be bad?
>
> We are finalising the math, but seems well over the usual 50% attack on distributed system.
Theoretically speaking, traditional BFT science would say this was only possible if you were checking for different versions of messages and excluding any message made by the bad nodes. For example, Lamport's paper shows that agreement is possible with only n-2 good nodes in this case. However fiendish problems are still possible. Here, if the bad nodes can partition the two good nodes, or the a single messages they send to each other get lost by the network, then the bad nodes can clearly tell the two good nodes different stories resulting in it breaking.
> I really really need to document this and have it in a place to point at, many people ask the same question all over the Internet it seems :-)
It would be nice to see :-) because BFT is tricky area. As soon as you publish how it works then the community can get to work finding any issues.