How to do when majority of nodes fail but the network is ok, maybe because of something else

shanc...@gmail.com

unread,

Jan 2, 2018, 7:46:47 AM1/2/18

to raft-dev

hello，I am learning the raft recently，

I have read the "Log replication - what happens when the leader cannot replicate to majority of nodes" and other topics, but don't find the answer.

I have the same problem with "Log replication - what happens when the leader cannot replicate to majority of nodes" but a little different.

if the network is ok, but when the leader broadcast the commit, majority of nodes fail maybe because of

full disk or full memory or something else, how should I do?

if i response the client failure, but the leader and some node success，and when the client read it, it have the value,

if the server doesn't response the client，the client will timeout,

it's ok, becase timeout mean success or failure, but I think it's not good.

Cheng Fu

unread,

Jan 15, 2018, 2:01:06 AM1/15/18

to raft-dev

Hi,

Good question.

In this particular case, the system hangs, it does not proceed to next step.

Yes, this is not good, but more tolerable than promising client something and later on figure out it should have been revoked. (Give client a response while the system is not consences, a.k.a. correctness)

The raft algorithm explicitly assumes if you expect the system to be live, you should make sure you have majority of nodes up,

In practice, if you think yourself being a admin of a cluster of N machines. All machine failure occurs one by another in time axis, and you get noticed when each machine fails, by the time the failure node reach N/2 machines, you probably already recovered earliest failed node, this keep the total running, functional node count above N/2.

The risk of all node failed at almost the same time is extremely low, you should be OK with that.

In another word, if you care about this risk, you should invest more on hardware, reduce the failure rate of single node, and make it bearable,.

To give you a brief intro to some knowledge background you may want to study:

Under the theory of FLP,

You cannot build a distributed system in real life with all these 3 properties,

termination, agreement, and validity

Paxos/Raft system designers decided to sacrifice termination and retain other 2 properties.

If you are interested learn more on this topic, I encourage you to read these 2 papers.

FLP

http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/

Paxos paper

http://the-paper-trail.org/blog/consensus-protocols-paxos/

I quoted form it:

"Rather than sacrifice correctness, which some variants (including the one I described) of 3PC do, Paxos sacrifices liveness, i.e. guaranteed termination"

Please let me know if this makes sense to you :P

Thanks.

Cheng

yang chen

unread,

Jan 15, 2018, 2:14:40 AM1/15/18

to raft...@googlegroups.com

Thank you for your replying, I am reading the document you mentioned, I will reply further when I finish.

--
You received this message because you are subscribed to a topic in the Google Groups "raft-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/raft-dev/6Zq9HzwChVs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to raft-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yang chen

unread,

Jan 15, 2018, 3:04:57 AM1/15/18

to raft...@googlegroups.com

I have another question if the Follower's last log different with the leader's prevLogTerm and prevIndex, the follower will response false and the leader will decrease the nextIndex, but if the follower fails with other reason, it will response false too, how can leader distinguish the situation. the raft doesn't describe.

Cheng Fu

unread,

Jan 15, 2018, 3:42:18 AM1/15/18

to raft...@googlegroups.com

On Jan 15, 2018, at 12:04 AM, yang chen <shanc...@gmail.com> wrote:

I have another question if the Follower's last log different with the leader's prevLogTerm and prevIndex, the follower will response false and the leader will decrease the nextIndex, but if the follower fails with other reason,

Can you pinpoint what other reasons do you mean?

PS: If you watch closely, follower’s current term is returned along with success.

I guess leader distinguished by looking at followers term and its own term .

Cheng

it will response false too, how can leader distinguish the situation. the raft doesn't describe.

<image.png>

To unsubscribe from this group and all its topics, send an email to raft-dev+u...@googlegroups.com.

shanc...@gmail.com

unread,

Jan 15, 2018, 3:52:00 AM1/15/18

to raft-dev

for example, full disk or full memory, but not the majority of nodes all fail with the reason, just one node, the leader how to distinguish the different situation.

在 2018年1月15日星期一 UTC+8下午4:42:18，Cheng Fu写道：

To unsubscribe from this group and all its topics, send an email to raft-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cheng Fu

unread,

Jan 15, 2018, 3:56:47 AM1/15/18

to raft...@googlegroups.com

If a node fails, it does NOT respond to leader, on leader side, the TCP request should times out, saying something like fails to connect.

This is different than a false is sent from follower to leader. In which case, you get a false response.

Does this make sense to you?

Cheng

yang chen

unread,

Jan 15, 2018, 4:08:02 AM1/15/18

to raft...@googlegroups.com

but in some implementations of the raft, it will response false, for example, go-raft(https://github.com/goraft/raft). I also think should response the leader because the follower fails, it really fails.

To unsubscribe from this group and all its topics, send an email to raft-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "raft-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/raft-dev/6Zq9HzwChVs/unsubscribe.

To unsubscribe from this group and all its topics, send an email to raft-dev+unsubscribe@googlegroups.com.

Cheng Fu

unread,

Jan 15, 2018, 4:35:54 AM1/15/18

to raft...@googlegroups.com

I think you may want to spoke to the person whole implemented the goraft to see why they did it.

Please let me know if you find out they are right, we are wrong.

:P

Cheng

yang chen

unread,

Jan 15, 2018, 4:38:28 AM1/15/18

to raft...@googlegroups.com

OK. Thank you.

Reply all

Reply to author

Forward