Adding guarantee that there will be a single leader at any moment of time

96 views
Skip to first unread message

Filipp Ozinov

unread,
Nov 17, 2016, 7:19:32 AM11/17/16
to raft-dev
Hi. Is there any way to add guarantee that there will be a single leader at any moment of time? For example, collect append-entries response times from each follower, and if we don't have a majority of followers who's response time is greater than (currentTime - raftMinTimeout) - fallback to follower state. But I'm not sure if this is safe. What do you think?

Oren Eini (Ayende Rahien)

unread,
Nov 17, 2016, 8:07:45 AM11/17/16
to raft...@googlegroups.com
There is no real way to do so, but in general, a leader should step down if it doesn't get majority confirmation in the election time.

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


On Thu, Nov 17, 2016 at 2:19 PM, Filipp Ozinov <fil...@bakanov.su> wrote:
Hi. Is there any way to add guarantee that there will be a single leader at any moment of time? For example, collect append-entries response times from each follower, and if we don't have a majority of followers who's response time is greater than (currentTime - raftMinTimeout) - fallback to follower state. But I'm not sure if this is safe. What do you think?

--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Filipp Ozinov

unread,
Nov 17, 2016, 3:13:11 PM11/17/16
to raft-dev
Why not? Could you please provide some case when suggested scheme will not work?

четверг, 17 ноября 2016 г., 16:07:45 UTC+3 пользователь Ayende Rahien написал:
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.

Oren Eini (Ayende Rahien)

unread,
Nov 17, 2016, 3:30:06 PM11/17/16
to raft...@googlegroups.com
Sorry, I misread it, you are correct.
That is what I meant by stepping down.

However, note that you can't prevent it, only reduce its likelihood and duration.

In election timeout is 10 seconds, and a node decided to time out after not getting a heartbeat after 5 seconds, it can take over the cluster while the previous leader is still waiting for the timeout

To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.

Henrik Ingo

unread,
Nov 18, 2016, 6:02:25 AM11/18/16
to raft...@googlegroups.com
For practical purposes you can prevent this quite efficiently by
adding a delay between the old primary stepping down and the new
primary calling for election.

E.g. if a primary doesn't get hearbeats for 5 seconds, it steps down
as primary. However, any node will only call for election 10 seconds
after the last heartbeat.

The cost of this is that you will now have 10-5=5 second pauses when
there's no primary. Note also that this method isn't robust in a
theoretical sense, because in theory we would also have to protect
against surprising clock skews / jumps.

More robust methods to achieve what you want. The basic idea would be
to require some kind of majority acknowledgement for each read as well
as write. In this case, there could be 2 primaries for a brief
overlapping moment in time, but the application can only successfully
read from the new one. The cost of this is of course bigger latency
for reads.

henrik
henri...@avoinelama.fi
+358-40-5697354 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://fi.linkedin.com/pub/henrik-ingo/3/232/8a7

Filipp Ozinov

unread,
Nov 18, 2016, 7:14:22 AM11/18/16
to raft-dev, henri...@avoinelama.fi
Thanks all, seems it's really impossible task. For example some node lost connection with other nodes. It becomes a candidate and start leader election for multiple times (while connection is lost). When connection comes back - that node can become a new leader very fast. If all nodes will wait for append-entries - than no node can become leader.

пятница, 18 ноября 2016 г., 14:02:25 UTC+3 пользователь Henrik Ingo написал:
Reply all
Reply to author
Forward
0 new messages