Question about several implementations of leader election?

136 views
Skip to first unread message

Terry Tan

unread,
Sep 21, 2017, 4:06:30 AM9/21/17
to raft-dev
Hi  raft-dev,

For leader election , i found there are two solutions right now, one is pre-vote implemented by copycat , if a server lost connection with leader ,but leader is still on his power to connect with majority, the follower server will do pre-vote (in copycat poll phase), then mostly it will be rejected because it's term value or log will be less than the leader ,and the term will not increase in this phase , but for another solution,  when the follower issues an voting request, it increases its term first , then send rpc ,the one received his request  will see if it's lease out of date (leader will heart beat to update the lease ),  if it's lease is not out of date ,simple reject the request ,   my question is even though the request is rejected, that follower will still increase it's term again ,will retry and retry endless ,then the term will be monolically increasing , i dont know if my understanding for that is correct or not ,or there is some machenism to prevent it form increasing term?

Oren Eini (Ayende Rahien)

unread,
Sep 21, 2017, 4:35:21 AM9/21/17
to raft...@googlegroups.com
The prevote is the generally accepted one.
In general, if you see a term number higher than yours, you should immediately assume that you are out of date and accept the other node's term & vote for them.
The prevote avoid that because it clearly states "I'm not sure that I need to update my vote".

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


On Thu, Sep 21, 2017 at 11:06 AM, Terry Tan <tx...@sina.com> wrote:
Hi  raft-dev,

For leader election , i found there are two solutions right now, one is pre-vote implemented by copycat , if a server lost connection with leader ,but leader is still on his power to connect with majority, the follower server will do pre-vote (in copycat poll phase), then mostly it will be rejected because it's term value or log will be less than the leader ,and the term will not increase in this phase , but for another solution,  when the follower issues an voting request, it increases its term first , then send rpc ,the one received his request  will see if it's lease out of date (leader will heart beat to update the lease ),  if it's lease is not out of date ,simple reject the request ,   my question is even though the request is rejected, that follower will still increase it's term again ,will retry and retry endless ,then the term will be monolically increasing , i dont know if my understanding for that is correct or not ,or there is some machenism to prevent it form increasing term?

--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Terry Tan

unread,
Sep 21, 2017, 5:22:00 AM9/21/17
to raft-dev
Hi Ayende,

very happy to see you again, i have known the solution one ,but for solution two ,the term number will be increased monolically? How do we avoid this thing in solution 2?

在 2017年9月21日星期四 UTC+8下午4:35:21,Ayende Rahien写道:
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.

Oren Eini (Ayende Rahien)

unread,
Sep 21, 2017, 5:43:32 AM9/21/17
to raft...@googlegroups.com
You don't, as far as I understand, which is why people usually go for 1
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.

Terry Tan

unread,
Sep 21, 2017, 7:00:13 AM9/21/17
to raft-dev

How about the second one ?  in raft paper , membership configuration , it mentioned that 

"The third issue is that removed servers (those not in Cnew) can disrupt the cluster. These servers will not receive heartbeats, so they will time out and start new elections. They will then send RequestVote RPCs with new term numbers, and this will cause the current leader to revert to follower state. A new leader will eventually be elected, but the removed servers will time out again and the process will repeat, resulting in poor availability. To prevent this problem, servers disregard RequestVote RPCs when they believe a current leader exists. Specifically, if a server receives a RequestVote RPC within the minimum election timeout of hearing from a current leader, it does not update its term or grant its vote. This does not affect normal elections, where each server waits at least a minimum election timeout before starting an election. However, it helps avoid disruptions from removed servers: if a leader is able to get heartbeats to its cluster, then it will not be deposed by larger term numbers."

although this is not the normal election case , the same problem still exists. how do we solve this ?


在 2017年9月21日星期四 UTC+8下午5:43:32,Ayende Rahien写道:

Oren Eini (Ayende Rahien)

unread,
Sep 22, 2017, 3:39:18 AM9/22/17
to raft...@googlegroups.com
I don't like this because it means that we are now in a territory that requires for two elections terms if the leader fails, and can lead to large jumps in terms.
imagine that you have a leader go down, one of the follower times out first, and try to elect itself. It can't until the election timeout is passed on all other nodes, and by this time you likely have more candidates, which means that you've competition on the leadership.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.

Terry Tan

unread,
Sep 25, 2017, 2:20:12 AM9/25/17
to raft-dev
Hi  Ayende ,

If one follower is timeout ,then it starts the election process ,the other followers will reject the election util it can not receive heartbeat from leader a period of time. Once other server  is timeout, it will start the election like the previous follower but it is random ,may  have competition ,may not .But as what you said, the follower  will have large jumps in terms, and i think that is the problem which i feel is not elegant  , i dont know why they have such kind of solution,  or may be my understanding is not correct?


在 2017年9月22日星期五 UTC+8下午3:39:18,Ayende Rahien写道:
Reply all
Reply to author
Forward
0 new messages