How to avoid frequent leader election in raft

1,169 views
Skip to first unread message

樊冰心

unread,
Oct 25, 2013, 5:53:55 AM10/25/13
to raft...@googlegroups.com
For some reason, a follower always receives the heartbeat timeout, causing frequent leader election. How to solve it.

Diego Ongaro

unread,
Oct 25, 2013, 3:46:51 PM10/25/13
to 樊冰心, raft...@googlegroups.com
Hi,
If your followers always time out and start a new election, you should
check to see that they are receiving heartbeats. Maybe add log
messages every time the leader sends heartbeats and every time the
followers receive them to check on this. Also, make sure that the
followers reset their election timers (to reasonable values) when they
receive heartbeats.
Hope this helps,
Diego

On Fri, Oct 25, 2013 at 2:53 AM, 樊冰心 <fanbin...@gmail.com> wrote:
> For some reason, a follower always receives the heartbeat timeout, causing
> frequent leader election. How to solve it.
>
> --
> You received this message because you are subscribed to the Google Groups
> "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to raft-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

聂安

unread,
Oct 28, 2013, 2:37:41 AM10/28/13
to raft...@googlegroups.com, 樊冰心
what if  this peer is in a bad network and others are good? 
this bad server will repeatly start up leader-election and destroy the leader, 
making the whole cluster unavailable. 
how to avoid this situation? does a some kind of lease work?

在 2013年10月26日星期六UTC+8上午3时46分51秒,Diego Ongaro写道:

Diego Ongaro

unread,
Oct 28, 2013, 3:56:56 AM10/28/13
to 聂安, raft...@googlegroups.com, 樊冰心
Yes, it's possible that a server that cannot receive heartbeats from
the leader but is able to request votes from other servers would make
the whole cluster unavailable. This sort of gets into Byzantine
territory. If you're worried about this problem, you'd want to
distinguish whether this server/its network is faulty or whether the
leader is faulty. One way to try do that is for the server to send out
a few pings before starting a new election. If it can confirm
bidirectional communication with a majority of the cluster, for
example, then it's probably right to suspect the leader is faulty.
-Diego

聂安

unread,
Oct 28, 2013, 5:09:15 AM10/28/13
to raft...@googlegroups.com, 聂安, 樊冰心
This raises two more questions.
1. if this 'bad peer' confirmed bidirectional communication with a majority 
    of the cluster excluding the leader, what should i do next?
2. does a longer election-timout work, or make things worse? how does the 
    long election-timeout harm out cluster?
sorry for my poor english. thank you!


在 2013年10月28日星期一UTC+8下午3时56分56秒,Diego Ongaro写道:

Diego Ongaro

unread,
Oct 28, 2013, 5:13:55 AM10/28/13
to 聂安, raft...@googlegroups.com, 樊冰心
My replies are below.

On Mon, Oct 28, 2013 at 2:09 AM, 聂安 <niean...@gmail.com> wrote:
> This raises two more questions.
> 1. if this 'bad peer' confirmed bidirectional communication with a majority
> of the cluster excluding the leader, what should i do next?

I guess in this case it could start an election. (Though you'd still
run into problems if two servers could each talk to a majority of the
cluster but couldn't talk to each other.)

> 2. does a longer election-timout work, or make things worse? how does the
> long election-timeout harm out cluster?

A longer timeout would give those heartbeat messages more time to get
through on the faulty network link, but longer election timeouts
increase downtime in case the leader actually does fail.

聂安

unread,
Oct 28, 2013, 6:20:53 AM10/28/13
to raft...@googlegroups.com, 聂安, 樊冰心
now it is my concern that one maybe-bad peer can lead the whole cluster going unavailable.
thank you very much!

在 2013年10月28日星期一UTC+8下午5时13分55秒,Diego Ongaro写道:

张亮

unread,
Jan 9, 2015, 3:22:41 AM1/9/15
to raft...@googlegroups.com, fanbin...@gmail.com

Seems it's an old post, but I have a similiar concern too.

The paper says that a removed server may send RequestVote RPC with new term number and cause the current leader to step down.
To prevent this, "if a server receives a RequestVote RPC within the minimum election timeout of hearing from a current
leader, it does not update its term or grant its vote."

Yet, maybe for some reason, a normal follower names F1 times out, and starts election with a new term number.
it will fail because of stale log.
and again, it will start another election with a higher term number, and will fail again.
and because its term number is greater than leader's, it will reject leader's AppendEntries request.
so follower F1 will repeat election forever, and can't be a normal part of the cluster again.

In my opinion, to avoid this happening, why not let F1 step down, that is:
when its RequestVote returns false, and find that there is already a leader with newer logs, then F1 set it's term to be the leader's term, and change to follower state.

Am I misunderstand something in Raft, I think I need some help here.

在 2013年10月26日星期六 UTC+8上午3:46:51,Diego Ongaro写道:

Yicheng Qin

unread,
Jan 10, 2015, 3:57:08 PM1/10/15
to 张亮, raft...@googlegroups.com, fanbin...@gmail.com
On Fri, Jan 9, 2015 at 12:22 AM, 张亮 <sparkli...@gmail.com> wrote:

Seems it's an old post, but I have a similiar concern too.

The paper says that a removed server may send RequestVote RPC with new term number and cause the current leader to step down.
To prevent this, "if a server receives a RequestVote RPC within the minimum election timeout of hearing from a current
leader, it does not update its term or grant its vote."

Yet, maybe for some reason, a normal follower names F1 times out, and starts election with a new term number.
it will fail because of stale log.
and again, it will start another election with a higher term number, and will fail again.
and because its term number is greater than leader's, it will reject leader's AppendEntries request.
so follower F1 will repeat election forever, and can't be a normal part of the cluster again.

In my opinion, to avoid this happening, why not let F1 step down, that is:
when its RequestVote returns false, and find that there is already a leader with newer logs, then F1 set it's term to be the leader's term, and change to follower state.
If raft receives message with higher term, it actually does what you say:
```
Current terms are exchanged whenever servers communicate; if one server’s current term is smaller than the other’s, then it updates its current term to the larger value. If a candidate or leader discovers that its term is out of date, it immediately reverts to fol- lower state.
``` 

Am I misunderstand something in Raft, I think I need some help here.

在 2013年10月26日星期六 UTC+8上午3:46:51,Diego Ongaro写道:
Hi,
If your followers always time out and start a new election, you should
check to see that they are receiving heartbeats. Maybe add log
messages every time the leader sends heartbeats and every time the
followers receive them to check on this. Also, make sure that the
followers reset their election timers (to reasonable values) when they
receive heartbeats.
Hope this helps,
Diego

On Fri, Oct 25, 2013 at 2:53 AM, 樊冰心 <fanbin...@gmail.com> wrote:
> For some reason, a follower always receives the heartbeat timeout, causing
> frequent leader election. How to solve it.
>
> --
> You received this message because you are subscribed to the Google Groups
> "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to raft-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages