How can we sync logs to a follower that has been left to far behind

Skip to first unread message

Quốc Khánh Bùi

Aug 23, 2023, 10:07:57 AM8/23/23
to raft-dev
here is a scenario, We have three nodes named 1,2, and 3.

node 1: The current leader, has received and committed a lot of data.
node 2: follower, data are entirely in sync with the leader.
node 3: completely isolated, it can't reach others and vice versa for a quite long time. it misses a lot of data and will take a lot of time to sync.

Then suddenly, node 1 dies, and node 3 goes up.
Now the election is between node 2 and node 3, node 2 definitely becomes the next leader because its logs are more up-to-date than node 3. 

After becoming the next leader, node 2's job now is to transfer logs to node 3 by using append entries requests, but the syncing time is very long, much longer than the election timeout. that means node 2 will need to step down as a leader before it can transfer all logs to node 3 - in the paper. 

Thus, a leader in Raft steps down if an election timeout elapses without a successful round of heartbeats to a majority of its cluster. this allows clients to retry their requests with another server.

So the behavior is, that node 2 becomes the leader, but can't sync logs to node 3, and then node 2 steps down. then they do the election again, and node 2 becomes leader again, but failed to sync logs to node 3 again. this behaviour will be repeated indefinitely.

My questions are, How can I sync logs from the leader to the follower and maintain the availability of the cluster?
Please give me some advice, Thank you all!

Oren Eini (Ayende Rahien)

Aug 27, 2023, 4:33:35 AM8/27/23
You _are_ getting heartbeats, however.
Node 3 is getting the node synced using AppendEntries, no? That is what they are about.

Note that in this scenario, the cluster cannot actually *do* something, since it cannot commit until the log is fully synced.

But it will make progress in the sync.

A separate scenario is if you need to do a snapshot install because it has been a _long_ time since the sync, and the log was trimmed. 

In our impl, we consider the node to be sending heartbeats while it is accepting & installing the snapshot.

You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit


Quốc Khánh Bùi

Aug 27, 2023, 5:20:22 AM8/27/23
to raft-dev
Hi Oren Eini, Thank you for your answer.
I don't quite understand your answer yet, so I just want to confirm. I should implement log compaction, is that what you mean?

Oren Eini (Ayende Rahien)

Aug 27, 2023, 5:26:16 AM8/27/23
Log compaction is a separate step
Let's assume that you don't _have_ log compaction, and you have a *large* gap between node 2 & 3.

Node 2 is going to send AppendEntries (with 50 entries each) to node 3
Node 3 will accept and ack that

This is a vote of confidence in node 2, and won't make it lose its leadership

Reply all
Reply to author
0 new messages