How can we sync logs to a follower that has been left to far behind

121 views
Skip to first unread message

Quốc Khánh Bùi

unread,
Aug 23, 2023, 10:07:57 AM8/23/23
to raft-dev
here is a scenario, We have three nodes named 1,2, and 3.

node 1: The current leader, has received and committed a lot of data.
node 2: follower, data are entirely in sync with the leader.
node 3: completely isolated, it can't reach others and vice versa for a quite long time. it misses a lot of data and will take a lot of time to sync.

Then suddenly, node 1 dies, and node 3 goes up.
Now the election is between node 2 and node 3, node 2 definitely becomes the next leader because its logs are more up-to-date than node 3. 

After becoming the next leader, node 2's job now is to transfer logs to node 3 by using append entries requests, but the syncing time is very long, much longer than the election timeout. that means node 2 will need to step down as a leader before it can transfer all logs to node 3 - in the paper. 

Thus, a leader in Raft steps down if an election timeout elapses without a successful round of heartbeats to a majority of its cluster. this allows clients to retry their requests with another server.

So the behavior is, that node 2 becomes the leader, but can't sync logs to node 3, and then node 2 steps down. then they do the election again, and node 2 becomes leader again, but failed to sync logs to node 3 again. this behaviour will be repeated indefinitely.

My questions are, How can I sync logs from the leader to the follower and maintain the availability of the cluster?
Please give me some advice, Thank you all!

Oren Eini (Ayende Rahien)

unread,
Aug 27, 2023, 4:33:35 AM8/27/23
to raft...@googlegroups.com
You _are_ getting heartbeats, however.
Node 3 is getting the node synced using AppendEntries, no? That is what they are about.

Note that in this scenario, the cluster cannot actually *do* something, since it cannot commit until the log is fully synced.

But it will make progress in the sync.

A separate scenario is if you need to do a snapshot install because it has been a _long_ time since the sync, and the log was trimmed. 

In our impl, we consider the node to be sending heartbeats while it is accepting & installing the snapshot.

--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/raft-dev/5336abaf-71c0-43ae-981c-69d254707ba3n%40googlegroups.com.


--

Quốc Khánh Bùi

unread,
Aug 27, 2023, 5:20:22 AM8/27/23
to raft-dev
Hi Oren Eini, Thank you for your answer.
I don't quite understand your answer yet, so I just want to confirm. I should implement log compaction, is that what you mean?

Oren Eini (Ayende Rahien)

unread,
Aug 27, 2023, 5:26:16 AM8/27/23
to raft...@googlegroups.com
Log compaction is a separate step
Let's assume that you don't _have_ log compaction, and you have a *large* gap between node 2 & 3.

Node 2 is going to send AppendEntries (with 50 entries each) to node 3
Node 3 will accept and ack that

This is a vote of confidence in node 2, and won't make it lose its leadership

Reply all
Reply to author
Forward
0 new messages