here is a scenario, We have three nodes named 1,2, and 3.
node 1: The current leader, has received and committed a lot of data.
node 2: follower, data are entirely in sync with the leader.
node 3: completely isolated, it can't reach others and vice versa for a quite long time. it misses a lot of data and will take a lot of time to sync.
Then suddenly, node 1 dies, and node 3 goes up.
Now the election is between node 2 and node 3, node 2 definitely becomes the next leader because its logs are more up-to-date than node 3.
After becoming the next leader, node 2's job now is to transfer logs to node 3 by using append entries requests, but the syncing time is very long, much longer than the election timeout. that means node 2 will need to step down as a leader before it can transfer all logs to node 3 - in the paper.
Thus, a leader in Raft steps down if an election timeout elapses without a successful round of heartbeats to a majority of its cluster. this allows clients to retry their requests with another server.
So the behavior is, that node 2 becomes the leader, but can't sync logs to node 3, and then node 2 steps down. then they do the election again, and node 2 becomes leader again, but failed to sync logs to node 3 again. this behaviour will be repeated indefinitely.
My questions are, How can I sync logs from the leader to the follower and maintain the availability of the cluster?
Please give me some advice, Thank you all!