unsafety about leadership tranform

101 views
Skip to first unread message

jin deng

unread,
Dec 11, 2017, 9:45:41 AM12/11/17
to raft-dev
hi all,
    The phd version of the raft paper supplies a leader transfer operation which need NOT communicate with majority members to be a leader.I think this will break the rule that: only one leader will be elected during the same term.

    Imagine we have 5 nodes and there is a network split between two parts: one with 2 nodes and the other with 3.The old leader leaves in the minority part and transfer its leadership to the other node within the same part.At the same time the majority part begins a election and finally elected a new leader.The two leaders will have same term and what's more,the leader lease may not be able to detect this situation because the lease has not expired.So we may get stale read on the old leader.

    Is this a weakness of the leadership transfer mechanism?maybe we can solve this problem by forcing to make sure the election timeout time is more than leader lease timeout,but that impose a requirement about time correctness which is raft want to avoid.And in practice we know that election timeout is more longer than lease expiration time,but that will not break the correctness of the algorithm even it was broken sometime with clock issue,only got performance problem.

jin deng

unread,
Dec 11, 2017, 9:53:01 AM12/11/17
to raft-dev
well,I think the leader lease seems not very reliable if we take the clock as a unreliable resource.is it the only way to handle stale read by query the majority members to make the final decision?

在 2017年12月11日星期一 UTC+8下午10:45:41,jin deng写道:

Thông Phạm Văn

unread,
Dec 14, 2017, 4:23:39 AM12/14/17
to raft-dev
I think the leader in minority part can not commit any entries. Then the leader in majority part can overwrite any entry that was replicated by that leader.

Vào 21:45:41 UTC+7 Thứ Hai, ngày 11 tháng 12 năm 2017, jin deng đã viết:

Юрий Соколов

unread,
Dec 15, 2017, 2:59:55 AM12/15/17
to raft-dev
The problem jin is talking about: there will be two leaders with the same term, and that is clear violation of protocol.

There will be two different versions of log entries with same index and term: one committed in majority, and other uncommitted in minority, - and it is also violation of protocol. Next leader could be from minority, and he will find that former majority members has entries with same term and index, and will not overwrite those entries, so it will lead to split-brain.

Perhaps we are missing something, and this situation is impossible?

Thông Phạm Văn

unread,
Dec 15, 2017, 3:04:27 AM12/15/17
to raft-dev
The node from minority part can not win vote because of its term is smaller than any member from majority part

Vào 14:59:55 UTC+7 Thứ Sáu, ngày 15 tháng 12 năm 2017, Юрий Соколов đã viết:

Юрий Соколов

unread,
Dec 15, 2017, 3:13:59 AM12/15/17
to raft-dev
But Jin talks about leader transfership from old leader (that occurs in minority due to temporal network split) that happens concurrently with leader election in majority.

jordan.h...@gmail.com

unread,
Dec 15, 2017, 7:47:09 PM12/15/17
to raft...@googlegroups.com
This conversation seems to be premised on the idea that a leader transfers its leadership directly to another node, and that new node is immediately promoted and maintains the same term as the old leader. AIUI the dissertation does not describe it that way. The leader transferring its leadership sends all its entries to another node *and then times out that node* to immediately start a new election that node will win. By timing out the node to which leadership is being transferred, that will cause the new leader to increment its term, thus solving the duplicate terms issue. Additionally, considering that it increments its term and runs an election to get elected leader, it simply won't get elected if it's partitioned. Most likely, another node will timeout and start an election as well, and the node to which the leader intended to transfer leadership may not win.

> On Dec 15, 2017, at 12:13 AM, Юрий Соколов <funny....@gmail.com> wrote:
>
> But Jin talks about leader transfership from old leader (that occurs in minority due to temporal network split) that happens concurrently with leader election in majority.
>
> --
> You received this message because you are subscribed to the Google Groups "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

jin deng

unread,
Dec 19, 2017, 2:48:46 AM12/19/17
to raft-dev
sorry,I was misunderstand that the node won't immediately became leader after the current leader transferred all its logs to it.Instead,the node just timeout the election timer and start a valid election,so this situation won't happen actually.

thanks all your guys,It seems that i replied this post as private at before:-)

在 2017年12月15日星期五 UTC+8下午3:59:55,Юрий Соколов写道:
Reply all
Reply to author
Forward
0 new messages