--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
--
1) No, if you wait until you get to candidate status, you'll have a lot more conflicted elections. What you can do is refuse to grant a vote if you got a recent heartbeat from the leader. That wait a single disconnected node can't interrupt operations.Note that this also require pre voting feature, otherwise on join it will force an election.2) That can happen quite easily, if you have two candidates that want to vote, and they failed, then fail again, etc. You can end up with a permenant hang elections.
2. One vote per term. The leader in the [now] older term will quickly discover that it's no longer the leader and step down. The higher term leader sending heartbeats is one mechanism for that to happen. IIRC, the higher term requestVote will cause the older term leader to step down [or never ascend depending on timing].
3. Don't respond at all? When the sender has an up to date or newer log, never send granted=true? Not sure I follow.
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
When a node receives a RequestVote with a newer term and an older log, the node should1. update currentTerm2. reject the vote3. become a followerWhat would break if I made a minor adjustment in this case? If the node is a leader, do steps 1 and 2 and skip 3, avoid becoming a follower.
When a granted=false reply is received, go to follower mode since it's impossible to ever be a leader.
I think you just described the stock pre vote mechanism.
I'm not liking all the tweaking that's going on either. It doesn't take much to break safety.Stepping back is a good idea. But I wouldn't focus on that particular scenario. I think the focus should be minimum disruptability. Several ideas are mentioned in the dissertation and actually recommended for a real deployment. But we're all here implementing them as tweaks.The stock raft paper demonstrates a really useful algorithm with the least amount of rules. But some of us are finding out there needs to be a little more for it to be practical. In a way, I feel that Raft also suffers from the same statement it makes about multi-paxos. Re-phrased from the dissertation:....One reason is that there is no widely agreed-upon algorithm for Minimum Disruptive Raft...sketched possible approaches....but these differ from others...details have not been published.I think what's needed is another version of the condensed summary that folds in pre-votes and minimum election timeout and just lays it all out in a "understandable" form. Should it be a different message or a flag in request vote? Some extra rules for different roles. etc. I would also throw in the no-op issue as well since it's important in practice. E.g. should we always insert a no-op right after an election? etc.
--
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
--
You assume that a single node would be split from the network.But what about two nodes being split in a 5 node cluster?
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
But you just introduced another state, and for what?What does this get you?
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
--
--
I'm bringing some of the "is it ok for a leader to reject pre-vote rpc?" discussion into here, since I couldn't see how the two threads differed.
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
But a scenario like the one Diego described, where you have a node that can't talk to the leader, but can talk to the rest of the cluster...
Avoiding DisruptionsFirst off, if you're doing membership changes, servers shouldn't update their terms based on RequestVote requests for a minimum (baseline) election timeout after receiving a heartbeat. I consider that tweak required if you support membership changes. All the other approaches I know about don't quite handle all cases of disruptive servers during and after membership changes. (You can make a best effort to tell a server it's removed from the cluster, but that might fail, and you can't retry forever.)
--
Where (other than this email) is this "required tweak" discussed? I don't think I've seen it before.
On Wed, Jun 17, 2015 at 1:20 PM, Ben Darnell <b...@bendarnell.com> wrote:On Wed, Jun 17, 2015 at 3:35 AM, Diego Ongaro <onga...@gmail.com> wrote:Avoiding DisruptionsFirst off, if you're doing membership changes, servers shouldn't update their terms based on RequestVote requests for a minimum (baseline) election timeout after receiving a heartbeat. I consider that tweak required if you support membership changes. All the other approaches I know about don't quite handle all cases of disruptive servers during and after membership changes. (You can make a best effort to tell a server it's removed from the cluster, but that might fail, and you can't retry forever.)I see how this avoids disruptions, but doesn't it cause other problems? Suppose node C is partitioned away and increments its term, so its term is now ahead of the majority A and B. If A and B refuse to increase their terms to match, then C cannot force an election, but its term remains "in the future" so it will not recognize messages from the current leader and remain frozen in its isolated state. How do you bring node C back into the fold? Do A and B somehow tell C to decrease its term?Relying on asymmetry: the term in the RequestVote request is not adopted, but the term in the AppendEntries response is. The leader (A or B) will send a heartbeat to C, C will reply to that heartbeat with its newer term, and the leader will step down and adopt that newer term.
Where (other than this email) is this "required tweak" discussed? I don't think I've seen it before.In the last two paragraphs of the cluster membership changes section of the conference paper and Extended Version paper, as well as in the dissertation. We added this fairly late in the game, so if you've mostly read pre-release drafts of the paper, it may not have been present there.
-Diego