Hey Gerry,
Yeah, that doesn't look right. In case the rest of y'all are
wondering, the pseudocode comes from here:
https://github.com/ongardie/raft-pseudocode . There's a very relevant
warning on it: "warning: might not be entirely correct, check with
TLA+ spec". The story behind that pseudocode is that we were exploring
the possibility of using it as a substitute for Figure 2 (the
"cheatsheet") in the Raft paper, but we decided against it.
If you're trying to implement the algorithm and are worried about
correctness, you might want to try the TLA+ spec (latest version at
https://github.com/ongardie/raft.tla ), which I did try to get right.
> 1. So as discussed elsewhere does it really make sense to handle plain heartbeats the same way like AppendEntry instructions? Or wouldn’t it be semantically cleaner to separate them?
As you're aware, we've discussed this before. I'm pretty sure some
implementations do separate heartbeats, and it's not crazy. On the
other hand, sending and processing heartbeats the same as (empty)
AppendEntries is convenient, in my opinion.
> 2. Is there any other purpose of the heartbeat than resetting the election counter of the peers?
It gets followers the leader's latest commit index. Empty
AppendEntries are also typically used to determine where the leader
and follower's logs diverge (setting 'nextIndex'). It lets the
follower know who the leader is. It makes sure everyone's cool with
the current term.
> 3. Does the peer after having received a heartbeat have to send a response or would it be sufficient to just reset the election timer (My implementation doesn’t use RPC calls. It is just sending messages forth and back!)
The reply is useful in a few cases. If the follower's term is higher,
it gets the deposed leader to step down. If nextIndex is out of place,
the reply helps to adjust it. In some implementations (like LogCabin),
read-only requests are blocked at the leader until a round of
heartbeats complete with a majority of the cluster; without a reply,
you can't know when they complete.
One more that I can think of: you don't want client requests to block
at a partitioned leader forever. You can have clients enforce a
timeout on their requests, but maybe the leader is in a better
position to know when the client should give up (the leader knows when
it's able to make forward progress; the client doesn't). So in
LogCabin, a leader steps down if it can't get heartbeats out to a
majority of the cluster within an election timeout.
-Diego
> --
> You received this message because you are subscribed to the Google Groups
> "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
raft-dev+u...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.