--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hibernating Rhinos Ltd
Oren Eini l CEO l Mobile: + 972-52-548-6969
Office: +972-4-622-7811 l Fax: +972-153-4-622-7811
I think that option (1) is the same solution that I mentioned for supporting clients who want to implement a reliable FIFO buffer for commands which don't have ExactlyOnceRPC semantics.
I am curious whether if implementing that option is *sufficient* to -- if one wants -- eliminate the need for sessions.
A more interesting issue is actually what happens when you restore from an old backup. That is a case where a node was basically time travelled, and its confirmed state on other machines has changed.
In practice, this means we need some notion of cluster identity, and we need a way to (a) verify that identity and (b) invalidate that identity if a node "forgets" (or does anything else that violates a Raft assumption).
So the effect of wiping a node's persistent state should be that you have just permanently removed that node from its cluster... later, if/when you start the node back again, it cannot rejoin its cluster but instead becomes the first member in a brand new cluster.
The simplest way to implement all of this is with unique cluster ID's that are stored in persistent storage along with all the other persistent data, and have nodes reject messages with the wrong cluster ID.
Oren Eini (Ayende Rahien) <aye...@ayende.com> wrote:A more interesting issue is actually what happens when you restore from an old backup. That is a case where a node was basically time travelled, and its confirmed state on other machines has changed.Yes that is pretty scary and if you literally mean a disk backup then I would hope you got the full state and not just partial. In that case I think that node should never be able to get elected leader (assuming this only happened on 1 box) and the leader should re-snapshot it. It could still be bad, though -- if a node forgot its votes then it could potentially vote differently multiple times in the same term and you could end up with multiple leaders in the same term. It seems that the probability of this might be small, since restoring from a backup would usually take a long time. Still, I don't think we can ever safely fully restore from backup...Archie Cobbs <archie...@gmail.com> wrote:In practice, this means we need some notion of cluster identity, and we need a way to (a) verify that identity and (b) invalidate that identity if a node "forgets" (or does anything else that violates a Raft assumption).
So the effect of wiping a node's persistent state should be that you have just permanently removed that node from its cluster... later, if/when you start the node back again, it cannot rejoin its cluster but instead becomes the first member in a brand new cluster.
The simplest way to implement all of this is with unique cluster ID's that are stored in persistent storage along with all the other persistent data, and have nodes reject messages with the wrong cluster ID.I'm confused by this approach. GUID per node makes sense to me, but I'm not sure what purpose cluster IDs are serving, unless you literally mean a different database or some type of sharding (etcd raft calls this multi-node) which clearly is good for certain things, but not this. I don't think cluster ids help with safely adding a node post-disk failure back into the pool for its original cluster (assuming the implication is that all nodes in a cluster share the same cluster ID).
If you wanted to guard against time travel, you would have to add additional mechanism.
A simple solution would be to exclude the node id from backups.On simple crash/stop -> restart, everything works normally.On restore from backup, node must get a new id and needs to be added to cluster.