Is it possible to configure Akka Cluster with auto-downing so that with 100% probability at each moment at most one cluster node thinks that he is the leader (or singleton) even in case of cluster partitions? Some emails/blog posts imply that requiring cluster partition to be majority quorum does the trick?
How to do it with auto-scaling clusters?
Am I correct that if two actors with the same persistence id work concurrently on different cluster partitions the sequence numbers of events will get mixed up?
Does it imply that if Akka Persistence is used with Akka Cluster Sharding then Akka Cluster must guarantee uniqueness of ShardCoordinator?
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Hi Oleg,Your understanding of both topics, in the current version of Akka, are correct.More specifically:Is it possible to configure Akka Cluster with auto-downing so that with 100% probability at each moment at most one cluster node thinks that he is the leader (or singleton) even in case of cluster partitions? Some emails/blog posts imply that requiring cluster partition to be majority quorum does the trick?The auto-downing strategy currently provided is a rather naive one - just timeout based. It does not guard against split brains, which is why I would rather not encourage using auto-downing if your cluster needs any kind of "single" entity.The timer based auto downing works well for clusters where you have "many workers, but no master" for example, since causing a split need not end in a wrong "leader" being elected (since there is no leader).Yes, quorum would help making downing more safe (avoid split brains), however we have not implemented it yet (we are aware of the need and possibility of course).
Is it possible to configure Akka Cluster with auto-downing so that with 100% probability at each moment at most one cluster node thinks that he is the leader (or singleton) even in case of cluster partitions? Some emails/blog posts imply that requiring cluster partition to be majority quorum does the trick?The auto-downing strategy currently provided is a rather naive one - just timeout based. It does not guard against split brains, which is why I would rather not encourage using auto-downing if your cluster needs any kind of "single" entity.The timer based auto downing works well for clusters where you have "many workers, but no master" for example, since causing a split need not end in a wrong "leader" being elected (since there is no leader).
Am I correct that if two actors with the same persistence id work concurrently on different cluster partitions the sequence numbers of events will get mixed up?Correct, currently these sequence numbers are "source of truth" and mixing them up causes problems during replay.Does it imply that if Akka Persistence is used with Akka Cluster Sharding then Akka Cluster must guarantee uniqueness of ShardCoordinator?In fact, the cluster sharding *uses* persistence in order to survive the leader going down - so we can restore the shard allocation information once the new coordinator boots up.
Yes, currently there is a hard requirement for "only one writer" in akka-persistence. This can be facilitated by either cluster-sharding or cluster-singletons.
Slightly related "future work": We do have some CRDT work by Patrik Nordwall stashed (it's public as akka-data-replication) and will want to move it into Akka main at some point, those of course do not require any kind leaders in the cluster.
Hello!Thanks for the answers! I have a few more questions/clarifications below, if You don't mind.
On Wednesday, January 14, 2015 at 6:02:33 PM UTC+2, Akka Team wrote:
Is it possible to configure Akka Cluster with auto-downing so that with 100% probability at each moment at most one cluster node thinks that he is the leader (or singleton) even in case of cluster partitions? Some emails/blog posts imply that requiring cluster partition to be majority quorum does the trick?The auto-downing strategy currently provided is a rather naive one - just timeout based. It does not guard against split brains, which is why I would rather not encourage using auto-downing if your cluster needs any kind of "single" entity.The timer based auto downing works well for clusters where you have "many workers, but no master" for example, since causing a split need not end in a wrong "leader" being elected (since there is no leader).I'd like to better understand what happens to a Leader if it becomes unreachable. Auto-down is impossible because it can only be performed by the Leader, right? Can I execute Down command on non-leader member of cluster when the Leader is unreachable?
Am I correct that if two actors with the same persistence id work concurrently on different cluster partitions the sequence numbers of events will get mixed up?Correct, currently these sequence numbers are "source of truth" and mixing them up causes problems during replay.Does it imply that if Akka Persistence is used with Akka Cluster Sharding then Akka Cluster must guarantee uniqueness of ShardCoordinator?In fact, the cluster sharding *uses* persistence in order to survive the leader going down - so we can restore the shard allocation information once the new coordinator boots up.Yes, currently there is a hard requirement for "only one writer" in akka-persistence. This can be facilitated by either cluster-sharding or cluster-singletons.So Persistence and consequently Sharding cannot be used in potential split-brain cluster configurations? Because ShardCoordinator could potentially run on both partitions simultaneously?
Also if Persistence is storing state in eventually consistent replicated journal (eg Cassandra) wouldn't there be inconsistencies in case of journal cluster network partition?
Slightly related "future work": We do have some CRDT work by Patrik Nordwall stashed (it's public as akka-data-replication) and will want to move it into Akka main at some point, those of course do not require any kind leaders in the cluster.As I am sure You know, there is whole infrastructure for consistent highly reliable distributed coordination:Zookeeper's async API at first glance looks like a perfect match for Akka and FSM?
But as outlined below You discovered that Zookeeper doesn't suit Your needs well. May be You could share Your findings?> In the process we have implemented and thrown away prototypes using both ZooKeeper, JGroups and Hazelcast.> None of them solved the problem optimally and each one imposed its own set of unnecessary constraints and drawbacks.> We strongly believe that a loosely coupled, eventually consistent, fully decentralized and self-coordinating P2P solution> is the right way to scale an actor system and to make it fully resilient to failure.
Thank You!Oleg
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
On Wed, Jan 14, 2015 at 6:58 PM, Oleg Mürk <oleg...@gmail.com> wrote:I'd like to better understand what happens to a Leader if it becomes unreachable. Auto-down is impossible because it can only be performed by the Leader, right? Can I execute Down command on non-leader member of cluster when the Leader is unreachable?Akka Cluster doesn't require a strict leader for managing the cluster membership. The membership data is a Conflict Free Replicated Data Type (CRDT) so if there are conflicting updates they can be merged anyway. The leader is just a role for a node that performs certain actions and it is alright if several nodes thinks they have this role.
So Persistence and consequently Sharding cannot be used in potential split-brain cluster configurations? Because ShardCoordinator could potentially run on both partitions simultaneously?It is right that you must only be one active instance of a PersistentActor with a given persistenceId, i.e. single writer to the journal. That is true also for the ShardCoordinator, since it is a PersistentActor. That means that you must handle network partitions carefully and use a proper downing strategy as discussed earlier. We have acknowledged that a packaged solution for improving this have been requested by many users.
<...>
We don't need strong coordination for cluster membership and we have the goal to support large clusters, which I believe would not be possible with Zookeeper. Zookeeper is great, but it is solving a different set of problems.
It's also possible to build coordination services on top of Akka Cluster, as illustrated by Konrad's akka-raft prototype.