Akka cluster - seed node goes down and is not discoverable afterwards

666 views
Skip to first unread message

Eugene Dzhurinsky

unread,
Jul 2, 2014, 10:46:40 PM7/2/14
to akka...@googlegroups.com
Hello!

I have the scenario when the only one seed node is configured.

If this node is not reachable  due to a network problem - then after the connection is re-established, it does not re-join the cluster. In other words, the rest of nodes don't know about seed node anymore.

Are there any approaches to tell the rest of cluster to "poll" the seed nodes periodically, and if they respond - then "rejoin" the cluster with those nodes?

Thanks!

Patrik Nordwall

unread,
Jul 3, 2014, 2:08:38 AM7/3/14
to akka...@googlegroups.com
On Thu, Jul 3, 2014 at 4:46 AM, Eugene Dzhurinsky <jdev...@gmail.com> wrote:
Hello!

I have the scenario when the only one seed node is configured.

If this node is not reachable  due to a network problem - then after the connection is re-established, it does not re-join the cluster. In other words, the rest of nodes don't know about seed node anymore.

So you had one cluster consisting of all nodes, and then a network split caused it to be split into two separate clusters?
That does not really have anything to do with seed nodes. Seed nodes are only used as initial contact points when joining new nodes.

The split into two clusters must be because of downing of unreachable nodes. Perhaps you use auto-downing?
If you didn't down the nodes they would have detected each other as unreachable during the network split, but when network connection is re-established they would detect each other as reachable again and the nodes would still all be part of same cluster.
 

Are there any approaches to tell the rest of cluster to "poll" the seed nodes periodically, and if they respond - then "rejoin" the cluster with those nodes?

An ActorSystem can only be part of one cluster, and only once. Rejoining or joining of two different clusters are not supported. Then the ActorSystem must be stopped and started again to join the cluster.

/Patrik
 

Thanks!

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--

Patrik Nordwall
Typesafe Reactive apps on the JVM
Twitter: @patriknw

Eugene Dzhurinsky

unread,
Jul 3, 2014, 4:28:34 PM7/3/14
to akka...@googlegroups.com
So you had one cluster consisting of all nodes, and then a network split caused it to be split into two separate clusters?
That does not really have anything to do with seed nodes. Seed nodes are only used as initial contact points when joining new nodes.

Hi!

Let's assume there are 3 nodes - A, B and C. Node A is a seed node for B and C. So when cluster starts - B joins A and C joins A, then they gossip and then B knows about C and vise versa.

Now A is not reachable, so both B and C mark it as "quarantined". After A gets online - it doesn't know neither about B nor about C, and none of the nodes B and C will even try to restore the connections to A.

If A has some "actors" to be deployed on node B - then it will not be able to do so.

The only way to resolve this seems to make A, B and C as "seed" nodes, so if any of then will "disconnect" - then after the connection is established (and threshold is not broken yet) they will re-join. But I'm not sure if there would be a delay in startup of node B - will it be marked as inaccessible?

Patrik Nordwall

unread,
Jul 4, 2014, 3:38:47 AM7/4/14
to akka...@googlegroups.com
On Thu, Jul 3, 2014 at 10:28 PM, Eugene Dzhurinsky <jdev...@gmail.com> wrote:
So you had one cluster consisting of all nodes, and then a network split caused it to be split into two separate clusters?
That does not really have anything to do with seed nodes. Seed nodes are only used as initial contact points when joining new nodes.

Hi!

Let's assume there are 3 nodes - A, B and C. Node A is a seed node for B and C. So when cluster starts - B joins A and C joins A, then they gossip and then B knows about C and vise versa.
yes, then my understanding was correct 

Now A is not reachable, so both B and C mark it as "quarantined". After A gets online - it doesn't know neither about B nor about C, and none of the nodes B and C will even try to restore the connections to A.

If A has some "actors" to be deployed on node B - then it will not be able to do so.

The only way to resolve this seems to make A, B and C as "seed" nodes, so if any of then will "disconnect" - then after the connection is established (and threshold is not broken yet) they will re-join. But I'm not sure if there would be a delay in startup of node B - will it be marked as inaccessible?

I repeat myself, this has nothing to do with seed nodes. Please re-read my previous reply, and this section: http://doc.akka.io/docs/akka/2.3.4/scala/cluster-usage.html#Joining_to_Seed_Nodes

/Patrik
 

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Ryadh khsib

unread,
Jul 4, 2014, 5:12:09 AM7/4/14
to akka...@googlegroups.com
Hi Patrik,
Reading this: 
An ActorSystem can only be part of one cluster, and only once. Rejoining or joining of two different clusters are not supported. Then the ActorSystem must be stopped and started again to join the cluster.

/Patrik

Do you confirm that for every Cluster an new ActorSytem should be created? 
We are using a singleton and router clusters within the same ActorSystem. Would you recommend to create 2 actor systems?

Thanks,
Ryadh

Patrik Nordwall

unread,
Jul 4, 2014, 5:40:33 AM7/4/14
to akka...@googlegroups.com
On Fri, Jul 4, 2014 at 11:12 AM, Ryadh khsib <riadh...@gmail.com> wrote:
Hi Patrik,
Reading this: 
An ActorSystem can only be part of one cluster, and only once. Rejoining or joining of two different clusters are not supported. Then the ActorSystem must be stopped and started again to join the cluster.

/Patrik

Do you confirm that for every Cluster an new ActorSytem should be created? 

An ActorSystem is created locally, in the JVM of each machine. We often refer to this as a cluster node, or a cluster member.
A Cluster is a set of nodes (actor systems) working together.
 
We are using a singleton and router clusters within the same ActorSystem. Would you recommend to create 2 actor systems?

I don't understand what you mean with "singleton and router clusters", but I assume that mean that you use Cluster Singelton and Cluster Aware Routers. You can use these two features the same ActorSystem.

/Patrik

Ryadh khsib

unread,
Jul 4, 2014, 9:10:35 AM7/4/14
to akka...@googlegroups.com
I mean we deployed within the same actor system: 
- singleton actor: scheduled task 
- router: the singleton distributed the work to workers using a router

I presume to implement a Singleton actor, some sort of Cluster is needed. The same thing applies for setting up a router.

I am totally wrong?

Thanks,
Ryadh


You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/2PnY31IgGEg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--
Ryadh Khsib
Software Engineer

email: ryadh...@gmail.com

Patrik Nordwall

unread,
Jul 4, 2014, 9:20:49 AM7/4/14
to akka...@googlegroups.com
On Fri, Jul 4, 2014 at 3:10 PM, Ryadh khsib <ryadh...@gmail.com> wrote:
I mean we deployed within the same actor system: 
- singleton actor: scheduled task 
- router: the singleton distributed the work to workers using a router

I presume to implement a Singleton actor, some sort of Cluster is needed.

Yes
 
The same thing applies for setting up a router.

well, you can have local routers: http://doc.akka.io/docs/akka/2.3.4/java/routing.html
but for cluster aware routers you need a cluster, that is right

/Patrik

Ryadh khsib

unread,
Jul 4, 2014, 9:24:08 AM7/4/14
to akka...@googlegroups.com
OK we are using a cluster aware router to distribute the load.

This means that we have 2 clusters deployed on the same ActorSystem, is that correct? 

If it is the case, should we deploy each "cluster" in a separate ActorSystem?

Ryadh

Martynas Mickevičius

unread,
Jul 7, 2014, 7:22:01 AM7/7/14
to akka...@googlegroups.com
Hi Ryadh,

ActorSystem can only join a Cluster once.

Have you seen the documentation on Cluster Usage? http://doc.akka.io/docs/akka/2.3.4/java/cluster-usage.html

The relevant excerpt from that page:

An actor system can only join a cluster once. Additional attempts will be ignored.
When it has successfully joined it must be restarted to be able to join another
cluster or to join the same cluster again. It can use the same host name and port
after the restart, but it must have been removed from the cluster before the join
request is accepted.

So it is not the case that Cluster is deployed on ActorSystem. But rather ActorSystem joins a particular Cluster.
Martynas Mickevičius
TypesafeReactive Apps on the JVM

Ryadh khsib

unread,
Jul 7, 2014, 8:56:37 AM7/7/14
to akka...@googlegroups.com
Thanks Martynas! 

It seems that Clusters are a higher level entity:
Cluster -> ActorSystem -> Actor

AFAIK Clusters are created by ActorSystems when configured to do so which is probably the source of confusion (lower entity creating a higher entity) . 

Cheers,
Ryadh

√iktor Ҡlang

unread,
Jul 7, 2014, 9:05:09 AM7/7/14
to Akka User List
A cluster is just a name for cooperating ActorSystems using a specific protocol. They are not reified.
Cheers,

Ryadh khsib

unread,
Jul 7, 2014, 9:16:04 AM7/7/14
to akka...@googlegroups.com
Thanks. Your definition makes perfect sense. Could we say that a Singleton, a ClusterAwareRouter are some sort of 'applications' running on top of the cluster? In that case it should allowed to have multiple 'applications' running on the same cluster.

Ryadh


Martynas Mickevičius

unread,
Jul 8, 2014, 6:05:29 AM7/8/14
to akka...@googlegroups.com
Hi Ryadh,

Cluster Singleton and Cluster Aware Router can be used together in the same cluster.
Reply all
Reply to author
Forward
0 new messages