Many problems with akka-cluster-sharding-scala activator

1,073 views
Skip to first unread message

Raman Gupta

unread,
Mar 31, 2014, 10:10:44 AM3/31/14
to akka...@googlegroups.com
I am experimenting with the cluster sharding activator, but am having lots of issues with it. I have tried updating the activator to 2.3.1, but to no avail (and other issues show up, such as described here: https://www.assembla.com/spaces/akka/simple_planner#/ticket:3967).

Problems noticed so far:

1) 100% of the time, the activator sends a lot of messages to the ClusterSystem deadLetters on startup of the seed node. Here is one example:

[INFO] [03/31/2014 09:37:00.654] [ClusterSystem-akka.actor.default-dispatcher-2] [akka://ClusterSystem/deadLetters] Message [akka.cluster.InternalClusterAction$InitJoin$] from Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] to Actor[akka://ClusterSystem/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[ ... many more akka.cluster.InternalClusterAction$InitJoin$ messages ... ]
[INFO] [03/31/2014 09:37:05.518] [ClusterSystem-akka.actor.default-dispatcher-14] [akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess] Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] to Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[... JOINING and Up message...]
[INFO] [03/31/2014 09:37:06.516] [ClusterSystem-akka.actor.default-dispatcher-16] [akka://ClusterSystem/user/sharding/AuthorListingCoordinator/singleton] Message [akka.contrib.pattern.ShardCoordinator$Internal$Register] from Actor[akka://ClusterSystem/user/sharding/AuthorListing#1471529820] to Actor[akka://ClusterSystem/user/sharding/AuthorListingCoordinator/singleton] was not delivered. [7] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[INFO] [03/31/2014 09:37:06.516] [ClusterSystem-akka.actor.default-dispatcher-16] [akka://ClusterSystem/user/sharding/PostCoordinator/singleton] Message [akka.contrib.pattern.ShardCoordinator$Internal$Register] from Actor[akka://ClusterSystem/user/sharding/Post#589187748] to Actor[akka://ClusterSystem/user/sharding/PostCoordinator/singleton] was not delivered. [8] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2) Using the default shared LevelDB journal configuration, sometimes (but not always) when the Bot node is started, the seed node goes nuts:

[INFO] [03/31/2014 09:46:00.768] [ClusterSystem-akka.actor.default-dispatcher-3] [Cluster(akka://ClusterSystem)] Cluster Node [akka.tcp://Cluste...@127.0.0.1:2551] - Leader is moving node [akka.tcp://Cluste...@127.0.0.1:50327] to [Up]
Uncaught error from thread [ClusterSystem-akka.remote.default-remote-dispatcher-24] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-17] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-28] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-29] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
[...keeps going forever...]
^CJava HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated

3) When it is working, the shared leveldb journal seems to work reasonably well (except for the SPOF on the first node). However, when I change to either one of the MongoDB replicated journals in contrib, when testing various combinations of node failures, things go nuts with duplicatekeyexceptions (looping infinitely), OutOfMemoryError's, and other weirdness. I know these are early implementations but the similarity of the failures when using the two different journal implementations makes me think the problems may not be with the journal implementations, but with akka-persistence instead.

4) When restarting the Bot node, there are lots of WARNings about unknown UIDs (the following message keeps repeating for Bots that have been shut down -- i.e. the node never appears to be actually removed from the cluster, even after the entire cluster is restarted):

[WARN] [03/31/2014 10:01:40.280] [ClusterSystem-akka.remote.default-remote-dispatcher-5] [Remoting] Association to [akka.tcp://Cluste...@127.0.0.1:50327] with unknown UID is reported as quarantined, but address cannot be quarantined without knowing the UID, gating instead for 5000 ms.

Has anyone else done any experimentation with akka-cluster-sharding?

Regards,
Raman

delasoul

unread,
Apr 1, 2014, 4:07:35 AM4/1/14
to akka...@googlegroups.com
Hello,

(I have tested with Akka 2.3.0, akka-persistence-mongo-casbah 0.4-SNAPSHOT and running the BlogApp in different processes)

I can confirm

points 1 and 4, but these have no influence on how the application works. The dead-letter and gating messages are always coming when starting the first
seed node but everything is working fine when the other nodes join the cluster.
I don't see duplicate key exceptions or OOM, but when stopping the first seed node, after a while the remaining nodes start to fail with
a NoSuchElementException for every shard lookup  (see exception log at the end of the post).
As said this only happens when stopping the first seed node, if I stop the second seed node or other BlogApps I started with port 0 and then restart them in various order everything works fine.

hth,

michael

[ERROR] [04/01/2014 09:37:46.131] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/AuthorListingCoordinator/singleton] key not found: Actor[akka
://ClusterSystem/user/sharding/AuthorListing#-1841768893]
java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListing#-1841768893]
        at scala.collection.MapLike$class.default(MapLike.scala:228)
        at scala.collection.AbstractMap.default(Map.scala:58)
        at scala.collection.MapLike$class.apply(MapLike.scala:141)
        at scala.collection.AbstractMap.apply(Map.scala:58)
        at akka.contrib.pattern.ShardCoordinator$Internal$State.updated(ClusterSharding.sc
ala:1055)
        at akka.contrib.pattern.ShardCoordinator$$anonfun$receiveRecover$1.applyOrElse(Clu
sterSharding.scala:1162)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunc
tion.scala:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.sca
la:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.sca
la:25)
        at akka.persistence.Eventsourced$$anonfun$1.applyOrElse(Eventsourced.scala:196)
        at akka.persistence.Recovery$State$$anonfun$processPersistent$1.apply(Recovery.sca
la:31)
        at akka.persistence.Recovery$State$$anonfun$processPersistent$1.apply(Recovery.sca
la:31)
        at akka.persistence.Recovery$State$class.withCurrentPersistent(Recovery.scala:42)
        at akka.persistence.Recovery$$anon$1.withCurrentPersistent(Recovery.scala:105)
        at akka.persistence.Recovery$State$class.processPersistent(Recovery.scala:31)
        at akka.persistence.Recovery$$anon$1.processPersistent(Recovery.scala:105)
        at akka.persistence.Recovery$$anon$1.aroundReceive(Recovery.scala:110)
        at akka.persistence.Recovery$class.aroundReceive(Recovery.scala:242)
        at akka.contrib.pattern.ShardCoordinator.akka$persistence$Eventsourced$$super$arou
ndReceive(ClusterSharding.scala:1132)
        at akka.persistence.Eventsourced$$anon$1.aroundReceive(Eventsourced.scala:29)
        at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:172)
        at akka.contrib.pattern.ShardCoordinator.aroundReceive(ClusterSharding.scala:1132)

        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
        at akka.dispatch.Mailbox.run(Mailbox.scala:220)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispat
cher.scala:393)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339
)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:10
7)




 

On Monday, 31 March 2014 16:10:44 UTC+2, Raman Gupta wrote:
I am experimenting with the cluster sharding activator, but am having lots of issues with it. I have tried updating the activator to 2.3.1, but to no avail (and other issues show up, such as described here: https://www.assembla.com/spaces/akka/simple_planner#/ticket:3967).

Problems noticed so far:

1) 100% of the time, the activator sends a lot of messages to the ClusterSystem deadLetters on startup of the seed node. Here is one example:

[INFO] [03/31/2014 09:37:00.654] [ClusterSystem-akka.actor.default-dispatcher-2] [akka://ClusterSystem/deadLetters] Message [akka.cluster.InternalClusterAction$InitJoin$] from Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] to Actor[akka://ClusterSystem/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[ ... many more akka.cluster.InternalClusterAction$InitJoin$ messages ... ]
[INFO] [03/31/2014 09:37:05.518] [ClusterSystem-akka.actor.default-dispatcher-14] [akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess] Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] to Actor[akka://ClusterSystem/system/cluster/core/daemon/firstSeedNodeProcess#-438400827] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[... JOINING and Up message...]
[INFO] [03/31/2014 09:37:06.516] [ClusterSystem-akka.actor.default-dispatcher-16] [akka://ClusterSystem/user/sharding/AuthorListingCoordinator/singleton] Message [akka.contrib.pattern.ShardCoordinator$Internal$Register] from Actor[akka://ClusterSystem/user/sharding/AuthorListing#1471529820] to Actor[akka://ClusterSystem/user/sharding/AuthorListingCoordinator/singleton] was not delivered. [7] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[INFO] [03/31/2014 09:37:06.516] [ClusterSystem-akka.actor.default-dispatcher-16] [akka://ClusterSystem/user/sharding/PostCoordinator/singleton] Message [akka.contrib.pattern.ShardCoordinator$Internal$Register] from Actor[akka://ClusterSystem/user/sharding/Post#589187748] to Actor[akka://ClusterSystem/user/sharding/PostCoordinator/singleton] was not delivered. [8] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2) Using the default shared LevelDB journal configuration, sometimes (but not always) when the Bot node is started, the seed node goes nuts:

[INFO] [03/31/2014 09:46:00.768] [ClusterSystem-akka.actor.default-dispatcher-3] [Cluster(akka://ClusterSystem)] Cluster Node [akka.tcp://ClusterSystem@127.0.0.1:2551] - Leader is moving node [akka.tcp://ClusterSystem@127.0.0.1:50327] to [Up]
Uncaught error from thread [ClusterSystem-akka.remote.default-remote-dispatcher-24] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-17] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-28] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
Uncaught error from thread [ClusterSystem-akka.actor.default-dispatcher-29] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ClusterSystem]
[...keeps going forever...]
^CJava HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated

3) When it is working, the shared leveldb journal seems to work reasonably well (except for the SPOF on the first node). However, when I change to either one of the MongoDB replicated journals in contrib, when testing various combinations of node failures, things go nuts with duplicatekeyexceptions (looping infinitely), OutOfMemoryError's, and other weirdness. I know these are early implementations but the similarity of the failures when using the two different journal implementations makes me think the problems may not be with the journal implementations, but with akka-persistence instead.

4) When restarting the Bot node, there are lots of WARNings about unknown UIDs (the following message keeps repeating for Bots that have been shut down -- i.e. the node never appears to be actually removed from the cluster, even after the entire cluster is restarted):

[WARN] [03/31/2014 10:01:40.280] [ClusterSystem-akka.remote.default-remote-dispatcher-5] [Remoting] Association to [akka.tcp://ClusterSystem@127.0.0.1:50327] with unknown UID is reported as quarantined, but address cannot be quarantined without knowing the UID, gating instead for 5000 ms.

delasoul

unread,
Apr 1, 2014, 9:57:18 AM4/1/14
to akka...@googlegroups.com
looking at the log and code I don't fully understand what's going on:
It look's like when the ShardCoordinator is getting moved to another node and is replayed the regions Map
of the internal State is empty?
What I don't understand is why the ShardRegionTerminated event for the same regions are coming over and over again?

[INFO] [04/01/2014 13:01:29.983] [ClusterSystem-akka.actor.default-dispatcher-23] [akka://
ClusterSystem/deadLetters] Message [akka.cluster.ClusterHeartbeatSender$Heartbeat] from Ac
tor[akka://ClusterSystem/system/cluster/core/daemon/heartbeatSender#1982459892] to Actor[a
kka://ClusterSystem/deadLetters] was not delivered. [10] dead letters encountered, no more
 dead letters will be logged. This logging can be turned off or adjusted with configuratio

n settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[WARN] [04/01/2014 13:01:31.273] [ClusterSystem-akka.actor.default-dispatcher-24] [akka.tc
p://Cluste...@127.0.0.1:2552/system/cluster/core/daemon] Cluster Node [akka.tcp://Clus
terS...@127.0.0.1:2552] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://Cl
uster...@127.0.0.1:2551, status = Up)]
[WARN] [04/01/2014 13:01:33.003] [ClusterSystem-akka.remote.default-remote-dispatcher-5] [
akka.tcp://Cluste...@127.0.0.1:2552/system/endpointManager/reliableEndpointWriter-akka
.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A2551-0] Association with remote system [akka.tcp:
//Cluste...@127.0.0.1:2551] has failed, address is now gated for [5000] ms. Reason is:
 [Association failed with [akka.tcp://Cluste...@127.0.0.1:2551]].
[WARN] [04/01/2014 13:01:39.994] [ClusterSystem-akka.remote.default-remote-dispatcher-6] [
akka.tcp://Cluste...@127.0.0.1:2552/system/endpointManager/reliableEndpointWriter-akka
.tcp%3A%2F%2FClusterSystem%40127.0.0.1%3A2551-0] Association with remote system [akka.tcp:
//Cluste...@127.0.0.1:2551] has failed, address is now gated for [5000] ms. Reason is:
 [Association failed with [akka.tcp://Cluste...@127.0.0.1:2551]].
[INFO] [04/01/2014 13:01:41.301] [ClusterSystem-akka.actor.default-dispatcher-23] [Cluster
(akka://ClusterSystem)] Cluster Node [akka.tcp://Cluste...@127.0.0.1:2552] - Leader is
 auto-downing unreachable node [akka.tcp://Cluste...@127.0.0.1:2551]
[INFO] [04/01/2014 13:01:41.311] [ClusterSystem-akka.actor.default-dispatcher-23] [Cluster
(akka://ClusterSystem)] Cluster Node [akka.tcp://Cluste...@127.0.0.1:2552] - Marking u
nreachable node [akka.tcp://Cluste...@127.0.0.1:2551] as [Down]
[INFO] [04/01/2014 13:01:42.281] [ClusterSystem-akka.actor.default-dispatcher-25] [Cluster
(akka://ClusterSystem)] Cluster Node [akka.tcp://Cluste...@127.0.0.1:2552] - Leader is
 removing unreachable node [akka.tcp://Cluste...@127.0.0.1:2551]
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-20] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/PostCoordinator] Previous oldest removed [a
kka.tcp://Cluste...@127.0.0.1:2551]
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-27] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingViewCoordinator] Previous olde
st removed [akka.tcp://Cluste...@127.0.0.1:2551]
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-27] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingViewCoordinator] Younger obser
ved OldestChanged: [None -> myself]
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-27] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingViewCoordinator] Singleton man
ager [akka.tcp://Cluste...@127.0.0.1:2552] starting singleton actor
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-27] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingViewCoordinator] ClusterSingle
tonManager state change [Younger -> Oldest]
[INFO] [04/01/2014 13:01:42.301] [ClusterSystem-akka.actor.default-dispatcher-17] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingCoordinator] Previous oldest r
emoved [akka.tcp://Cluste...@127.0.0.1:2551]
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-16] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/PostCoordinator] Younger observed OldestCha
nged: [None -> myself]
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-16] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/PostCoordinator] Singleton manager [akka.tc
p://Cluste...@127.0.0.1:2552] starting singleton actor
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-16] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/PostCoordinator] ClusterSingletonManager st
ate change [Younger -> Oldest]
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-21] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingCoordinator] Younger observed
OldestChanged: [None -> myself]
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-21] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingCoordinator] Singleton manager
 [akka.tcp://Cluste...@127.0.0.1:2552] starting singleton actor
[INFO] [04/01/2014 13:01:42.311] [ClusterSystem-akka.actor.default-dispatcher-21] [akka.tc
p://Cluste...@127.0.0.1:2552/user/sharding/AuthorListingCoordinator] ClusterSingletonM
anager state change [Younger -> Oldest]
[ERROR] [04/01/2014 13:01:43.121] [ClusterSystem-akka.actor.default-dispatcher-28] [akka:/
[ERROR] [04/01/2014 13:01:43.131] [ClusterSystem-akka.actor.default-dispatcher-24] [akka:/
/ClusterSystem/user/sharding/AuthorListingViewCoordinator/singleton] key not found: Actor[
akka://ClusterSystem/user/sharding/AuthorListingView#-164535745]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListingView#-164535745]
[ERROR] [04/01/2014 13:01:43.161] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/PostCoordinator/singleton] key not found: Actor[akka://Cluste
rSystem/user/sharding/Post#-585985228]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
Post#-585985228]
[ERROR] [04/01/2014 13:01:43.211] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/AuthorListingViewCoordinator/singleton] key not found: Actor[
akka://ClusterSystem/user/sharding/AuthorListingView#-164535745]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListingView#-164535745]
[ERROR] [04/01/2014 13:01:43.231] [ClusterSystem-akka.actor.default-dispatcher-28] [akka:/
[ERROR] [04/01/2014 13:01:43.281] [ClusterSystem-akka.actor.default-dispatcher-26] [akka:/
/ClusterSystem/user/sharding/PostCoordinator/singleton] key not found: Actor[akka://Cluste
rSystem/user/sharding/Post#-585985228]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
Post#-585985228]
[ERROR] [04/01/2014 13:01:43.301] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
[ERROR] [04/01/2014 13:01:43.301] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/AuthorListingViewCoordinator/singleton] key not found: Actor[
akka://ClusterSystem/user/sharding/AuthorListingView#-164535745]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListingView#-164535745]
[ERROR] [04/01/2014 13:01:43.361] [ClusterSystem-akka.actor.default-dispatcher-26] [akka:/
/ClusterSystem/user/sharding/AuthorListingViewCoordinator/singleton] key not found: Actor[
akka://ClusterSystem/user/sharding/AuthorListingView#-164535745]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListingView#-164535745]
[ERROR] [04/01/2014 13:01:43.371] [ClusterSystem-akka.actor.default-dispatcher-24] [akka:/
[ERROR] [04/01/2014 13:01:43.381] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/PostCoordinator/singleton] key not found: Actor[akka://Cluste
rSystem/user/sharding/Post#-585985228]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
Post#-585985228]
[ERROR] [04/01/2014 13:01:43.421] [ClusterSystem-akka.actor.default-dispatcher-16] [akka:/
/ClusterSystem/user/sharding/AuthorListingViewCoordinator/singleton] key not found: Actor[
akka://ClusterSystem/user/sharding/AuthorListingView#-164535745]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
AuthorListingView#-164535745]
[ERROR] [04/01/2014 13:01:43.431] [ClusterSystem-akka.actor.default-dispatcher-25] [akka:/
[ERROR] [04/01/2014 13:01:43.461] [ClusterSystem-akka.actor.default-dispatcher-24] [akka:/
/ClusterSystem/user/sharding/PostCoordinator/singleton] key not found: Actor[akka://Cluste
rSystem/user/sharding/Post#-585985228]

java.util.NoSuchElementException: key not found: Actor[akka://ClusterSystem/user/sharding/
Post#-585985228]

Raman Gupta

unread,
Apr 1, 2014, 10:23:27 AM4/1/14
to akka...@googlegroups.com
All right, at least I figured out the OOM problem. The sbt packaged with Fedora 20 does not set the perm gen size, so it uses the default size of 64MB, which is too small for sbt / Akka. That was probably causing a lot of my issues. In case anyone cares, I created: 


That took care of a lot of weirdness! There are still issues however. Here is another error I found by starting and stopping the 2552 node several times, specifically stopping it immediately after a "New post saved:" message:

Seen on the bot:

[INFO] [04/01/2014 10:20:58.686] [ClusterSystem-akka.actor.default-dispatcher-22] [akka.tcp://Cluste...@127.0.0.1:56185/user/sharding/AuthorListingCoordinator] Member removed [akka.tcp://Cluste...@127.0.0.1:2552]
[ERROR] [04/01/2014 10:21:00.044] [ClusterSystem-akka.actor.default-dispatcher-3] [akka://ClusterSystem/user/sharding/AuthorListing] actor name must not be empty
akka.actor.InvalidActorNameException: actor name must not be empty
        at akka.actor.dungeon.Children$class.checkName(Children.scala:180)
        at akka.actor.dungeon.Children$class.actorOf(Children.scala:38)
        at akka.actor.ActorCell.actorOf(ActorCell.scala:369)
        at akka.contrib.pattern.ShardRegion$$anonfun$6.apply(ClusterSharding.scala:802)
        at akka.contrib.pattern.ShardRegion$$anonfun$6.apply(ClusterSharding.scala:798)
        at scala.Option.getOrElse(Option.scala:120)
        at akka.contrib.pattern.ShardRegion.deliverMessage(ClusterSharding.scala:798)
        at akka.contrib.pattern.ShardRegion$$anonfun$receiveCoordinatorMessage$2.apply(ClusterSharding.scala:694)
        at akka.contrib.pattern.ShardRegion$$anonfun$receiveCoordinatorMessage$2.apply(ClusterSharding.scala:693)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at akka.contrib.pattern.ShardRegion.receiveCoordinatorMessage(ClusterSharding.scala:693)
        at akka.contrib.pattern.ShardRegion$$anonfun$receive$3.applyOrElse(ClusterSharding.scala:656)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
        at akka.contrib.pattern.ShardRegion.aroundReceive(ClusterSharding.scala:586)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
        at akka.dispatch.Mailbox.run(Mailbox.scala:220)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


Regards,
Raman

Patrik Nordwall

unread,
Apr 2, 2014, 7:47:19 AM4/2/14
to akka...@googlegroups.com
Hi Raman and Michael,

I distilled this to 2 remaining issues:

1. NoSuchElementException ClusterSharding.scala:1055
That looks like a bug. Please create a ticket with description of how to reproduce.

2. InvalidActorNameException: actor name must not be empty ClusterSharding.scala:802
That means that the id is "", which is not meaningful and not supported. We should add a check and handle it in a better way. Ticket please.

Have I missed anything else?

/Patrik



--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--

Patrik Nordwall
Typesafe Reactive apps on the JVM
Twitter: @patriknw

delasoul

unread,
Apr 2, 2014, 8:32:06 AM4/2/14
to akka...@googlegroups.com
Hello Patrick,

I have created a ticket:

https://www.assembla.com/spaces/akka/tickets/3974

pls. let me know if you need smthg additionally,

michael




On Wednesday, 2 April 2014 13:47:19 UTC+2, Patrik Nordwall wrote:
Hi Raman and Michael,

I distilled this to 2 remaining issues:

1. NoSuchElementException ClusterSharding.scala:1055
That looks like a bug. Please create a ticket with description of how to reproduce.

2. InvalidActorNameException: actor name must not be empty ClusterSharding.scala:802
That means that the id is "", which is not meaningful and not supported. We should add a check and handle it in a better way. Ticket please.

Have I missed anything else?

/Patrik

On Tue, Apr 1, 2014 at 4:23 PM, Raman Gupta <rocke...@gmail.com> wrote:
All right, at least I figured out the OOM problem. The sbt packaged with Fedora 20 does not set the perm gen size, so it uses the default size of 64MB, which is too small for sbt / Akka. That was probably causing a lot of my issues. In case anyone cares, I created: 


That took care of a lot of weirdness! There are still issues however. Here is another error I found by starting and stopping the 2552 node several times, specifically stopping it immediately after a "New post saved:" message:

Seen on the bot:

[INFO] [04/01/2014 10:20:58.686] [ClusterSystem-akka.actor.default-dispatcher-22] [akka.tcp://ClusterSystem@127.0.0.1:56185/user/sharding/AuthorListingCoordinator] Member removed [akka.tcp://ClusterSystem@127.0.0.1:2552]

Raman Gupta

unread,
Apr 2, 2014, 1:27:53 PM4/2/14
to akka...@googlegroups.com
I created https://www.assembla.com/spaces/akka/tickets/3975 re #2.

There is also the one other (relatively minor) issue of the 8-10 dead letters on cluster startup. Do you consider that a bug? If so, I shall create a ticket for that as well.

Regards,
Raman


On Wednesday, April 2, 2014 7:47:19 AM UTC-4, Patrik Nordwall wrote:
Hi Raman and Michael,

I distilled this to 2 remaining issues:

1. NoSuchElementException ClusterSharding.scala:1055
That looks like a bug. Please create a ticket with description of how to reproduce.

2. InvalidActorNameException: actor name must not be empty ClusterSharding.scala:802
That means that the id is "", which is not meaningful and not supported. We should add a check and handle it in a better way. Ticket please.

Have I missed anything else?

/Patrik

On Tue, Apr 1, 2014 at 4:23 PM, Raman Gupta <rocke...@gmail.com> wrote:
All right, at least I figured out the OOM problem. The sbt packaged with Fedora 20 does not set the perm gen size, so it uses the default size of 64MB, which is too small for sbt / Akka. That was probably causing a lot of my issues. In case anyone cares, I created: 


That took care of a lot of weirdness! There are still issues however. Here is another error I found by starting and stopping the 2552 node several times, specifically stopping it immediately after a "New post saved:" message:

Seen on the bot:

[INFO] [04/01/2014 10:20:58.686] [ClusterSystem-akka.actor.default-dispatcher-22] [akka.tcp://ClusterSystem@127.0.0.1:56185/user/sharding/AuthorListingCoordinator] Member removed [akka.tcp://ClusterSystem@127.0.0.1:2552]

Patrik Nordwall

unread,
Apr 2, 2014, 4:23:35 PM4/2/14
to akka...@googlegroups.com



2 apr 2014 kl. 19:27 skrev Raman Gupta <rocke...@gmail.com>:


Thanks a lot. 


There is also the one other (relatively minor) issue of the 8-10 dead letters on cluster startup. Do you consider that a bug? If so, I shall create a ticket for that as well.

Dead letter logging is not a bug. You can turn that off if it's disturbing.

/Patrik

Patrik Nordwall

unread,
Apr 3, 2014, 2:21:25 AM4/3/14
to akka...@googlegroups.com
On Wed, Apr 2, 2014 at 2:32 PM, delasoul <michael...@gmx.at> wrote:
Hello Patrick,

I have created a ticket:

https://www.assembla.com/spaces/akka/tickets/3974

pls. let me know if you need smthg additionally,

Thanks, I can't reproduce, but I have an idea of what can be wrong. We can continue the discussion in the ticket.

Patrik Nordwall

unread,
Apr 3, 2014, 2:42:50 AM4/3/14
to akka...@googlegroups.com
On Wed, Apr 2, 2014 at 10:23 PM, Patrik Nordwall <patrik....@gmail.com> wrote:



2 apr 2014 kl. 19:27 skrev Raman Gupta <rocke...@gmail.com>:


That problem is in the sample as I have described in the ticket.
Thanks for trying it out in anger.

/Patrik

Patrik Nordwall

unread,
Apr 11, 2014, 11:18:56 AM4/11/14
to akka...@googlegroups.com
For the record, the issues listed here were fixed in Akka 2.3.2.

Thanks for your help finding and narrowing down the issues.

Cheers,
Patrik
Reply all
Reply to author
Forward
0 new messages