I'm seeing a scenario where a cluster member is killed by Marathon due to out of control memory growth because a persistent actor stops processing message. When Marathon kills the cluster member the cluster role doesn't recover due to the shard coordinators having issue. My log is full exceptions. Wondering if the first one is a possible bug due to the requirement failed.
2017-01-13 04:47:25.626 [ERROR] [report-compute.07dbb0f4-d92e-11e6-80f8-0aedbb57963c] [reportCompute] [eplworkerslave12.lhr.manhattan.aspect-cloud.net:31871] [PersistentShardCoordinator] Exception in receiveRecover when replaying event type [akka.cluster.sharding.ShardCoordinator$Internal$ShardRegionRegistered] with sequence number [12] for persistenceId [/sharding/reportCompute.mr427.worktypesCoordinator].
java.lang.IllegalArgumentException: requirement failed: Region Actor[akka.tcp://manhattan@eplworkerslave4.lhr.manhattan.aspect-cloud.net:31708/system/sharding/reportCompute.mr427.worktypes#447621404] already registered: State(Map(),Map(Actor[akka.tcp://manhattan@eplworkerslave4.lhr.manhattan.aspect-cloud.net:31708/system/sharding/reportCompute.mr427.worktypes#447621404] -> Vector()),Set(),Set(),false)
at scala.Predef$.require(Predef.scala:224)
at akka.cluster.sharding.ShardCoordinator$Internal$State.updated(ShardCoordinator.scala:276)
at akka.cluster.sharding.PersistentShardCoordinator$$anonfun$receiveRecover$1.applyOrElse(ShardCoordinator.scala:742)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at akka.persistence.Eventsourced$$anon$3$$anonfun$1.applyOrElse(Eventsourced.scala:481)
Along have a large number of these messages:[ReplayFilter] Invalid replayed event [sequenceNr=5, writerUUID=59b217ca-b6f6-4725-aa92-e5930d734daa]. There was already a newer writer whose last replayed event was [sequenceNr=5, writerUUID=c3a77da7-08c4-489c-b158-6c717270713d] for the same persistenceId [/sharding/reportCompute.pens.interactionsCoordinator].Perhaps, the old writer kept journaling messages after the new writer created, or duplicate persistentId for different entities?-Richard
[ShardRegion] Trying to register to coordinator at [Some(ActorSelection[Anchor(akka://manhattan/), Path(/system/sharding/reportCompute.pens.worktypesCoordinator/singleton/coordinator)])], but no acknowledgement. Total [100000] buffered messages.
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscribe@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
Patrik Nordwall
Akka Tech Lead
Lightbend - Reactive apps on the JVM
Twitter: @patriknw