Runtime issue with Cluster Sharding.

1,215 views
Skip to first unread message

kraythe

unread,
Jun 10, 2016, 3:18:43 PM6/10/16
to Akka User List
Greetings, 

I apologize if this has been asked but I am having what I assume is a config problem. When I start a single node I get the following logged errors and my sharded actors don't start. The errors are like such: 

2016-06-10 14:07:01 -0500 - [INFO] - Message [akka.cluster.InternalClusterAction$InitJoin$] from Actor[akka://application/system/cluster/core/daemon/joinSeedNodeProcess-1#-956463865] to Actor[akka://application/deadLetters] was not delivered. [10] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2016-06-10 14:07:01 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:03 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:05 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:06 -0500 - [WARN] - Association with remote system [akka.tcp://Cluste...@127.0.0.1:2551] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://Cluste...@127.0.0.1:2551]] Caused by: [Connection refused: /127.0.0.1:2551]ere...

As you can see there seems to be a problem starting sharding. So I checked my config and build.sbt and it seems to me I have my ducks in a row. Here is the akka.conf file (which is included by application.conf in a play2.5 app.) 

akka {
  log-dead-letters-during-shutdown = off
extensions = [
"com.romix.akka.serialization.kryo.KryoSerializationExtension$",
"akka.cluster.metrics.ClusterMetricsExtension"
]

actor {
provider = "akka.cluster.ClusterActorRefProvider"
serializers {
java = "akka.serialization.JavaSerializer"
proto = "akka.remote.serialization.ProtobufSerializer"
// FIXME define bindings in code for config.
kryo = "com.romix.akka.serialization.kryo.KryoSerializer"
}

# See for Documentation: https://github.com/romix/akka-kryo-serialization
kryo {
type = "graph"
idstrategy = "automatic"
buffer-size = 4096
max-buffer-size = -1
use-manifests = false
post-serialization-transformations = "off"
kryo-custom-serializer-init = "distributed.serialization.SerializationConfigUtil"
implicit-registration-logging = true
kryo-trace = false
}

# default dispatcher used by Play
default-dispatcher {
# This will be used if you have set "executor = "fork-join-executor""
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 8

# The parallelism factor is used to determine thread pool size using the
# following formula: ceil(available processors * factor). Resulting size
# is then bounded by the parallelism-min and parallelism-max values.
parallelism-factor = 4.0

# Max number of threads to cap factor-based parallelism number to
parallelism-max = 64

# Setting to "FIFO" to use queue like peeking mode which "poll" or "LIFO" to use stack
# like peeking mode which "pop".
task-peeking-mode = "FIFO"
}
}
}

# See http://doc.akka.io/docs/akka/snapshot/general/configuration.html#config-akka-remote
remote {
log-remote-lifecycle-events = off
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
# This causes the server to select a random available port in local mode and is important for running multiple
# nodes on the same machine. It is overridden in environments.
port = 0
}
}

cluster {
// FIXME cant use static config!
seed-nodes = [
"akka.tcp://Cluste...@127.0.0.1:2551"
]

metrics {
enabled = on
native-library-extract-folder = ${user.dir}/target/native
}

# auto downing is NOT safe for production deployments.
# you may want to use it during development, read more about it in the docs.
#
# auto-down-unreachable-after = 10s
}

akka {
persistence {
journal.plugin = "akka-persistence-sql-async.journal"
snapshot-store.plugin = "akka-persistence-sql-async.snapshot-store"
}
}

akka-persistence-sql-async {
journal.class = "akka.persistence.journal.sqlasync.MySQLAsyncWriteJournal"
snapshot-store.class = "akka.persistence.snapshot.sqlasync.MySQLSnapshotStore"

user = ${db.default.username}
password = ${db.default.password}
url = ${db.default.url}
max-pool-size = 4
wait-queue-capacity = 10000

metadata-table-name = "akka_persistence_metadata"
journal-table-name = "akka_persistence_journal"
snapshot-table-name = "akka_persistence_snapshots"
}
}

And the build.sbt 

libraryDependencies ++= Seq(
javaCore,
javaJpa,
javaJdbc,
filters,
cache,
javaWs,
// other dependencies omitted
"com.typesafe.akka" % "akka-cluster_2.11" % "2.4.6",
"com.typesafe.akka" % "akka-cluster-sharding_2.11" % "2.4.6",
"com.typesafe.akka" % "akka-cluster-tools_2.11" % "2.4.6",
"com.typesafe.akka" % "akka-cluster-metrics_2.11" % "2.4.6",
// Used by akka-persistence
"com.okumin" %% "akka-persistence-sql-async" % "0.3.1",
"com.github.mauricio" %% "mysql-async" % "0.2.16"
)

Any help would be appreciated. 

Rafał Siwiec

unread,
Jun 11, 2016, 3:24:10 PM6/11/16
to Akka User List
akka.remote.netty.tcp.port - instead of 0 you should use 2551.

kraythe

unread,
Jun 11, 2016, 5:38:43 PM6/11/16
to Akka User List
What if I dont want any auto join at all. I took out the seed nodes and I get the same problem.

Justin du coeur

unread,
Jun 12, 2016, 10:57:29 AM6/12/16
to akka...@googlegroups.com
Well, the problem *looks* to me like the cluster isn't joining up at all.  So if you don't do auto join, then you need to set that up programmatically.  (Don't recall offhand how that works -- the seed-node approach is usual.)

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

kraythe

unread,
Jun 12, 2016, 12:09:54 PM6/12/16
to Akka User List
Can you run cluster Sharding on a single node? Is that not possible? I even tried to use the LevelDB default plugin for journaling and no dice. My Most recent akka config is below. Thanks a bunch for your reply, this one really has me stumped. I have tried 3 dozen things and read every doc and google link I can get my hands on 3 times.  

        # The parallelism factor is used to determine thread pool size using the following formula:
        # ceil(available processors * factor). Resulting size is then bounded by the parallelism-min and
        # parallelism-max values.
        parallelism-factor = 4.0

        # Max number of threads to cap factor-based parallelism number to
        parallelism-max = 64

        # Setting to "FIFO" to use queue like peeking mode which "poll" or "LIFO" to use stack like peeking mode
        # which "pop".
        task-peeking-mode = "FIFO"
      }
   
}
 
}

 
# See http://doc.akka.io/docs/akka/snapshot/general/configuration.html#config-akka-remote
  remote {
    log
-remote-lifecycle-events = off
    enabled
-transports = ["akka.remote.netty.tcp"]
    netty
.tcp {
     
# This causes the server to select a random available port in local mode and is important for running multiple
      # nodes on the same machine. It is overridden in environments.
      port = 0
    }
 
}

  cluster
{

   
// FIXME cant use static seed nodes config!
    seed-nodes = []


    metrics
{
      enabled
= on
     
native-library-extract-folder = ${user.dir}/target/native
   
}

   
# auto downing is NOT safe for production deployments.
    # you may want to use it during development, read more about it in the docs.
    # auto-down-unreachable-after = 10s

    # Settings for the ClusterShardingExtension
    sharding {

     
# The extension creates a top level actor with this name in top level system scope,
      # e.g. '/system/sharding'
      guardian-name = sharding

     
# Specifies that entities runs on cluster nodes with a specific role. If the role is not specified (or empty)
      # all nodes in the cluster are used.
      role = ""

      # When this is set to 'on' the active entity actors will automatically be restarted upon Shard restart. i.e.
      # if the Shard is started on a different ShardRegion due to rebalance or crash.
      remember-entities = off

     
# If the coordinator can't store state changes it will be stopped and started again after this duration, with
      # an exponential back-off of up to 5 times this duration.
      coordinator-failure-backoff = 5 s

     
# The ShardRegion retries registration and shard location requests to the ShardCoordinator with this interval
      # if it does not reply.
      retry-interval = 2 s

     
# Maximum number of messages that are buffered by a ShardRegion actor.
      buffer-size = 100000

      # Timeout of the shard rebalancing process.
      handoff-timeout = 60 s

     
# Time given to a region to acknowledge it's hosting a shard.
      shard-start-timeout = 10 s

     
# If the shard is remembering entities and can't store state changes will be stopped and then started again
      # after this duration. Any messages sent to an affected entity may be lost in this process.
      shard-failure-backoff = 10 s

     
# If the shard is remembering entities and an entity stops itself without using passivate. The entity will
      # be restarted after this duration or when the next message for it is received, which ever occurs first.
      entity-restart-backoff = 10 s

     
# Rebalance check is performed periodically with this interval.
      rebalance-interval = 10 s

     
# Absolute path to the journal plugin configuration entity that is to be used for the internal persistence of
      # ClusterSharding. If not defined the default journal plugin is used. Note that this is not related to
      # persistence used by the entity actors.
      journal-plugin-id = ""

      # Absolute path to the snapshot plugin configuration entity that is to be used for the internal persistence
      # of ClusterSharding. If not defined the default snapshot plugin is used. Note that this is not related to
      # persistence used by the entity actors.
      snapshot-plugin-id = ""

      # Parameter which determines how the coordinator will be store a state valid values either "persistence" or
      # "ddata" The "ddata" mode is experimental, since it depends on the experimental module
      # akka-distributed-data-experimental.
      state-store-mode = "persistence"

      # The shard saves persistent snapshots after this number of persistent events. Snapshots are used to reduce
      # recovery times.
      snapshot-after = 1000

      # Setting for the default shard allocation strategy
      least-shard-allocation-strategy {
       
# Threshold of how large the difference between most and least number of allocated shards must be to begin the
        # rebalancing.
        rebalance-threshold = 10

        # The number of ongoing rebalancing processes is limited to this number.
        max-simultaneous-rebalance = 3
      }

     
# Timeout of waiting the initial distributed state (an initial state will be queried again if the timeout happened)
      # works only for state-store-mode = "ddata"
      waiting-for-state-timeout = 5 s

     
# Timeout of waiting for update the distributed state (update will be retried if the timeout happened)
      # works only for state-store-mode = "ddata"
      updating-state-timeout = 5 s

     
# Settings for the coordinator singleton. Same layout as akka.cluster.singleton. The "role" of the singleton
      # configuration is not used. The singleton role will be the same as "akka.cluster.sharding.role".
      coordinator-singleton = ${akka.cluster.singleton}

     
# The id of the dispatcher to use for ClusterSharding actors. If not specified default dispatcher is used.
      # If specified you need to define the settings of the actual dispatcher. This dispatcher for the entity actors
      # is defined by the user provided Props, i.e. this dispatcher is not used for the entity actors.
      use-dispatcher = ""
    }
 
}

  persistence
{
    journal
{
     
# Absolute path to the journal plugin configuration entry used by persistent actor or view by default.
      # Persistent actor or view can override `journalPluginId` method in order to rely on a different journal plugin.
      # plugin = "akka-persistence-sql-async.journal"
      plugin = "akka.persistence.journal.leveldb"
      leveldb-shared.store {
       
# DO NOT USE 'native = off' IN PRODUCTION !!!
        native = off
        dir
= "target/shared-journal"
      }
     
# List of journal plugins to start automatically. Use "" for the default journal plugin.
      auto-start-journals = ["akka.persistence.journal.leveldb"]
   
}
    snapshot
-store {
     
# Absolute path to the snapshot plugin configuration entry used by persistent actor or view by default.
      # Persistent actor or view can override `snapshotPluginId` method in order to rely on a different snapshot plugin.
      # It is not mandatory to specify a snapshot store plugin. If you don't use snapshots you don't have to configure
      # it. Note that Cluster Sharding is using snapshots, so if you use Cluster Sharding you need to define a snapshot
      # store plugin.
      # plugin = "akka-persistence-sql-async.snapshot-store"
      plugin = "akka.persistence.snapshot-store.local"
      local.dir = "target/snapshots"
      # List of snapshot stores to start automatically. Use "" for the default snapshot store.
      auto-start-snapshot-stores = ["akka.persistence.snapshot-store.local"]
   
}
   
#https://github.com/okumin/akka-persistence-sql-async
    #journal.plugin = "akka-persistence-sql-async.journal"
    #snapshot-store.plugin = "akka-persistence-sql-async.snapshot-store"
  }
}

//akka-persistence-sql-async {
//  journal.class = "akka.persistence.journal.sqlasync.MySQLAsyncWriteJournal"
//  snapshot-store.class = "akka.persistence.snapshot.sqlasync.MySQLSnapshotStore"
//
//  user = ${db.default.username}
//  password = ${db.default.password}
//  url = ${db.default.url}
//  max-pool-size = 4
//  wait-queue-capacity = 10000
//
//  metadata-table-name = "akka_persistence_metadata"
//  journal-table-name = "akka_persistence_journal"
//  snapshot-table-name = "akka_persistence_snapshots"
//}


Thanks for your time.

kraythe

unread,
Jun 12, 2016, 12:13:25 PM6/12/16
to Akka User List
Damn the previous reply cut off my config, here it is, Ive cut out everything commented out. I just want to start ONE node for now. Here is my config actually... come on google you can do this!

akka {
  log
-dead-letters-during-shutdown = off
  extensions
= [
   
"com.romix.akka.serialization.kryo.KryoSerializationExtension$",
   
"akka.cluster.metrics.ClusterMetricsExtension"
  ]

  actor
{
    provider
= "akka.cluster.ClusterActorRefProvider"
    serializers {
      java
= "akka.serialization.JavaSerializer"
      proto = "akka.remote.serialization.ProtobufSerializer"
      // FIXME define bindings in code for config.
      kryo = "com.romix.akka.serialization.kryo.KryoSerializer"
    }


    kryo
{

      type
= "graph"
      idstrategy = "automatic"
      buffer-size = 4096
      max-buffer-size = -1
      use-manifests = false
      post-serialization-transformations = "off"
      kryo-custom-serializer-init = "distributed.serialization.SerializationConfigUtil"
      implicit-registration-logging = true
      kryo-trace = false
    }


   
default-dispatcher {
      fork
-join-executor {
        parallelism
-min = 8
        parallelism-factor = 4.0
        parallelism-max = 64
        task-peeking-mode = "FIFO"
      }
   
}
 
}

  remote
{
    log
-remote-lifecycle-events = off
    enabled
-transports = ["akka.remote.netty.tcp"]
    netty
.tcp {

      port
= 0
    }
 
}

  cluster
{

   
// FIXME cant use static seed nodes config!
    seed-nodes = []


    metrics
{

      enabled
= on
     
native-library-extract-folder = ${user.dir}/target/native
   
}

    sharding
{
      guardian
-name = sharding
      role
= ""
      remember-entities = off
      coordinator
-failure-backoff = 5 s
     
retry-interval = 2 s
      buffer
-size = 100000
      handoff-timeout = 60 s
      shard
-start-timeout = 10 s
      shard
-failure-backoff = 10 s
      entity
-restart-backoff = 10 s
      rebalance
-interval = 10 s
      journal
-plugin-id = ""
      snapshot-plugin-id = ""
      state-store-mode = "persistence"
      snapshot-after = 1000
      least-shard-allocation-strategy {
        rebalance
-threshold = 10
        max-simultaneous-rebalance = 3
      }
      waiting
-for-state-timeout = 5 s
      updating
-state-timeout = 5 s
      coordinator
-singleton = ${akka.cluster.singleton}
     
use-dispatcher = ""
    }
 
}

  persistence
{
    journal
{
      plugin
= "akka.persistence.journal.leveldb"
      leveldb-shared.store {

       
native = off
        dir
= "target/shared-journal"
      }

     
auto-start-journals = ["akka.persistence.journal.leveldb"]
   
}
    snapshot
-store {

      plugin
= "akka.persistence.snapshot-store.local"
      local.dir = "target/snapshots"
      auto-start-snapshot-stores = ["akka.persistence.snapshot-store.local"]
   
}
 
}
}





On Sunday, June 12, 2016 at 9:57:29 AM UTC-5, Justin du coeur wrote:

kraythe

unread,
Jun 12, 2016, 12:42:40 PM6/12/16
to Akka User List
I am wondering if there is some problem since I am running Akka inside of play 2.5 as an integrated environment. It would suck if thats a problem for cluster config because that means I would have to separate the systems and that just isnt happening any time soon, if at all. 


On Friday, June 10, 2016 at 2:18:43 PM UTC-5, kraythe wrote:
Greetings, 

I apologize if this has been asked but I am having what I assume is a config problem. When I start a single node I get the following logged errors and my sharded actors don't start. The errors are like such: 

2016-06-10 14:07:01 -0500 - [INFO] - Message [akka.cluster.InternalClusterAction$InitJoin$] from Actor[akka://application/system/cluster/core/daemon/joinSeedNodeProcess-1#-956463865] to Actor[akka://application/deadLetters] was not delivered. [10] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2016-06-10 14:07:01 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:03 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:05 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:06 -0500 - [WARN] - Association with remote system [akka.tcp://ClusterSystem@127.0.0.1:2551] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://ClusterSystem@127.0.0.1:2551]] Caused by: [Connection refused: /127.0.0.1:2551]ere...

akka {

Patrik Nordwall

unread,
Jun 12, 2016, 1:22:40 PM6/12/16
to Akka User List
You can use it on a single node, but you must still join that node to itself.

/Patrik
sön 12 juni 2016 kl. 18:42 skrev kraythe <kra...@gmail.com>:
I am wondering if there is some problem since I am running Akka inside of play 2.5 as an integrated environment. It would suck if thats a problem for cluster config because that means I would have to separate the systems and that just isnt happening any time soon, if at all. 


On Friday, June 10, 2016 at 2:18:43 PM UTC-5, kraythe wrote:
Greetings, 

I apologize if this has been asked but I am having what I assume is a config problem. When I start a single node I get the following logged errors and my sharded actors don't start. The errors are like such: 

2016-06-10 14:07:01 -0500 - [INFO] - Message [akka.cluster.InternalClusterAction$InitJoin$] from Actor[akka://application/system/cluster/core/daemon/joinSeedNodeProcess-1#-956463865] to Actor[akka://application/deadLetters] was not delivered. [10] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2016-06-10 14:07:01 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:03 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:05 -0500 - [WARN] - Trying to register to coordinator at [None], but no acknowledgement. Total [368] buffered messages.

2016-06-10 14:07:06 -0500 - [WARN] - Association with remote system [akka.tcp://Cluste...@127.0.0.1:2551] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://Cluste...@127.0.0.1:2551]] Caused by: [Connection refused: /127.0.0.1:2551]ere...

akka {

--

kraythe

unread,
Jun 12, 2016, 1:44:22 PM6/12/16
to Akka User List
There is a joke in there somewhere. :) 

I just have to figure out how to do that. I have not had luck so far.

kraythe

unread,
Jun 12, 2016, 2:20:08 PM6/12/16
to Akka User List
  cluster {
// FIXME cant use static seed nodes config!
seed-nodes = [
"akka.tcp://application@w.x.y.z:2551"
"akka.tcp://application@localhost:2551"
]

So the first line for the seed nodes works but the second doesn't. The problem is that IPs of our machines are variable depending upon the DNS of our vpn. Is there any way around this in config? If I was in java i could get the address of the machine directly.  Once I get it connecting it starts bitching about levledb but that is at least progress. I am wondering where the name "application" came from rather than "ClusterSystem". Any ideas ? 


Patrik Nordwall

unread,
Jun 12, 2016, 4:11:44 PM6/12/16
to Akka User List
application is the default ActorSystem name in Play.

You can join programatically

val cluster = Cluster(system)
cluster.join(cluster.selfAddress)

/Patrik

kraythe

unread,
Jun 12, 2016, 4:21:38 PM6/12/16
to Akka User List
For the next person with this problem, the resolution is that if you even run with one node you have to either configure seed nodes, manually or programmatically join the cluster for the sharding system to work. My error was not specifying the host name exactly in my remoting settings and matching that. Also the actor system name must match exactly, in my case playFramework 2.5 used the name "application". Here is the relevant parts of the fixed config. 

akka {
  actor
{
    provider
= "akka.cluster.ClusterActorRefProvider"
  }

  # See http://doc.akka.io/docs/akka/snapshot/general/configuration.html#config-akka-remote
  remote {
    log
-remote-lifecycle-events = off
    enabled
-transports = ["akka.remote.netty.tcp"]
    netty
.tcp {

      hostname
= 127.0.0.1
      port
= 2551
    }
 
}

  cluster
{
   
# Configuration of seed nodes. Note that this must match the akka.remote.netty settings EXACTLY for the node
    # to join with or it won't work correctly.
    seed-nodes = [
     
"akka.tcp://appli...@127.0.0.1:2551"
      "akka.tcp://appli...@127.0.0.1:2552"
    ]
  }
}


Ryan Tanner

unread,
Jun 12, 2016, 7:21:13 PM6/12/16
to Akka User List
The mention of VPN has me tingling.  Are you trying to join to a cluster across a WAN?

kraythe

unread,
Jun 13, 2016, 3:45:29 AM6/13/16
to Akka User List
No it was only local. But on my vpn it changes my ip all the time.
Reply all
Reply to author
Forward
0 new messages