More feedback on Akka Cluster (2.1.0-RC1)

224 views
Skip to first unread message

Evan Chan

unread,
Oct 25, 2012, 7:15:34 PM10/25/12
to akka...@googlegroups.com
I know Akka Cluster is still marked as experimental, but I wanted to give more detailed feedback on some of the testing today, as well as post some questions.   

  1. ClusterAwareRouter / lookup routees by name:    This seems like the most stable config so far.   Removing and adding nodes back seems to lead to the correct rebalancing every time.
  2. It's really too bad that you can only look up one routee per node.   Can that routee be itself a local router?   Otherwise we would have to be bound to one mailbox for all incoming requests per node, and that won't scale.
  3. How hard would it be to change the code so that routees-path could take a wildcard?   The idea is that all the actors discovered in the selection would be added to the router config. 
  4. ClusterAwareRouter / deploy routees:   I don't understand the intention of this.  
    • If only one master deploys the actors, then the router is only usable from that one master.
    • If multiple masters all deploy the actors, then you end up with (# masters) * (max-nr-of-instances-per-node) * (# nodes) actors, which is probably not what you want, plus each master's  router will only know about the routees it created, and not the other created ones.
    • If you remove one node, and then add it back, the actors are not redeployed.
    • Ultimately what I want is to have (n-instances-per-node), with routers on all nodes being able to talk to all (the same) instances.
  5. You can get an exception easily if you have each node start up a router which attempts to deploy to the other cluster members.    Not sure if this is really supported scenario.   Even with one master, if you remove one of the nodes (like in StatsSampleOneMaster), you may get an exception like this:
    • scala.MatchError: Actor[akka://ClusterSystem/remote/Cluste...@127.0.0.1:2551/user/statsService/workerRouter/c6] (of class akka.actor.LocalActorRef)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.actor.VirtualPathContainer.foreachChild(ActorRef.scala:535)
    • at akka.remote.RemoteSystemDaemon.$bang(RemoteDaemon.scala:105)
    • at akka.event.EventStream.publish(EventStream.scala:40)

I would really love to use the Cluster aware router stuff, because I want to be able to use different underlying routers, like RoundRobin, Random, and ConsistentHash.   However, it seems that today only the lookup-one-routee-per-node method really works.

thanks,
Evan

Patrik Nordwall

unread,
Oct 26, 2012, 6:45:56 AM10/26/12
to akka...@googlegroups.com
On Fri, Oct 26, 2012 at 1:15 AM, Evan Chan <e...@ooyala.com> wrote:
I know Akka Cluster is still marked as experimental, but I wanted to give more detailed feedback on some of the testing today, as well as post some questions.   

  1. ClusterAwareRouter / lookup routees by name:    This seems like the most stable config so far.   Removing and adding nodes back seems to lead to the correct rebalancing every time.
Good. Yes that setup makes sense in a lot of cases. 
  1. It's really too bad that you can only look up one routee per node.   Can that routee be itself a local router?   Otherwise we would have to be bound to one mailbox for all incoming requests per node, and that won't scale.
The routee can create and delegate to it's own children to increase parallelism. Yes, it can be a router. 
  1. How hard would it be to change the code so that routees-path could take a wildcard?   The idea is that all the actors discovered in the selection would be added to the router config. 
We will start as is, to keep it simple, and recommend child worker-workers if one worker per node is a bottleneck. I think that's already in the documentation.
  1. ClusterAwareRouter / deploy routees:   I don't understand the intention of this.  
    • If only one master deploys the actors, then the router is only usable from that one master.
Correct. It's only usable via that one master. In the sample there is the Facade that delegates to the master.
Note that this master-worker scenario is needed for some types of jobs, but certainly not for all, and then lookup of existing routees makes  more sense. We have already discussed that it would be useful if routers in the cluster could be used without sending via the "head" router, and we have an idea for that, but it's in the future.
    • If multiple masters all deploy the actors, then you end up with (# masters) * (max-nr-of-instances-per-node) * (# nodes) actors, which is probably not what you want, plus each master's  router will only know about the routees it created, and not the other created ones.
Yes, that is probably not what you want, and that's why I used the single-master scenario in the sample.
    • If you remove one node, and then add it back, the actors are not redeployed.
I can't reproduce that. Works fine for me. If you can write  down steps of how to reproduce a ticket is much welcome. 
    • Ultimately what I want is to have (n-instances-per-node), with routers on all nodes being able to talk to all (the same) instances.
The the lookup of routees is the way to go. 
  1. You can get an exception easily if you have each node start up a router which attempts to deploy to the other cluster members.    Not sure if this is really supported scenario.   Even with one master, if you remove one of the nodes (like in StatsSampleOneMaster), you may get an exception like this:
    • scala.MatchError: Actor[akka://ClusterSystem/remote/Cluste...@127.0.0.1:2551/user/statsService/workerRouter/c6] (of class akka.actor.LocalActorRef)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.actor.VirtualPathContainer.foreachChild(ActorRef.scala:535)
    • at akka.remote.RemoteSystemDaemon.$bang(RemoteDaemon.scala:105)
    • at akka.event.EventStream.publish(EventStream.scala:40)


Yes, I can also produce it. That should be fixed. Please open a ticket: http://doc.akka.io/docs/akka/current/project/issue-tracking.html
 
I would really love to use the Cluster aware router stuff, because I want to be able to use different underlying routers, like RoundRobin, Random, and ConsistentHash.   However, it seems that today only the lookup-one-routee-per-node method really works.

Thanks for your feedback. Looking forward to more of it.
/Patrik
 

thanks,
Evan

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To post to this group, send email to akka...@googlegroups.com.
To unsubscribe from this group, send email to akka-user+...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
 
 



--

Patrik Nordwall
Typesafe The software stack for applications that scale
Twitter: @patriknw


Patrik Nordwall

unread,
Oct 26, 2012, 11:06:25 AM10/26/12
to akka...@googlegroups.com
I created the ticket: https://www.assembla.com/spaces/akka/tickets/2660
Akka Team
Typesafe - The software stack for applications that scale
Blog: letitcrash.com
Twitter: @akkateam

√iktor Ҡlang

unread,
Oct 26, 2012, 11:14:54 AM10/26/12
to akka...@googlegroups.com
Hi Evan,

thanks for the feedback! It's exactly this kind of feedback we want!

On Fri, Oct 26, 2012 at 1:15 AM, Evan Chan <e...@ooyala.com> wrote:
I know Akka Cluster is still marked as experimental, but I wanted to give more detailed feedback on some of the testing today, as well as post some questions.   

  1. ClusterAwareRouter / lookup routees by name:    This seems like the most stable config so far.   Removing and adding nodes back seems to lead to the correct rebalancing every time.
  2. It's really too bad that you can only look up one routee per node.   Can that routee be itself a local router?   Otherwise we would have to be bound to one mailbox for all incoming requests per node, and that won't scale.

Technically I have a hard time believing that going across the network to a single node will have more throughput than simply reading messages off a mailbox on the other side. i.e. that single actor endpoint won't be the bottleneck, the network will.
 
  1. How hard would it be to change the code so that routees-path could take a wildcard?   The idea is that all the actors discovered in the selection would be added to the router config. 
The question is "when". It's all asynchronous.

 
  1. ClusterAwareRouter / deploy routees:   I don't understand the intention of this.  
    • If only one master deploys the actors, then the router is only usable from that one master.
    • If multiple masters all deploy the actors, then you end up with (# masters) * (max-nr-of-instances-per-node) * (# nodes) actors, which is probably not what you want, plus each master's  router will only know about the routees it created, and not the other created ones.
    • If you remove one node, and then add it back, the actors are not redeployed.
    • Ultimately what I want is to have (n-instances-per-node), with routers on all nodes being able to talk to all (the same) instances.
  2. You can get an exception easily if you have each node start up a router which attempts to deploy to the other cluster members.    Not sure if this is really supported scenario.   Even with one master, if you remove one of the nodes (like in StatsSampleOneMaster), you may get an exception like this:
    • scala.MatchError: Actor[akka://ClusterSystem/remote/Cluste...@127.0.0.1:2551/user/statsService/workerRouter/c6] (of class akka.actor.LocalActorRef)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.remote.RemoteSystemDaemon$$anonfun$$bang$3.apply(RemoteDaemon.scala:105)
    • at akka.actor.VirtualPathContainer.foreachChild(ActorRef.scala:535)
    • at akka.remote.RemoteSystemDaemon.$bang(RemoteDaemon.scala:105)
    • at akka.event.EventStream.publish(EventStream.scala:40)
I think this might tie into a start-up phase where the cluster will await deployment until X number of nodes are up and have joined.

 

I would really love to use the Cluster aware router stuff, because I want to be able to use different underlying routers, like RoundRobin, Random, and ConsistentHash.   However, it seems that today only the lookup-one-routee-per-node method really works.

I'm not sure I follow, what do you mean?

Cheers,
 

thanks,
Evan

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To post to this group, send email to akka...@googlegroups.com.
To unsubscribe from this group, send email to akka-user+...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
 
 



--
Viktor Klang

Akka Tech Lead
Typesafe - The software stack for applications that scale

Twitter: @viktorklang

Evan Chan

unread,
Oct 26, 2012, 2:49:53 PM10/26/12
to akka...@googlegroups.com
Hi Patrik,


On Friday, October 26, 2012 3:45:59 AM UTC-7, Patrik Nordwall wrote:


On Fri, Oct 26, 2012 at 1:15 AM, Evan Chan <e...@ooyala.com> wrote:
I know Akka Cluster is still marked as experimental, but I wanted to give more detailed feedback on some of the testing today, as well as post some questions.   

  1. ClusterAwareRouter / lookup routees by name:    This seems like the most stable config so far.   Removing and adding nodes back seems to lead to the correct rebalancing every time.
Good. Yes that setup makes sense in a lot of cases. 
  1. It's really too bad that you can only look up one routee per node.   Can that routee be itself a local router?   Otherwise we would have to be bound to one mailbox for all incoming requests per node, and that won't scale.
The routee can create and delegate to it's own children to increase parallelism. Yes, it can be a router. 
There is a more fundamental problem than scalability, actually.    Suppose I want to use a ConsistentHashingRouter.  The state of the hashing is kept in the master router itself.  If I have a local router on each node, would it be able to sub-hash on the message?     IE, the flow becomes:

Master node:  ConsistentHashingRouter:
    - based on the hashing key, it routes the message to the remote routee

Remote node:  remote routee is itself a consistent hashing router
   - Does this router see the hashing envelope?
    - It needs to do sub-hashing somehow

I suppose this pattern would work (if the hashing key is passed to the remote routee.... ), but it would be clumsy.


With round-robin and random, the hierarchy of routers would be less of a problem.

-Evan

Patrik Nordwall

unread,
Oct 27, 2012, 3:09:36 AM10/27/12
to akka...@googlegroups.com
On Fri, Oct 26, 2012 at 8:49 PM, Evan Chan <e...@ooyala.com> wrote:
Hi Patrik,

On Friday, October 26, 2012 3:45:59 AM UTC-7, Patrik Nordwall wrote:


On Fri, Oct 26, 2012 at 1:15 AM, Evan Chan <e...@ooyala.com> wrote:
I know Akka Cluster is still marked as experimental, but I wanted to give more detailed feedback on some of the testing today, as well as post some questions.   

  1. ClusterAwareRouter / lookup routees by name:    This seems like the most stable config so far.   Removing and adding nodes back seems to lead to the correct rebalancing every time.
Good. Yes that setup makes sense in a lot of cases. 
  1. It's really too bad that you can only look up one routee per node.   Can that routee be itself a local router?   Otherwise we would have to be bound to one mailbox for all incoming requests per node, and that won't scale.
The routee can create and delegate to it's own children to increase parallelism. Yes, it can be a router. 
There is a more fundamental problem than scalability, actually.    Suppose I want to use a ConsistentHashingRouter.  The state of the hashing is kept in the master router itself.  If I have a local router on each node, would it be able to sub-hash on the message?     IE, the flow becomes:

Master node:  ConsistentHashingRouter:
    - based on the hashing key, it routes the message to the remote routee

Remote node:  remote routee is itself a consistent hashing router
   - Does this router see the hashing envelope?
    - It needs to do sub-hashing somehow

The envelope approach doesn't work, but there are two more ways to define the consistent hash key. Have you considered if you can use any of them?

I would also like to mention that the routers are not the silver bullet that fits every problem. It's easy and often very appropriate to use an ordinary actor and implement the "routing" yourself in the actor. Round robin and random are trivial one-liners, and for consistent cashing you can use the akka.routing.ConsistentHash utility. To manage the routee references you can subscribe to cluster events as is done in StatsSampleClient.

Cheers,
Patrik

Evan Chan

unread,
Oct 27, 2012, 12:12:46 PM10/27/12
to akka...@googlegroups.com
Patrick & co.:

Response inlined.  By the way it's quite a privilege to have discussions with a community so well versed in distributed and concurrency problems.

On Sat, Oct 27, 2012 at 12:09 AM, Patrik Nordwall <patrik....@gmail.com> wrote:
There is a more fundamental problem than scalability, actually.    Suppose I want to use a ConsistentHashingRouter.  The state of the hashing is kept in the master router itself.  If I have a local router on each node, would it be able to sub-hash on the message?     IE, the flow becomes:

Master node:  ConsistentHashingRouter:
    - based on the hashing key, it routes the message to the remote routee

Remote node:  remote routee is itself a consistent hashing router
   - Does this router see the hashing envelope?
    - It needs to do sub-hashing somehow

The envelope approach doesn't work, but there are two more ways to define the consistent hash key. Have you considered if you can use any of them?

I would also like to mention that the routers are not the silver bullet that fits every problem. It's easy and often very appropriate to use an ordinary actor and implement the "routing" yourself in the actor. Round robin and random are trivial one-liners, and for consistent cashing you can use the akka.routing.ConsistentHash utility. To manage the routee references you can subscribe to cluster events as is done in StatsSampleClient.

Here is the description of routers from the Akka docs:
"Routers behave like single actors, but they should also not hinder scalability".

Sure I can embed the routing logic in each end point actor, but not only would this violate DRY, but any kind of routing that relies on state (round-robin, or dynamic resizing) would be really difficult to coordinate and not practical.   This is especially true for a cluster wide router.

Routers on a single node (JVM) satisfy this property and are extremely appealing for that reason.   Currently the cluster router does not satisfy the description above.  They may behave like a single actor, but if you pass the cluster aware router ActorRef around the cluster, the routing logic would still be executed back on the node where the cluster router is defined, I believe (would love to be wrong above this), which will hinder scalability.

This is how I would implement a truly transparent, distributed cluster router, which does satisfy the router credo (feedback more than welcome):
- Distribute the cluster router code to different nodes  (if you're using the built in one, done already)
- Let there be a "master" cluster router node which is responsible for deploying routees and handling dynamic resizing
- All of the other copies of the cluster router are slaves and get notified by the master any time the routee list changes
- Local actors use the node-local copy of the router, and routing logic always happens concurrently, on the same node, on the same thread, does not hinder scalability

I'm going to implement some version of this.  This feels different enough from he current cluster router that it probably should be a new thing, like DistributedClusterRouter.   

-Evan
 
Cheers,
Patrik
 

--
--
Evan Chan
Senior Software Engineer | 
e...@ooyala.com | (650) 996-4600
www.ooyala.com | blog | @ooyala

Patrik Nordwall

unread,
Oct 27, 2012, 1:41:42 PM10/27/12
to akka...@googlegroups.com


On Oct 27, 2012, at 18:12, Evan Chan <e...@ooyala.com> wrote:

Patrick & co.:

Response inlined.  By the way it's quite a privilege to have discussions with a community so well versed in distributed and concurrency problems.

On Sat, Oct 27, 2012 at 12:09 AM, Patrik Nordwall <patrik....@gmail.com> wrote:
There is a more fundamental problem than scalability, actually.    Suppose I want to use a ConsistentHashingRouter.  The state of the hashing is kept in the master router itself.  If I have a local router on each node, would it be able to sub-hash on the message?     IE, the flow becomes:

Master node:  ConsistentHashingRouter:
    - based on the hashing key, it routes the message to the remote routee

Remote node:  remote routee is itself a consistent hashing router
   - Does this router see the hashing envelope?
    - It needs to do sub-hashing somehow

The envelope approach doesn't work, but there are two more ways to define the consistent hash key. Have you considered if you can use any of them?

I would also like to mention that the routers are not the silver bullet that fits every problem. It's easy and often very appropriate to use an ordinary actor and implement the "routing" yourself in the actor. Round robin and random are trivial one-liners, and for consistent cashing you can use the akka.routing.ConsistentHash utility. To manage the routee references you can subscribe to cluster events as is done in StatsSampleClient.

Here is the description of routers from the Akka docs:
"Routers behave like single actors, but they should also not hinder scalability".

Sure I can embed the routing logic in each end point actor, but not only would this violate DRY, but any kind of routing that relies on state (round-robin, or dynamic resizing) would be really difficult to coordinate and not practical.   This is especially true for a cluster wide router.

Routers on a single node (JVM) satisfy this property and are extremely appealing for that reason.   Currently the cluster router does not satisfy the description above.  They may behave like a single actor, but if you pass the cluster aware router ActorRef around the cluster, the routing logic would still be executed back on the node where the cluster router is defined, I believe (would love to be wrong above this), which will hinder scalability.

This is how I would implement a truly transparent, distributed cluster router, which does satisfy the router credo (feedback more than welcome):
- Distribute the cluster router code to different nodes  (if you're using the built in one, done already)
- Let there be a "master" cluster router node which is responsible for deploying routees and handling dynamic resizing
- All of the other copies of the cluster router are slaves and get notified by the master any time the routee list changes
- Local actors use the node-local copy of the router, and routing logic always happens concurrently, on the same node, on the same thread, does not hinder scalability

I agree that we should have this in future akka cluster. Exactly which order to implement next steps is a bit unclear. I think actor tree partitioning must be done first, so that this fits in nicely with that (e.g. migration of actors across nodes).


I'm going to implement some version of this.  This feels different enough from he current cluster router that it probably should be a new thing, like DistributedClusterRouter.   

Great, let us know if you need any assistance, or review.
/Patrik


-Evan
 
Cheers,
Patrik
 

--
--
Evan Chan
Senior Software Engineer | 
e...@ooyala.com | (650) 996-4600
www.ooyala.com | blog | @ooyala

--

Evan Chan

unread,
Oct 29, 2012, 5:16:48 PM10/29/12
to akka...@googlegroups.com
Patrik,


On Saturday, October 27, 2012 10:41:49 AM UTC-7, Patrik Nordwall wrote:



I'm going to implement some version of this.  This feels different enough from he current cluster router that it probably should be a new thing, like DistributedClusterRouter.   

Great, let us know if you need any assistance, or review.
/Patrik


What would help is if we could subscribe to the ClusterRouterConfig and its underlying Actor, anytime it decides to add new routees due to a node joining, or similar events.    

I was initially thinking of creating a modified underlying cluster router actor, which sends out notifications to slave routers anytime the routee set changes.      However, the ability to subscribe to cluster router events would be more flexible.    Also, since the underlying cluster router actor code is private[akka], I'd probably have to copy and paste.  :-p

-Evan 

Patrik Nordwall

unread,
Oct 30, 2012, 4:27:17 AM10/30/12
to akka...@googlegroups.com
On Mon, Oct 29, 2012 at 10:16 PM, Evan Chan <e...@ooyala.com> wrote:
Patrik,


On Saturday, October 27, 2012 10:41:49 AM UTC-7, Patrik Nordwall wrote:



I'm going to implement some version of this.  This feels different enough from he current cluster router that it probably should be a new thing, like DistributedClusterRouter.   

Great, let us know if you need any assistance, or review.
/Patrik


What would help is if we could subscribe to the ClusterRouterConfig and its underlying Actor, anytime it decides to add new routees due to a node joining, or similar events.    

I was initially thinking of creating a modified underlying cluster router actor, which sends out notifications to slave routers anytime the routee set changes.      However, the ability to subscribe to cluster router events would be more flexible.    Also, since the underlying cluster router actor code is private[akka], I'd probably have to copy and paste.  :-p


I'm afraid we can't add that because we might want to solve this in a different way when we have the full picture of actor tree partitioning.

So right now you need to hook into existing code, by copy paste or placing your code in akka package. Sorry, I hope you can solve your problem anyway. I suggest that you listen to the cluster events yourself, then you are in full control.

/Patrik

Evan Chan

unread,
Oct 30, 2012, 1:40:02 PM10/30/12
to akka...@googlegroups.com
Patrik,

What are the best practices to follow now so that we can lessen the migration work when the actor tree partitioning stuff is ready?

In the meantime I'll listen for cluster events myself.  It does mean re-implementing some of the stuff that's already in the cluster router stuff though, which is too bad.

Also, I understand from the docs that creating actors locally on a node, and making a remote call into a service that creates actors locally, is preferable to remote deploy of an actor.  Is that right?  It's probably faster too.

-Evan

Patrik Nordwall

unread,
Oct 30, 2012, 3:46:32 PM10/30/12
to akka...@googlegroups.com
On Tue, Oct 30, 2012 at 6:40 PM, Evan Chan <e...@ooyala.com> wrote:
Patrik,

What are the best practices to follow now so that we can lessen the migration work when the actor tree partitioning stuff is ready? 

In the meantime I'll listen for cluster events myself.  It does mean re-implementing some of the stuff that's already in the cluster router stuff though, which is too bad.

Yes, that sounds good, since you need this *now*.
 

Also, I understand from the docs that creating actors locally on a node, and making a remote call into a service that creates actors locally, is preferable to remote deploy of an actor.  Is that right?  It's probably faster too.

You mean this section:

Caveat: Remote deployment ties both systems together in a tight fashion, where it may become impossible to shut down one system after the other has become unreachable. This is due to a missing feature—which will be part of the clustering support—that hooks up network failure detection with DeathWatch. If you want to avoid this strong coupling, do not remote-deploy but send Props to a remotely looked-up actor and have that create a child, returning the resulting actor reference.

That is solved by akka cluster in 2.1.0-RC1, but only for cluster members.

Evan Chan

unread,
Nov 1, 2012, 12:40:35 PM11/1/12
to akka...@googlegroups.com
Patrick,

How about adding a ticket to Assembla to track the issue / feature request of a distributed cluster router?  

-Evan

Evan Chan

unread,
Nov 1, 2012, 12:47:21 PM11/1/12
to akka...@googlegroups.com

Patrik Nordwall

unread,
Nov 1, 2012, 1:17:19 PM11/1/12
to akka...@googlegroups.com
Thanks a lot! This is exactly the type of feedback we need. Keep it coming. 

By the way, did you see the Cluster Death Watch post at letitcrash.com?
Related to something else we discussed the other day.

/Patrik

1 nov 2012 kl. 17:47 skrev Evan Chan <e...@ooyala.com>:

Reply all
Reply to author
Forward
0 new messages