CPU-usage and a lot of actors with cluster-systems

264 views
Skip to first unread message

Flo B.

unread,
Nov 17, 2015, 4:10:37 PM11/17/15
to Akka User List
Hallo everyone,

I have a question about CPU-usage and a large number of actors. I'm currently playing with a server-system with 15 machines and each machine has 2x Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GH (8 Cores, 16 Threads for each CPU)
I also have a measurement client which is one of the machines.

I have 4 machines e. g. and they handle 20 000 actors with a cluster-configuration. If I send a number of messages to each actor at one time and measure the time between sending and receiving on the measurement client, I found out that I can only send a certain amount of messages to the actors. More messages lead to flow control warnings and the client does not get all answers in time.
If I use 12 machines, the cluster-system can handle more messages, which makes sense.

What surprised me is the CPU-usage for the configuration with 4 machines. The CPU-usage per machine is only around 1000% (possible is 3200%; with Linux top command).

I use the following dispatcher-configuration. Each other configuration (e.g. a big core-pool-size or bigger/smaller throughput) have worse time results and lead to more CPU-usage.

myDispatcher {
     type = Dispatcher
     executor = "thread-pool-executor"
     thread-pool-executor {
     core-pool-size-min = 16.0
     core-pool-size-factor = 2.0
     core-pool-size-max = 32
}
    throughput = 60
}

My explanation for this phenomenon is that each CPU can only handle a certain amount of actors at one point of time (or a small range) and a certain number of actors do not need all the power of one core/thread. Could this be the explanation?

Best Regards

Flo





Jim Hazen

unread,
Nov 17, 2015, 6:35:37 PM11/17/15
to Akka User List
Try using the fork-join-executor instead of the thread-pool-executor for multiplexing large numbers of non-blocking tasks across your CPUs.

Jim Hazen

unread,
Nov 17, 2015, 6:39:46 PM11/17/15
to Akka User List
Also, the default akka-remoting utilizing Java serialization is dog slow.  There are other threads that discuss swapping out the serializer with much faster ones.  So if you're doing a lot of cluster sharding, your throughput may be bottlenecked on the remote inter-node IO.

Flo B.

unread,
Nov 18, 2015, 10:17:48 AM11/18/15
to Akka User List


Am Mittwoch, 18. November 2015 00:39:46 UTC+1 schrieb Jim Hazen:
Also, the default akka-remoting utilizing Java serialization is dog slow.  There are other threads that discuss swapping out the serializer with much faster ones.  So if you're doing a lot of cluster sharding, your throughput may be bottlenecked on the remote inter-node IO.

Thanks Jim for your response,

I already use  https://github.com/romix/akka-kryo-serialization for serialization. I tried the fork-join-executor, but it does not make anything better (even worse with some settings) and tried different parameters for the executor, still the above settings seems best for my approach.

Could it be, that the logic inside the actors is to big? So the parallelism is limited?

Best regards

Flo

Jim Hazen

unread,
Nov 18, 2015, 12:30:13 PM11/18/15
to Akka User List
That's possible. Your actor won't be able to get more work until it completes its receive. You could:

A: look into actor pool/routers. This will give you more receive blocks to work with, increasing concurrency.
B: look into spending less time within your receive block, maybe by delegating the real work to a dispatched Future. If you go this route you'll have the ability to use a thread-pool-dispatcher for the longer/blocking work and your fork-join-dispatcher for your very fast receive executions.

I tend to end up with option B. With my Spray services, using the dispatch directive takes the real work out of http dispatching thread. With actors that are mostly IO, since Spray IO is async, you get essentially the same thing. So when I find I have dense cpu intensive code, I offload that work into another dispatcher and let the actor drive work into this ecexutor as quickly as it can.

At the very least, option B should help you drive up cpu utilization.

Flo B.

unread,
Nov 18, 2015, 2:12:23 PM11/18/15
to Akka User List
Thanks again for your interesting points, jim

I already have a separate dispatcher for the actors and one for the system.

jimhaz...@gmail.com

unread,
Nov 18, 2015, 2:36:19 PM11/18/15
to akka...@googlegroups.com
Be sure you aren't just separating your actors onto their own collective executor, but put the work each actor is doing onto an executor. The goal is to drive as much work as possible into these executors, and free up akka's executors to simply drive this work into to those other executors as quickly as possible. 

At a high level this would look like a receive block that only contains code wrapped within a Future. This puts the load on the child executor and frees up akka to drive this executor as hard as it can. 

If you don't do this you're just moving your single threaded actor handling from one dispatcher to the next, but aren't increasing internal actor concurrency. 

-Jim

-- Sent from my mobile device
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/EW_1nk0EiOY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Patrik Nordwall

unread,
Nov 18, 2015, 3:01:02 PM11/18/15
to akka...@googlegroups.com
What kind of work is performed by the actors?

If you already have 20000 actors I don't think the advice about delegating to futures will help.

/Patrik

You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.

Flo B.

unread,
Nov 18, 2015, 4:58:02 PM11/18/15
to Akka User List
Thank both of you for your answers, 

@Jim; so your solution would be to use a lot of dispatchers/executors to resolve the problem; doesn't this lead to a lot of overhead and makes the application more complex than it should be?

@Patrik; Hello Patrik, each actor has a geo-locaiton condition to test; I send a message to test if the location inside the message fits the location -> each actor answers if it does

Are there any experiences with the akka network configurations? I left everything by default (remoting); could the tcp-interface be the bottleneck? 

Best regards and thanks both for your help,

Flo

Jim Hazen

unread,
Nov 19, 2015, 2:45:32 PM11/19/15
to Akka User List
I'm not saying use a lot of dispatchers.  I'm saying that you should delegate to maybe 1 more dispatcher for your heavy work to unblock your actor's dispatching thread (and definitely another for blocking IO) allowing it to put more of your Actor's mailbox entries to work concurrently.  In one of your posts you were afraid that actors "doing too much" could be causing things to slow down.  If your actors are blocking on IO or some other lengthy task, a single actor won't be able to process message 2 until the receive block from message 1 has completed.  By delegating heavy work to a different "heavy work" dispatcher, your Actor's dispatcher can start to process more requests from the mailbox.  This allows more heavy/blocking tasks to be inflight.  It doesn't make long tasks take less time, but hopefully will allow you to maximize your remaining resources either queuing up blocking work or processing the results as they return.

To Patrik's point.  20000 actors is already a lot of concurrency.  Unless you broadcast 5 messages to each while the first message waits for 10s on a remote Geo call.  Then you've spent some CPU queuing 100k requests, started the blocking IO for 20k of those and then wait for 10s.  This is an exaggeration, but the idea here is that the time you spend waiting for IO drives down your overall CPU utilization.  If you're looking to maximize CPU utilization you want all of your 100k requests in flight, saturating your IO pipe, and then processing results as they return.  After a window of delay you should be processing results at the max speed of your network, which ought to keep your CPU busier than it is now.

Flo B.

unread,
Nov 19, 2015, 3:51:45 PM11/19/15
to Akka User List
Thanks Hazen for your explanation and your comment.

I hope I understood everything right! :) You want to split the work inside an actor and use e. g.  two executors, one for the actor stuff and one for the heavy work (io...). But! It is important for me that the work of message 1 is done before message 2 is handled. And isn't one of the benefits for the actor model and akka itself, that I have the garantee to that message 1 is done before message 2?

And wouldn't that lead to the decision to use two actors for the task? one for the handling and one for the work?

Best regards

Flo

Heiko Seeberger

unread,
Nov 19, 2015, 6:07:55 PM11/19/15
to akka...@googlegroups.com
I’d rather use pool routers to get parallelism (more work done simultaneously given the hardware resources) and delegate blocking work to „tagged“ actors which use a dispatcher which is configured properly to deal with blocking (many threads).

Heiko

--

Heiko Seeberger
Twitter: @hseeberger

signature.asc

Jim Hazen

unread,
Nov 20, 2015, 3:17:34 AM11/20/15
to Akka User List


But! It is important for me that the work of message 1 is done before message 2 is handled. And isn't one of the benefits for the actor model and akka itself, that I have the garantee to that message 1 is done before message 2?

Yes.  And you might be in trouble there.  In which case, at least on a per-actor basis, you wouldn't be able to use actor pooling or my Future dispatching solution.  You have essentially 20k sequential processors.  If having 20k concurrent actors doesn't provide enough concurrency to keep your system busy, I'm not sure what else you can do.  Optimize the IO and the sequential time best you can. 

Carsten Saathoff

unread,
Nov 20, 2015, 3:23:28 AM11/20/15
to Akka User List
Hi,

I think you (and probably everyone else here) need more data to figure out why your system is behaving as it is. 

Above, you write the configuration with four machines exposes the behaviour you describe. Does that mean with 12 machines the CPUs are fully utilized? What about a single machine. Can you fully utilize a single machine?

I would add some sort of monitoring to your system and try to figure out how a single actor system behaves. If you are able to fully utilize a single machine, find out what happens if you use more than one.

In general, understanding a concurrent and distributed system is hard, so IMHO data is your best friend. Therefore, measure what's going on. And of course there is a lot of information missing. Are you doing any blocking stuff, any IO, how large is the data you have to send around in the cluster, and so on. You also write you have to ensure ordering of messages, and I guess that will limit parallelism as well.

best

Carsten
Reply all
Reply to author
Forward
0 new messages