Re: [akka-user] Akka not using all available CPU resources

1,020 views
Skip to first unread message

√iktor Ҡlang

unread,
Mar 27, 2013, 12:54:23 PM3/27/13
to Akka User List
Hi Alexander!

Long answer: http://letitcrash.com/post/20397701710/50-million-messages-per-second-on-a-single-machine
Short answer: use the fork-join-executor

Does that help?

Cheers,


On Wed, Mar 27, 2013 at 5:45 PM, Alexander waite <alix...@gmail.com> wrote:
I've written an akka application where I instantiate 15,000+ actors and have them do some processing and then send messages to each other. Everything runs fine on my laptop and takes 120 seconds to complete and by checking "top" I can see that it's maxing out all cores on my machine. Awesome. Now, I also have access to a 64 core BSD box. When I take the same code and run it on that box, it runs faster (about 44 seconds average) but then when I check "top" during running, CPU usage is only up at 1800% instead of the theoretical 6400%. 
I'm pretty sure that the application has enough work to do to saturate the CPU so I don't understand why it isn't. I've tried forcing all of my values for threads in the default dispatcher to 64 but this has changed absolutely nothing.

my application.conf:

akka {
        //log-config-on-start = on
        actor {
                default-dispatcher {

                        mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"
                        executor = "thread-pool-executor"
                        fork-join-executor{
                                parallelism-factor = 1.0
                                parallelism-min = 64
                                parallelism-max = 64
                        }

                        thread-pool-executor {
                                core-pool-size-min = 64
                                core-pool-size-max = 64
                                max-pool-size-min = 64
                                max-pool-size-max = 64
                                task-queue-size = -1
                                core-pool-size-factor = 1.0
                                max-pool-size-factor = 1.0
                        }

                        throughput=100
                }
        }
}

The reason both executors are in this application.conf is because I have tried using both to see if that makes any difference, it didn't.
Any ideas? Is my mental model of how akka deals with threads wrong or do I have a different problem?

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Viktor Klang
Director of Engineering

Twitter: @viktorklang

Akka Team

unread,
Mar 27, 2013, 1:05:21 PM3/27/13
to akka...@googlegroups.com
Hi Alexander!

> I used the fork-join-executor originally and then moved to thread-pool for a
> "just in case" scenario. I have the same issue with both executors.

Have you profiled it? It is not easy to see why your cores are idle
without some data. You might want to consider using Typesafe Console
to pinpoint possible bottlenecks.

-Ende

--
Akka Team
Typesafe - The software stack for applications that scale
Blog: letitcrash.com
Twitter: @akkateam

√iktor Ҡlang

unread,
Mar 27, 2013, 1:06:39 PM3/27/13
to Akka User List
On Wed, Mar 27, 2013 at 5:57 PM, Alexander waite <alix...@gmail.com> wrote:
I used the fork-join-executor originally and then moved to thread-pool for a "just in case" scenario. I have the same issue with both executors.

And you don't have any locks or waiting in the code you call inside your actor? Then I suspect the `mailbox-type = "akka.dispatch.UnboundedDequeBasedMailbox"`
if you must have the Deque because of Stash, you might want to try to use ConcurrentLinkedDeque (which we cannot put into Akka as long as it has to be able to work with Java 6, which is why it uses LinkedBlockingDeque)

package your.package

import com.typesafe.config.Config
import akka.actor.{ ActorSystem, ActorRef }
import akka.dispatch. { MailboxType, DequeBasedMessageQueue, UnboundedDequeBasedMessageQueueSemantics, MessageQueue, Envelope }

case class UnboundedDequeBasedMailboxForJava7() extends MailboxType {
  def this(settings: ActorSystem.Settings, config: Config) = this()
  final override def create(owner: Option[ActorRef], system: Option[ActorSystem]): MessageQueue =
    new ConcurrentLinkedDeque[Envelope]() with DequeBasedMessageQueue with UnboundedDequeBasedMessageQueueSemantics { final val queue = this }
}


And then in your config you put: mailbox-type = "your.package.UnboundedDequeBasedMailboxForJava7"
 
What effect does that give?

Cheers,

Akka Team

unread,
Mar 27, 2013, 1:14:06 PM3/27/13
to Akka User List
> And you don't have any locks or waiting in the code you call inside your
> actor? Then I suspect the `mailbox-type =
> "akka.dispatch.UnboundedDequeBasedMailbox

Ah, smart observation, Viktor!

-Endre

√iktor Ҡlang

unread,
Mar 27, 2013, 1:38:06 PM3/27/13
to Akka User List
So you're not calling into anything that is synchronized or internally uses locks?

Perhaps you're unstashing very frequently and then restash a lot of messages, leading to inefficiencies?

Cheers,


On Wed, Mar 27, 2013 at 6:32 PM, Alexander waite <alix...@gmail.com> wrote:
I have added the concurrent mailbox and have seen some improvement where the system sits stable at around 2400% total cpu usage. I don't have any locks in my code and I did need the dequeue for the stash. I was looking at using the typesafe console earlier for profiling the system but please correct me if I'm wrong, but is it not a commercial product? I don't currently have a research budget for such things.
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

√iktor Ҡlang

unread,
Mar 27, 2013, 1:40:00 PM3/27/13
to Akka User List
What does your actor hierarchy look like?

√iktor Ҡlang

unread,
Mar 27, 2013, 2:02:23 PM3/27/13
to Akka User List



On Wed, Mar 27, 2013 at 6:47 PM, Alexander waite <alix...@gmail.com> wrote:
I only have one actor that uses the stash and even then it only stashes and unstashes during initialisation and then it never stashes again. 
Could my problem be that I only have one actor using stash yet I'm forcing everything to use this dequeue based mailbox?

Yes, try to give that actor its own pinned dispatcher and the new mailbox.
 
As for the Hierarchy, I have a simple MVC where view is a single actor, controller is a single actor and model is a single actor that spawns hundreds of thousands to several million worker actors. To ease the bottleneck of all the worker actors talking to a single model actor, the model actor also spawns some actors which aggregate messages from a specific group of workers and then pass the aggregation to the model actor. I can attempt a crude MsPaint diagram if you wish.

To me it sounds like you might want to avoid having millions of children to a single actor. Especially if that actor (the parent) is also supposed to do meaningful work (imagine how much time for real work you'd have if you had a million kids on your hands!)

Cheers,

√iktor Ҡlang

unread,
Mar 27, 2013, 2:11:24 PM3/27/13
to Akka User List



On Wed, Mar 27, 2013 at 7:09 PM, Alexander waite <alix...@gmail.com> wrote:
The "model" actor really doesnt do anything "meaningful". It receives a "Tick" command from the controller to indicate it's time to do work, and then multiplexes it to all the workers via the intermediate nodes I mentioned to reduce the bottleneck. Once all workers have returned a "tock" message indicating completion, the aggregated tock messages are passed back up to that single model actor and it forwards the completion message back to the controller.

I'll put together a pinned dispatcher for each of my major components, change default dispatcher to something more "worker-friendly" and get back to you.

I only suggested doing so for that _one_ actor that needed to use Stash. If you have multiple actors that uses stash, they shouldn't use a pinned dispatcher (as they won't be able to share threads then), they can share the same thread pool but make sure it has the right mailbox.

Roland Kuhn

unread,
Mar 27, 2013, 5:14:16 PM3/27/13
to akka...@googlegroups.com, akka...@googlegroups.com
Hi Alexander,

how many of those "tick" messages are active at the same time? If it is just one then there will be a time period where everyone waits for the model to talk to the controller, which will just use 1 CPU core.

Another thought: how are the cores grouped physically and how many aggregator actors do you use? Have you tried the various NUMA settings for the JVM?


Regards,

Dr. Roland Kuhn
Akka Tech Lead
Typesafe – The software stack for applications that scale.
twitter: @rolandkuhn

On 27 mar 2013, at 19:09, Alexander waite <alix...@gmail.com> wrote:

The "model" actor really doesnt do anything "meaningful". It receives a "Tick" command from the controller to indicate it's time to do work, and then multiplexes it to all the workers via the intermediate nodes I mentioned to reduce the bottleneck. Once all workers have returned a "tock" message indicating completion, the aggregated tock messages are passed back up to that single model actor and it forwards the completion message back to the controller.

I'll put together a pinned dispatcher for each of my major components, change default dispatcher to something more "worker-friendly" and get back to you.

On Wednesday, 27 March 2013 18:02:23 UTC, √ wrote:

G J

unread,
Mar 30, 2013, 1:50:59 PM3/30/13
to akka...@googlegroups.com
This is tangential to the thread, but I'm curious...

The 'let it crash' link mentions benchmarks using:

Processor: 48 core AMD Opteron (4 dual-socket with 6 core AMD® Opteron™ 6172 2.1 GHz

It repeatedly appears to me - maybe by coincidence - that anyone with a little bit of brains is using AMD processors. Yet when you look at CPU performance comparisons on:

these processors suck big time - including the AMD 6172.

Is it the case that cpubenchmark.net has their pockets lined with Intel cash?

What give, if anything?

Best Regards.
---------------------------------------------------

Akka Team

unread,
Apr 6, 2013, 6:55:53 AM4/6/13
to Akka User List
Hi G J,

I don’t know the details since I was not part of the procurement process, but in order to measure software scalability we just need lots of cores and don’t care that much whether those are the fastest available or not. A “complicated” NUMA setup also helps identifying memory latency effects (which in this case led to fixing scalability issues in the early implementation of jsr166y’s ForkJoinPool).

Regards,

Roland
Reply all
Reply to author
Forward
0 new messages