CPU spikes using fork-join-executor

951 views
Skip to first unread message

Akara Sucharitakul

unread,
Apr 14, 2015, 12:28:45 AM4/14/15
to akka...@googlegroups.com
Scala 2.11.6, Akka 2.3.9, Spray 1.3.2.

When testing a particular workload, we ran into CPU spikes. The avg CPU load is less than 5% but about 6-7 minutes into the run, we start to see CPU spiking to near 100% lasting for several seconds. This repeats itself every 6-7 minutes. We can't correlate this with a particular GC activity just yet. We tested against Java 7_45 but tried Java 8 without much change.

Spray is configured on the default-dispatcher (Spray default).

Trying to capture thread dumps at the time spikes happening reveal 9 threads spinning on the followings:

"default-akka.actor.default-dispatcher-11" prio=10 tid=0x0000000001b2b800 nid=0x3752 runnable [0x00007f1191d40000]
   java.lang.Thread.State: RUNNABLE
at akka.io.DirectByteBufferPool.takeBufferFromPool(DirectByteBufferPool.scala:41)
at akka.io.DirectByteBufferPool.acquire(DirectByteBufferPool.scala:31)
at akka.io.TcpConnection.doRead(TcpConnection.scala:231)
at akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

2 threads show this spinning:

"default-akka.actor.default-dispatcher-8" prio=10 tid=0x0000000001d6f000 nid=0x374f runnable [0x00007f11928d2000]
   java.lang.Thread.State: RUNNABLE
at akka.io.DirectByteBufferPool.offerBufferToPool(DirectByteBufferPool.scala:66)
at akka.io.DirectByteBufferPool.release(DirectByteBufferPool.scala:34)
at akka.io.TcpConnection.doRead(TcpConnection.scala:245)
at akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

1 thread is on I/O:

"default-akka.actor.default-dispatcher-4" prio=10 tid=0x0000000001be2800 nid=0x36ff runnable [0x00007f11924b0000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:295)
- locked <0x00000007d3b7e940> (a java.lang.Object)
at akka.io.TcpConnection.innerRead$1(TcpConnection.scala:216)
at akka.io.TcpConnection.doRead(TcpConnection.scala:232)
at akka.io.TcpConnection$$anonfun$connected$1.applyOrElse(TcpConnection.scala:87)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.io.TcpConnection.aroundReceive(TcpConnection.scala:28)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

All in all, during this time there is no app logic being executed on the default dispatcher at all (12 threads on the default-dispatcher).

For reference, here's the takeBufferFromPool implementation:

@tailrec
private final def takeBufferFromPool(): ByteBuffer =
if (locked.compareAndSet(false, true)) {
val buffer =
try if (buffersInPool > 0) {
buffersInPool -= 1
pool(buffersInPool)
} else null
finally locked.set(false)

// allocate new and clear outside the lock
if (buffer == null)
allocate(defaultBufferSize)
else {
buffer.clear()
buffer
}
} else takeBufferFromPool()
// spin while locked

It seems like some thread is holding locked (AtomicBoolean) to true causing the other threads to spin, but I do not seem to find the offending code.

Now to the set of questions:
  1. Using thread-pool-executor instead of fork-join-executor prevents the spikes from happening. Anybody running into similar issues with fork-join-executor? Any know bug in the Scala ForkJoinPool?
  2. I don't seem to find a way to use the Java 8 or Java 7 ForkJoinPool directly in the Akka docs. Is there a way I can test it out?
  3. Any other thoughts?
Thanks in advance!
-Akara


Martynas Mickevičius

unread,
Apr 16, 2015, 8:03:48 AM4/16/15
to akka...@googlegroups.com
Hi,

are you running this application on Windows? There is a known situation where high CPU usage might occur when using TCP with akka IO.

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--
Martynas Mickevičius
TypesafeReactive Apps on the JVM

Akara Sucharitakul

unread,
Apr 16, 2015, 11:20:52 PM4/16/15
to akka...@googlegroups.com
Nope, this is actually Ubuntu 10.10 on a 4vCPU VM (likely ESX but I can't tell for sure). Couple more pieces of information:
  • Further experiments with parallelism-factor lowered to 1.0 no longer shows the spikes. 
  • Trying to profile the spikes with JMC would cause the spikes not to happen. The sampling somehow changed the timing characteristics.
  • Switching the ForkJoinExecutor to Java8, the initial experiments show similar spikes. Since this was quick implementation of ExecutorServiceConfigurator to get Java 8 ForkJoinPool to work, there is still a bit of refinement we need to do before we can conclude on this fact.
I'll keep updating this thread as we find out more. Expect it to be a long process, though.

Thanks!
-Akara

Endre Varga

unread,
Apr 17, 2015, 2:19:23 AM4/17/15
to akka...@googlegroups.com
Hi Akara,

The ForkJoinPool is a high-throughput executor pool and therefore has been designed with the assumption that most of the time threads will be busy. If you see the "scan()" method popping up as using the most CPU time then it means that your threads are mostly idle. This is quite likely since you mentioned that by reducing the parallelism the "problem" goes away.

Since FJP assumes the high throughput scenario it employs a work stealing strategy that relies on spinning (i.e. a kind of polling). Spinning is the most efficient way to do this if you know that you will succeed quick engough (this is here the high-throughput assumption -- it assumes that it will quickly grab a new job). However, if there is not enough work, then this spinning becomes visible as a CPU spike.

What you can do is one of these:
 - Just accept that this is how it is and not worry
 - Reduce the parallelism (i.e. number of threads in the pool)
 - Use a ThreadPoolExecutor based dispatcher

-Endre

Roland Kuhn

unread,
Apr 17, 2015, 2:20:55 AM4/17/15
to akka-user
Hi Akara,

thanks for all the detailed information, I am 95% certain that I know what happens—and I’d call it a bug. To the casual observer the code in DirectByteBufferPool looks like a CAS loop, but it actually is an open-coded primitive spinlock. If you have more threads competing for this lock than you have CPUs it might well happen that the kernel decides to schedule out a thread that was holding this lock, and since it is not a normal, honest lock, the kernel has no idea why the other threads now all are spinning like mad. It probably takes some server-type hardware and a bit of patience and good monitoring to actually notice, which is why you are the first person to report it :-)

I created a ticket to track this.

Regards,

Roland


Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn


Endre Varga

unread,
Apr 17, 2015, 2:23:11 AM4/17/15
to akka...@googlegroups.com
On Fri, Apr 17, 2015 at 8:20 AM, Roland Kuhn <goo...@rkuhn.info> wrote:
Hi Akara,

thanks for all the detailed information, I am 95% certain that I know what happens—and I’d call it a bug. To the casual observer the code in DirectByteBufferPool looks like a CAS loop, but it actually is an open-coded primitive spinlock. If you have more threads competing for this lock than you have CPUs it might well happen that the kernel decides to schedule out a thread that was holding this lock, and since it is not a normal, honest lock, the kernel has no idea why the other threads now all are spinning like mad. It probably takes some server-type hardware and a bit of patience and good monitoring to actually notice, which is why you are the first person to report it :-)

Yeah, what I said about FJP holds, but in this case it is not relevant. Thanks for investigating!

-Endre

Viktor Klang

unread,
Apr 17, 2015, 4:32:08 AM4/17/15
to Akka User List
I'd bet on this explanation too.

Having looked at the code I think it should be possible to switch implementations to something without spinlocking.

On Fri, Apr 17, 2015 at 8:20 AM, Roland Kuhn <goo...@rkuhn.info> wrote:



--
Cheers,

Akara Sucharitakul

unread,
Apr 19, 2015, 1:37:53 PM4/19/15
to akka...@googlegroups.com
Thank you very much, everybody. Although we don't have the right tools to gather concrete evidence supporting this explanation, I agree it is the most likely cause. And thank you very much for filing the issue.

The fact that we're on a VM and have double-layer scheduling (one by the hypervisor and one by the OS) adds to the complexity and could cause some kind of oscillation where a thread could be kept from running for longer than usual. This could be a hypothesis for the periodic and rather regular spike we've observed, and a reason others may have not seen it. It could also be happening just on a particular version of the hypervisor that we happen to have, too.

Good this is now in the clear and we know how to deal with it.

Thank you very much once again.
-Akara
Reply all
Reply to author
Forward
0 new messages