OOME in using CometActor

Atsuhiko Yamanaka

unread,

Sep 16, 2009, 6:38:58 AM9/16/09

to lif...@googlegroups.com

Hi,

I have encountered OutOfMemory Exception in developing Liftweb app.
It seems it is related to CometActor(, or scala.actors.Actor)
At OOME timing, I have found 1,000,000 instances of
scala.actors.FJTaskRunner$VolatileTaskRef
in the heap area by using VisualVM.

I have tried to narrow down the problem, and found that
following simple clock app has also the similar problem,
http://gist.github.com/187924

If you are interested in this problem, please try it.

$ git clone git://gist.github.com/187924.git gist-187924
$ cd gist-187924
$ mvn jetty:run

Then, visit http://127.0.0.1:8080/

I will attache the screenshot of VisualVM, which displays
the internal of heap area after 2 hours later.
In this shot, you will find that 490,000 FJTaskRunner$VolatileTaskRef
instances in the heap. In monitoring the VisualVM, I have found
that the number of threads has been monotonically increased
even for the access from only one browser.

Does scala.actors.Actor have still resource leak bug?

I'm trying Liftweb 1.1-SNAPSHOT, Scala 2.7.5
and JDK1.6.0_16 on GNU/Linux .

Sincerely,
--
Atsuhiko Yamanaka
JCraft,Inc.
1-14-20 HONCHO AOBA-KU,
SENDAI, MIYAGI 980-0014 Japan.
Tel +81-22-723-2150
+1-415-578-3454
Skype callto://jcraft/

OOME2.JPG

David Pollak

unread,

Sep 16, 2009, 8:30:54 AM9/16/09

to lif...@googlegroups.com

On Wed, Sep 16, 2009 at 3:38 AM, Atsuhiko Yamanaka <atsuhiko...@gmail.com> wrote:

Hi,

I have encountered OutOfMemory Exception in developing Liftweb app.
It seems it is related to CometActor(, or scala.actors.Actor)
At OOME timing, I have found 1,000,000 instances of
scala.actors.FJTaskRunner$VolatileTaskRef
in the heap area by using VisualVM.

I have tried to narrow down the problem, and found that
following simple clock app has also the similar problem,
http://gist.github.com/187924

If you are interested in this problem, please try it.

$ git clone git://gist.github.com/187924.git gist-187924
$ cd gist-187924
$ mvn jetty:run

Then, visit http://127.0.0.1:8080/

I will attache the screenshot of VisualVM, which displays
the internal of heap area after 2 hours later.
In this shot, you will find that 490,000 FJTaskRunner$VolatileTaskRef
instances in the heap. In monitoring the VisualVM, I have found
that the number of threads has been monotonically increased
even for the access from only one browser.

Does scala.actors.Actor have still resource leak bug?

Apparently so. Philipp Haller told me that the default in 2.7.5/JVM 1.5 implementations was to use the java.util.concurrent.Executor for scheduling... I guess not (teaches me not to trust but verify).

I'll Force Scala Actors to use java.util.concurrent.Executor for we avoid the FJ Library in Scala

I would change CometActors to use Lift's Actor library, but that would be a breaking change to Lift, although we'd definitely get a more reliable Actor implementation.

I'm trying Liftweb 1.1-SNAPSHOT, Scala 2.7.5
and JDK1.6.0_16 on GNU/Linux .

Sincerely,
--
Atsuhiko Yamanaka
JCraft,Inc.
1-14-20 HONCHO AOBA-KU,
SENDAI, MIYAGI 980-0014 Japan.
Tel +81-22-723-2150
+1-415-578-3454
Skype callto://jcraft/

--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp

Derek Williams

unread,

Sep 16, 2009, 8:42:55 AM9/16/09

to lif...@googlegroups.com

I believe I ran into this before, to fix it I came up with this:

package scala.actors

object ActorTimerKiller {
def kill {
    Actor.timer.cancel
}
}

and then in my Boot.scala:

LiftRules.unloadHooks.append(() => {
ActorTimerKiller.kill
}

I also have Actor.clearSelf in my unloadHooks, but I can't remember if that was for this issue or another, since I use Actors in my code, not just CometActors.

Oh wait, rereading your post, it looks like this is something that doesn't involve redeploying during development, which was where I was running into this problem. This post probably wont help you then... but I'll post it anyways.

--
Derek Williams

Erik Engbrecht

unread,

Sep 16, 2009, 9:33:14 AM9/16/09

to lif...@googlegroups.com

The large number of VolatileTaskRefs is a consequence of your thread pool growth. Each worker thread maintains an array of VolatileTaskRef objects. The VolatileTaskRef objects are reused rather than allocated for each task, so they will not be GC'd as long as the worker thread is alive. You can tell that the they being properly cleared because size wise if they weren't being cleared you see what they are pointing to dominate heap usage.

I'd suggest setting the maximum thread pool size to something reasonable for the number of processors and available memory you have. If you don't, the scheduler will happily spawn up to 255 threads.

--
http://erikengbrecht.blogspot.com/

Atsuhiko Yamanaka

unread,

Sep 16, 2009, 10:25:11 AM9/16/09

to lif...@googlegroups.com

Hi,

On Wed, Sep 16, 2009 at 10:33 PM, Erik Engbrecht
<erik.en...@gmail.com> wrote:
> The large number of VolatileTaskRefs is a consequence of your thread pool
> growth. Each worker thread maintains an array of VolatileTaskRef objects.
> The VolatileTaskRef objects are reused rather than allocated for each task,
> so they will not be GC'd as long as the worker thread is alive. You can
> tell that the they being properly cleared because size wise if they weren't
> being cleared you see what they are pointing to dominate heap usage.
>
> I'd suggest setting the maximum thread pool size to something reasonable for
> the number of processors and available memory you have. If you don't, the
> scheduler will happily spawn up to 255 threads.

Thank you for your suggestion.

Do you mean system properties
actors.corePoolSize
actors.maxPoolSize
actors.timeFreq
referred in scala.actors. FJTaskScheduler2?

Erik Engbrecht

unread,

Sep 16, 2009, 10:36:57 AM9/16/09

to lif...@googlegroups.com

Yes, particularly maxPoolSize.

--
http://erikengbrecht.blogspot.com/

Atsuhiko Yamanaka

unread,

Sep 16, 2009, 10:50:28 AM9/16/09

to lif...@googlegroups.com

Hi,

On Wed, Sep 16, 2009 at 11:36 PM, Erik Engbrecht
<erik.en...@gmail.com> wrote:
> Yes, particularly maxPoolSize.

Thank you for prompt reply.
I'll try lower value, for example, 50 for maxPoolSize before going to the bed.
I'll look forward to the good result in the next morning.

Atsuhiko Yamanaka

unread,

Sep 16, 2009, 10:15:18 PM9/16/09

to lif...@googlegroups.com

Hi,

On Wed, Sep 16, 2009 at 11:50 PM, Atsuhiko Yamanaka
<atsuhiko...@gmail.com> wrote:
> I'll try lower value, for example, 50 for maxPoolSize before going to the bed.
> I'll look forward to the good result in the next morning.

I got the good result.

By setting maxPoolSize as 50, OOME had not appeared.
With that setting, 50 FJTaskRunner instances was allocated and
204,800 FJTaskRunner$VolatileTaskRef were there in the heap.

I guess that 204,800 corresponds to 50*4096[1]. According to VisualVM, those
204,800 instances cost 2,457,600( == 50*4096*12) bytes.

So, for the default maxPoolSize(255), we will be able to prevent this
kind of OOME
by adding additional 12,533,760 bytes heap size for VolatileTaskRef.

As for the changing the implementation of CometActor in the other thread,
I think current implementation may be enough for "scala 2.7.x" at least,
if we can share above knowledge.

Anyway, thank for your help. Now, my lift app becomes sustainable.
It is a desktop image sharing service and heavily depends on CometActor.
If you are interested in it, it has been experimentally running at
http://lift.jcraft.com/dstream/scala@tohoku2 .

[1] http://lampsvn.epfl.ch/trac/scala/browser/scala/tags/R_2_7_6_final/src/actors/scala/actors/FJTaskRunner.java#L244

Reply all

Reply to author

Forward