New Lift Actor code

David Pollak

unread,

May 23, 2009, 1:19:28 AM5/23/09

to liftweb, scala-i...@listes.epfl.ch, esme...@incubator.apache.org

Folks,

It is not lightly that I've made the decision to write an alternative Actor library and move the Lift code base from the Scala Actors to Lift Actors (working name). I want to spend a little time talking about the steps that led to the decision as well as the impact that it will have on Lift code.

Since November, I've been chasing a series of memory leaks in the Actor library. Philipp Haller from EPFL has been responsive in addressing the individual memory leaks, but the issue seems to be one of whack-a-mole... each time one memory leak is fixed, another one appears. Further, the existing Actor architecture does not lend itself to the kind of Actor usage cycle that we find in Lift apps. Specifically:

Lift creates/destroys an Actor for each Comet request. This rapid creation/destruction of Actors caused memory back-ups, and the existing Actor code seems to be oriented to long running Actors rather than Actors with Object-length lifespans.
The FJ libraries used for Actor scheduling have problems on multi-core machines and are also a source of memory retention issues.
Replacing the FJ libraries with a scheduler based on java.util.concurrent exposes race/deadlock conditions related to the fact that some parts of the Actor processing (e.g., testing mailbox items against partial functions while the Actor itself is synchronized)
The Actors require external threads to function and it's not possible to create external threads in the Google App Engine (making Actor-based functionality including CometActors non-functioning in GAE apps)
Actors are fragile when exceptions are thrown
Actors have running and not running states (as compared with objects which can always respond to message sends). In practice, managing the running and not running states is as hard as managing memory in C.
There are hidden actors associated with each thread which display the above fragility and state management issues
And as a practical matter, I've got a couple of applications that are going into production over the next few weeks and cannot wait for the various fixes to make it into Scala 2.8 and the hacks and work-arounds that I've done to the 2.7.4 Actor libraries became too complex for my comfort.

I have written a simple Actor class that is focused on message sending and processing of messages asynchronously. This means there's a single operation that you can perform on Actors, the message send operation. Actors can be specicialized (they only access messages of a certain type). In order to receive a response from an Actor, you can pass in a Future as part of the message and that Future may be satisfied asynchronously. This means that a sender of a message need not be an Actor and that the Actor recipient of a message cannot determine the sender of a message. Actors have two bits of internal state: a mailbox and a flag indicating that the Actor is currently processing messages in its mailbox. The amount of synchronization of Actors is minimal (on inserting messages into the mailbox, on removing messages from the mailbox, and on changing state to/from "processing messages".)

An Actor instance must provide a messageHandler method which returns a PartialFunction that is used to pattern match against the messages in the mailbox. The instance may also provide an optional exception handler that is called if an Exception is thrown during the handling of a message.

The Actor is guaranteed to only be processing one message at a time and the Actor is guaranteed not to be in a monitor (synchronized) during the processing of messages. An Actor is guaranteed to maintain the order of the messages in its mailbox, however, messages that do not currently match the messageHandler will be retained in the order that they were received in the event that the messageHandler changes and they can be processed.

The Lift Actors will, by default, use the java.util.concurrent library for thread pooling, although I have worked out a mechanism for thread-piggy-backing such that if the Actors are running in GAE, they need not use any additional thread (this will enable Lift's comet support in GAE.) There will also be a scheduler (much like the existing ActorPing) which will send a message to an Actor at some time in the future (and on GAE, this scheduler, the Pinger, will not require a separate thread.)

The changes that you will have to make to your applications are minimal. Actors will no longer have start(), exit(), or link() methods. Actors will always process messages in their mailbox and will be removed from the system by the JVM's garbage collector. Calls to !? will be replaced by calls to ! with a Future as a parameter to the message. Calls to ActorPing will be replaced by calls to Pinger.

You can continue to mix Scala's Actors and Lift's Actors in an application, although Lift's work-arounds to the Scala Actor memory retention issues and scheduling issues will not be turned on by default (they will still be available in the Lift codebase if you're using Scala 2.7.4 and need the work-arounds.)

I am happy to share the Lift Actor code with EPFL and if it makes it into the Scala distribution as SimpleActors or something similar, I'm totally cool with that. I'm not interested in owning or maintaining an Actor library. I am however, dedicated to making sure that Lift apps can run in production for months (or even years) without retaining memory or having other problems that can impact the stability of applications.

I will have a branch committed up on GitHub tomorrow with Lift ported to the new Actor library.

If you have any questions, please let me know.

Thanks,

David

PS -- ESME people, I'll roll these changes into ESME next week

--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Git some: http://github.com/dpp

Timothy Perrett

unread,

May 23, 2009, 7:37:22 AM5/23/09

to Lift

David, this is extremely interesting.

Given the points you outlined this makes perfect sense to move from
scala.actors - however, if come the 2.8 release EPFL fix the actors
library so that it then becomes acceptable to use within lift again,
would you want to move back to it? IMO, and as you said in your mail,
you (or indeed we) have no interest in maintaing our own actors
implementation and it seems like it would be most optiomal to use the
EPFL implementation when it becomes appropriate to.

Cheers, Tim

On May 23, 6:19 am, David Pollak <feeder.of.the.be...@gmail.com>
wrote:

> Folks,
>
> It is not lightly that I've made the decision to write an alternative Actor
> library and move the Lift code base from the Scala Actors to Lift Actors
> (working name). I want to spend a little time talking about the steps that
> led to the decision as well as the impact that it will have on Lift code.
>
> Since November, I've been chasing a series of memory leaks in the Actor
> library. Philipp Haller from EPFL has been responsive in addressing the
> individual memory leaks, but the issue seems to be one of whack-a-mole...
> each time one memory leak is fixed, another one appears. Further, the
> existing Actor architecture does not lend itself to the kind of Actor usage
> cycle that we find in Lift apps. Specifically:
>

> - Lift creates/destroys an Actor for each Comet request. This rapid

> creation/destruction of Actors caused memory back-ups, and the existing
> Actor code seems to be oriented to long running Actors rather than Actors
> with Object-length lifespans.

> - The FJ libraries used for Actor scheduling have problems on multi-core

> machines and are also a source of memory retention issues.

> - Replacing the FJ libraries with a scheduler based on

> java.util.concurrent exposes race/deadlock conditions related to the fact
> that some parts of the Actor processing (e.g., testing mailbox items against
> partial functions while the Actor itself is synchronized)

> - The Actors require external threads to function and it's not possible

> to create external threads in the Google App Engine (making Actor-based
> functionality including CometActors non-functioning in GAE apps)

> - Actors are fragile when exceptions are thrown
> - Actors have running and not running states (as compared with objects

> which can always respond to message sends). In practice, managing the
> running and not running states is as hard as managing memory in C.

> - There are hidden actors associated with each thread which display the

> above fragility and state management issues

> - And as a practical matter, I've got a couple of applications that are

> Beginning Scalahttp://www.apress.com/book/view/1430219890

David Pollak

unread,

May 23, 2009, 10:29:36 AM5/23/09

to lif...@googlegroups.com

On Sat, May 23, 2009 at 4:37 AM, Timothy Perrett <tim...@getintheloop.eu> wrote:

David, this is extremely interesting.

Given the points you outlined this makes perfect sense to move from
scala.actors - however, if come the 2.8 release EPFL fix the actors
library so that it then becomes acceptable to use within lift again,
would you want to move back to it? IMO, and as you said in your mail,
you (or indeed we) have no interest in maintaing our own actors
implementation and it seems like it would be most optiomal to use the
EPFL implementation when it becomes appropriate to.

Sure. I would prefer to build stuff on top of standard tools and libraries. Having two different Actor implementations could cause confusion. With that being said, I also expect that if we sit on top of a standard library, that there is a mechanism for insuring that systemic problems (in this case the memory retention issues) are addressed in a holistic and timely manner.

Thanks,

David

--

Lift, the simply functional web framework http://liftweb.net

Beginning Scala http://www.apress.com/book/view/1430219890

Martin Ellis

unread,

May 23, 2009, 10:39:31 AM5/23/09

to lif...@googlegroups.com

On Sat, May 23, 2009 at 6:19 AM, David Pollak
<feeder.of...@gmail.com> wrote:
> I am happy to share the Lift Actor code with EPFL and if it makes it into
> the Scala distribution as SimpleActors or something similar, I'm totally
> cool with that. I'm not interested in owning or maintaining an Actor
> library. I am however, dedicated to making sure that Lift apps can run in
> production for months (or even years) without retaining memory or having
> other problems that can impact the stability of applications.

The cool thing about this is that it provides solid evidence that Scala -
as a language - does satisfy the aim of being be a scalable language.

I'm referring to the fact that Scala actors are not part of the core language.
They're just a library that can be replaced with a different library, which can
also to provide the 'feel' of native language support for objects of that type.
It's such a fundamental part of the language design that Programming in
Scala talks about it in Chapter 1, Section 1.

It's timely that you sent the email so soon after the link to the Guy Steele
"Growing a Language" OOPSLA presentation (of which I am still in awe)
went around on twitter.
http://video.google.com/videoplay?docid=-8860158196198824415

I guess this demonstrates that Scala provides the features for growth that
Steele says are needed for languages to be successful in the long term,
and that he would have liked Java to have. Awesome.

Nice, clear explanation, by the way. Should avoid any any NIH allegations on
the diggs and reddits of the world ;o)

Martin

David Pollak

unread,

May 23, 2009, 11:58:33 AM5/23/09

to lif...@googlegroups.com, scala-i...@listes.epfl.ch, esme...@incubator.apache.org

On Sat, May 23, 2009 at 7:39 AM, Martin Ellis <elli...@gmail.com> wrote:

On Sat, May 23, 2009 at 6:19 AM, David Pollak
<feeder.of...@gmail.com> wrote:
> I am happy to share the Lift Actor code with EPFL and if it makes it into
> the Scala distribution as SimpleActors or something similar, I'm totally
> cool with that. I'm not interested in owning or maintaining an Actor
> library. I am however, dedicated to making sure that Lift apps can run in
> production for months (or even years) without retaining memory or having
> other problems that can impact the stability of applications.

The cool thing about this is that it provides solid evidence that Scala -
as a language - does satisfy the aim of being be a scalable language.

Yes, this is absolutely right. It also points up what I missed in my original posting... the amazing value of the Scala Actors which include:

First, and most important to Lift, a conceptual framework for doing concurrency. Without the Actor model, Lift would not have such a rich model for building interactive applications.
A design that keeps true to the Erlang Actor model in that it supports linking, run states, and other things that make an OTP style library possible. (Hey Jonas, where's that OTP library?)
A design that has evolved from simply supporting send/wait-for-response (!?) to send and immediately receive Future and other cool features.
Blocking until Futures are satisfied without consuming a thread if the Future was within a react-based Actor.
An implementation that worked well in JDK 1.4. Many of the current memory and scheduling issues are a result of the fact that Scala's Actors worked on JDK 1.4, back when 1.4 was the target for the Scala distribution.

Scala is a language that supports multiple Actor libraries, just as it supports multiple collections libraries. There are no built-in collections classes in Scala. All collections are implemented at the library level. And just as there were defects in some on the Scala collections classes that David MacIver fixed, there are existing defects in the Actor libraries. Just as there are specialized Map() collections that are appearing for Scala that maximize performance for particular data types and/or key distributions, we are creating a specialized Actor library that's optimized for the kind of use that we see in Lift and web apps in general.

This is a testament to Scala's flexibility and to the foresight of including such a powerful concurrency library, Actors, as part of the distribution. But for those two things, Lift would not be nearly as cool as it is.

So, please do not read this thread as a repudiation of the Scala Actor library, please read it as an expansion of what is possible within Scala.

Thanks,

David

I'm referring to the fact that Scala actors are not part of the core language.
They're just a library that can be replaced with a different library, which can
also to provide the 'feel' of native language support for objects of that type.
It's such a fundamental part of the language design that Programming in
Scala talks about it in Chapter 1, Section 1.

It's timely that you sent the email so soon after the link to the Guy Steele
"Growing a Language" OOPSLA presentation (of which I am still in awe)
went around on twitter.
http://video.google.com/videoplay?docid=-8860158196198824415

I guess this demonstrates that Scala provides the features for growth that
Steele says are needed for languages to be successful in the long term,
and that he would have liked Java to have. Awesome.

Nice, clear explanation, by the way. Should avoid any any NIH allegations on
the diggs and reddits of the world ;o)

Martin

Jonas Bonér

unread,

May 23, 2009, 4:20:53 PM5/23/09

to lif...@googlegroups.com, scala-i...@listes.epfl.ch, esme...@incubator.apache.org

> First, and most important to Lift, a conceptual framework for doing
> concurrency. Without the Actor model, Lift would not have such a rich model
> for building interactive applications.
> A design that keeps true to the Erlang Actor model in that it supports
> linking, run states, and other things that make an OTP style library
> possible. (Hey Jonas, where's that OTP library?)

Here it is the repo:
http://github.com/jboner/scala-otp/tree/master

Or do you mean that it has not happened much there for a while?
I certainly plan to expand it quite a lot, even have some code I could
make its way into it eventually.

--
Jonas Bonér

twitter: @jboner
blog: http://jonasboner.com
work: http://crisp.se
work: http://scalablesolutions.se
code: http://github.com/jboner

David Pollak

unread,

May 26, 2009, 12:22:10 PM5/26/09

to Philipp Haller, liftweb, scala-i...@listes.epfl.ch

On Mon, May 25, 2009 at 9:52 AM, Philipp Haller <philipp...@epfl.ch> wrote:

Hi all,

I have been looking at scala.actors to see how far we are from meeting
Lift's requirements, and what LiftActor provides that scala.actors
don't. I split my reply into two mails for better modularity. In the
next installment you can read about how (something like) LiftActor could
be integrated into scala.actors.

To do this, let me first address some of David's points. Disclaimer: I
don't want to argue that scala.actors is perfect. There are some
problems and we are fixing them as we speak. (Kudos to Erik, Rich and
Mirco!)

> * Lift creates/destroys an Actor for each Comet request. This rapid

> creation/destruction of Actors caused memory back-ups, and the
> existing Actor code seems to be oriented to long running Actors
> rather than Actors with Object-length lifespans.

scala.actors (Actors in the following) are designed to support these
short object-length lifespans. Indeed, creating an Actor is very cheap.
If the user chooses to shut down the underlying thread pool manually,
destruction of an Actor amounts to garbage-collecting it. Destruction
only involves (little) more if the library should shut down the thread
pool automatically, or Actors are linked together.

In practice, this has been a serious memory retention issue. I've had to write a job that scavenges exited Actors from the ActorGC pool as well as a custom scheduler to work around these issues. In theory this may be true, but in practice, it's not. In practice, one must manually exit an Actor or the Actor is retained.

> * The FJ libraries used for Actor scheduling have problems on

> multi-core machines and are also a source of memory retention issues.

We should have replaced the old (pre-JDK7) FJ framework earlier. Note
that Actors can override the scheduler used to execute them:

object MyExecutorScheduler extends SchedulerAdapter {
val pool = Executors.newCachedThreadPool() // for example
def execute(block: => Unit) =
pool.execute(new Runnable {
def run() { block }
})
}

trait MyActor extends Actor {
override def scheduler = MyExecutorScheduler
}

In 2.8, we intend to use a scheduler based on j.u.c.ThreadPoolExecutor
as a default.

We've replaced the default scheduler with one that is substantially similar to the above scheduler. It has cured a lot of problems.

> * Replacing the FJ libraries with a scheduler based on

> java.util.concurrent exposes race/deadlock conditions related to
> the fact that some parts of the Actor processing (e.g., testing
> mailbox items against partial functions while the Actor itself is
> synchronized)

If I understand correctly this is the subject of ticket #2009:

https://lampsvn.epfl.ch/trac/scala/ticket/2009

Here, the reasoning is indeed as Erik suggests, namely that in the send
method
- the check for a matching message is done on the sender's tread while
the receiver's lock is held,
- the check is only done if the receiver is guaranteed to wait for a
message (therefore, it does not touch any local state), and
- multiple senders are serialized using the lock of the receiver.

Indeed, the guarantee that we want to provide is that Actors only
execute on one thread at a time. So, if there is a problem, I suppose it
must be somewhere else.

There must be two guarantees: (1) that the Actor will only execute on one thread at once and (2) that the Actor will not be in a monitor during execution. If the Actor is in a monitor during execution (as is the case with send and as was reported in 2009), there is a potential deadlock.

> * The Actors require external threads to function and it's not

> possible to create external threads in the Google App Engine
> (making Actor-based functionality including CometActors
> non-functioning in GAE apps)

As I mentioned before, Actors can be made to use schedulers that do not
create external threads. So, in principle this should make it possible
to run Actors on GAE.

Except that send() is in a monitor, so if you piggy-back message processing on send, you've got a deadlock. Further, if A sends a message to B which sends a message to A, how is that processed on a single thread?

> * Actors are fragile when exceptions are thrown

Here, I believe we can work out a good solution. You have already given
some valuable input, David, with your LiftActor class. So, either we
extend the programming model in that direction, or we refine
trapExit/link etc. and make them more robust.

> * Actors have running and not running states (as compared with

> objects which can always respond to message sends). In practice,
> managing the running and not running states is as hard as managing
> memory in C.

Actually, Actors only have to be started to execute the code in act()
before the first `react`. Upon hitting the first `react`, Actor and
LiftActor behave essentially the same with respect to running states.
And, again, you don't have to terminate your Actors if you are OK with
shutting down the thread pool yourself.

But there is a single thread pool for the entire app (or at least all the Lift parts of it) so the thread pool is retained for the whole time the app is running. Managing memory by thread pools is different and far less robust than using the JVM's GC mechanism.

> * There are hidden actors associated with each thread which display

> the above fragility and state management issues

Basically, what we have is that once a non-actor thread calls an
actor-based operation, an ActorProxy instance is created (which
basically holds the mailbox) that is stored in a ThreadLocal (see the
Actor object). It is used to enable a normal thread to receive messages.
Also, it is created when a non-actor thread sends a message to another
actor, so that the receiver always has access to the sender of a message.
It should always be possible to write an application so that none of
these ActorProxy instances are created. However, even if they are
created I currently don't see how they could result in memory retention
problems.

The problem that I've seen in the past is that if one of the ActorProxy instances gets into a non-running state, then the thread is dead to any !? operations. This makes every !? operation fragile.

I hope this clarifies some (important) points that David raises. The
bottom line is that our goals are not so far apart. I believe with some
tuning effort scala.actors could meet the requirements of Lift.