It is not lightly that I've made the decision to write an alternative Actor library and move the Lift code base from the Scala Actors to Lift Actors (working name). I want to spend a little time talking about the steps that led to the decision as well as the impact that it will have on Lift code.
Since November, I've been chasing a series of memory leaks in the Actor library. Philipp Haller from EPFL has been responsive in addressing the individual memory leaks, but the issue seems to be one of whack-a-mole... each time one memory leak is fixed, another one appears. Further, the existing Actor architecture does not lend itself to the kind of Actor usage cycle that we find in Lift apps. Specifically:
- Lift creates/destroys an Actor for each Comet request. This rapid creation/destruction of Actors caused memory back-ups, and the existing Actor code seems to be oriented to long running Actors rather than Actors with Object-length lifespans. - The FJ libraries used for Actor scheduling have problems on multi-core machines and are also a source of memory retention issues. - Replacing the FJ libraries with a scheduler based on java.util.concurrent exposes race/deadlock conditions related to the fact that some parts of the Actor processing (e.g., testing mailbox items against partial functions while the Actor itself is synchronized) - The Actors require external threads to function and it's not possible to create external threads in the Google App Engine (making Actor-based functionality including CometActors non-functioning in GAE apps) - Actors are fragile when exceptions are thrown - Actors have running and not running states (as compared with objects which can always respond to message sends). In practice, managing the running and not running states is as hard as managing memory in C. - There are hidden actors associated with each thread which display the above fragility and state management issues - And as a practical matter, I've got a couple of applications that are going into production over the next few weeks and cannot wait for the various fixes to make it into Scala 2.8 and the hacks and work-arounds that I've done to the 2.7.4 Actor libraries became too complex for my comfort.
I have written a simple Actor class that is focused on message sending and processing of messages asynchronously. This means there's a single operation that you can perform on Actors, the message send operation. Actors can be specicialized (they only access messages of a certain type). In order to receive a response from an Actor, you can pass in a Future as part of the message and that Future may be satisfied asynchronously. This means that a sender of a message need not be an Actor and that the Actor recipient of a message cannot determine the sender of a message. Actors have two bits of internal state: a mailbox and a flag indicating that the Actor is currently processing messages in its mailbox. The amount of synchronization of Actors is minimal (on inserting messages into the mailbox, on removing messages from the mailbox, and on changing state to/from "processing messages".)
An Actor instance must provide a messageHandler method which returns a PartialFunction that is used to pattern match against the messages in the mailbox. The instance may also provide an optional exception handler that is called if an Exception is thrown during the handling of a message.
The Actor is guaranteed to only be processing one message at a time and the Actor is guaranteed not to be in a monitor (synchronized) during the processing of messages. An Actor is guaranteed to maintain the order of the messages in its mailbox, however, messages that do not currently match the messageHandler will be retained in the order that they were received in the event that the messageHandler changes and they can be processed.
The Lift Actors will, by default, use the java.util.concurrent library for thread pooling, although I have worked out a mechanism for thread-piggy-backing such that if the Actors are running in GAE, they need not use any additional thread (this will enable Lift's comet support in GAE.) There will also be a scheduler (much like the existing ActorPing) which will send a message to an Actor at some time in the future (and on GAE, this scheduler, the Pinger, will not require a separate thread.)
The changes that you will have to make to your applications are minimal. Actors will no longer have start(), exit(), or link() methods. Actors will always process messages in their mailbox and will be removed from the system by the JVM's garbage collector. Calls to !? will be replaced by calls to ! with a Future as a parameter to the message. Calls to ActorPing will be replaced by calls to Pinger.
You can continue to mix Scala's Actors and Lift's Actors in an application, although Lift's work-arounds to the Scala Actor memory retention issues and scheduling issues will not be turned on by default (they will still be available in the Lift codebase if you're using Scala 2.7.4 and need the work-arounds.)
I am happy to share the Lift Actor code with EPFL and if it makes it into the Scala distribution as SimpleActors or something similar, I'm totally cool with that. I'm not interested in owning or maintaining an Actor library. I am however, dedicated to making sure that Lift apps can run in production for months (or even years) without retaining memory or having other problems that can impact the stability of applications.
I will have a branch committed up on GitHub tomorrow with Lift ported to the new Actor library.
If you have any questions, please let me know.
Thanks,
David
PS -- ESME people, I'll roll these changes into ESME next week
Given the points you outlined this makes perfect sense to move from
scala.actors - however, if come the 2.8 release EPFL fix the actors
library so that it then becomes acceptable to use within lift again,
would you want to move back to it? IMO, and as you said in your mail,
you (or indeed we) have no interest in maintaing our own actors
implementation and it seems like it would be most optiomal to use the
EPFL implementation when it becomes appropriate to.
Cheers, Tim
On May 23, 6:19 am, David Pollak <feeder.of.the.be...@gmail.com>
wrote:
> It is not lightly that I've made the decision to write an alternative Actor
> library and move the Lift code base from the Scala Actors to Lift Actors
> (working name). I want to spend a little time talking about the steps that
> led to the decision as well as the impact that it will have on Lift code.
> Since November, I've been chasing a series of memory leaks in the Actor
> library. Philipp Haller from EPFL has been responsive in addressing the
> individual memory leaks, but the issue seems to be one of whack-a-mole...
> each time one memory leak is fixed, another one appears. Further, the
> existing Actor architecture does not lend itself to the kind of Actor usage
> cycle that we find in Lift apps. Specifically:
> - Lift creates/destroys an Actor for each Comet request. This rapid
> creation/destruction of Actors caused memory back-ups, and the existing
> Actor code seems to be oriented to long running Actors rather than Actors
> with Object-length lifespans.
> - The FJ libraries used for Actor scheduling have problems on multi-core
> machines and are also a source of memory retention issues.
> - Replacing the FJ libraries with a scheduler based on
> java.util.concurrent exposes race/deadlock conditions related to the fact
> that some parts of the Actor processing (e.g., testing mailbox items against
> partial functions while the Actor itself is synchronized)
> - The Actors require external threads to function and it's not possible
> to create external threads in the Google App Engine (making Actor-based
> functionality including CometActors non-functioning in GAE apps)
> - Actors are fragile when exceptions are thrown
> - Actors have running and not running states (as compared with objects
> which can always respond to message sends). In practice, managing the
> running and not running states is as hard as managing memory in C.
> - There are hidden actors associated with each thread which display the
> above fragility and state management issues
> - And as a practical matter, I've got a couple of applications that are
> going into production over the next few weeks and cannot wait for the
> various fixes to make it into Scala 2.8 and the hacks and work-arounds that
> I've done to the 2.7.4 Actor libraries became too complex for my comfort.
> I have written a simple Actor class that is focused on message sending and
> processing of messages asynchronously. This means there's a single
> operation that you can perform on Actors, the message send operation.
> Actors can be specicialized (they only access messages of a certain type).
> In order to receive a response from an Actor, you can pass in a Future as
> part of the message and that Future may be satisfied asynchronously. This
> means that a sender of a message need not be an Actor and that the Actor
> recipient of a message cannot determine the sender of a message. Actors
> have two bits of internal state: a mailbox and a flag indicating that the
> Actor is currently processing messages in its mailbox. The amount of
> synchronization of Actors is minimal (on inserting messages into the
> mailbox, on removing messages from the mailbox, and on changing state
> to/from "processing messages".)
> An Actor instance must provide a messageHandler method which returns a
> PartialFunction that is used to pattern match against the messages in the
> mailbox. The instance may also provide an optional exception handler that
> is called if an Exception is thrown during the handling of a message.
> The Actor is guaranteed to only be processing one message at a time and the
> Actor is guaranteed not to be in a monitor (synchronized) during the
> processing of messages. An Actor is guaranteed to maintain the order of the
> messages in its mailbox, however, messages that do not currently match the
> messageHandler will be retained in the order that they were received in the
> event that the messageHandler changes and they can be processed.
> The Lift Actors will, by default, use the java.util.concurrent library for
> thread pooling, although I have worked out a mechanism for
> thread-piggy-backing such that if the Actors are running in GAE, they need
> not use any additional thread (this will enable Lift's comet support in
> GAE.) There will also be a scheduler (much like the existing ActorPing)
> which will send a message to an Actor at some time in the future (and on
> GAE, this scheduler, the Pinger, will not require a separate thread.)
> The changes that you will have to make to your applications are minimal.
> Actors will no longer have start(), exit(), or link() methods. Actors will
> always process messages in their mailbox and will be removed from the system
> by the JVM's garbage collector. Calls to !? will be replaced by calls to !
> with a Future as a parameter to the message. Calls to ActorPing will be
> replaced by calls to Pinger.
> You can continue to mix Scala's Actors and Lift's Actors in an application,
> although Lift's work-arounds to the Scala Actor memory retention issues and
> scheduling issues will not be turned on by default (they will still be
> available in the Lift codebase if you're using Scala 2.7.4 and need the
> work-arounds.)
> I am happy to share the Lift Actor code with EPFL and if it makes it into
> the Scala distribution as SimpleActors or something similar, I'm totally
> cool with that. I'm not interested in owning or maintaining an Actor
> library. I am however, dedicated to making sure that Lift apps can run in
> production for months (or even years) without retaining memory or having
> other problems that can impact the stability of applications.
> I will have a branch committed up on GitHub tomorrow with Lift ported to the
> new Actor library.
> If you have any questions, please let me know.
> Thanks,
> David
> PS -- ESME people, I'll roll these changes into ESME next week
On Sat, May 23, 2009 at 4:37 AM, Timothy Perrett <timo...@getintheloop.eu>wrote:
> David, this is extremely interesting.
> Given the points you outlined this makes perfect sense to move from
> scala.actors - however, if come the 2.8 release EPFL fix the actors
> library so that it then becomes acceptable to use within lift again,
> would you want to move back to it? IMO, and as you said in your mail,
> you (or indeed we) have no interest in maintaing our own actors
> implementation and it seems like it would be most optiomal to use the
> EPFL implementation when it becomes appropriate to.
Sure. I would prefer to build stuff on top of standard tools and
libraries. Having two different Actor implementations could cause
confusion. With that being said, I also expect that if we sit on top of a
standard library, that there is a mechanism for insuring that systemic
problems (in this case the memory retention issues) are addressed in a
holistic and timely manner.
> On May 23, 6:19 am, David Pollak <feeder.of.the.be...@gmail.com>
> wrote:
> > Folks,
> > It is not lightly that I've made the decision to write an alternative
> Actor
> > library and move the Lift code base from the Scala Actors to Lift Actors
> > (working name). I want to spend a little time talking about the steps
> that
> > led to the decision as well as the impact that it will have on Lift code.
> > Since November, I've been chasing a series of memory leaks in the Actor
> > library. Philipp Haller from EPFL has been responsive in addressing the
> > individual memory leaks, but the issue seems to be one of whack-a-mole...
> > each time one memory leak is fixed, another one appears. Further, the
> > existing Actor architecture does not lend itself to the kind of Actor
> usage
> > cycle that we find in Lift apps. Specifically:
> > - Lift creates/destroys an Actor for each Comet request. This rapid
> > creation/destruction of Actors caused memory back-ups, and the
> existing
> > Actor code seems to be oriented to long running Actors rather than
> Actors
> > with Object-length lifespans.
> > - The FJ libraries used for Actor scheduling have problems on
> multi-core
> > machines and are also a source of memory retention issues.
> > - Replacing the FJ libraries with a scheduler based on
> > java.util.concurrent exposes race/deadlock conditions related to the
> fact
> > that some parts of the Actor processing (e.g., testing mailbox items
> against
> > partial functions while the Actor itself is synchronized)
> > - The Actors require external threads to function and it's not
> possible
> > to create external threads in the Google App Engine (making
> Actor-based
> > functionality including CometActors non-functioning in GAE apps)
> > - Actors are fragile when exceptions are thrown
> > - Actors have running and not running states (as compared with objects
> > which can always respond to message sends). In practice, managing the
> > running and not running states is as hard as managing memory in C.
> > - There are hidden actors associated with each thread which display
> the
> > above fragility and state management issues
> > - And as a practical matter, I've got a couple of applications that
> are
> > going into production over the next few weeks and cannot wait for the
> > various fixes to make it into Scala 2.8 and the hacks and work-arounds
> that
> > I've done to the 2.7.4 Actor libraries became too complex for my
> comfort.
> > I have written a simple Actor class that is focused on message sending
> and
> > processing of messages asynchronously. This means there's a single
> > operation that you can perform on Actors, the message send operation.
> > Actors can be specicialized (they only access messages of a certain
> type).
> > In order to receive a response from an Actor, you can pass in a Future as
> > part of the message and that Future may be satisfied asynchronously.
> This
> > means that a sender of a message need not be an Actor and that the Actor
> > recipient of a message cannot determine the sender of a message. Actors
> > have two bits of internal state: a mailbox and a flag indicating that the
> > Actor is currently processing messages in its mailbox. The amount of
> > synchronization of Actors is minimal (on inserting messages into the
> > mailbox, on removing messages from the mailbox, and on changing state
> > to/from "processing messages".)
> > An Actor instance must provide a messageHandler method which returns a
> > PartialFunction that is used to pattern match against the messages in the
> > mailbox. The instance may also provide an optional exception handler
> that
> > is called if an Exception is thrown during the handling of a message.
> > The Actor is guaranteed to only be processing one message at a time and
> the
> > Actor is guaranteed not to be in a monitor (synchronized) during the
> > processing of messages. An Actor is guaranteed to maintain the order of
> the
> > messages in its mailbox, however, messages that do not currently match
> the
> > messageHandler will be retained in the order that they were received in
> the
> > event that the messageHandler changes and they can be processed.
> > The Lift Actors will, by default, use the java.util.concurrent library
> for
> > thread pooling, although I have worked out a mechanism for
> > thread-piggy-backing such that if the Actors are running in GAE, they
> need
> > not use any additional thread (this will enable Lift's comet support in
> > GAE.) There will also be a scheduler (much like the existing ActorPing)
> > which will send a message to an Actor at some time in the future (and on
> > GAE, this scheduler, the Pinger, will not require a separate thread.)
> > The changes that you will have to make to your applications are minimal.
> > Actors will no longer have start(), exit(), or link() methods. Actors
> will
> > always process messages in their mailbox and will be removed from the
> system
> > by the JVM's garbage collector. Calls to !? will be replaced by calls to
> !
> > with a Future as a parameter to the message. Calls to ActorPing will be
> > replaced by calls to Pinger.
> > You can continue to mix Scala's Actors and Lift's Actors in an
> application,
> > although Lift's work-arounds to the Scala Actor memory retention issues
> and
> > scheduling issues will not be turned on by default (they will still be
> > available in the Lift codebase if you're using Scala 2.7.4 and need the
> > work-arounds.)
> > I am happy to share the Lift Actor code with EPFL and if it makes it into
> > the Scala distribution as SimpleActors or something similar, I'm totally
> > cool with that. I'm not interested in owning or maintaining an Actor
> > library. I am however, dedicated to making sure that Lift apps can run
> in
> > production for months (or even years) without retaining memory or having
> > other problems that can impact the stability of applications.
> > I will have a branch committed up on GitHub tomorrow with Lift ported to
> the
> > new Actor library.
> > If you have any questions, please let me know.
> > Thanks,
> > David
> > PS -- ESME people, I'll roll these changes into ESME next week
<feeder.of.the.be...@gmail.com> wrote: > I am happy to share the Lift Actor code with EPFL and if it makes it into > the Scala distribution as SimpleActors or something similar, I'm totally > cool with that. I'm not interested in owning or maintaining an Actor > library. I am however, dedicated to making sure that Lift apps can run in > production for months (or even years) without retaining memory or having > other problems that can impact the stability of applications.
The cool thing about this is that it provides solid evidence that Scala - as a language - does satisfy the aim of being be a scalable language.
I'm referring to the fact that Scala actors are not part of the core language. They're just a library that can be replaced with a different library, which can also to provide the 'feel' of native language support for objects of that type. It's such a fundamental part of the language design that Programming in Scala talks about it in Chapter 1, Section 1.
I guess this demonstrates that Scala provides the features for growth that Steele says are needed for languages to be successful in the long term, and that he would have liked Java to have. Awesome.
Nice, clear explanation, by the way. Should avoid any any NIH allegations on the diggs and reddits of the world ;o)
On Sat, May 23, 2009 at 7:39 AM, Martin Ellis <ellis....@gmail.com> wrote:
> On Sat, May 23, 2009 at 6:19 AM, David Pollak
> <feeder.of.the.be...@gmail.com> wrote:
> > I am happy to share the Lift Actor code with EPFL and if it makes it into
> > the Scala distribution as SimpleActors or something similar, I'm totally
> > cool with that. I'm not interested in owning or maintaining an Actor
> > library. I am however, dedicated to making sure that Lift apps can run
> in
> > production for months (or even years) without retaining memory or having
> > other problems that can impact the stability of applications.
> The cool thing about this is that it provides solid evidence that Scala -
> as a language - does satisfy the aim of being be a scalable language.
Yes, this is absolutely right. It also points up what I missed in my
original posting... the amazing value of the Scala Actors which include:
- First, and most important to Lift, a conceptual framework for doing
concurrency. Without the Actor model, Lift would not have such a rich model
for building interactive applications.
- A design that keeps true to the Erlang Actor model in that it supports
linking, run states, and other things that make an OTP style library
possible. (Hey Jonas, where's that OTP library?)
- A design that has evolved from simply supporting send/wait-for-response
(!?) to send and immediately receive Future and other cool features.
- Blocking until Futures are satisfied without consuming a thread if the
Future was within a react-based Actor.
- An implementation that worked well in JDK 1.4. Many of the current
memory and scheduling issues are a result of the fact that Scala's Actors
worked on JDK 1.4, back when 1.4 was the target for the Scala distribution.
Scala is a language that supports multiple Actor libraries, just as it
supports multiple collections libraries. There are no built-in collections
classes in Scala. All collections are implemented at the library level.
And just as there were defects in some on the Scala collections classes that
David MacIver fixed, there are existing defects in the Actor libraries.
Just as there are specialized Map() collections that are appearing for Scala
that maximize performance for particular data types and/or key
distributions, we are creating a specialized Actor library that's optimized
for the kind of use that we see in Lift and web apps in general.
This is a testament to Scala's flexibility and to the foresight of including
such a powerful concurrency library, Actors, as part of the distribution.
But for those two things, Lift would not be nearly as cool as it is.
So, please do not read this thread as a repudiation of the Scala Actor
library, please read it as an expansion of what is possible within Scala.
> I'm referring to the fact that Scala actors are not part of the core
> language.
> They're just a library that can be replaced with a different library, which
> can
> also to provide the 'feel' of native language support for objects of that
> type.
> It's such a fundamental part of the language design that Programming in
> Scala talks about it in Chapter 1, Section 1.
> It's timely that you sent the email so soon after the link to the Guy
> Steele
> "Growing a Language" OOPSLA presentation (of which I am still in awe)
> went around on twitter.
> http://video.google.com/videoplay?docid=-8860158196198824415
> I guess this demonstrates that Scala provides the features for growth that
> Steele says are needed for languages to be successful in the long term,
> and that he would have liked Java to have. Awesome.
> Nice, clear explanation, by the way. Should avoid any any NIH allegations
> on
> the diggs and reddits of the world ;o)
> First, and most important to Lift, a conceptual framework for doing > concurrency. Without the Actor model, Lift would not have such a rich model > for building interactive applications. > A design that keeps true to the Erlang Actor model in that it supports > linking, run states, and other things that make an OTP style library > possible. (Hey Jonas, where's that OTP library?)
Or do you mean that it has not happened much there for a while? I certainly plan to expand it quite a lot, even have some code I could make its way into it eventually.
> I have been looking at scala.actors to see how far we are from meeting > Lift's requirements, and what LiftActor provides that scala.actors > don't. I split my reply into two mails for better modularity. In the > next installment you can read about how (something like) LiftActor could > be integrated into scala.actors.
> To do this, let me first address some of David's points. Disclaimer: I > don't want to argue that scala.actors is perfect. There are some > problems and we are fixing them as we speak. (Kudos to Erik, Rich and > Mirco!)
> > * Lift creates/destroys an Actor for each Comet request. This rapid > > creation/destruction of Actors caused memory back-ups, and the > > existing Actor code seems to be oriented to long running Actors > > rather than Actors with Object-length lifespans.
> scala.actors (Actors in the following) are designed to support these > short object-length lifespans. Indeed, creating an Actor is very cheap. > If the user chooses to shut down the underlying thread pool manually, > destruction of an Actor amounts to garbage-collecting it. Destruction > only involves (little) more if the library should shut down the thread > pool automatically, or Actors are linked together.
In practice, this has been a serious memory retention issue. I've had to write a job that scavenges exited Actors from the ActorGC pool as well as a custom scheduler to work around these issues. In theory this may be true, but in practice, it's not. In practice, one must manually exit an Actor or the Actor is retained.
> > * Replacing the FJ libraries with a scheduler based on > > java.util.concurrent exposes race/deadlock conditions related to > > the fact that some parts of the Actor processing (e.g., testing > > mailbox items against partial functions while the Actor itself is > > synchronized)
> If I understand correctly this is the subject of ticket #2009:
> Here, the reasoning is indeed as Erik suggests, namely that in the send > method > - the check for a matching message is done on the sender's tread while > the receiver's lock is held, > - the check is only done if the receiver is guaranteed to wait for a > message (therefore, it does not touch any local state), and > - multiple senders are serialized using the lock of the receiver.
> Indeed, the guarantee that we want to provide is that Actors only > execute on one thread at a time. So, if there is a problem, I suppose it > must be somewhere else.
There must be two guarantees: (1) that the Actor will only execute on one thread at once and (2) that the Actor will not be in a monitor during execution. If the Actor is in a monitor during execution (as is the case with send and as was reported in 2009), there is a potential deadlock.
> > * The Actors require external threads to function and it's not > > possible to create external threads in the Google App Engine > > (making Actor-based functionality including CometActors > > non-functioning in GAE apps)
> As I mentioned before, Actors can be made to use schedulers that do not > create external threads. So, in principle this should make it possible > to run Actors on GAE.
Except that send() is in a monitor, so if you piggy-back message processing on send, you've got a deadlock. Further, if A sends a message to B which sends a message to A, how is that processed on a single thread?
> > * Actors are fragile when exceptions are thrown
> Here, I believe we can work out a good solution. You have already given > some valuable input, David, with your LiftActor class. So, either we > extend the programming model in that direction, or we refine > trapExit/link etc. and make them more robust.
> > * Actors have running and not running states (as compared with > > objects which can always respond to message sends). In practice, > > managing the running and not running states is as hard as managing > > memory in C.
> Actually, Actors only have to be started to execute the code in act() > before the first `react`. Upon hitting the first `react`, Actor and > LiftActor behave essentially the same with respect to running states. > And, again, you don't have to terminate your Actors if you are OK with > shutting down the thread pool yourself.
But there is a single thread pool for the entire app (or at least all the Lift parts of it) so the thread pool is retained for the whole time the app is running. Managing memory by thread pools is different and far less robust than using the JVM's GC mechanism.
> > * There are hidden actors associated with each thread which display > > the above fragility and state management issues
> Basically, what we have is that once a non-actor thread calls an > actor-based operation, an ActorProxy instance is created (which > basically holds the mailbox) that is stored in a ThreadLocal (see the > Actor object). It is used to enable a normal thread to receive messages. > Also, it is created when a non-actor thread sends a message to another > actor, so that the receiver always has access to the sender of a message. > It should always be possible to write an application so that none of > these ActorProxy instances are created. However, even if they are > created I currently don't see how they could result in memory retention > problems.
The problem that I've seen in the past is that if one of the ActorProxy instances gets into a non-running state, then the thread is dead to any !? operations. This makes every !? operation fragile.
> I hope this clarifies some (important) points that David raises. The > bottom line is that our goals are not so far apart. I believe with some > tuning effort scala.actors could meet the requirements of Lift.