In JRuby, the issue has been forced upon us by the inclusion of
somewhat localized coroutines for external enumeration in Ruby 1.8.7
and general coroutines in the form of fibers in 1.9. The two cases are
slightly different.
For external enumerating coroutines, we can have options that simulate
coroutine state. For example, the Enumerator for a core Array class
could know how to "next" through the array elements without using a
coroutine to call "each" with a block. But for other cases that may
have arbitrarily-complex iteration logic, we need a full-on coroutine
to hop in and out of that logic. And for these cases we must spin up a
thread to do the iteration.
For fibers, the logic is almost always going to be arbitrarily
complex, so they'll be a fiber in every case. We'll use pooling tricks
to reduce the overhead of spinning up fibers, but we can't avoid using
threads to implement them.
Functionally, spinning up threads where needed and ping-ponging state
back and forth is not a serious functional challenge. The challenge is
in managing the *lifecycle* of those threads.
The use of threads for simulating coroutines has a few immediate problems:
1. Threads are more expensive to spin up
2. Thread counts may be limited by the host platform
3. Threads root objects, for GC purposes
4. Threads will not GC until terminated
The latter two issues have been keeping me up nights. I'm looking for
suggestions.
Because Threads can root objects, and because we may want a fiber or
enumerator to GC before they've reached a terminal state (finished
enumerating/fiber exits), two major complexities arrive.
* The fiber/enum object can hold a hard reference to the thread, but
must define a finalizer so that when they are ready for collection
they can terminate the coroutine thread
* The thread must have a hard reference to the fiber/enum object, so
as not to interfere with it coming eligible for collection
Both of these stem from the realization that an incomplete
enumeration/fiber/generator can ultimately hold state forever, if the
thread doing that enumeration can't be killed once it would be
eligible for GC.
If it were possible to create "unrooting" threads, or perhaps threads
that only reference but do not root objects in their stacks, this all
become much easier...but I don't think there's a way currently to
create such threads, is there?
Are there other ways to implement this I've missed? Of course I know
about the approach Rife and Scala take for continuations in very
localized conditions, but they can't work across aribitrary libraries
that have not been similarly manipulated.
I'm now getting more desperate to see coroutines added to *any*
production-class JVM, even if I have to sign a waiver and pay a pound
of flesh to get at it.
- Charlie
> Ok, here's a classic challenge for those of us on the JVM: coroutines.
In order to know what you need, you need to say what *kind* of
coroutines these are. The Lua people have three binary features that
a given language's coroutines may or may not have:
1) Are coroutines first-class objects that can be passed around as
arguments or results and stored in data structures, or are they only
creatable and runnable using special syntax?
2) Can a coroutine resume any other coroutine (full coroutines,
symmetric coroutines), or only the coroutine that invoked it
(semi-coroutines, asymmetric coroutines, semi-symmetric coroutines)?
3) Can only the coroutine itself yield (shallow), or can any
subroutine invoked by the coroutine yield on its behalf (deep)?
Here's their paper on how it's done in Lua, which has first-class deep
semi-coroutines: http://www.inf.puc-rio.br/~roberto/docs/corosblp.pdf
.
--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures
> Ok, here's a classic challenge for those of us on the JVM: coroutines.
Yes indeed.
FTR, here are some notes from our lively talk on this subject at the
JVM Language Summit:
http://wiki.jvmlangsummit.com/MotionsToContinue
> For external enumerating coroutines, we can have options that simulate
> coroutine state. For example, the Enumerator for a core Array class
> could know how to "next" through the array elements without using a
> coroutine to call "each" with a block. But for other cases that may
> have arbitrarily-complex iteration logic, we need a full-on coroutine
> to hop in and out of that logic. And for these cases we must spin up a
> thread to do the iteration.
Brian Goetz just pointed out Marcin Rzeznicki's project to me on this
subject. Check out this lovely samefringe code:
http://code.google.com/p/coroutines/source/browse/trunk/src/mr/go/coroutines/tests/classes/Tree.java
(RIFE probably offers the same thing; do they have a similarly elegant
samefringe demo?)
I think of such approaches "bytecode shredding". The idea is to
transform bytecodes to CPS after compilation. The problem is that it
is hard to know which bytecode to do this to; you don't want to abuse
the JVM by doing it to the bytecodes. I think integrating the
shredding into the JVM would be most efficient, since it would make
the transformation lazy and transparent.
> For fibers, the logic is almost always going to be arbitrarily
> complex, so they'll be a fiber in every case. We'll use pooling tricks
> to reduce the overhead of spinning up fibers, but we can't avoid using
> threads to implement them.
>
> Functionally, spinning up threads where needed and ping-ponging state
> back and forth is not a serious functional challenge. The challenge is
> in managing the *lifecycle* of those threads.
>
> The use of threads for simulating coroutines has a few immediate
> problems:
>
> 1. Threads are more expensive to spin up
> 2. Thread counts may be limited by the host platform
> 3. Threads root objects, for GC purposes
> 4. Threads will not GC until terminated
>
> The latter two issues have been keeping me up nights. I'm looking for
> suggestions.
A thread T1 could (perhaps) be GC-ed if it could be proven that T1 is
blocked, and that all references to T1 are unreachable, and (lastly)
that T1 could be unblocked only by some operation on a reachable
reference to T1. I doubt if any JVM has such logic in it, and I
suspect that Java's threading APIs might offer resistance to proving
the last statement. But maybe not.
Even if all that were to become workable, it would still be
unsatisfactory to be creating lots of fibers speculatively, and
letting the GC clean them up. The problem would be, not that the GC
couldn't clean them up, but that each one would be a short-lived
megabyte-scale object, even though, in the source code, it might
appear to have only a few words of interesting state.
> If it were possible to create "unrooting" threads, or perhaps threads
> that only reference but do not root objects in their stacks, this all
> become much easier...but I don't think there's a way currently to
> create such threads, is there?
No. A live thread is assumed to be capable of future side effects on
the whole system, so cannot be pruned or killed. (Except, maybe, as
T1 above.)
> Are there other ways to implement this I've missed? Of course I know
> about the approach Rife and Scala take for continuations in very
> localized conditions, but they can't work across aribitrary libraries
> that have not been similarly manipulated.
Yep; that's a subset of the "siloing" problem Neal and Josh were
insistent about at the Summit. But, how serious is this problem (in
your case)?
> I'm now getting more desperate to see coroutines added to *any*
> production-class JVM, even if I have to sign a waiver and pay a pound
> of flesh to get at it.
Send chocolate to Lukas Stadler and Marcin Rzeznicki!
We need to experiment more with this stuff. I am confident the
community can find graceful trade-offs to scale good demos like
Tree.samefringe into industrial-strength APIs. But it will require
experimentation with the corner cases. When the design settles out,
it will (presumably) be attractive for JVM vendors to optimize such
things.
On Sep 25, 2009, at 11:47 PM, John Cowan wrote:
> In order to know what you need, you need to say what *kind* of
> coroutines these are.
At the Summit we noted these degrees of freedom relevant to the JVM:
> • full or scoped
> • restartability: one-shot or repeated restart
> • reflectability: opaque, read-only, serializable (forgeable!)
> • mobility: thread-bound, JVM-bound, mobile
> The Lua people have three binary features that
> a given language's coroutines may or may not have:
>
> 1) Are coroutines first-class objects that can be passed around as
> arguments or results and stored in data structures, or are they only
> creatable and runnable using special syntax?
Any JVM-based coroutine is likely to be first-class. That in turn
leads to further questions of whether it is mobile and/or reflectable
in any way.
> 2) Can a coroutine resume any other coroutine (full coroutines,
> symmetric coroutines), or only the coroutine that invoked it
> (semi-coroutines, asymmetric coroutines, semi-symmetric coroutines)?
That's a good question. Does anyone know what the C# answer to that
is (for yielding iterators)?
> 3) Can only the coroutine itself yield (shallow), or can any
> subroutine invoked by the coroutine yield on its behalf (deep)?
To do samefringe (and IMO be useful) you need deep coroutines. And:
If coroutine C1 is going to call coroutine C2, which then will yield a
value back to C1's original caller, must C1 be compiled specially,
perhaps to call C2 via a special calling sequence? Or (this is
Charlie's siloing problem, I think) can C1 (and its coder and its
compiler) be ignorant of any coroutining, looking just like normal
Java code that happens to call C2?
When we finish figuring out the trade-offs, my guess is that we will
need C1 to be "knowing", and somehow to advertise to the JVM
(statically) that it is calling a coroutine C2.
> Here's their paper on how it's done in Lua, which has first-class deep
> semi-coroutines: http://www.inf.puc-rio.br/~roberto/docs/corosblp.pdf
> .
Lovely; thanks.
-- John
> 2) Can a coroutine resume any other coroutine (full coroutines,That's a good question. Does anyone know what the C# answer to that
> symmetric coroutines), or only the coroutine that invoked it
> (semi-coroutines, asymmetric coroutines, semi-symmetric coroutines)?
is (for yielding iterators)?