I'm looking at implementing a language that would be best represented
using CPS. Now, I can't really find any references to full CPS
implementations on the JVM anywhere. I'm looking at using the technique
Kawa is planning for compiled code (see the last part at
http://www.gnu.org/software/kawa/internals/complications.html). Has
anyone else done this? It seems one of the natural side effects of
something like this would be a need to have almost all primitives
implemented in the language itself, since writing this kind of code by
hand is a bit of a pain.
Second, has anyone tried, or is it even possible to do real CPS in
interpreted style on the JVM? It would seem as that would require at
least TCO to work.
I guess my implementation won't have any interpretation if that's the
case. =)
Cheers
--
Ola Bini (http://ola-bini.blogspot.com)
JRuby Core Developer
Developer, ThoughtWorks Studios (http://studios.thoughtworks.com)
Practical JRuby on Rails (http://apress.com/book/view/9781590598818)
"Yields falsehood when quined" yields falsehood when quined.
> I'm looking at implementing a language that would be best represented
> using CPS. Now, I can't really find any references to full CPS
> implementations on the JVM anywhere. I'm looking at using the technique
> Kawa is planning for compiled code (see the last part at
> http://www.gnu.org/software/kawa/internals/complications.html). Has
> anyone else done this?
If you haven't already, I recommend reading Henry Baker's classic
paper "Cheney on the MTA"
<http://home.pipeline.com/~hbaker1/CheneyMTA.html>, which is about
compiling full Scheme to the C "virtual machine", which is close
enough to the JVM as far as control issues are concerned.
Essentially, you compile all functions into CPS, and then each
function calls its successor using an ordinary C function call, never
returning from it. When the C stack gets too full[*], all the data
structures are evicted to the GCed heap, and all the control frames
are discarded by a longjmp back to the beginning. This provides
*amortized* TCO automatically; that is, although stack is pushed, it
gets recycled rapidly -- in essence, the C stack doubles as the first
generation of the Scheme heap.
[*]The size of the C stack is measured by subtracting a pointer to a
local variable from a global pointer to a variable kept in the main
procedure; on the JVM, you'd presumably keep a global count of
procedures invoked as a rough measure.
> It seems one of the natural side effects of
> something like this would be a need to have almost all primitives
> implemented in the language itself, since writing this kind of code by
> hand is a bit of a pain.
It's no problem in such a system to call ordinary C functions,
provided they don't call back into Scheme or consume too much stack.
The Chicken Scheme implementation
<http://www.call-with-current-continuation.org> is an actual
implementation of this idea for Scheme-to-C (I'm involved in the
community in my copious spare time). I'd look there next. The
community is helpful and friendly, and may well have good ideas about
how to adapt Cheney-style operations to the JVM.
--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures
Yes, I've taken a look at that paper, and various other implementations,
including Lisp In Small Pieces, which does all these tricks. The problem
is that they're all using things that require you to manipulate the
stack in some way. setjmp and longjmp can of course be duplicated in a
manner by using exceptions, but the rest of the stack manipulation
doesn't seem possible. Otherwise, that model would probably work fine.
Kelly
On Dec 29, 2007 9:05 AM, Ola Bini <ola....@gmail.com> wrote:
>
--
Like the fella says, in Italy for 30 years under the Borgias they had
warfare, terror, murder, and bloodshed, but they produced
Michelangelo, Leonardo da Vinci, and the Renaissance. In Switzerland
they had brotherly love - they had 500 years of democracy and peace,
and what did that produce? The cuckoo clock.
-- Orson Welles
> The problem
> is that they're all using things that require you to manipulate the
> stack in some way. setjmp and longjmp can of course be duplicated in a
> manner by using exceptions,
Correct.
> but the rest of the stack manipulation doesn't seem possible.
No other tricks are required! Since only pointers and primitives,
never objects, are allocated on the Java stack, no object eviction is
needed, so you just keep on calling and never returning (hence the
allusion to "MTA Charlie") and finally throw the exception to kill the
stack, which is now 100% garbage.
(If there can be plain Java code mixed into the call stack, there is
some danger that someone will do a "catch(Throwable e)" or
"catch(RuntimeException e)" and screw the mechanism.)
Kelly Nawrocke writes:
> I had some papers on A-Normal form and trampolining on the JVM
ANF and CPS are of course equivalent, so it's hardly surprising that
what you can do with one, you can do with the other.
> They found that
> returning to the trampoline loop every N frames (~40 was optimal at
> the time) was the most efficient.
This is essentially what I described in my last post -- throw the
exception every N calls. Of course the size of N depends on how much
stack you allocate to your JVM at startup, which depends on how you
are using whatever physical memory you have. After some experiments
with trying to set the stack size experimentally at system
configuration time, current versions of Chicken just set an arbitrary
C-stack size (128 KB, I think) which can be overridden either at
system configuration time or at run time.
The exception should override the fillInStackTrace method, which is
what costs the most time in throwing an exception.
> The problem
> is that they're all using things that require you to manipulate the
> stack in some way. setjmp and longjmp can of course be duplicated in a
> manner by using exceptions,
Correct.
> but the rest of the stack manipulation doesn't seem possible.
No other tricks are required! Since only pointers and primitives,
Kelly Nawrocke writes:
--
http://citeseer.ist.psu.edu/schinz01tail.html
This gets into tail-call elimination using exceptions. If you are
going to use the java stack then this will be necessary.
http://www.cs.brown.edu/~sk/Publications/Papers/Published/pcmkf-cont-from-gen-stack-insp/
Section 4.2 gets into implementing delimited continuations on msil
(not using any special instructions). There is another paper that got
more into Administrative Normal Form and using exceptions to capture
the current continuation but I can't find it anywhere.
http://citeseer.ist.psu.edu/flanagan-essence.html
Paper on compiling with continuations. Defines ANF and there is some
scheme code at the end for performing the transformation.
The last two have a lot of proofs etc... and are a little dense but
the example code is useful.
Kelly
--
My apologies to lurkers or non-participants for the resend.
http://www.ccs.neu.edu/scheme/pubs/stackhack4.html
This one is a C# implementation of the ideas found in "Continuations
from Generalized Stack Inspection". The implementation doesn't access
the stack directly and use exception handling to perform the
capturing. Useful since it shows what the generated code would look
like for the fibonacci and takeuchi functions.
Kelly
--