Optimizations for Objects

Dan Sugalski

unread,

Mar 17, 2004, 11:53:16 AM3/17/04

to perl6-i...@perl.org

Okay, as I see it there are two big things that we can do to speed
objects up. (Well, besides speeding up the creation of continuation
PMCs, which I am, at the moment, sorely tempted to put in a special
pool for fast allocation) They are:

1) A method cache. Which we need anyway, so this isn't any surprise

2) Pre-figuring the delegated vtable functions.

#1 is a pretty straightforward thing, I'll detail my scheme in a bit.

#2 is where I think we can get some wins, though without a
notification system it's a bit dodgy. (A fixable one, though)

What I'm considering for a vtable method-specific cache is pretty
straightforward. When we build a class we allocate a new vtable for
it, though the vtable's essentially just a clone of the delegate
vtable. Make a call and we do a method lookup and then call it.

That's all fine and good, and the generic method cache will help
here. However... we can do better. What I'm thinking of is caching
the actual found method PMC pointer in the class somewhere (hanging
off the vtable or in the class' attributes or something) such that we
don't actually have to *do* any method lookups. With the method PMC
directly cached, we just call the darned thing and are done with it.
No lookups at all, something definitely faster than actually looking
anything up.

At the moment I'm considering two storage schemes for the PMC
pointers (one hanging off the vtable itself, another in the class
attributes) but I'm open to suggestions here.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Zellyn Hunter

unread,

Mar 17, 2004, 12:49:49 PM3/17/04

to perl6-i...@perl.org

Might be worth looking at Smalltalk papers too - they've been doing objects
forever. If I remember correctly, some smalltalks have an interesting form
of caching: they call the last thing the call resolved to, and redispatch as
necessary. So rather than looking things up before every call, you make the
same call as last time, and the called routine does:

a) are we supposed to be here? (presumed to be cheaper)
b) if not, redispatch and re-cache (presumed to be more expensive)
c) main method body

I'm not sure if that's apropos, but I thought it was quite clever when I first
read about it. Searching the web for [smalltalk cache] pulls up some
interesting info and discussions.

Zellyn

Leopold Toetsch

unread,

Mar 17, 2004, 12:46:50 PM3/17/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> Okay, as I see it there are two big things that we can do to speed
> objects up. (Well, besides speeding up the creation of continuation
> PMCs, which I am, at the moment, sorely tempted to put in a special
> pool for fast allocation)

I though about that already. Returncontinuations created via the *cc
opcode variants are created in the opcode and used exactly once just for
returning from the sub. So I'd do for these a per-interpreter
freelist, where they get put back after invokeing the return
contination.

> 1) A method cache. Which we need anyway, so this isn't any surprise

> 2) Pre-figuring the delegated vtable functions.

> At the moment I'm considering two storage schemes for the PMC

> pointers (one hanging off the vtable itself, another in the class
> attributes) but I'm open to suggestions here.

Or: after the 1st delegate lookup create a JITed stub that is in
pseudo-code:

enter
call saveregs
S0 = "meth"
P2 = obj
P5 = value # arguments
call runops
call restoreregs
# handle return val
leave
ret

This function's address replaces the vtable slot of the old one and is
now called directly as the delegate. When the register assignments are
done interpreter-relative, this stub should be reusable by different
threads too.

> --
> Dan

> --------------------------------------"it's like this"-------------------
> Dan Sugalski even samurai
> d...@sidhe.org have teddy bears and even
> teddy bears get drunk

leo

Dan Sugalski

unread,

Mar 17, 2004, 1:01:08 PM3/17/04

to l...@toetsch.at, perl6-i...@perl.org

At 6:46 PM +0100 3/17/04, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
>> Okay, as I see it there are two big things that we can do to speed
>> objects up. (Well, besides speeding up the creation of continuation
>> PMCs, which I am, at the moment, sorely tempted to put in a special
>> pool for fast allocation)
>
>I though about that already. Returncontinuations created via the *cc
>opcode variants are created in the opcode and used exactly once just for
>returning from the sub. So I'd do for these a per-interpreter
>freelist, where they get put back after invokeing the return
>contination.

Unless something inside the call's taken a continuation, in which
case they're still valid and may be reused. Can't do that,
unfortunately. :(

I was thinking of a special PMC arena for continuations that already
had the full data chunk tacked on the end of the struct and the
various PMC pointers-to-data filled in.

> > 1) A method cache. Which we need anyway, so this isn't any surprise
>
>> 2) Pre-figuring the delegated vtable functions.
>
>> At the moment I'm considering two storage schemes for the PMC
>> pointers (one hanging off the vtable itself, another in the class
>> attributes) but I'm open to suggestions here.
>
>Or: after the 1st delegate lookup create a JITed stub

Which is swell, except for that pesky can't-guarantee-a-JIT thing... :)

Sterling Hughes

unread,

Mar 17, 2004, 12:06:35 PM3/17/04

to Dan Sugalski, perl6-i...@perl.org

>
> That's all fine and good, and the generic method cache will help here.
> However... we can do better. What I'm thinking of is caching the
> actual found method PMC pointer in the class somewhere (hanging off
> the vtable or in the class' attributes or something) such that we
> don't actually have to *do* any method lookups. With the method PMC
> directly cached, we just call the darned thing and are done with it.
> No lookups at all, something definitely faster than actually looking
> anything up.
>

How do you handle the threaded case with two private objects and
runtime method overloading?

-Sterling

Leopold Toetsch

unread,

Mar 17, 2004, 4:11:45 PM3/17/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 6:46 PM +0100 3/17/04, Leopold Toetsch wrote:
>>
>>I though about that already. Returncontinuations created via the *cc
>>opcode variants are created in the opcode and used exactly once just for
>>returning from the sub. So I'd do for these a per-interpreter
>>freelist, where they get put back after invokeing the return
>>contination.

> Unless something inside the call's taken a continuation, in which
> case they're still valid and may be reused. Can't do that,
> unfortunately. :(

Could you please provide a code snippet that shows this reusing.

Thanks,
leo

Larry Wall

unread,

Mar 17, 2004, 5:38:40 PM3/17/04

to perl6-i...@perl.org

On Wed, Mar 17, 2004 at 12:06:35PM -0500, Sterling Hughes wrote:
: >
: >That's all fine and good, and the generic method cache will help here.

I suspect Dan is primarily talking about public, single dispatch in
a (more-or-less) static, global class hierarchy. Private dispatch
shouldn't be trying to use the public dispatcher in any case, in
my opinion, and Perl 6 will enforce that by making you call private
methods differently from public methods, to keep private things in a
completely separate namespace. (Whether Parrot will have to confuse
the private and public dispatchers to support other languages is
another matter, of course... But we'd prefer not to violate Liskov
substitutability just because you add a private method.)

Larry

Mitchell N Charity

unread,

Mar 17, 2004, 6:32:54 PM3/17/04

to perl6-i...@perl.org

It's always nice to have numbers when planning optimizations.

So I started fiddling with oofib.imc.

It seemed like we were spending a lot of time copying strings.
So I defined the method name strings once, in main,
$S3 = "fibB"
and replaced calls like
self.fibB(n2)
with
self.$S3(n2)
Optimized and unoptimized reported run times went down 10%.
That's 1/5 of the way to perl! :)

I would like to see at least two other tests:

(1) doing fetch_method in main, and using the fetched methods,
rather than repeatedly looking them up.

(2) creating an explicit return continuation in fib, and using
it for both recursive calls, cutting continuation creation
in half.

The first will illustrate the maximum savings we can expect here
from a perfect, costless, method lookup cache.

The second will illustrate how much time is being spent in return
continuation pmc creation and associated garbage collection.

What else might be interesting?

I began work on (1), but got bogged down.
Perhaps a better imc writer than I could knock it off?

I just don't have that warm fuzzy feeling that we know where our time
is going yet.

Mitchell

Mitchell N Charity

unread,

Mar 18, 2004, 4:10:59 PM3/18/04

to perl6-i...@perl.org

It seemed nontrivial to reduce the number return continuation pmc's
used in oofib.imc. So I instead added an _extra_, unused one, to the
two already used in each fib method.

Parrot ran 30% slower. (Optimized, unoptimized, and jit)

Which suggests return continuation pmc driven memory management costs
(gc and allocation) are currently a major, perhaps even dominant,
factor in method invocation speed.

Mitchell

Dan Sugalski

unread,

Mar 18, 2004, 4:04:50 PM3/18/04

to mcha...@vendian.org, perl6-i...@perl.org

Yow. Okay, thanks. That means it's time to dive into the think tank
and see what we can do about that.

Leopold Toetsch

unread,

Mar 18, 2004, 4:38:50 PM3/18/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 4:10 PM -0500 3/18/04, Mitchell N Charity wrote:
>>It seemed nontrivial to reduce the number return continuation pmc's
>>used in oofib.imc. So I instead added an _extra_, unused one, to the
>>two already used in each fib method.
>>
>>Parrot ran 30% slower. (Optimized, unoptimized, and jit)
>>
>>Which suggests return continuation pmc driven memory management costs
>>(gc and allocation) are currently a major, perhaps even dominant,
>>factor in method invocation speed.

> Yow. Okay, thanks. That means it's time to dive into the think tank
> and see what we can do about that.

Which brings up again my warnocked question: How can return
continuations get reused? Please provide some PASM code that enables my
brain to follow your argument that we cant't simply cache them.

leo

Mitchell N Charity

unread,

Mar 18, 2004, 5:32:30 PM3/18/04

to perl6-i...@perl.org

>Which suggests return continuation pmc driven memory management costs
>(gc and allocation) are currently a major, perhaps even dominant,
>factor in method invocation speed.

Yow. Okay, thanks. That means it's time to dive into the think tank
and see what we can do about that.

Nothing like finally thinking of caveats _after_ hitting send.

oofib as a program is generating atypically(?) little garbage.
So much of the garbage is continuation pmcs, which thus determine the
frequency of costly gc sweeps. In a dirtier, more typical environment,
where perhaps other factors are determining the frequency of gc sweeps,
the gc cost would look more like the pmc's _incremental_ cost of
collection. Maybe. Which would be less "yow". How much less isn't
clear. Though given that allocating these pmc's seems to be itself say
1/3 of method call cost without considering gc, they would retain at
least that level of yow.

Mitchell
(btw, regards ICU progress... awesome!!:)

Dan Sugalski

unread,

Mar 18, 2004, 5:31:44 PM3/18/04

to l...@toetsch.at, perl6-i...@perl.org

At 10:38 PM +0100 3/18/04, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
>> At 4:10 PM -0500 3/18/04, Mitchell N Charity wrote:
>>>It seemed nontrivial to reduce the number return continuation pmc's
>>>used in oofib.imc. So I instead added an _extra_, unused one, to the
>>>two already used in each fib method.
>>>
>>>Parrot ran 30% slower. (Optimized, unoptimized, and jit)
>>>
>>>Which suggests return continuation pmc driven memory management costs
>>>(gc and allocation) are currently a major, perhaps even dominant,
>>>factor in method invocation speed.
>
>> Yow. Okay, thanks. That means it's time to dive into the think tank
>> and see what we can do about that.
>
>Which brings up again my warnocked question: How can return
>continuations get reused?

Works like this. (No pasm, but it should be obvious)

sub a:
build return continuation X
call sub b
do processing
return

sub b:
do some processing
take continuation Y
store Y in global variable Z
exit

main:
call sub a
grab continuation Y from variable Z
invoke continuation Y

If a continuation's taken from within a sub somewhere, return
continuations may (probably will) be used multiple times, once for
the original return then once for each time the continuation is
invoked.

Jens Rieks

unread,

Mar 18, 2004, 5:48:52 PM3/18/04

to l...@toetsch.at, perl6-i...@perl.org

Hi,

See attachment.

> leo
jens

Jens Rieks

unread,

Mar 18, 2004, 5:52:50 PM3/18/04

to perl6-i...@perl.org

oops, I renamed the wrong file...
This is the correct example.

jens

reuse.pasm

Mitchell N Charity

unread,

Mar 18, 2004, 6:42:07 PM3/18/04

to perl6-i...@perl.org

Which brings up again my warnocked question: How can return
continuations get reused? Please provide some PASM code that enables my
brain to follow your argument that we cant't simply cache them.

Well, I can't do PASM code, but let me give it try. ;)

Single vs multiple use. It should be possible to give the same
continuation to multiple people, and have them variously invoke it,
perhaps repeatedly, whenever someone wants to. Though all those
invocations would come to the same "place", so if you wished to
distinguish just who was invoking you, you would have to cooperatively
arrange for that yourself (or use something like caller()).

"Reuse" implies something stronger. That the continuation becomes
somehow invalid, and thus its dead husk can be used to host a new
continuation.

As continuations can be stashed anyplace, and invoked at any later
time, the only way to know folks are done with it is wait until it
gets garbage collected. So one might say create a special memory
pool, used just for continuations, which might perhaps allow for easier
construction.

Forcing a continuation to become invalid, to say throw an exception if
it is later invoked, would seem to require something like creating a
derivative "revocable continuation", and uid in normal continuations.
So one gives people the revocable continuation, and if/when it's
underlying continuation is "reused", it's uid is changed, which the
derivative checks when deciding whether to invoke it, or to die.

Some memory systems allow one to forcibly declare an object dead, and
subsequent use triggers an exception. But you don't get to reuse the
object. Unless there is an extra level of indirection between object
ids and their memory, which we don't have.

It's not clear to me what "return" continuations are intended to be.
Up-calls? This would not avoid the "can it be reused now?" question,
as one might stash one's return continuation in say a timer, to be
invoked again on every third tuesday.

One might try to contractually enforce a "one use only" rule, so if
the continuation is ever invoked, you're promised it won't be used
again. But that would have to be binding on the entire call tree
beneath you. And on anyone who might reflectively snoop on the call
tree. :) Otherwise anyone could stash _their_ continuation, and by
later reinvoking it, reinvoke yours. Though you might generate
"firewall" one-use code objects to insulate yourself, but... blech.
However, language compilers are of course free to use their own
calling conventions when dealing only with themselves, where they can
make any kind of strange guarantee they wish. They just wouldnt be
able to call out of these specialized zones, without significant
border guarding.

So I don't immediately see how a continuation can be "reused" in a
now-it-will-do-something-different sense.

Then again, there _is_ this "return continuation" thing in the code.
I just cant quite figure out what it is... ;)

Mitchell

Leopold Toetsch

unread,

Mar 19, 2004, 2:57:28 AM3/19/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 10:38 PM +0100 3/18/04, Leopold Toetsch wrote:
>>
>>Which brings up again my warnocked question: How can return
>>continuations get reused?

> Works like this. (No pasm, but it should be obvious)

Ok. First (and that applies to Jens example too), I'd like to outline
continuation vs return continuation and its usage:

1) First there was only a Continuation PMC. Performance with CPS sucked.
So I did invent a RetContinuation PMC Performance sucked less. The
difference between these two is: the former has COW copied stacks inside
its context, the latter has only pointers of the stacks in its context.

2) A return continuation is automatically created by

invokecc
callmethodcc

It's totally hidden when using PIR function or method call syntax

_func(a, b)
obj."meth"(args)

These internally create above opcodes and create a new return
continuation on each sub or method invocation

3) Returning from the sub or method in PIR syntax again totally hides
the return continuation

.pcc_begin_return
.return result
.pcc_end_return
.end

or just

.sub _foo
.end # automatic return sequence inserted.

4) From PIR level there is no official way to even reference this return
continuation - its invisible.

5) *But*

> If a continuation's taken from within a sub somewhere, return
> continuations may (probably will) be used multiple times, once for
> the original return then once for each time the continuation is
> invoked.

6) Yes, that's true. So the questios is: Can we invent some HLL PIR
syntax that clearly indicates: this subroutine will return multiple
times through the same return continuation. We have already a similar
construct for Coroutines:

.pcc_begin_yield
.result # optional
.pcc_end_yield

This is invokeing the coroutine and returns to the caller.

What's the usage of Continuations from HLLs point of view? Can we get
some hints, what is intended?

I'd like to have, if possible a clear indication: that's a plain
function or method call and this is not. I think the possible speedup is
worth the effort.

leo

Piers Cawley

unread,

Mar 19, 2004, 6:21:43 AM3/19/04

to l...@toetsch.at, Dan Sugalski, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

I think that requires solving the halting problem.

Brent 'Dax' Royal-Gordon

unread,

Mar 19, 2004, 7:25:14 AM3/19/04

to Piers Cawley, l...@toetsch.at, Dan Sugalski, perl6-i...@perl.org

I may have a possible solution. Disclaimer: I do not really understand
continuations, so I may be completely wrong.

This idea involves two assumptions:

1. Most Parrot code will not use continuations except for returning.
(There will be a significant efficiency loss otherwise.)

2. The most expensive part of constructing a continuation is COWing
the stack, etc.--simply constructing the data structure is cheap.

3. A return continuation may be safely reused if:
a. It has not been stored externally.
b. None of the return continuations for subroutines it has called
(recursively speaking) have been stored externally.
c. No other continuations have been constructed.

Here's how it works:

A ReturnContinuation is a lightweight version of a normal Continuation;
it contains the information needed to create a Continuation, but is not
really a continuation itself. Once allocated, a ReturnContinuation is
never[1] truly deallocated; if it dies, it's put on a free list, and
will be reused if it's needed.

Before a ReturnContinuation may be stored externally (and thus have the
possibility of being invoked later), it has to be transformed into a
normal Continuation. This will probably require compilers (or IMCC) to
insert an extra instruction, but I don't think that'd be too difficult.
(A naive way to do it would be to simply emit a second/different
instruction whenever P1 is retrieved. It might be possible to make IMCC
do this, or even to write this into the 'set' op.)

If a program either creates a normal Continuation or transforms a
ReturnContinuation into a normal one, Parrot will automatically turn all
currently-live ReturnContinuations into normal Continuations. (How is
an unknown, admittedly--perhaps we'll keep a list of currently-live
ReturnContinuations, too.) This means that your return continuation may
in fact be a real Continuation instead of a ReturnContinuation.

If a ReturnContinuation is invoked without being stored externally (and
thus transformed into a real Continuation), it is immediately considered
dead, and returns to the free list.

The effect is that the heavy work of constructing real Continuations is
deferred until we have a good reason to believe they'll be needed. For
programs that never use continuations except for old-fashioned
returning, a lot of work is avoided, because ReturnContinuations are
faster to construct; for programs that do use continuations, a little
extra work is done, because it would have been faster to construct real
Continuations in the first place. I suspect this would usually be a
win, although it might be wise to give the language compiler a way to
say "I know this program will use continuations, so just construct them
that way in the first place."

Example:

A calls B, creates ReturnContinuation rA1.
B calls C, creates ReturnContinuation rB1.
C invokes ReturnContinuation rB1 to return to B.
ReturnContinuation rB1 returns to the free list.
B calls D, creates ReturnContinuation rB2 (ex-rB1?).
D stores rB2 into Z:
rB2 becomes a real Continuation.
rA1 becomes a real Continuation.
D invokes Continuation rB2 to return.
Continuation rB2 does *not* return to the free list.
B calls E, creates ReturnContinuation rB3.
E invokes ReturnContinuation rB3 to return.
ReturnContinuation rB3 returns to the free list.
B invokes Continuation rA1 to return.
Continuation rA1 does *not* return to the free list.
A calls F, creates ReturnContinuation rA2 (ex-rB3?).
F creates and stores a new Continuation into Y:
rA2 becomes a real Continuation.
F invokes rA2 to return.
Continuation rA1 does *not* return to the free list.
A invokes Z--control jumps to the next statement in B after the call to D.

[1] Conceptually, that is. Realistically, it might make sense to do so
if you have a few thousand spare return continuations floating around.

--
Brent "Dax" Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker

Oceania has always been at war with Eastasia.

Ibotty

unread,

Mar 19, 2004, 9:39:32 AM3/19/04

to perl6-i...@perl.org

> This idea involves two assumptions:

there are three kinds of people.
those that can count and the others ;)

~ibotty

Larry Wall

unread,

Mar 19, 2004, 3:44:42 PM3/19/04

to perl6-i...@perl.org

On Fri, Mar 19, 2004 at 08:57:28AM +0100, Leopold Toetsch wrote:
: What's the usage of Continuations from HLLs point of view? Can we get

: some hints, what is intended?

From the standpoint of Perl 6, I hope to hide continuations far, far
away in a galaxy long ago. No wait, wrong movie...

We can certainly make it the default that a routine is not going to
do anything fancy with continuations unless it is explicitly declared
to allow it. As to what that declaration should be, I have no idea.
Probably just a trait that says, "is continuationalizableish" or
some such.

: I'd like to have, if possible a clear indication: that's a plain

: function or method call and this is not. I think the possible speedup is
: worth the effort.

I have no problem with "plain" being the default. I suppose you could
declare it explicitly if you like:

sub foo () is plain {...}

Then obviously the other kind would be:

sub bar () is peanut {...}

Hmm, actually, "smooth" and "nutty" might be more accurate.

Larry

James Mastros

unread,

Mar 19, 2004, 4:32:21 PM3/19/04

to perl6-i...@perl.org, Larry Wall

Larry Wall wrote:
> We can certainly make it the default that a routine is not going to
> do anything fancy with continuations unless it is explicitly declared
> to allow it. As to what that declaration should be, I have no idea.
> Probably just a trait that says, "is continuationalizableish" or
> some such.

"is mindwarping"? "does the_time_warp_again"?

The problem, in any case, is similar to "throws" in Java: if any of your
children are marked, so shall you be, unto the Nth generation. (Or, at
least, this is how I understand it.)

-=- James Mastros

Leopold Toetsch

unread,

Mar 19, 2004, 5:06:06 PM3/19/04

to Larry Wall, perl6-i...@perl.org

There is of course no need to declare the default, which will be very
likely beyond 99%. I won't discuss P6 syntax here, which could lead to

sub bar() will reuse_P1 { ... }

Piers' statement was related to solving the halting problem, which isn't
really positive. I don't throw or catch pies, but CPS returns are
currently the major performance loss Parrto has.

> Larry

leo

Piers Cawley

unread,

Mar 20, 2004, 4:22:04 AM3/20/04

to l...@toetsch.at, Larry Wall, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

I argue that we have the problems we do (incorrect behaviour of
continuations, horrible allocation performance) because we chose the
wrong optimization in the first place. The stack optimizations that are
in place make sense when you don't have continuations, but once you do,
the cost of allocating a continuation and maintaining all that COW
complexity becomes prohibitive. I strongly advocate rejigging the
stacks so that one stack frame = 1 stacked thing + 1 link to the next
thing in the chain. No need for COW, no need for memcpy when allocating
continuations, no worrying complexity to deal with while you're trying
to get the behaviour right. Oh, an no need for RetContinuations either.

Leopold Toetsch

unread,

Mar 20, 2004, 7:10:58 AM3/20/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:

> I argue that we have the problems we do (incorrect behaviour of
> continuations, horrible allocation performance) because we chose the
> wrong optimization in the first place. The stack optimizations that are
> in place make sense when you don't have continuations, but once you do,
> the cost of allocating a continuation and maintaining all that COW
> complexity becomes prohibitive. I strongly advocate rejigging the
> stacks so that one stack frame = 1 stacked thing + 1 link to the next
> thing in the chain. No need for COW, no need for memcpy when allocating
> continuations, no worrying complexity to deal with while you're trying
> to get the behaviour right. Oh, an no need for RetContinuations either.

The current problem of e.g. the oofib benchmark isn't related to that.
Actually it doesn't COW copy stacks. A RetContinuation just does a
memcpy of the context structure, that's it.

Creating RetContinuations is only one factor, the major is the missing
method cache. C<callmethodcc> takes almost 3 times the time of creating a
RetContinuation.

leo

Dan Sugalski

unread,

Mar 20, 2004, 11:18:08 AM3/20/04

to perl6-i...@perl.org

At 12:44 PM -0800 3/19/04, Larry Wall wrote:
>On Fri, Mar 19, 2004 at 08:57:28AM +0100, Leopold Toetsch wrote:
>: What's the usage of Continuations from HLLs point of view? Can we get
>: some hints, what is intended?
>
>From the standpoint of Perl 6, I hope to hide continuations far, far
>away in a galaxy long ago. No wait, wrong movie...

Which is swell, but Perl 6 is only one of the languages we care
about. Ruby does, and while it's not exactly one of the primary
targets, scheme and lisp are heavily laced with them.

>We can certainly make it the default that a routine is not going to
>do anything fancy with continuations unless it is explicitly declared
>to allow it.

Can't do that. There's no way you can ever really know that, since
any function or method you call might take one. Forbidding it is
going to be problematic as well, since then we hit performance issues
in guaranteeing that.

Larry Wall

unread,

Mar 20, 2004, 11:31:40 AM3/20/04

to perl6-i...@perl.org

On Sat, Mar 20, 2004 at 11:18:08AM -0500, Dan Sugalski wrote:
: At 12:44 PM -0800 3/19/04, Larry Wall wrote:
: >On Fri, Mar 19, 2004 at 08:57:28AM +0100, Leopold Toetsch wrote:
: >: What's the usage of Continuations from HLLs point of view? Can we get
: >: some hints, what is intended?
: >
: >From the standpoint of Perl 6, I hope to hide continuations far, far
: >away in a galaxy long ago. No wait, wrong movie...
:
: Which is swell, but Perl 6 is only one of the languages we care
: about. Ruby does, and while it's not exactly one of the primary
: targets, scheme and lisp are heavily laced with them.

I understand that, which is why I said "From the standpoint of Perl 6".

: >We can certainly make it the default that a routine is not going to

: >do anything fancy with continuations unless it is explicitly declared
: >to allow it.
:
: Can't do that. There's no way you can ever really know that, since
: any function or method you call might take one. Forbidding it is
: going to be problematic as well, since then we hit performance issues
: in guaranteeing that.

Well, Leo asked for hints, and I basically said Perl has no problem
sending them. If Parrot has a problem receiving them, that's another
matter. :-)

Larry

Melvin Smith

unread,

Mar 20, 2004, 11:53:06 AM3/20/04

to l...@toetsch.at, Dan Sugalski, perl6-i...@perl.org

At 08:57 AM 3/19/2004 +0100, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
> > At 10:38 PM +0100 3/18/04, Leopold Toetsch wrote:
> >>
> >>Which brings up again my warnocked question: How can return
> >>continuations get reused?
>
> > Works like this. (No pasm, but it should be obvious)

I was aware of Leo's RetContinuation because our initial
implementation using a full Continuation was deemed a little
heavy.

As long as the RetContinuation honors Copy On Write, there
should be no problem unless the RetContinuation is stored
and referenced "outside" the original scope (or escapes the
closure of the sub call or full continuation).

The problem could be that some code is not honoring COW
and it is not copying things before using.

-Melvin

Leopold Toetsch

unread,

Mar 20, 2004, 12:32:11 PM3/20/04

to perl6-i...@perl.org

Larry Wall wrote:

> Well, Leo asked for hints, and I basically said Perl has no problem
> sending them. If Parrot has a problem receiving them, that's another
> matter. :-)

When there are now hits from languages like Ruby or Smalltalk, then one
single flag will do it: If ever a real Continuation is created, this
optimization will be stopped. That's hint enough.
If this flag isn't set, RetContinuations will be recycled immediately on
an extra free list (and allocated from that).

I've running here a method cache now (passes already all but 1 test).
Speedup (unoptimized build) is aroud one third.

> Larry

leo

Gerald E Butler

unread,

Mar 20, 2004, 11:33:10 AM3/20/04

to perl6-i...@perl.org

Hello all,

I've been investigating the possibility of creating a MACHINE
DESCRIPTION (aka BACK-END) for GCC to target PARROT. My thinking is
this: If a satisfactory GCC back-end targeting PARROT is created -and-
PARROT is efficient enough (which from reading the documentation thus
far produced seems like it is an inevitable conclusion) then GCC could
compile itself to PARROT byte-code giving PARROT (and the whole open
source community) a PARROT self-host compiler which compiles multiple
languages (C#, C++, C, Pascal, Objective-C, Java, etc, etc, etc) to the
PARROT runtime.

Is a complete non-starter, or is this something which has
possibilities? Please give your expert opinions.

James Mastros

unread,

Mar 21, 2004, 4:49:30 AM3/21/04

to Gerald E Butler, perl6-i...@perl.org

Gerald E Butler wrote:
> I've been investigating the possibility of creating a MACHINE
> DESCRIPTION (aka BACK-END) for GCC to target PARROT. My thinking is
> this: If a satisfactory GCC back-end targeting PARROT is created -and-
> PARROT is efficient enough (which from reading the documentation thus
> far produced seems like it is an inevitable conclusion) then GCC could
> compile itself to PARROT byte-code giving PARROT (and the whole open
> source community) a PARROT self-host compiler which compiles multiple
> languages (C#, C++, C, Pascal, Objective-C, Java, etc, etc, etc) to the
> PARROT runtime.

At first glance, this looks like a good thing to do. After all, Parrot
was made a register machine, instead of a stack machine, to be able to
leverage the existing art with reference to real machines, which are
register machines, not stack machines.

So what's the problem?

In a word: Memory. GCC expects the machine to be more or less like a
real machine. This means, in no small part, that it has a large amount
of memory, accessible by getting a pointer to it, which is an integer.
Now, it would be possible to emulate that with an appropriately-crafted
PMC.

So is it possible, after you write that PMC?

Well... in theory, if you do this, you can write a gcc backend targeting
parrot. It shouldn't even be /that/ difficult. (More difficult then
porting gcc to a more traditional arch, less difficult then writing your
own C, etc, compiler.)

Unfortunately, since you've created that "memory" PMC, and stuffed much
of your data in it, you no longer have assembler that resembles, even
vaguely, that of the rest of the parrot world. Heck, most languages
will end up putting most user-visible things in PMC registers, and
you'll only ever use one. You'll never use string registers. You've
got an implicit PMC argument to every function, and one that probably
wants highlander (singleton) semantics as well.

You can write C, C#, C++, Pascal, Objective-C, Java, etc, but it won't
interoperlate with other Parrot code well (without stubs), if at all
(the stubs may be quite difficult).

Is it worthwhile anyway?

Quite possibly.

Is it easy?

No.

Will it do everything you want it to do?

No.

-=- James Mastros

Gerald E Butler

unread,

Mar 21, 2004, 10:32:04 AM3/21/04

to James Mastros, perl6-i...@perl.org

Ouch! I believe I now understand the complication. (Please forgive my
ignorance if I say something really stupid in the next few paragraphs,
I'm very new to world of compilers/vm's and am probably attempting to do
something way over my head here...)

As I've been reading GCCINT (GCC Internals) especially in reference to
RTL and creation of patterns to match particular RTL isns, I've noticed
that RTL is expressed as a series of fairly low-level, fundamental,
machine-based operations (e.g. Add 2 Ints, Branch on Condition, Set
Condition Codes based on, etc).

I kept thinking to myself, well, even though many of RTL's ops that
need mapped are much lower-level than many PARROT ops, I *should* be
able to map them all satisfactorily. Unfortunately, I completely missed
the point of memory addressing. The problem, if I understand you
correctly, is that when RTL says, "Hey Add these 2 Ints for me..", I
have no way of knowing what those Ints represent. They could be a
Checkbook balance and a Deposit -or- they could be an address of the
Checkbook balance and the address of the Deposit. That address, in turn,
needs to be accessible by something like, "Load the Integer at this
address into the first (Integer) Register...." ---- Hmmmmm, that makes
me think....what if an instruction like:

"Store value X from Register Y at address pointed to by Register Z"

were implemented as storing Reg Y in an Expandable Array (or perhaps
Hash) where the value in Z represents the Index/Key into the Array. So,
in other words, an address related action is really a reference into and
Array (Hmmm.....that just sounds like a less efficient way to do it than
use a PMC pointing to a big hunk of memory a you suggested, No?)

OK, so the problem is that GCC's RTL representation is not rich enough
in semantics for the back-end to map what it needs to do PARROT in a
true PARROIISH fashion, correct? Could this *reasonably* (and I really
use this term LOOSELY) be solved by creating a richer RTL and changing
the Parse Tree->RTL stage to emit the richer semantic RTL rather than
the low-level RTL? Or, would it be more advisable to not use GCC at all
and simply focus on creation of direct-to-parrot compilers for various
languages?

Thanks for Your Input,

Gerry Butler

Mitchell N Charity

unread,

Mar 21, 2004, 11:39:39 AM3/21/04

to perl6-i...@perl.org

Having already argued against it, here is an argument that reusing
RetContinuations is acceptable.

Parrot is not a side-effect free language. So _every_ continuation
goes out with a social contract.

Something like "Use this continuation once, great, but use it twice,
and things are _so_ undefined, and the user _really_ won't like it",
seems a perfectly valid, even normal, caller contract.

Eg, I add 2+2, write the result to an open file handle, then _close_
the file handle, and putter off. Sure, add has the power to return
more than once. So the file handle doesn't get gc'ed. But it has
mutated into something which goes boom when written to. And if/when
bad things happen because add returns twice, well, we tell the user
"it's add's fault! Or someone add called! It's not _our_ fault.".

Given that, it seems reasonable, for instance, to use the same normal
Continuation for two adjacent method calls, and wrap demultiplexing
logic around the label where the continuation lands. "Is this the
first time the continuation has returned, then save the return value,
and do the second method. If this is the second time, we're done,
putter onward". If this > 2nd time, go boom.

And if we can do _that_, well, why bother with the demultiplexing?
Just mutate the Continuation, telling it now to return to label2
rather than label1.

And RetContinuation segfaulting when used twice becomes acceptable
behavior under this model. Sigh.

So, if you buy all this, it seems
(1) RetContinuation is a valid optimization on Continuation.
It is defended against multiple invocation by social contract.
A mechanism which of course makes it hard to find bugs.
(2) RetContinuation pmc's _can_ be reused.
Providing even more bug cover.
(3) One might amortize the construction cost of a Continuation by
using it for a couple of calls, perhaps making it competitive
with RetContinuation. Also bug cover.

Caveats. It is not clear to me how much stuff one can actually do
before the state a Continuation captured ceases to be acceptable. If
it is even if it is possible to set up the second method call. So (3)
may not work.

If one does reuse RetConts, one really wants an easy way to globally
turn this off, and instead have RetConts croak if reused.

I nominate this optimization for the "Winter 2004 tweak most likely to
make parrot unstable and undebuggable" award. ;) But perhaps if we can
live with it, we'll be running fast enough to set this optimization
focus aside, and get back to working on making parrot actually usable.
With real use, we'll have real loads to do optimization against.

Blech.
Mitchell

Leopold Toetsch

unread,

Mar 21, 2004, 1:03:29 PM3/21/04

to mcha...@vendian.org, perl6-i...@perl.org

Mitchell N Charity <mcha...@vendian.org> wrote:

> And RetContinuation segfaulting when used twice becomes acceptable
> behavior under this model. Sigh.

No. Segfaults aren't accetable. I've already outlined a scheme to
disable this optimization. It's currently not done but its simple: If
ever a Continuation is created, the RetContinuation recycling will be
turned off, in the absence of better hints that they are sill usable
safely.

If there are other possible paths that a RetContinuation can be reused,
the optimization will be turned off. If that's not detectable, then the
optimization can only be turned on by user demand.

> I nominate this optimization for the "Winter 2004 tweak most likely to
> make parrot unstable and undebuggable" award. ;)

I've clearly indicated that it isn't finished yet. Its not integrated,
its easy to turn it off and so on. But if we ever want some kind of
function call performance (not only method calls - remember) then we
need this or something very similar.

> Blech.
> Mitchell

leo

Matt Fowles

unread,

Mar 21, 2004, 1:31:10 PM3/21/04

to Piers Cawley, Dan Sugalski, perl6-i...@perl.org

All~

Piers Cawley wrote:
> I argue that we have the problems we do (incorrect behaviour of
> continuations, horrible allocation performance) because we chose the
> wrong optimization in the first place. The stack optimizations that are
> in place make sense when you don't have continuations, but once you do,
> the cost of allocating a continuation and maintaining all that COW
> complexity becomes prohibitive. I strongly advocate rejigging the
> stacks so that one stack frame = 1 stacked thing + 1 link to the next
> thing in the chain. No need for COW, no need for memcpy when allocating
> continuations, no worrying complexity to deal with while you're trying
> to get the behaviour right. Oh, an no need for RetContinuations either.
>

I feel like Piers has asked this several times and never gotten an
answer about why it is (or is not) a good idea. I agree with him that
it is a good idea, but I am far from an authoritative source.

Also, Piers didn't you implement this once already? If could you update
it to the current system without too much trouble? It would be nice to
have numbers on this approach.

Matt

Leopold Toetsch

unread,

Mar 21, 2004, 3:30:08 PM3/21/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:

> ... I strongly advocate rejigging the

> stacks so that one stack frame = 1 stacked thing + 1 link to the next
> thing in the chain.

Let's do things in correct order. First was method cache. 2nd the
debatable return continuation recycling. Both accummulated, sum up to
300% speedup. Next is - that's true - stack code or better register
preservation costs.

Please note that the current stack code is in, to make it working again.
It was broken a long time. Now its by far less broken. All improvements
are welcome.

> ... No need for COW, no need for memcpy when allocating

> continuations, no worrying complexity to deal with while you're trying
> to get the behaviour right.

Implementations of better schemes are much appreciated.

> Oh, an no need for RetContinuations either.

You seem to be mixing up different issues with that statement. Using
plain Continuation PMCs for returning just from subroutines was dead
slow, w or w/o COWed stacks. And ...

Implementations of better schemes are much appreciated.

leo

Piers Cawley

unread,

Mar 21, 2004, 4:03:54 PM3/21/04

to l...@toetsch.at, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

But when a Continuation is simply a collection of pointers to the tops
of the various stacks (and I really do think it should include P1 in
that list...) will it really be that slow? I'm surprised.

Piers Cawley

unread,

Mar 21, 2004, 4:00:30 PM3/21/04

to Matt_...@softhome.net, Dan Sugalski, perl6-i...@perl.org

Matt Fowles <Matt_...@softhome.net> writes:

Nope, never implemented it. I have the C skills of a teeny tiny kitten.

Luke Palmer

unread,

Mar 21, 2004, 5:53:11 PM3/21/04

to Piers Cawley, l...@toetsch.at, perl6-i...@perl.org

Piers Cawley writes:
> > You seem to be mixing up different issues with that statement. Using
> > plain Continuation PMCs for returning just from subroutines was dead
> > slow, w or w/o COWed stacks.
>
> But when a Continuation is simply a collection of pointers to the tops
> of the various stacks (and I really do think it should include P1 in
> that list...) will it really be that slow? I'm surprised.

I implemented this in the register stacks, but I didn't spend any time
on optimization (and who knows, I may have even been marking the stacks
COW unnecessarily). Continuation creation, call, and return had great
performance. But the problem was that pushtop, etc. were much slower
than with the current scheme.

I would love to see RetContinuation leave, honestly. One of the
greatest things about CPS is that you can grab anybody's continuation
that they passed you and store it somewhere. RetContinuation is just a
computed goto.

But more than I want to see RetContinuation leave, I want to see sub
calls fast.

Maybe when I get some time again I'll implement single frame linked
stacks for a benchmark.

Luke

Piers Cawley

unread,

Mar 21, 2004, 7:52:06 PM3/21/04

to Luke Palmer, l...@toetsch.at, perl6-i...@perl.org

Luke Palmer <lu...@luqui.org> writes:

> Piers Cawley writes:
>> > You seem to be mixing up different issues with that statement. Using
>> > plain Continuation PMCs for returning just from subroutines was dead
>> > slow, w or w/o COWed stacks.
>>
>> But when a Continuation is simply a collection of pointers to the tops
>> of the various stacks (and I really do think it should include P1 in
>> that list...) will it really be that slow? I'm surprised.
>
> I implemented this in the register stacks, but I didn't spend any time
> on optimization (and who knows, I may have even been marking the stacks
> COW unnecessarily). Continuation creation, call, and return had great
> performance. But the problem was that pushtop, etc. were much slower
> than with the current scheme.

The thing is, 'pushtop' is almost certainly the wrong thing to do. You
should only push the registers you care about onto the register
stacks. I still find it odd that pushing 16 register onto the stack is
quicker than pushing 3 (for appropriate values of 3)

The catch is that, whilst I know which registers *I* care about, I don't
know which registers IMCC cares about. So maybe IMCC's 'newcont' should
expand to:

save 'imcc_magic_register1'
save 'imcc_magic_register2'
target = newsub .Continuation, dest
restore 'imcc_magic_register2'
restore 'imcc_magic_register1'

Notice how making a continuation looks remarkably like making a function
call.

If the destination of the continuation is within the current
compilation unit (which it probably should be, or things get *very*
weird) then, potentially, IMCC knows what registers the continuation
target cares about and can automagically save the current .locals as
well.

> I would love to see RetContinuation leave, honestly. One of the
> greatest things about CPS is that you can grab anybody's continuation
> that they passed you and store it somewhere. RetContinuation is just a
> computed goto.

Would it be possible to arrange things so that

$P0 = new .Continuation
$P0 = P1 # The current RetContinuation

makes $P0 into a full continuation equivalent to the RetContinuation?

Leopold Toetsch

unread,

Mar 22, 2004, 1:57:18 AM3/22/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:

> The thing is, 'pushtop' is almost certainly the wrong thing to do. You
> should only push the registers you care about onto the register
> stacks.

Yes:

$ time parrot -j oofib.imc
fib(28) = 317811 3.050051s

real 0m3.077s

$ time parrot -j -Oc oofib.imc
fib(28) = 317811 2.234049s

real 0m2.262s

C<savetop> becomes C<pushtopp> if only P-registers are used. Saving only
3 or 5 registers isn't possible yet. We have no opcodes for that.

> The catch is that, whilst I know which registers *I* care about, I don't
> know which registers IMCC cares about. So maybe IMCC's 'newcont' should
> expand to:

> save 'imcc_magic_register1'
> save 'imcc_magic_register2'
> target = newsub .Continuation, dest
> restore 'imcc_magic_register2'
> restore 'imcc_magic_register1'

> Notice how making a continuation looks remarkably like making a function
> call.

The usage of above Continuation would be to pass it on into a subroutine
and eventually return through it to an alternate position? If yes, then
the Continuation should be created inside the call sequence to avoid
duplicate register saving and restoring code:

res = subroutine(args, NEWCONT)

But that's only one side of the business. The code at the return label
C<dest>) has to restore the same registers too.

If the target of the Continuation isn't the function return, it has to
look somethin like this:

goto around_cont_ret
cont_ret_target:
restoretop # or whatever was saved
around_cont_ret:

> If the destination of the continuation is within the current
> compilation unit (which it probably should be, or things get *very*
> weird) then, potentially, IMCC knows what registers the continuation
> target cares about and can automagically save the current .locals as
> well.

Yes.

> Would it be possible to arrange things so that

> $P0 = new .Continuation
> $P0 = P1 # The current RetContinuation

> makes $P0 into a full continuation equivalent to the RetContinuation?

Sure. It depends on what part of the context should be copied into the
Continuation:

get_addr Idest, P1
set_addr $P0, Idest # assign dest - implemented

or

assign $P0, P1 # vtable->set_pmc (N/Y)

which could do whatever is appropriate.

($P0 = P1 just aliases the two and isn't usable for assignment)

leo

Leopold Toetsch

unread,

Mar 22, 2004, 2:20:04 AM3/22/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:
> Leopold Toetsch <l...@toetsch.at> writes:

>> You seem to be mixing up different issues with that statement. Using
>> plain Continuation PMCs for returning just from subroutines was dead
>> slow, w or w/o COWed stacks.

> But when a Continuation is simply a collection of pointers to the tops
> of the various stacks (and I really do think it should include P1 in
> that list...) will it really be that slow? I'm surprised.

You are missing object creation overhead. P1 and P2 are in the saved
P-register frame.

And finally: I really don't know, if a Continuation needs just pointers
to the stacks (and to which) or (COWed) copies (to which).

*Will please somebody describe all the semantics of Continuation usage.*

leo

Piers Cawley

unread,

Mar 22, 2004, 4:03:33 AM3/22/04

to l...@toetsch.at, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

> Piers Cawley <pdca...@bofh.org.uk> wrote:
>
>> The thing is, 'pushtop' is almost certainly the wrong thing to do. You
>> should only push the registers you care about onto the register
>> stacks.
>
> Yes:
>
> $ time parrot -j oofib.imc
> fib(28) = 317811 3.050051s
>
> real 0m3.077s
>
> $ time parrot -j -Oc oofib.imc
> fib(28) = 317811 2.234049s
>
> real 0m2.262s
>
> C<savetop> becomes C<pushtopp> if only P-registers are used. Saving only
> 3 or 5 registers isn't possible yet. We have no opcodes for that.

save Pn
save Pm
...
restore Pm
restore Pn

Admittedly we don't have opcodes for storing multiple registers at a
time, and having them would be a useful optimization...

>> The catch is that, whilst I know which registers *I* care about, I don't
>> know which registers IMCC cares about. So maybe IMCC's 'newcont' should
>> expand to:
>
>> save 'imcc_magic_register1'
>> save 'imcc_magic_register2'
>> target = newsub .Continuation, dest
>> restore 'imcc_magic_register2'
>> restore 'imcc_magic_register1'
>
>> Notice how making a continuation looks remarkably like making a function
>> call.
>
> The usage of above Continuation would be to pass it on into a subroutine
> and eventually return through it to an alternate position?

Have you looked at the parrotunit code that I (finally) posted? That
creates a continuation and calls a method to set it as an object
attribute. If you could fix whatever I'm doing wrong there I'd be
obscenely grateful btw.

Incidentally, the latest rewrite has the continuation created and saved
on the object using:

savetop
P5 = new Continuation
set_addr P5, catch0
I0 = 1
I1 = 0
I2 = 0
I3 = 1
I4 = 0
S0 = "setFailureHandler"
callmethodcc
restoretop

...
finally:
thisObject."tearDown"() # thisObject is self in a high register...
.pcc_begin_return
.pcc_end_return

catch0:
restoretop
...
branch finally
.end

and the continuation is invoked using 'handler("a string") rather than
simply invoking it. Catch is, 'finally' gets jumped to twice.

Other usages include stuffing the continuation into a
global.

> If yes, then the Continuation should be created inside the call
> sequence to avoid duplicate register saving and restoring code:
>
> res = subroutine(args, NEWCONT)

> But that's only one side of the business. The code at the return label
> C<dest>) has to restore the same registers too.
>
> If the target of the Continuation isn't the function return, it has to
> look somethin like this:
>
> goto around_cont_ret
> cont_ret_target:
> restoretop # or whatever was saved
> around_cont_ret:

Oh yes, I'm fine with that. The problem I have at the moment is that I
don't know *what* to save and restore. It appears that IMCC uses at
least one register in P16-32 to save a copy of P1 so that it'll be
caught by a savetop and, in the cases where I was saving individual
registers, creating the continuation, and restoring the registers, I
was failing to save the current continuation because I didn't know
where it was (this is why I want P1 to be invariant over function
calls/continuation creation). Presumably, because IMCC knows that
cont_ret is a continuation target, it can create the appropriate
real_cont_ret and add the appropriate stack manipulation code in there?
This would be really cool.

>> If the destination of the continuation is within the current
>> compilation unit (which it probably should be, or things get *very*
>> weird) then, potentially, IMCC knows what registers the continuation
>> target cares about and can automagically save the current .locals as
>> well.
>
> Yes.
>
>> Would it be possible to arrange things so that
>
>> $P0 = new .Continuation
>> $P0 = P1 # The current RetContinuation
>
>> makes $P0 into a full continuation equivalent to the RetContinuation?
>
> Sure. It depends on what part of the context should be copied into the
> Continuation:
>
> get_addr Idest, P1
> set_addr $P0, Idest # assign dest - implemented
>
> or
>
> assign $P0, P1 # vtable->set_pmc (N/Y)

Assign would be good. I can't really think of an occasion when you'd
want to copy anything less than the full context held by the
continuation.

>
> which could do whatever is appropriate.
>
> ($P0 = P1 just aliases the two and isn't usable for assignment)

D'oh.

Piers Cawley

unread,

Mar 22, 2004, 4:58:44 AM3/22/04

to l...@toetsch.at, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

> Piers Cawley <pdca...@bofh.org.uk> wrote:
>> Leopold Toetsch <l...@toetsch.at> writes:
>
>>> You seem to be mixing up different issues with that statement. Using
>>> plain Continuation PMCs for returning just from subroutines was dead
>>> slow, w or w/o COWed stacks.
>
>> But when a Continuation is simply a collection of pointers to the tops
>> of the various stacks (and I really do think it should include P1 in
>> that list...) will it really be that slow? I'm surprised.
>
> You are missing object creation overhead. P1 and P2 are in the saved
> P-register frame.
>
> And finally: I really don't know, if a Continuation needs just pointers
> to the stacks (and to which) or (COWed) copies (to which).

If a 'single object per stack frame' approah, the continuation only
needs pointers to the various stacktops. In fact all that anything will
need for dealing with such stacks is a pointer to their current
stacktop.

> *Will please somebody describe all the semantics of Continuation usage.*

A Continuation is a closure over the call chain, and the various stacks
in the context. Someone proposed in email that, actually, a
continuation should close over everything but the parameter
registers. Consider the following:

savetop
$P0 = newcont target
store_global "theCont", $P0
target:
restoretop
here:
...

When the continuation is invoked using, say, "cont("Result1", 2, "bibble")
the registers and user stacks should look exactly the same as if you
had just returned from a normal function call that looked like:

a_function_call()
here:

ie: P1 and P2 would be untouched, as would the high registers, and the
various parameter/return registers would be populated with the returned
values.

Given the way that the stack stuff works, I wonder if there's a case
for rejigging the calling conventions so that the control registers
(current continuation, self, (methodName?)) are contiguous with the
user registers. If we made P15 the current object, P14 the current
continuation, and S15 the methodname, then savetop could include them
efficiently without IMCC having to do 'set P20 = P1' at the start of
every sub that makes a function call.

If IMCC were then to always allocate 'user' register in ascending order
from 16, presumably it'd be possible to introduce a new op:

saverangep 14, 20
saverangei 16, 18
...

Along with associated

restorerangep <start>, <count>

(or should that be

restorerangep <start>, <end>

Thoughts?

Leopold Toetsch

unread,

Mar 22, 2004, 5:02:44 AM3/22/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:
> Leopold Toetsch <l...@toetsch.at> writes:

>> C<savetop> becomes C<pushtopp> if only P-registers are used. Saving only
>> 3 or 5 registers isn't possible yet. We have no opcodes for that.

> save Pn
> save Pm

Well, these are already used for pushing items onto the user stack. It
could be

pushtopp 3 # save P16..P18
savetop 4 # save 4 regs from all I16 ..., S16, N16, P16 ... P19

and so on.

>>> Notice how making a continuation looks remarkably like making a function
>>> call.
>>
>> The usage of above Continuation would be to pass it on into a subroutine
>> and eventually return through it to an alternate position?

> Have you looked at the parrotunit code that I (finally) posted?

Yes (you must have missed a f'up with a small patch in it).

> ... That

> creates a continuation and calls a method to set it as an object
> attribute. If you could fix whatever I'm doing wrong there I'd be
> obscenely grateful btw.

Please mail me your current code.

> Incidentally, the latest rewrite has the continuation created and saved
> on the object using:

Ok, code is fine.

> catch0:
> restoretop
> ...
> branch finally
> .end

Should work.

> and the continuation is invoked using 'handler("a string") rather than
> simply invoking it. Catch is, 'finally' gets jumped to twice.

Probably registers are still messed up.

> Other usages include stuffing the continuation into a
> global.

[ PIR syntax for Continuations ]

> Oh yes, I'm fine with that. The problem I have at the moment is that I
> don't know *what* to save and restore. It appears that IMCC uses at
> least one register in P16-32 to save a copy of P1 so that it'll be
> caught by a savetop and, in the cases where I was saving individual
> registers, creating the continuation, and restoring the registers, I
> was failing to save the current continuation because I didn't know
> where it was (this is why I want P1 to be invariant over function
> calls/continuation creation).

P1 (and self) are saved in some registers in the range P16..P31 and
preserved by pushtopp (or savetop). On function return P16..P31 are
restored. The return continuation is put into P1 (or invoked) only
in the function return sequence:

.pcc_begin_return
.pcc_end_return

translates normally to:

set P1, Px
set I1, 1
...
set I5, 0
invoke P1

While the return continuation is preserved, you don't know where. Do you
need access to the current return continuation?

> ... Presumably, because IMCC knows that

> cont_ret is a continuation target, it can create the appropriate
> real_cont_ret and add the appropriate stack manipulation code in there?
> This would be really cool.

The code for creating the continuation and the return must be in sync.
When you pass the continuation on into a subroutine and want to return
either normally or through the continuation, we need something like:

$P0 = newcont dest_label FOR _invoker
_invoker($P0)
...
dest_label:
...

Creating correct code from that is a bit ugly, because the continuation
is created, before the actual call sequence is generated. So a bit more
verbose:

.pcc_begin prototyped
.arg_newcont dest_label
.pcc_call _invoker
.pcc_end
...

.CONT_TARGET dest_label:

That's still complicated but doable. That would need searching the
current unit for subroutine calls that have a C<.arg_newcont> argument,
compare the labels and create finally the very same register
restore opcode(s) that the function call has.

OTOH storing a continuation inside a global can't prepare any code for
the continuation return, because nothing about the continuation's usage
is known.

>> assign $P0, P1 # vtable->set_pmc (N/Y)

> Assign would be good. I can't really think of an occasion when you'd
> want to copy anything less than the full context held by the
> continuation.

What about:

.sym pmc cont
cont = newcont dest
...
updatecc cont # assign current P1's context to cont

With

assign cont, P1

we need another syntax extension to actually get the real P1 register:

assign cont, current_P1

or P1 is always restored immediately after function calls (like
currently P2).

I think, C<assign> and restoring P1 immediately would be the most useful
combination.

leo

Piers Cawley

unread,

Mar 22, 2004, 6:45:57 AM3/22/04

to l...@toetsch.at, Dan Sugalski, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:

> Dan Sugalski <d...@sidhe.org> wrote:
>> Okay, as I see it there are two big things that we can do to speed
>> objects up. (Well, besides speeding up the creation of continuation
>> PMCs, which I am, at the moment, sorely tempted to put in a special
>> pool for fast allocation)
>
> I though about that already. Returncontinuations created via the *cc
> opcode variants are created in the opcode and used exactly once just for
> returning from the sub. So I'd do for these a per-interpreter
> freelist, where they get put back after invokeing the return
> contination.

How hard would it be to stick all continuations onto a 'weak'
continuation stack (not seen by DOD) then, during DOD, mark the freed
continuations (or the live ones). After DOD do the following

# Assume cpool_head = top of stack
# cpool_last = last continuation in stack
# end_of_chain = guard object, is never free

return if cpool_head.isfree
last = cpool_head
this = cpool_head.next
while !this.is_free {
last = this
this = this.next
}
last.next = end_of_chain
cpool_last.next = cpool_head
cpool_head = this

When you come to allocate a continuation, you know that, if the
head of the continuation list isn't free, there are no free
continuations on the list, so you allocate a new one and push it onto
the list.

If the head of the list is free, grab it, mark it as used, rerun the
above algorithm, populate your continuation and continue on your merry
way.

If we use a single value per stack frame approach to the stack we can
recycle stack frames in the same way.

(I know, patches welcome, but my C sucks)

--
Piers

Leopold Toetsch

unread,

Mar 22, 2004, 5:50:11 AM3/22/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:
> Leopold Toetsch <l...@toetsch.at> writes:

>> And finally: I really don't know, if a Continuation needs just pointers
>> to the stacks (and to which) or (COWed) copies (to which).

> If a 'single object per stack frame' approah, the continuation only
> needs pointers to the various stacktops. In fact all that anything will
> need for dealing with such stacks is a pointer to their current
> stacktop.

That could work fine for register frames, which are big anyway. OTOH
control, user, and pad stacks items are single pointers. Accessing single
items as frames isn't really optimal.

>> *Will please somebody describe all the semantics of Continuation usage.*

> When the continuation is invoked using, say, "cont("Result1", 2, "bibble")

> the registers and user stacks should look exactly the same as if you
> had just returned from a normal function call that looked like:

> a_function_call()
> here:

That's clear. The problem is actually, how to create the continuation,
so that everything looks like a proper function call. The call
sequence of a .pcc_call maps to the creation of a continuation:

.pcc_begin
.arg x
.arg y
.create_call conti, label
.pcc_end

The return is usually a different location:

.pcc_begin
label:
.ret_from conti
.result r1
.result r2
.pcc_end

This OTOH doesn't really cover the usage of a continuation that is
passed on to a function or method call as an argument.

> Given the way that the stack stuff works, I wonder if there's a case
> for rejigging the calling conventions so that the control registers
> (current continuation, self, (methodName?)) are contiguous with the
> user registers. If we made P15 the current object, P14 the current
> continuation, and S15 the methodname, then savetop could include them
> efficiently without IMCC having to do 'set P20 = P1' at the start of
> every sub that makes a function call.

Interesting idea. OTOH these register moves are really fast. C<set_p_p>
is two CPU instructions only in the (i386) JIT core.

> If IMCC were then to always allocate 'user' register in ascending order
> from 16, presumably it'd be possible to introduce a new op:

> saverangep 14, 20

pushp from, to

could replace all variants we now have:

pushp # 0..31
pushtopp # 16..31
pushbottomp # 0..15

That would need variable sized register frame chunks. But looking at
current memory costs of C<pushtopp> and friends, it could be a good
optimization.

leo

Leopold Toetsch

unread,

Mar 22, 2004, 6:58:59 AM3/22/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley <pdca...@bofh.org.uk> wrote:

> How hard would it be to stick all continuations onto a 'weak'
> continuation stack (not seen by DOD) then, during DOD, mark the freed
> continuations (or the live ones).

The current code is much simpler. s.

void add_to_retc_free_list(Parrot_Interp, PMC*);
PMC *get_retc_from_free_list(Parrot_Interp);
void mark_object_cache(Parrot_Interp);

in objects.c

leo

Dan Sugalski

unread,

Mar 22, 2004, 7:56:08 AM3/22/04

to Gerald E Butler, perl6-i...@perl.org

At 11:33 AM -0500 3/20/04, Gerald E Butler wrote:
>Hello all,
>
> I've been investigating the possibility of creating a MACHINE
>DESCRIPTION (aka BACK-END) for GCC to target PARROT. My thinking is
>this: If a satisfactory GCC back-end targeting PARROT is created -and-
>PARROT is efficient enough (which from reading the documentation thus
>far produced seems like it is an inevitable conclusion) then GCC could
>compile itself to PARROT byte-code giving PARROT (and the whole open
>source community) a PARROT self-host compiler which compiles multiple
>languages (C#, C++, C, Pascal, Objective-C, Java, etc, etc, etc) to the
>PARROT runtime.
>
> Is a complete non-starter, or is this something which has
>possibilities? Please give your expert opinions.

While I think you may run into some issues (James pointed some out
already) I also think it's worth looking at in some more depth
anyway. It may be problematic for some languages, but GCC compiles a
lot and I don't see any reason why we shouldn't be OK with, say,
FORTRAN or java.

Leopold Toetsch

unread,

Mar 26, 2004, 10:34:58 AM3/26/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 6:46 PM +0100 3/17/04, Leopold Toetsch wrote:

>>Or: after the 1st delegate lookup create a JITed stub

> Which is swell, except for that pesky can't-guarantee-a-JIT thing... :)

I've running that now for the C<__init> call. In the absence of
C<__init> the vtable function is replaced by C<ret>. When C<__init> is
present a JITed function stub gets called that calls the PASM w/o method
lookup.

Speedup for missing C<__init> is 100%[1], with C<__init> around 10% [2]

It doesn't work for superclasses' C<__init> yet though.

Is it more "swell" or "pesky"?

[1] new Px, Iclass in a loop
[2] oo2.pasm bench

leo

Dan Sugalski

unread,

Mar 26, 2004, 12:21:31 PM3/26/04

to l...@toetsch.at, perl6-i...@perl.org

Depends on whether it requires the JIT or not. :)

FWIW, I figure the way to do this fast is to have two extra slots in
the class object, one for constructors and one for destructors. The
class can fill 'em in at construct time, and the object
constructor/destructor stuff can just walk through the list. Saves us
the hassle of doing all the lookups at runtime.

Leopold Toetsch

unread,

Mar 29, 2004, 11:18:33 AM3/29/04

to Piers Cawley, perl6-i...@perl.org

Piers Cawley wrote:

> Leopold Toetsch <l...@toetsch.at> writes:
>
> Out of interest, why do we have distinct register and user stacks?

User stack entries are typed and you save and restore one item per
operation. The register backing stores first took 32 registers of one
kind (now 16 or 32). To get values around a pop on register stack, you
need both:

save P5
popp
restore P5 # all but P5 restored

> But I have the feeling I'm thinking IMCC is rather more sophisticated
> than it is in real life. From the point of view of a programmer, the
> important thing is that invoking a continuation should return the upper
> and control registers (but not the argument registers) to the state
> they were in when the continuation was made. How the continuation is
> subsequently stored/passed is completely irrelevant to this.

That's clear. The problem is the register stack restore sequence at the
continuation's destination label. This sequence must match the store
operations at the point, where the continuation is constructed.

leo

Piers Cawley

unread,

Mar 29, 2004, 6:39:59 AM3/29/04

to l...@toetsch.at, perl6-i...@perl.org

Leopold Toetsch <l...@toetsch.at> writes:
> Piers Cawley <pdca...@bofh.org.uk> wrote:
>> Leopold Toetsch <l...@toetsch.at> writes:
>
>>> C<savetop> becomes C<pushtopp> if only P-registers are used. Saving only
>>> 3 or 5 registers isn't possible yet. We have no opcodes for that.
>
>> save Pn
>> save Pm
>
> Well, these are already used for pushing items onto the user stack. It
> could be
>
> pushtopp 3 # save P16..P18
> savetop 4 # save 4 regs from all I16 ..., S16, N16, P16 ... P19
>
> and so on.

Out of interest, why do we have distinct register and user stacks?

[...]

>> ... Presumably, because IMCC knows that
>> cont_ret is a continuation target, it can create the appropriate
>> real_cont_ret and add the appropriate stack manipulation code in there?
>> This would be really cool.
>
> The code for creating the continuation and the return must be in sync.
> When you pass the continuation on into a subroutine and want to return
> either normally or through the continuation, we need something like:
>
> $P0 = newcont dest_label FOR _invoker
> _invoker($P0)
> ...
> dest_label:
> ...

But the function the continuation gets passed to is completely
irrelevant. When you make a continuation you want to save exactly the
same state as you'd save if you were making a function call at the same
point.

Say you had code like

...
$P0 = "Something"
$P1 = "Something else"
$P2 = "Some other thing"
.newcont $P3, dest_label
...
do_return:
.pcc_begin_return
.pcc_end_return

dest_label:
print $P0
print $P2
branch do_return

Then it'd be cool if IMCC could look ahead to see that when (if) the
continuation is invoked, the only registers that get used are $P0 and
$P2 and emit something like:

...
$P0 = "Something"
$P1 = "Something else"
$P2 = "Some other thing"
save P1
save P2
save $P0
save $P2
$P3 = newcont Continuation, dest_label
restore $P2
restore $P0
restore P2
restore P1
...
do_return:
.pcc_begin_return
.pcc_end_return

dest_label:
restore $P2
restore $P0
restore P2
restore P1
print $P0
print $P2
branch do_return

But I have the feeling I'm thinking IMCC is rather more sophisticated
than it is in real life. From the point of view of a programmer, the
important thing is that invoking a continuation should return the upper
and control registers (but not the argument registers) to the state
they were in when the continuation was made. How the continuation is
subsequently stored/passed is completely irrelevant to this.

> Creating correct code from that is a bit ugly, because the continuation

> is created, before the actual call sequence is generated. So a bit more
> verbose:
>
> .pcc_begin prototyped
> .arg_newcont dest_label
> .pcc_call _invoker
> .pcc_end
> ...
>
> .CONT_TARGET dest_label:
>
> That's still complicated but doable. That would need searching the
> current unit for subroutine calls that have a C<.arg_newcont> argument,
> compare the labels and create finally the very same register
> restore opcode(s) that the function call has.
>
> OTOH storing a continuation inside a global can't prepare any code for
> the continuation return, because nothing about the continuation's usage
> is known.

This is irrelevant.

>>> assign $P0, P1 # vtable->set_pmc (N/Y)
>
>> Assign would be good. I can't really think of an occasion when you'd
>> want to copy anything less than the full context held by the
>> continuation.
>
> What about:
>
> .sym pmc cont
> cont = newcont dest
> ...
> updatecc cont # assign current P1's context to cont
>
> With
>
> assign cont, P1
>
> we need another syntax extension to actually get the real P1 register:
>
> assign cont, current_P1
>
> or P1 is always restored immediately after function calls (like
> currently P2).
>
> I think, C<assign> and restoring P1 immediately would be the most useful
> combination.

Absolutely.

Vishal Vatsa

unread,

Apr 29, 2004, 11:24:00 AM4/29/04

to perl6-i...@perl.org

hi Guys
I am working on this problem as my masters research project.

Details to follow soon

--
Vishal Vatsa
Dept. of Computer Sc.
NUI Maynooth

Vishal Vatsa

unread,

Apr 29, 2004, 11:41:45 AM4/29/04

to perl6-i...@perl.org, Gerald E Butler

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi

I am research student in Ireland and I have been working on this problem for a
while now, I have spent a couple of months getting to grips with gcc
internals, (by the time I finish this project I might not have any hair left
on my head ;o).

> Hello all,
>
> I've been investigating the possibility of creating a MACHINE
> DESCRIPTION (aka BACK-END) for GCC to target PARROT. My thinking is
> this: If a satisfactory GCC back-end targeting PARROT is created -and-
> PARROT is efficient enough (which from reading the documentation thus
> far produced seems like it is an inevitable conclusion) then GCC could
> compile itself to PARROT byte-code giving PARROT (and the whole open
> source community) a PARROT self-host compiler which compiles multiple
> languages (C#, C++, C, Pascal, Objective-C, Java, etc, etc, etc) to the
> PARROT runtime.
>
> Is a complete non-starter, or is this something which has
> possibilities? Please give your expert opinions.

I have managed to create a token backend for parrot so far, nothing really
great, (ie does not work at the moment). I hope to have a prototype ready for
YAPC::Eu::Belfast in september.

I have had my head in the guts of gcc for a while now, so have not been
looking at the list.

I am aware of the PMC issues, but I am going full steam at the moment,
hopefully I will not turn in to a train wreck :o)

My main aim is to create the parrot backend, and then have some fun with
parrot and jvm.

In the next week or two I will put up my research page ( what I have done and
what I am doing) on http://www.parrot.cs.may.ie

and I will monitor this thread with great interest.

Thats all for now.

- --

Vishal Vatsa
Dept. of Computer Sc.
NUI Maynooth

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFAkSJCJF0YUd7wiaIRArpYAJ0R4Mf84JRxdjfww3JJwJc2pWfTUgCbBszR
iwmDBv1UxqYAbpciVlIqXo4=
=m0lK
-----END PGP SIGNATURE-----