0) The register allocator isn't to blame, that looks really fine.
We have two kinds of problems one with exceptions and one with
continuations. As exceptions are continuations, we could also define it
as the same problem. Anyway, the usage of exceptions is a bit different.
Catch blocks are local labels only AFAIK in HLL. Parrot allows currently
a Sub too, but that might be bogus.
1) Exceptions (see t/op/gc_14.imc)
We have a typical usage showing the problem like:
n = x
... # code that can reuse register of n
newsub eh, .Exception_Handler, catch
set_eh eh
# code that migth through
n = y
clear_eh
catch:
print n
Now the register allocator just doesn't know that there exists a control
flow graph edge from anywhere in the try block to the catch label. The
register of variable n from the first statement is therefore not
preserved, as it's reassigned in the try block. So the catch block can
get the wrong value (neither x nor y) of n.
A simble solution could look like:
n = x
new eh, .Exception_Handler
set_eh eh, catch
# code that might through
n = y
clear_eh
catch:
with the changed opcode C<set_eh (in PMC, labelconst INT).
This would suffice to make a real branch target out of the catch label
and the register allocator should do the right thing. You can test that
by inserting "unless $P0, catch" after "set_eh" in the mentioned test.
(there is some code in imcc that specially treats newsub, but the newsub
ocpode isn't the branch - basically everything below C<set_eh> may
branch to the catch label)
(Did I already mention that implict behavior like using a register or
branching is bad)
2) Continuations (t/op/gc_13.imc [Hi Piers])
Again we have a hidden branch done by a Continuation, or better a loop.
The control flow of the main program is basically represented by this
conventional code fragment:
arr1=[...]; arr2=[...]
loop1: x = choose(arr1)
loop2: y = choose(arr2)
...
failed = fail()
goto (loop1, loop2)[failed]
except that the gotos are performed by backtracking via the
continuations. So we don't have these loop labels and the continuation
continues at the next opcode after the invocation of choose() and not at
the shown position above.
So again, the register allocator doesn't see that there is a branch, the
variable's arr2 register is clobbered, in this case by the fail closure.
As we never know, if a subroutine captures the return continuation and
creates such loops like above, we have in the absence of other syntax
actually no chance to hold any variable in a register as far as I can
see now. We'd have just to force using lexicals for all vars, except for
leaf subroutines (that don't call other subs) or subs that just call one
sub (they can't create loops).
Another idea is to create edges from all 1..n function calls in a sub to
the previos 0..n-1 calls, so that basically all possible loops done via
possible continuations are represented in the CFG.
Comments welcome,
leo
That analysis looks correct to me--any time you have a function call,
the subsequent op might be a branch target, if there is a subsequent
function call.
> We'd have just to force using lexicals for all vars
Having variable-size register sets would solve this, since you could
have fixed assignments of variables to registers, and not have to
suffer the overhead of moving data between registers and lexical pads
over-and-over. Well, it doesn't really "solve" it--just makes it
workable.
JEff
I like the idea of mandating lexicals vars. This would also eliminate
the need for spilling (I think), as the register allocator would only
need to refetch the lexical rather than save it off somewhere to be
restored later.
I get the feeling that this is equivalent to requiring exception
handlers to be a locally defined closure, which is another way we
could go about this.
Matt
--
"Computer Science is merely the post-Turing Decline of Formal Systems Theory."
-???
> I like the idea of mandating lexicals vars. This would also eliminate
> the need for spilling (I think), as the register allocator would only
> need to refetch the lexical rather than save it off somewhere to be
> restored later.
There are two issues: yes with refetch - no with store (probably).
Lexicals and globals are references, so:
add Plex, Px, Py
stores already x+y in the variable lex. Well, that's a compiler issue
and a reason to keep the current "destination exists" semantics of
opcodes. (This usage doesn't cope with the generation of new values,
though and morping "Undef" isn't a good solution)
> I get the feeling that this is equivalent to requiring exception
> handlers to be a locally defined closure, which is another way we
> could go about this.
Yes. That solves it. OTOH going all along with lexicals could be pretty
inefficient.
> Matt
leo
On Sat, 13 Nov 2004 14:08:12 -0800, Jeff Clites <jcl...@mac.com> wrote:
> That's oversimplifying a bit, but I feel like those are the core issues
> (stemming from the observation of Leo's that continuations in effect
> give all variables a lifetime that extends to their entire block, in
> most cases).
This does not give all variables extended lifetimes. It only gives
variables that are used in the exception handler such a lifetime.
Thus temporaries used in calculations can be safely overwritten.
Perhaps we should try adding the control flow arcs to the basic block
analysis and see how the allocator handles it then...
> On Sat, 13 Nov 2004 14:08:12 -0800, Jeff Clites <jcl...@mac.com>
> wrote:
>> That's oversimplifying a bit, but I feel like those are the core
>> issues
>> (stemming from the observation of Leo's that continuations in effect
>> give all variables a lifetime that extends to their entire block, in
>> most cases).
>
> This does not give all variables extended lifetimes. It only gives
> variables that are used in the exception handler such a lifetime.
> Thus temporaries used in calculations can be safely overwritten.
> Perhaps we should try adding the control flow arcs to the basic block
> analysis and see how the allocator handles it then...
Not all variables, but due to Leo's case (2), it should be all
variables which are referenced after the first function call out of a
subroutine, if there's a later function call; for instance, consider:
...
foo();
a = 10;
b = 12;
... use a and b
bar();
... never use a or b again
c = 1;
d = 2;
... use c and d
baz();
... never use c or d again
zoo();
Normally, the lifetime of a and b would end at the call to bar, and
that of c and d would end at the call to baz, but due to continuations,
the call to zoo might resume at the op right after the call to foo
(since the continuation created when calling foo may have been
captured), so a,b,c,d have to persist at least until the last function
call is made.
We could teach the allocator about these possibilities as you
mentioned, and that would give us correct program behavior, but we know
a priori that we'll have much higher register pressure that you might
expect, so a different overall strategy might work better.
JEff
On Sat, 13 Nov 2004 17:35:02 -0800, Jeff Clites <jcl...@mac.com> wrote:
> Not all variables, but due to Leo's case (2), it should be all
> variables which are referenced after the first function call out of a
> subroutine, if there's a later function call; for instance, consider:
>
> ...
> foo();
> a = 10;
> b = 12;
> ... use a and b
> bar();
> ... never use a or b again
> c = 1;
> d = 2;
> ... use c and d
> baz();
> ... never use c or d again
> zoo();
>
> Normally, the lifetime of a and b would end at the call to bar, and
> that of c and d would end at the call to baz, but due to continuations,
> the call to zoo might resume at the op right after the call to foo
> (since the continuation created when calling foo may have been
> captured), so a,b,c,d have to persist at least until the last function
> call is made.
Yes, but in the case of the continuation resuming after foo, the
continuation should restore the frame to the point where it was taken.
Thus all of the registers will be exactly as they were when the
continuation was taken (i.e. in the correct place). Thus, it does not
matter if we reuse a register later in the function, the continuation
will restore the proper context for itself.
Exceptions handlers, on the other hand, are a different story, because
anything that is used in the catch block must be kept in the correct
place through out the entire try block. However, the allocator will
figure this out for itself if we provide it the branch information.
Another reasonable option is to mandate that the exception handler
starts without any preconception about the register contents and
fetches everything as needed. This means that only lexicals can be
used in the handler, but it is a slightly softer requirement then only
lexicals everywhere.
> Yes, but in the case of the continuation resuming after foo, the
> continuation should restore the frame to the point where it was taken.
> Thus all of the registers will be exactly as they were when the
> continuation was taken (i.e. in the correct place).
Yes, but Jeff's example wasn't really reflecting the problem.
The case I've shown looks like:
.local pmc arr1, arr2
arr1=[1,3,5]
arr2=[1,5,9]
x = choose(arr1)
y = choose(arr2)
# arr2 never used beyond
$P0 = ...
At the last line the register allocator happily reuses the register that
arr2 had for $P0. That's totally legal in the absence of continuations.
So it doesn't suffice that the register frame is restored and that
variables are in the same place.
The effect of the continuation is the creation of a loop in the CFG.
Life time of variables and thus register allocation is different within
loops.
> Exceptions handlers, on the other hand, are a different story, because
> anything that is used in the catch block must be kept in the correct
> place through out the entire try block.
No they aren't really different. The presence of an exception handler
(and the code for installing such a handler) is more visible in PIR.
That's the only difference.
But again up to the above continuation example. The scheme source of the
relevant parts is this:
(define (choose . all-choices)
(let ((old-fail fail))
(call-with-current-continuation
(lambda (continuation)
...
(let ((x (choose 1 3 5))
(y (choose 1 5 9)))
In that source it's obvious that the continuations of choose are
captured in the local closures created by the lambda. So it's probably
just a lack of the compiler (and a lack of PIR syntax) to express the
relevant information that the call to choose has the possible
side-effect of being resumed just after the created "invokecc" opcode.
So from a HLL point of view that's all visible and clear.
<cite author="Piers">And, damnit, making a full continuation isn't
something a programmer should do lightly.
</cite>
And of course an HLL compiler can't and doesn't emit some code that
captures a continuation silently and w/o any reason.
So what to do:
1) Extending register frame size ad infinitum and never reuse a Parrot
register will definitely blow caches.
2) Generating an edge for every call to every previous calls will blow
the CFG and cause huge pressure on the register allocator. A lot of
spilling will be the result.
3) Using lexicals all over is slower (but HLL compilers will very likely
emit code that does exactly that anyway). So the problem may not be a
real problem anyway. We just know that an optimizer can't remove the
refetch of lexicals in most of the subroutines.
4) Having an explicit syntax construct "(call-with-current-continuation
" that expresses verbatim, what's going on, like e.g. with a reserved
word placed as a label:
RESUMEABLE: x = choose(arr1)
leo
We don't really have that much of a problem. What we have is just
something more simple -- the target of a continuation marks the start
of a basic block. That means that we have to assume everything we
don't get handed back from the function's dirty and should be
refetched.
Or, alternately, if we declare that the top half of the register set
is preserved on function call and return we can assume that the PMCs
and values in there are acceptable for use, though any that map to
lexicals or globals ought to be refetched, since we have the
possibility that the names have been rebound.
I'm perfectly fine in declaring that this is *only* legitimate in
mainline code, and that code generators don't have to deal with the
possibility that vtable or MMD function code has rebound names.
--
Dan
--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
We're dealing with languages that pretty much mandate inefficiency in
many ways. Since, for example, it's completely reasonable (well,
likely at least) for a called sub to rebind lexicals in its parent,
refetching's pretty much required.
> Since, for example, it's completely reasonable (well, likely at least)
> for a called sub to rebind lexicals in its parent
What does that mean, exactly? It seems like that directly contradicts
the meaning of "lexical". For instance, see Larry's comments from "Re:
Why lexical pads" at September 25, 2004 10:01:42 PM PDT (the first of
the 3 messages from him on that day).
JEff
I tend to agree that this is not such a problem. Basically, Parrot
has various control flow possibilities that were not previously
considered. (1) An exception can happen in a try block, so CFG arcs
need to be added from statements within the try block to the catch.
(2) Continuations, which I don't pretend to understand, can also flow
in previously unconsidered directions. Those arcs need to be added to
the CFG.
Alternately, for exceptions, it may be possible to circumvent that
process, and just add symbolic interfenence to all vars in the try
block with all vars in the catch block.
For continuations, however, it seems like those really are control
flow arcs that need to be added. If analysis can trim a few of them,
then that would be great, but if people are using continuations, maybe
they shouldn't expect high performance, and extra register spilling
may be another side effect.
~Bill
> Matt Fowles <uber...@gmail.com> wrote:
>
>> Yes, but in the case of the continuation resuming after foo, the
>> continuation should restore the frame to the point where it was taken.
>> Thus all of the registers will be exactly as they were when the
>> continuation was taken (i.e. in the correct place).
>
> Yes, but Jeff's example wasn't really reflecting the problem.
How come? (Not quibbling--just afraid I'm missing something.) It seems
that even this function body shows the problem:
a = 1
foo()
print a
b = 10
return b
It would seem (w/o continuations) that b should be able to re-use a's
register, but if it does then we'll print 10 instead of 1 "the second
time".
> So what to do:
>
> 1) Extending register frame size ad infinitum and never reuse a Parrot
> register will definitely blow caches.
>
> 2) Generating an edge for every call to every previous calls will blow
> the CFG and cause huge pressure on the register allocator. A lot of
> spilling will be the result.
>
> 3) Using lexicals all over is slower (but HLL compilers will very
> likely
> emit code that does exactly that anyway). So the problem may not be a
> real problem anyway. We just know that an optimizer can't remove the
> refetch of lexicals in most of the subroutines.
It seems that, in term of cache locality, the same problem is there
with more registers v. spilling v. lexicals. That is, if you have 100
local variables whose lifetimes overlap (due to continuations), then
you need 100 distinct memory locations to (repeatedly) access.
> 4) Having an explicit syntax construct "(call-with-current-continuation
> " that expresses verbatim, what's going on, like e.g. with a reserved
> word placed as a label:
>
> RESUMEABLE: x = choose(arr1)
I don't think that really helps either: If you have such a call, then
all the frames higher up the stack also can "return multiple times", so
they have the behavior, even w/o the label.
On the other hand, this Ruby code really bugs me (note: "$" variables
in Ruby are globals):
% cat continuation5.ruby
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print "a = ", a, "\n"
a = "hello"
print "a = ", a, "\n"
end
outer()
$saved.call
% ruby continuation5.ruby
a = 1
a = hello
continuation5.ruby:8:in `+': failed to convert Fixnum into String
(TypeError)
from continuation5.ruby:8:in `outer'
from continuation5.ruby:14
What bugs me is that "outer" gets an error when the continuation is
invoked, because "the second time" strange() returns, "a" is a string
and so you can't add 1 to it. But looking at the definition of "outer",
you'd expect that you could never get such an error. (Without the line
setting "a" to "hello", you get an infinite loop, printing increasing
integers.)
JEff
It can. You can't reuse the same PMC (if you're using PMCs), but you
can reuse the register.
It all comes down to how the code is generated. I've done this in a
project of mine, and it takes a little thought, but it works. When you
take a continuation, you have to saveall before you take it, and
restoreall at the point where the continuation is to resume.
This is the trick I used (modulo brain code rot--I haven't written PIR
in a really long time):
saveall
$P0 = new Continuation
set_addr $P0, RESUME
save $P0
restoreall
restore $P0
# ... $P0 is your continuation
RESUME:
restoreall
# ...
On the other hand, this may be completely irrelavent, since I haven't
been following the discussion.
Luke
> We don't really have that much of a problem. What we have is just
> something more simple -- the target of a continuation marks the start
> of a basic block.
It is of course a new basic block. But setting the CFG correctly would
need to create loops, that is an edge from all subcalls 1..n to previous
subcalls 0..n-1. That could create big increases in CFG size.
> ... That means that we have to assume everything we
> don't get handed back from the function's dirty and should be
> refetched.
I'm fine with refetching lexicals, as - you say it - the may got
rebound. But what about more static languages?
> Or, alternately, if we declare that the top half of the register set
> is preserved on function call and return
These are preserved anyway - as well as the lower half. Please forget
the upper/lower half notion coming from old savetop times. The failing
gc_13 test is not failing because the register isn't preserved. It's
failing because there is no indication that this register isn't
reassignable due to the loop-effect of the continuation.
[ refetching lexicals/globals ]
> I'm perfectly fine in declaring that this is *only* legitimate in
> mainline code, and that code generators don't have to deal with the
> possibility that vtable or MMD function code has rebound names.
Yes, as said I'm fine too with that. Perl/Python will do that anyway.
But what about other languages? Are we forcing these to be as dynamic as
the primary HLL targets?
leo
>> We don't really have that much of a problem. What we have is just
>> something more simple -- the target of a continuation marks the start
>> of a basic block. That means that we have to assume everything we
>> don't get handed back from the function's dirty and should be
>> refetched.
> I tend to agree that this is not such a problem. Basically, Parrot
> has various control flow possibilities that were not previously
> considered. (1) An exception can happen in a try block, so CFG arcs
> need to be added from statements within the try block to the catch.
I've already proposed a simple solution for the exception case:
new eh, Exception_Handler
set_eh eh, catch_label
# try block
catch_label:
# catch block
The try block is starting exactly at this point, where the exception
handler gets pushed onto the contol stack. By connecting the C<set_eh>
opcode to the catch block with an edge in the CFG, we could easily
prevent the reuse of registers located in front of the try block.
> (2) Continuations, which I don't pretend to understand, can also flow
> in previously unconsidered directions. Those arcs need to be added to
> the CFG.
As said: there are no invisible continuations taken. From an HLL point
of view, it's very clear what is happening.
> For continuations, however, it seems like those really are control
> flow arcs that need to be added. If analysis can trim a few of them,
> then that would be great, but if people are using continuations, maybe
> they shouldn't expect high performance, and extra register spilling
> may be another side effect.
If you insert automatically these CFG edges you can't trim them down,
there is no such analysis that could find points, where it isn't
necessary. So we have the performance degradation for *all* subs,
because w/o any compiler hints any subroutine invocation could act as a
loop.
Again - we'd get such loops:
sub1() <---+ <---------+
... | |
sub2() ----+ <-+ |
... | |
sub3() -----------+-----+
The backward branches are going to the opcode after the invoke. You'd
have to connect sub 1..n to sub 0..n-1. This are 81 loops for calling 10
subroutines in a row. This kills for sure all efforts the register
allocator might take, we'll end up with spilling a lot.
> ~Bill
leo
>> a = 1
>> foo()
>> print a
>> b = 10
>> return b
>>
>> It would seem (w/o continuations) that b should be able to re-use a's
>> register, but if it does then we'll print 10 instead of 1 "the second
>> time".
> It can. You can't reuse the same PMC (if you're using PMCs), but you
> can reuse the register.
No. With the presence of a continuation the "print a" can be executed
twice. If now C<b = 10> reuses a's register, it'll break.
> It all comes down to how the code is generated. I've done this in a
> project of mine, and it takes a little thought, but it works. When you
> take a continuation, you have to saveall before you take it, and
> restoreall at the point where the continuation is to resume.
Please forget C<saveall>. This was the way to go before we had register
frames.
> This is the trick I used (modulo brain code rot--I haven't written PIR
> in a really long time):
> saveall
> $P0 = new Continuation
> set_addr $P0, RESUME
> save $P0
> restoreall
> restore $P0
Sure. That's what we formerly used to do. Two memcpys á 640 bytes + 2
stack operations. The memcpys were killing performance.
This isn't needed anymore. A continuation does restore the register
frame. In your code above you emit all possible code to do the rigth
thing. I'm proposing a much simpler syntax:
RESUMABLE: foo()
# code here might be executed more then once
> Luke
leo
>> Yes, but Jeff's example wasn't really reflecting the problem.
> How come?
Your code didn't reuse C<a> after the call.
> ... It seems
> that even this function body shows the problem:
Yes, that one is reflecting it.
> a = 1
> foo()
> print a
> b = 10
> return b
> It would seem (w/o continuations) that b should be able to re-use a's
> register, but if it does then we'll print 10 instead of 1 "the second
> time".
Yep, if there is another call that uses the captured continuation of the
foo() call and continues at "print a".
> It seems that, in term of cache locality, the same problem is there
> with more registers v. spilling v. lexicals.
Not quite. We can't extend just one register set, we'd do that for all 4
kinds. You can not easily have 64 PMC registers and just 10 INTVALs.
A lexical access is just an array lookup (at least if the compiler uses
the indexed addressing of lexicals).
> ... That is, if you have 100
> local variables whose lifetimes overlap (due to continuations), then
> you need 100 distinct memory locations to (repeatedly) access.
Sure. If the program is complex you need the storage anyway.
>> 4) Having an explicit syntax construct "(call-with-current-continuation
>> " that expresses verbatim, what's going on, like e.g. with a reserved
>> word placed as a label:
>>
>> RESUMEABLE: x = choose(arr1)
> I don't think that really helps either: If you have such a call, then
> all the frames higher up the stack also can "return multiple times", so
> they have the behavior, even w/o the label.
The RESUMABLE label is of course at the invocation that might
resume, somehwere up the call chain. Again: the HLL compiler (or the
programmer) knows the effect of the continuation. Just the PIR code is
too dumb, i.e. is lacking this information.
> On the other hand, this Ruby code really bugs me (note: "$" variables
> in Ruby are globals):
> ... , you get an infinite loop, printing increasing
> integers.)
Sure. With callcc you are calling the function again and again.
> JEff
leo
I think I may know a way around this problem. You will have to bear
with me for a moment as I am not entirely used to speaking in terms of
continuations (I find a similar difficulty in speaking about Math, but
at least I have a commonly accepted lexicon for that ;-).
The view from 10,000 feet. Full continuations do not operate on their
context in place, but operate on a copy of it when invoked.
The nitty gritty, consider (as provided by Jeff):
a = 1
foo()
print a
b = 10
return b
in the case where foo does not construct a full continuation an donly
uses the return continuation, no extra copying is done and everything
works. In the case where foo takes a full continuation and puts it in
a global "evil", when foo upgrades its return continuation to a full
one (which happens automatically if foo or one of its children creates
a full continuation of its own), all of the continuations down the
tree will be marked as full. When a full continuation is invoked it
*copies* its contenxt into place (thus it can be invoked multiple
times and it will always have its original context). This means that
invoking full continuations will have a speed hit associated with it
(although that is to be expected), creatings full continuations has a
smaller speed hit of marking the chain (also reasonably expected), but
invoking and creating return continuations would still be cheap.
Hopefully that made sense to someone other than me,
> ... When a full continuation is invoked it
> *copies* its contenxt into place (thus it can be invoked multiple
> times and it will always have its original context).
If you mean by "context" C<struct Parrot_Context> then yes, this is what
continuations are doing. But then nothing is solved.
If you mean context + registers then there is another problem: returning
via continuation would restore everything, including e.g. a loop-counter
that keeps track of how often you loop via the continuation.
A continuation isn't allowed to do that, you couldn't make any progress
in that subroutine, when all register contents is restored.
leo
Okay, for this one if it's optional then we might as well not do it,
since it'll work so rarely all it'll be is a massive source of bug
reports. "Why did the sub that has no idea that I'm messing with its
lexical bindings only spottily and irregularly notice that I rebound
them?" I just don't want to go there.
Compilers can assume lexicals aren't going to get rebound outside of
their view. They're still going to have to deal with global rebinding.
> Jeff Clites <jcl...@mac.com> wrote:
>> On Nov 14, 2004, at 3:03 AM, Leopold Toetsch wrote:
>
>>> Yes, but Jeff's example wasn't really reflecting the problem.
>
>> How come?
>
> Your code didn't reuse C<a> after the call.
Oops.
>> It seems that, in term of cache locality, the same problem is there
>> with more registers v. spilling v. lexicals.
>
> Not quite. We can't extend just one register set, we'd do that for all
> 4
> kinds. You can not easily have 64 PMC registers and just 10 INTVALs.
> A lexical access is just an array lookup (at least if the compiler uses
> the indexed addressing of lexicals).
Ah. What I don't like, though, is that in the in the lexical case you
have things sitting in an array, from which you need to move things
back-and-forth to registers to work on them. In the "more registers"
case, they're sitting in a quite similar array, but you can work on
them directly there. Adding 2 such INTVALs is one op, instead of 4 (2
loads, 1 add, 1 store). Since the problem exists for all register types
(unless HLLs only use PMCs, which is possible), then we either need 4
lexical arrays for maximum locality (so that they can be sized
independently), or we need to be able to size the register frames
independently for the 4 types. (I realize that currently lexicals must
be PMCs, but it seems we have the same issue with other reg. types.)
>> ... That is, if you have 100
>> local variables whose lifetimes overlap (due to continuations), then
>> you need 100 distinct memory locations to (repeatedly) access.
>
> Sure. If the program is complex you need the storage anyway.
But I think the real problem is that it's only due to CPS that the
lifetimes are overlapping so much--that's what's biting us. (By
expanding my example code, you could come up with simple code which
uses 100 locals be would only need 1 register w/o continuations, and
needs 100 "spots" with them.) It's just a pretty unfortunate price
we're paying, if the feature is not extensively used. Now we're just
figuring out how to survive it.
>>> 4) Having an explicit syntax construct
>>> "(call-with-current-continuation
>>> " that expresses verbatim, what's going on, like e.g. with a reserved
>>> word placed as a label:
>>>
>>> RESUMEABLE: x = choose(arr1)
>
>> I don't think that really helps either: If you have such a call, then
>> all the frames higher up the stack also can "return multiple times",
>> so
>> they have the behavior, even w/o the label.
>
> The RESUMABLE label is of course at the invocation that might
> resume, somehwere up the call chain. Again: the HLL compiler (or the
> programmer) knows the effect of the continuation. Just the PIR code is
> too dumb, i.e. is lacking this information.
Picture this call stack:
main --> A --> B --> C --> D --> E
The call from D to E uses the "RESUMEABLE" label, and E stores the
resulting continuation in a global, and everything up to main returns.
Then, main invokes the continuation, and we find ourselves back in D.
Now, C, B, and A will return "again", without any compile-time hint.
>> On the other hand, this Ruby code really bugs me (note: "$" variables
>> in Ruby are globals):
>
>> ... , you get an infinite loop, printing increasing
>> integers.)
>
> Sure. With callcc you are calling the function again and again.
I know--the infinite loop was the "desired" behavior (just mentioned it
so that people wouldn't have to run it). What bugs me is that you can
get that error, though looking at the code it should be impossible. The
author of outer() might have no clue that could happen, so it's not
really his bug, and the person using a continuation needs really
detailed knowledge of everything in the call stack, to know if it will
work. I guess that's "just how it is", but it seems to mean that
continuations have limited usefulness in languages with side-effects,
except for very local usage (breaking out of a loop and such).
JEff
> Picture this call stack:
> main --> A --> B --> C --> D --> E
> The call from D to E uses the "RESUMEABLE" label, and E stores the
> resulting continuation in a global, and everything up to main returns.
> Then, main invokes the continuation, and we find ourselves back in D.
> Now, C, B, and A will return "again", without any compile-time hint.
That's fine. The return is an expected operation. I don't think that's
the problem. The error in gc_13 is a call path like:
choose() -> try (with_cc) -> fail() ->
|
(choose again) <- + <--+
|
choose() -> try (with_cc) -> fail() -> |
| |
(choose again) <- + |
|
fail() --------------------------------------+
The problem now is not per se the path in main from the two choose()
calls down to fail is executed more then once (as it's the case with
multiple returns). The problem is the loop in main. By denoting the loop
from the call to fail() to the first choose() with some kind of syntax,
the register allocator does the right thing.
> JEff
leo
On Mon, 15 Nov 2004 20:30:18 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> Matt Fowles wrote:
>
>
> > Leo~
> >
> > On Mon, 15 Nov 2004 17:36:20 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> >
> >>Matt Fowles wrote:
> >>
> >>
> >>>I do mean context + registers and it can do that. If you keep your
> >>>loop-counter in a PMC, then it will be incremented
> >>
> >>Ok, but having suddenly such a difference between value and reference
> >>types would really cause weird behavior.
> >
> >
> > I disagree. This is exactly the sort of distinction that has always
> > been annoying novice programmers with most every language.
>
> Yes of course. But continue that sentence above: weird... depending on
> the presence of Continuation, which, by creating a loop from your code,
> start preserving your scalar data types.
I am sorry, but I don't understand what you are trying to say here.
Would you mind rewording it for me?
> I didn't mention the runtime impact yet. You said already that it's
> costy. This would make continuations unusable for backtracking in the
> rules/rx engine.
The runtime inpact would be comparable to the speed of our previous
calls where memory had to be copied off and restored (accept that we
only need one of the copies instead of two).
I am not saying that this is necessarily the correct solution, but it
is a solution that would impose minimal breakage and would not put
heavy constraints on the register allocator.
One thing that worries me is the need to copy the entire callstack
worth of return continuations into full ones when a continuation is
created/promoted. This could be optimized for the regex case by
taking an initial continuation and thereafter using its copied stack
to initialize any newly taken continuations.
Which gives me an evil idea. We could allow bytecode to specify that
it wanted to start taking full continuations everywhere, but that
these would never be used below it on the callstack. Thus the regex
engine could do this and not suffer too much efficiency loss for
taking new continuations for every backtrack point.
Cheers,
Michael
[ continuations should restore registers ]
> I am sorry, but I don't understand what you are trying to say here.
> Would you mind rewording it for me?
Imagine a simple loop:
i = 0
lp:
foo()
inc i
if i < 10 goto lp
Looking at the loop counter a programmer or optimizer could easily
decide that using an I-Reg for it instead of a P-Reg is fine. Now comes
your proposed implementation for continuations: they copy the register
frame on creation and restore it on invocation. Besides of the big cost
of the memcpy's this simple loop above suddenly stops working, depending
on the implementation of foo() - which might be outside your control.
BTW in an early stage we had exactly this behavior of continuations.
This was abandoned. The subject was IIRC something like "Should
continuations close over registers". The answer was clearly "no".
leo
There is one thing I am not sure about here. The loop will work
correctly for each seperate invocation of the appropriate
continuation. The first time you call foo i is 0. The second time i
is 1. If foo ever invokes the full continuations that it captured at
one of these points, then i will go back to whatever it was when that
continuation was captured. All of this seems like reasonable behavior
to me. In the general case our optimizer will not be able to downgrad
i from a P to an I register anyway, as foo could mess with $CALLER::i
or whatever. Thus, I am not sure that I by your argument.
> On Tue, 16 Nov 2004 09:23:24 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
>> i = 0
>> lp:
>> foo()
>> inc i
>> if i < 10 goto lp
> There is one thing I am not sure about here. The loop will work
> correctly for each seperate invocation of the appropriate
> continuation.
No. Imagine, foo() is not a simple function anymore. Someone e.g. Jens
Rieks[1], discovers that the algoritm is simpler implemented with
continuations. So inside foo() the return continuation of foo() is
copyied, stored elsewhere, passed along to another function, and that
one now suddenly returns via this continuation to your loop. If this
invocation of the continuation would restore registers suddenly the loop
will become an infinite one, as C<i> is always restored to zero.
[1] Have a look at runtime/parrot/library/Streams/Sub.imc
leo
I disagree with that analysis. Let us consider the actual effect of
such an implementation.
First iteration
i = 0;
foo(); #at this point a continuation created capturing i=0, promoted
by Jens and stuff happens
#eventually it is invoked, restoring i=0
i++; #i = 1
foo(); #at this point a NEW return continuation is created capturing
i=1; promoted by Jens...
#eventually it is invoked, restoring i=1
i++; #i = 2
...
Thus every single invocation of foo will have an i one greater than
the last. If foo's algorithm had an error and did not use the new
return continuation to recreate its internal continuation each time,
then you would be correct. But that would be a bug in the
implementation of foo.
As the following code
#set up for foo
foo()
#set other stuff for foo
foo()
would be an infinite loop alway returning immediately after the first
invocation of foo.
I looked at Sub.imc and think it would work because write always
creates a new Continuation for each invocation of write.
That would work if there is a one to one representation of the invoation
of foo() an it's continuation. But no one guarantees that.
By repeatedly invocing the continuation you alway get to the opcode
after invoke, and registers would be restored to some earlier state.
> ... If foo's algorithm had an error and did not use the new
> return continuation to recreate its internal continuation each time,
> then you would be correct. But that would be a bug in the
> implementation of foo.
Why? If foo's implementation is changed internally to double it's work
per call, it could indicate that progress by returning twice through the
same continuation.
E.g.
unless done
(done, result) = foo()
s .= result
leo
You would, however, in this case be incorrect.
The loop variable must have a backing store outside of the register
set. The contents of registers must be assumed to be unreliable when
a continuation is continued. If we declare that they are put back
into the state they were when the continuation was taken, which is
reasonable though not required, the values of non-reference type
registers (ints and floats) will be reset. The rference type
registers (strings and PMCs) will also be reset so the pointers to
the string/pmc structs will be what they were when the continuation
was taken, but as they are references the referred-to thing may have
changed and the changed value will be seen.
But consider even this simple function:
sub B
{
a = 1
foo()
print a
b = 2
return b
}
If something called by foo() captures a continuation, and something
invokes it after B() returns, then there's a hidden branch, in effect,
from the return to the print, isn't there? The register allocator could
decide to use the same register for a and b, but then the "second"
return from foo() would print 2 instead of 1, which is wrong. And the
author of B(), of course, may have no idea such a thing would happen,
so wouldn't be able to supply any syntax to tell the compiler.
I'm just trying to come up with a simpler example, since it seems to me
that there's a problem any time a function returns, and the last thing
that executed in that frame wasn't a call to that function. (There's a
lot going on in the gc_13 example, and I think some of it is
distracting from the main point.)
But a RESUMABLE label seems like the information that's needed by the
compiler. But on the other hand in an example like the above, the
function B() may not be written to expect foo() to be resumed. So, what
should happen at runtime, if the label is absent? We could declare
undefined behavior, but that would mean that for defined behavior,
you'd need the RESUMABLE label all the way up the stack, and Ruby and
Scheme don't have that syntactic constraint. With Scheme, it's only
clear from the syntax what's going on locally--but you can invoke a
continuation far from any call/cc, if that continuation was stored away
into a variable.
JEff
> You would, however, in this case be incorrect.
>
> The loop variable must have a backing store outside of the register
> set. The contents of registers must be assumed to be unreliable when
> a continuation is continued. If we declare that they are put back
> into the state they were when the continuation was taken, which is
> reasonable though not required, the values of non-reference type
> registers (ints and floats) will be reset. The rference type
> registers (strings and PMCs) will also be reset so the pointers to
> the string/pmc structs will be what they were when the continuation
> was taken, but as they are references the referred-to thing may have
> changed and the changed value will be seen.
I am having trouble in that I agree with what you are saying, but I am
coming to a different conclusion. I think that foo would create a new
continuation (from it return continuation) each time it is called, and
thus things below it on the call stack would be unaffected by its own
internal continuation tricks (assuming for the moment that registers
are put back into place by continuations).
Since both you and Leo are arguing against me here, it seems like that
I am wrong, but I would like to figure out exactly why I am wrong so
that I can correct my train of thought in the future.
Thanks,
> Since both you and Leo are arguing against me here, it seems like that
> I am wrong, but I would like to figure out exactly why I am wrong so
> that I can correct my train of thought in the future.
Here's a real example you can play with, if you have Ruby installed:
% cat continuation6.ruby
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print "a = ", a, "\n"
end
# these two lines are "main"
outer()
$saved.call
% ruby continuation6.ruby
a = 1
a = 2
a = 3
a = 4
a = 5
a = 6
a = 7
a = 8
a = 9
a = 10
...infinite loop, by design
What happens when the program runs is that outer() is called (only
once) which creates a continuation (inside of strange()), increments a,
prints and returns. The next thing that happens is that the
continuation is invoked. Control jumps to the location in strange()
right after the callcc line, then that return and we are at the line in
outer() where 'a' is incremented. So 'a' increments from the last value
it had in that frame (since we are magically back again inside of the
"same" single invocation of outer()), then 'a' is printed and outer()
returns again (note: outer only called once, returned twice so far),
and then we call the continuation again, and start the loop over.
We only ever create one continuation in this example, since we only
ever call strange() once. The continuation preserves the frame (the
mapping from logical variables to their values), but not the values of
those variables at the time the continuation was created. In effect, I
think the continuation is arranging to preserve the state of the
variables as they were when code in the frame was last executed, rather
than at the time the continuation was created.
The behavior you were describing is what I had thought would happen,
but then I realized I wasn't sure, so I confirmed that it wasn't. The
above is the behavior of Ruby, and I believe Scheme works the same way.
What you described would be useful for backtracking (jumping back not
only to a previous location in a computation, but also its previous
state), but it's not what these languages seem to do.
JEff
On Tue, 16 Nov 2004 18:32:07 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> Matt Fowles wrote:
>
>
> > I disagree with that analysis. Let us consider the actual effect of
> > such an implementation.
> >
> > First iteration
> >
> > i = 0;
> > foo(); #at this point a continuation created capturing i=0, promoted
> > by Jens and stuff happens
> > #eventually it is invoked, restoring i=0
> > i++; #i = 1
> > foo(); #at this point a NEW return continuation is created capturing
>
> That would work if there is a one to one representation of the invoation
> of foo() an it's continuation. But no one guarantees that.
I suppose that what I am arguing is that anyone who does not maintain
such a one-to-one representation (at least from the perspective of
code calling foo()); full well deserves what they get. They are
restoring the execution to an earlier state by invoking an old
continuation. If the earlier state called them again the first time,
they should probably expect the earlier state to call them again the
second time. Unless they have some specific knowledge that the
earlier state will change it behavior (because things in the heap have
changed), there should be no expectation for it to.
This is one of the fundamental properties of continuations, but it
does throw people. And it's why register contents have to be thrown
away when a continuation is invoked, since the registers have values,
and continuations don't preserve values.
On Tue, 16 Nov 2004 13:41:25 -0500, Dan Sugalski <d...@sidhe.org> wrote:
> At 10:32 AM -0800 11/16/04, Jeff Clites wrote:
> >The continuation preserves the frame (the mapping from logical
> >variables to their values), but not the values of those variables at
> >the time the continuation was created.
>
> This is one of the fundamental properties of continuations, but it
> does throw people. And it's why register contents have to be thrown
> away when a continuation is invoked, since the registers have values,
> and continuations don't preserve values.
I think right here we have the crux of my failure to understand. I
was/am under the impression that the continuation will restore the
register frame to exactly as it was when the continuation was taken.
Thus those registers which are values (I,N) will continue to have the
value they had when the continuation was taken, while those registers
which are pointers/references (S, P) will still point to the same
place, but that data may have changed. Is this correct?
> sub B
> {
> a = 1
> foo()
> print a
> b = 2
> return b
> }
> If something called by foo() captures a continuation, and something
> invokes it after B() returns, then there's a hidden branch, in effect,
> from the return to the print, isn't there?
Yes. That's right and would cause problems. Again this is creating a
loop as you say from the return to the print statement.
OTOH looking at the scheme example, you can create such continuation
loops just for nested closures. All other usage would be like a "goto"
statement into the middle of some totally unrelated subroutine, which is
only solvable by going "the all gets refetched" road.
> But a RESUMABLE label seems like the information that's needed by the
> compiler. But on the other hand in an example like the above, the
> function B() may not be written to expect foo() to be resumed.
Yes. Again, the HLL language, that is creating the code has a clear
indication, what's going on. PIR code currently hasn't.
> ... With Scheme, it's only
> clear from the syntax what's going on locally--but you can invoke a
> continuation far from any call/cc, if that continuation was stored away
> into a variable.
So all closures inside that call/cc have to be emitted in such a way that
they can cope with it. It's IMHO nothing we can solve, except for
providing some syntax construct that clearly indicates the possible loop
for the CFG.
> JEff
leo
No. The registers are just about the only thing that *isn't* restored.
Continuations put the environment back. This includes things like the
lexical pad stack, the namespace stack, the stack itself, any
security credentials... basically everything that describes the
environment. *Data*, on the other hand, is *not* restored. Data stays
as it is.
Registers are a special case of data, and they're just declared crud
by fiat, since otherwise things get nasty and unpredictable.
> No. The registers are just about the only thing that *isn't* restored.
>
> Continuations put the environment back. This includes things like the
> lexical pad stack, the namespace stack, the stack itself, any
> security credentials... basically everything that describes the
> environment. *Data*, on the other hand, is *not* restored. Data stays
> as it is.
>
> Registers are a special case of data, and they're just declared crud
> by fiat, since otherwise things get nasty and unpredictable.
Then I am not sure what you mean by "The return continuation PMC type,
used to create return continuations used for call/return style
programming, guarantees that registers 16-31 will be set such that the
contents of those registers are identical to the content of the
registers when the return continuation was I<created>."
I read that as saying that registers will be restored by
continuations. If that is not what it is intended to mean, could you
clarify for me.
Thanks,
Return continuations are special, basically. There are a number of
specialized continuation forms, and this is one of 'em. Which makes
things a bit more confusing but, well, there you go.
> Return continuations are special, basically. There are a number of
> specialized continuation forms, and this is one of 'em. Which makes
> things a bit more confusing but, well, there you go.
It seems to me that it would simpilify much of the code, and reduce
the number of special cases if we extended that to all continuations
instead of just return ones. This would allow the register allocator
to re-use registers as it chose without having to refetch everything
from backing store (which is rather problematic for non-PMC
registers).
This does mean that if an N register wants to have its value change
across continuations it needs to have a backing store somewhere, but
even without this change things need to be fetched from backing store
as the register allocator might clobber them. So this does not seem
like a burden in that case, and it does seem like a win for the
allocator.
We could, but it would be wrong. Hell, it's arguably wrong for return
continuations to do so, and it wouldn't be unreasonable to argue that
I and N register contents are guaranteed crud and required refetching.
I'm not particularly concerned with pressure on the register
allocator, honestly -- it's a pleasant add-on, and one we will
continue to do, but it's not strictly necessary. We deal with that
after we get things correct.
On Tue, 16 Nov 2004 16:24:06 -0500, Dan Sugalski <d...@sidhe.org> wrote:
> We could, but it would be wrong. Hell, it's arguably wrong for return
> continuations to do so, and it wouldn't be unreasonable to argue that
> I and N register contents are guaranteed crud and required refetching.
>
> I'm not particularly concerned with pressure on the register
> allocator, honestly -- it's a pleasant add-on, and one we will
> continue to do, but it's not strictly necessary. We deal with that
> after we get things correct.
I can accept this, but I would like to make sure that I understand all
of the represcussions of it. Thus you can consider all of the
following questions (even though they will be phrased as statements).
1) After a full continuation is taken all of the registers must be
considered invalid.
2) After a return continuation is taken, the registers can be trusted.
3) If someone takes a full continuation, all return continuations
down the callstack must be promoted.
4) After a function call, some magic needs to happen so that the code
knows whether it came back to itself via a return continuation and can
trust its registers, or it came back via a full continuation and
cannot trust them.
Corrections welcome,
> ... Thus you can consider all of the
> following questions (even though they will be phrased as statements).
> 1) After a full continuation is taken all of the registers must be
> considered invalid.
Calling a subroutine allocates a new register frame, that subs register
frame pointer in the context points to these fresh registers.
A continuation restores the context it captured, i.e. at the place,
where it was created. This is true for all continuations. Inside the
context there is a *pointer* to a register frame, which is therefore
restored too.
The effect of taking a continuation is therefore to restore registers to
that state where the continuation was created. Due to calling conventions
a part of the registers is volatile (used during a call or as return
results), while the other part is non-volatile.
Until here there is no difference between return or full continuation.
The effect of a full continuation can be to create a loop, where the
known control flow doesn't show a loop. Without further syntax to denote
such loops 1) is true. This register invalidation happens, if a
preserved register was e.g. only used once after the call and then that
register got reassigned, which is allowed for a linear control flow but
not inside a loop.
This has per se nothing to do with a continuation. If you got an opcode
that does *silently* a "goto again_label" the CFG doesn't cope with the
loop, because it isn't there and things start breaking. The effect of a
full continuation *is* to create such loops.
> 2) After a return continuation is taken, the registers can be trusted.
Yes, according to usage in pdd03.
> 3) If someone takes a full continuation, all return continuations
> down the callstack must be promoted.
If one *creates* a full continuation ...
> 4) After a function call, some magic needs to happen so that the code
> knows whether it came back to itself via a return continuation and can
> trust its registers, or it came back via a full continuation and
> cannot trust them.
No. It's too late for magic. Either the CFG is known at compile time or
refetching in the presence of full continuations is mandatory. For both
the code must reflect the facts.
> Corrections welcome,
> Matt
leo
Thanks for the clarification.
Matt