I would like to address the problem of how continuations have affected
the register allocator. This problem came to the public eye, when two
tests that the new register allocator failed. Leo's analysis was
essentially that the allocator was fine, but that our handling of
continuations was incomplete, as far control flow and regiester
allocation went.
First, I think that it is important to acknowledge what is actually
agreed upon. Exceptions are handled as a special case of
continuations, and as such, this is a subproblem of the general
continuations problem. Leo's proposed solution, below seems to have
been accepted as workable, correct, and fast enough.
> Quoth l...@toetsch.at:
> 1a) for exceptions:
>
> set_eh handler, catch_label
>
> This is just a small adaption of the sequence of installing an
> exception handler.
> It depends a bit, if exception handlers are inline or nested
> closures or both.
Second, is the larger, *unresolved* problem of continuations in
general. Bear with me, as I am just barely beginning to understand
continuations myself. Comments on my observations are more than
welcome. Especially on my mistakes.
Continuations are a kind of "goto". When you invoke a continuation,
it moves the program counter to where the continuation was taken, and
restore most state (but not registers). This scheme has provided a
wealth of power, with acceptable performance for a variety of
circumstances. One neglected consideration, when implementing this
strategy, is the effect on the control flow graph.
> sub1() <---+ <---------+
> ... | |
> sub2() ----+ <-+ |
> ... | |
> sub3() -----------+-----+
In the continuations enhanced control flow graph (the control flow
graph (cfg), after one considers the effects of continuations), it is
possible for any subroutine to jump back to the exit point of any
previous subroutine. Leo has drawn the picture a few times, as above,
showing the complex web of links from any sub to any sub. It is the
control flow graph which provides the register allocator with the
information it nees to determines which variables interfere. They
interfere if they could be active at the same time. If they
interfere, they must be assigned different registers.
Now, consider a slightly different topic: According to recent
decisions, the lower half of the registers (0-15) for all types, will
not survive a subroutine call. This has created what we can call
volatile, and non-volatile symbols (variables). Non-volatile symbols
do not cross subroutine calls, and can therefore be put in the lower
register half. Volatile symbols cross sub calls, and cannot be put in
the lower half. (Think the "volatile" keyword from C, which means
"don't mess with this value", well it sort of means that.)
Here's the funny thing -- in the new continuations enhanced flowgraph,
every volatile symbol in a subroutine will interfere with every other
volatile symbol. Here's an informal proof. Remember all those arcs
that Leo drew? For instance, if v1 crosses sub1() and v2 crosses
sub2(), there is an arc from sub2() back to sub1() [wolog, sub1 is
first]. So v1 is live when v2 is live, and the two symbols interfere.
All those arcs extend the life range of each symbol to that extent.
Therefore, all volatile symbols in the sub will interfere.
What this means is that if your subroutine has more than 16 volatile
symbols of a given kind, there will be spilling.
Here's a list of proposals I have seen to deal with that issue:
1) Let the above stand (I think I was the only one who said that, and
that was before I had really thought about it).
2) Let the above stand, but allow for a pragma to turn off the arcs.
Thus, I can create a range within the sub, that turns off all those
pesky continuation enhanced arcs within that range. HLL compiler
writers will be able to set this pragma, knowing that none of it's
subs use continuations. (Larry Wall's suggestion.)
3) Specifically a label for each subroutine that *might* set a
continuation. Others have said that continuations should not be the
common case, if a continuation could be called, then labelling it as
RESUMABLE, isn't such a bad idea. (Leo's idea.)
3a) The natural extension of that is to also have label, like
"INVOKEABLE", for subs that might invoke a continuation.
4) "Fetch all from lexicals/globals after a function call." (Another
of Leo's.)
4a) Use lexicals instead of registers. (Not sure who mentioned it, but
it seems like just spilling everything, using a slightly different
spilling scheme.)
4b) Conditionally fetch ... (Ben's suggestion is one that deserves
some further thought. Leo doesn't think it could work, and he
understands the problem better than I, but his answer is not a
difinitive "impossible", so that gives some hope.)
5) Let the symbols interfere. The chances are that the program will
run correctly anyway. Let the register allocator do it's thing, and
every once in a while, parrot will behave incorrectly -- but only if
one is audacious enough to use continuations. So let the bugs fly!!
5a) Write the allocator to minimize the number of bugs generated.
COMMENTS:
* (2,3) If you think about it, suggestions 2 and 3 are essentially
equivalent. They both seem like good ideas. Disadvantages: forcing
compiler writer to have to think about when continuations will be
taken. Advantages: minimize spilling, but retain correctness.
* Suggestion 1 has the advantage of correctness, and not forcing
anyone to think about when continuations are invoked. The
disadvantage will only be evident when there are more than 16 volatile
symbols in subroutine. It's not clear how often that will be, though
it is fairly reasonable to assume that this will be a significant
performance hit in a significant number of cases. Experimentation
might be interesting. I also have some ideas to minimize the impact,
by snipping ranges of these variables, similar to 4.
* Suggestion 4 is really just another kind of spilling [1]. Spilling
that is done by copying old values of otherwise dead variables, so
that if they are resurrected by a continuation, they can be restored.
This might not be too bad a performance hit.
Advantages: correctness and compiler writers don't have to think about
it. Disadvantages: 4b is unknown if possible or not, but is possible,
would be an interesting path to explore. There is likely a
performance hit though, which Larry and others have indicated is not
really acceptable in the general case. This suggestion probably bears
further examination.
* I sincerely hope that no one finds suggestion 5 acceptable. A lack
of correctness is really just not acceptable. Among other really bad
things, this could make parrot vulnerable to security threats, and
unpredictable failure, when continuatinos are used. Btw, suggestion
5a is currently implemented. The old register allocator is so bad,
that there is little interference with different variables. That is
why it passes those tests that the new allocator fails. I have
considered writing a version of the allocator to be even less
efficient. This approaches suggestion 1.
--
Since this issue, is to a great extent, blocking my ability to work on
the register alloctor, I would like to see it resolved. Pending that,
I'd like to at least see agreement on the problem.
Thanks for your perserverence in reading this,
Bill
[1] Spilling is currently done by designating register P31 as a
PerlArray, and letting each spilled symbol occupy a position in the
array. These values are accessed by copying back and forth from/to
temporary registers.
On Mon, 22 Nov 2004 11:49:59 -0800, Bill Coffman <bill.c...@gmail.com> wrote:
> > sub1() <---+ <---------+
> > ... | |
> > sub2() ----+ <-+ |
> > ... | |
> > sub3() -----------+-----+
>
> In the continuations enhanced control flow graph (the control flow
> graph (cfg), after one considers the effects of continuations), it is
> possible for any subroutine to jump back to the exit point of any
> previous subroutine. Leo has drawn the picture a few times, as above,
> showing the complex web of links from any sub to any sub. It is the
> control flow graph which provides the register allocator with the
> information it nees to determines which variables interfere. They
> interfere if they could be active at the same time. If they
> interfere, they must be assigned different registers.
It is also possible for functions to jump forward to the return
continuation of a function called later on (this requires that
function to be called, store it continuation somewhere, and then jump
back to a function before the one in question, but it can happen.)
Matt
--
"Computer Science is merely the post-Turing Decline of Formal Systems Theory."
-???
> ...volatile, and non-volatile symbols (variables). Non-volatile symbols
> do not cross subroutine calls, and can therefore be put in the lower
> register half.
I've used the terms volatiles and non-volatiles in the reversed sense,
i.e. from the POV of an architecture ABI.
- volatile registers are *not* preserved around a function call
- non-volatile registers are preserved around a call: i.e. R15..R31
leo
> It is also possible for functions to jump forward to the return
> continuation of a function called later on
Yes. But I can't see a problem with that. The opcode after an invoke
opcode is already the begin of a new basic block. The forward branch
would just be an additional edge in the CFG with no additional information.
The (for the register allocator invisible) creation of loops is the PITA.
> Matt
leo
Or am I missing something fundamental?
We don't have a problem WRT register preservation, the problem arises
due to register re-using.
> ... Exactly *where* that
> return happens, and whether it happens more than once, is completely
> irrelevant from the point of view of the caller.
The return can only happen, where the normal function call would have
returned, but anyway.
> ... ISTM that the register
> allocator should work on the principle that anything it didn't save
> before it made the call will be toast afterwards.
Yes. But - please remember your example "Fun with nondeterministic searches".
Here's the relevant piece of code from main:
arr1=[1,3,5]
arr2=[1,5,9]
x = choose(arr1)
y = choose(arr2)
$P0 = find_lex "fail"
$P0()
You know, both "choose" calls capture the continuation and backtrack via
"fail" (basically). But the register allocator isn't aware of that. The
control flow graph (CFG) is linear top down, with new basic blocks
starting after each function call. "arr2" is obviously used around a
call and allocated in the preserved (non-volatile) register area. This
works fine.
Now the register allocator assigns a register to "$P0". It finds the
register that "arr2" had usable, because in a linear CFG, there's no way
that "arr2" might be used again. So that register is considered being
available. Now if $P0 happens to get the register that "arr2" had,
backtracking through the call to "fail()" obviously fails, as "arr2" is
now the Closure PMC. And that was exactly the case.
> Or am I missing something fundamental?
I don't know ;) It's basically a problem of the interpretation of the term
"caller saves". The first call to "choose" preserves "arr2", but then -
seen from the CFG - the register allocator has no clue that "arr2" is
needed again.
leo
Ah! [a light goes on over Piers's head].
>
>> Or am I missing something fundamental?
>
> I don't know ;)
I was. Hmm... bugger. So, unless we make the register allocator solve
the halting problem, the rule becomes "If you're playing silly beggars
with continuations and you're expecting to get at something in a
'surprising' way, stuff it in a lexical or we guarantee that you will be
anally violated by an enraged waterbuffalo that's just sick to death of
non-determinism"?
>> We don't have a problem WRT register preservation, the problem arises
>> due to register re-using.
> Ah! [a light goes on over Piers's head].
>>> Or am I missing something fundamental?
>> I don't know ;)
> I was. Hmm... bugger. So, unless we make the register allocator solve
> the halting problem, the rule becomes "If you're playing silly beggars
> with continuations and you're expecting to get at something in a
> 'surprising' way, stuff it in a lexical or we guarantee that you will be
> anally violated by an enraged waterbuffalo that's just sick to death of
> non-determinism"?
This would make quite a fine explanation in the docs, except that's a
bit unclear about "stuff *it*". The waterbuffalo is concerned of
preserved *temporary* variables too.
leo
I just thought of a heuristic that might help with register
preservation:
A variable/register should be preserved over a function call if either of the
following is true:
1. The variable is referred to again (lexically) after the function has
returned.
2. The variable is used as the argument of a function call within the
current compilation unit.
Condition 2 is something of a bugger if you have big compilation units,
but register allocation is always going to be a pain when there are big
compilation units around.
> I just thought of a heuristic that might help with register
> preservation:
>
> A variable/register should be preserved over a function call if either
> of the
> following is true:
>
> 1. The variable is referred to again (lexically) after the function has
> returned.
> 2. The variable is used as the argument of a function call within the
> current compilation unit.
That doesn't solve it, though you'd think it would. Here's the
counter-example:
x = 1
foo()
print x
y = 2
return y
You'd think that x and y could use the same memory location (register,
variable--whatever), since ostensibly their lifetimes don't overlap.
But continuation re-invocation can cause foo() to return multiple
times, and each time it should print "1", but it won't if x and y use
the same "slot" (it would print "2" each time after the first). In
truth, their lifetimes do overlap, due to the hidden (potential) loops
created by continuations.
The problem isn't preservation across calls per se, it's the implicit
loops. Continuations are basically gumming up tons of potential
optimizations.
JEff
Yes, I know it's not technically correct, but it is consistent, useful,
and doesn't kill our optimizations. If the compilers know about these
semantics, then there is no loss of functionality or interoperability.
Functions that aren't prepared to deal with continuations won't be able
to anyway (and in fact, these semantics may work better in those cases).
Luke
I expect I will in a bit -- work's got me backed up and I've got
too-big a backlog of perl6-internals mail to read. (This isn't a good
time to slip things in past me, though :) I'll try and dig through
this thread later today.
--
Dan
--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
On Mon, 29 Nov 2004 14:51:43 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> Luke Palmer <lu...@luqui.org> wrote:
> > It seems to me that there is no good solution to this problem without
> > annotating the register set or killing the register allocator.
>
> I think I've proposed a reasonable solution: putting lexicals in
> registers.
I would appreciate it if Dan (who I cc'ed this to directly) would
weigh in on this thread. I think that we have present most of the
options as clearly as we can, and a decision on how to move forward
would be appreciated.
Matt
I think I've proposed a reasonable solution: putting lexicals in
registers.
> Luke
leo
I haven't been able to devote much time lately, but this week I hope
to modify the register allocator to deal with these arcs, as well as
enhancements to the spill code. I expect that in some cases it will
have a definite impact on speed, and others it won't. Most of the test
cases (make test) don't have that many variables, for instance. I
hope to get this patch out in about a week.
There has been much discussion about how these arcs will impact the
performance, complexity, etc. What we need are some hard numbers, so
as to compare results.
Perhaps Leo can impliment his architecture, and compare to this.
Really, I think that Leo's idea is a form of spilling in itself, but
without the extra load/store instructions. That kind of spilling is
much better. HLL compilers can optimize this further, by adding
annotations to subs, saying no full continuations called by any
subroutines (actually, only one subroutine calling a full continuation
would not impact the register allocator). Analysis could also be
helpful in certain cases.
Bill
Except... we've already declared that return continuations are
special, and preserve the registers in the 16-31 range. So when we
return from foo, regardless of how or how many times, the pointer to
x's PMC will be in a register if it was in there before the call to
foo, if it's in the preserved range. So in this case there's no
problem. Things'll look like:
x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
foo() # foo invocation
print x # P16 still has pointer value 0x04
y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value 0x08
return y # Passes back 0x08
With more or less clarity.
I'm reading these threads a bit out of order so I think I'm missing
some context here, but lexicals in registers works out fine as long
as the backing pad is kept up to date. (And since PMCs are all
reference types that's pretty easy) Lexicals *exclusively* in
registers won't work -- the languages we're doing require pads.
If the big issue is spilling properly, lexicals could be removed
entirely from the spill picture with proper PIR notation and some
fleshing out of how we're handling pads. For example it'd be fine to
do something like:
.sub foo
.lexicals x, y, z
%1 = new Integer
%2 = new String
%3 = new Hash
%1 = 12
%2 = "foo"
%3[%2] = %1
.end
%nnn could work the same as $Pnnn notation but refer to slots in the
current pad. Since these are always PMCs, and always have a
functioning backing store, they can be dropped as need be and
reloaded.
Originally they were going to be tied in heavily to the full array
notation so lexicals wouldn't be required to take up registers --
code like:
a = b + c
could've been expressed by a single op:
add lexbase[1], lexbase[2], lexbase[3]
and take a lot of pressure of the register allocator, but...
On Tue, 30 Nov 2004 08:28:35 -0500, Dan Sugalski <d...@sidhe.org> wrote:
> At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
>
>
> Except... we've already declared that return continuations are
> special, and preserve the registers in the 16-31 range. So when we
> return from foo, regardless of how or how many times, the pointer to
> x's PMC will be in a register if it was in there before the call to
> foo, if it's in the preserved range. So in this case there's no
> problem. Things'll look like:
>
> x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
> foo() # foo invocation
> print x # P16 still has pointer value 0x04
> y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value 0x08
> return y # Passes back 0x08
>
> With more or less clarity.
I think that the concern is for the circumstance where foo() promotes
it return continuation to a full continuation. Then, that guarantee
is no longer provided (I think), and repeated invocation could leave y
in P16 rather than x.
Nope. The guarantee's still there. Promotion will force things up the
call chain to get marked as un-recyclable, but registers still get
restored on invocation.
On Tue, 30 Nov 2004 09:49:54 -0500, Dan Sugalski <d...@sidhe.org> wrote:
> At 9:36 AM -0500 11/30/04, Matt Fowles wrote:
>
>
> Nope. The guarantee's still there. Promotion will force things up the
> call chain to get marked as un-recyclable, but registers still get
> restored on invocation.
In that case, I am confused. When does the guarantee NOT apply?
Thanks,
When a general-purpose continuation's taken. Return continuations are
special case, designed to make sub calls and returns work as people
expect in the face of the underlying CPS mechanism we have. (They
are, if you like, a cheat) Exception continuations are also
special-case continuations. As are sub and method PMCs when you get
right down to it.
A general purpose continuation has two things attached to it. The
first is a destination address -- where in the code to go when the
continuation is invoked. The second thing attached is an environment
-- the lexical pad pointer, namespace pointer, opcode library
pointer, security information, top stack frame pointers, and so on.
Basically everything in the interpreter structure outside of the
internal bookkeeping and interpreter global information. (The context
structure's misnamed. It's not the context that continuations care
about)
When a continuation (or, really, anything) is invoked, parts of the
interpreter struct are overwritten with information from the
continuation, parts are left alone, and parts are effectively nulled
out. Which bits do what depends entirely on the
continuation/invokable, as each type saves/ignores/nulls different
things.
Oh. No, it won't. We've declared that return continuations will
always leave the top half registers in the state they were when the
return continuation was taken. In this case, when it's taken to pass
into foo, P16 is 0x04. When that return continuation is invoked, no
matter where or how many times, P16 will be set to 0x04. This does
make return continuations slightly different from 'plain'
continuations, but I think this is acceptable.
>None of this should have anything to do with return continuations
>specifically, since this is the case where the body of foo (or
>something called from it) creates a "real" continuation, which as I
>understand it is supposed to promote the return continuations to
>"real" continuations all the way up the stack.
The return continuations have to maintain their returny-ness
regardless, otherwise they can't be trusted and we'd need to
unconditionally reload the registers after the return from foo(),
since there's no way to tell whether we were invoked via a normal
return continuation chain invocation, or whether something funky
happened down deep in the call chain.
On Tue, 30 Nov 2004 10:15:00 -0800, Jeff Clites <jcl...@mac.com> wrote:
>
>
>
> On Nov 30, 2004, at 5:28 AM, Dan Sugalski wrote:
>
> > At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
> > Except... we've already declared that return continuations are
> > special, and preserve the registers in the 16-31 range. So when we
> > return from foo, regardless of how or how many times, the pointer to
> > x's PMC will be in a register if it was in there before the call to
> > foo, if it's in the preserved range. So in this case there's no
> > problem. Things'll look like:
> >
> > x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
> > foo() # foo invocation
> > print x # P16 still has pointer value 0x04
> > y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value
> > 0x08
> > return y # Passes back 0x08
> >
> > With more or less clarity.
>
> But the problem isn't preservation per se. When a continuation
> (originally captured inside of foo) is invoked, the frame will be
> restored with the register contents it had when it last executed, so
> P16 in your annotations will have pointer value 0x08 after the "first"
> time that continuation is invoked (because "y = 2" will have executed
> and changed the register contents). That will result in "print x"
> printing the value "2", which is wrong from the perspective of the code
> (that line should always print "1"). To do-the-right-thing, the
> register allocator either has to use a separate register to hold y, or
> needs to do some other preservation dance (instead of relying on
> preserved registers). And again, I think the reason for this is that
> the lifetimes of x and y overlap, so you can't just use the same
> register to store them. The only surprising part of all of this is that
> their lifetimes in fact overlap--they only overlap due to
> continuations, and wouldn't otherwise. It's the implicit branch from
> the "return" to the op-after-the-call-to-foo that's causing them to
> overlap.
We can handle this fairly easily by having the promotion swap in a new
invoke vtable for the return continuation. The new invoke vtable will
put into place a copy of its context (rather than the context itself).
Thus promoted return continuations can be invoked repeatedly (at the
cost of a memory copy), and non-promoted ones can be invoked once and
then immediately added to the continuation free list (possibly by
their invoke function).
But the problem isn't preservation per se. When a continuation
(originally captured inside of foo) is invoked, the frame will be
restored with the register contents it had when it last executed, so
P16 in your annotations will have pointer value 0x08 after the "first"
time that continuation is invoked (because "y = 2" will have executed
and changed the register contents). That will result in "print x"
printing the value "2", which is wrong from the perspective of the code
(that line should always print "1"). To do-the-right-thing, the
register allocator either has to use a separate register to hold y, or
needs to do some other preservation dance (instead of relying on
preserved registers). And again, I think the reason for this is that
the lifetimes of x and y overlap, so you can't just use the same
register to store them. The only surprising part of all of this is that
their lifetimes in fact overlap--they only overlap due to
continuations, and wouldn't otherwise. It's the implicit branch from
the "return" to the op-after-the-call-to-foo that's causing them to
overlap.
None of this should have anything to do with return continuations
specifically, since this is the case where the body of foo (or
something called from it) creates a "real" continuation, which as I
understand it is supposed to promote the return continuations to "real"
continuations all the way up the stack.
JEff
On Tue, 30 Nov 2004 11:20:50 -0800, Jeff Clites <jcl...@mac.com> wrote:
> On Nov 30, 2004, at 10:27 AM, Dan Sugalski wrote:
>
>
>
> > At 10:15 AM -0800 11/30/04, Jeff Clites wrote:
>
> > Oh. No, it won't. We've declared that return continuations will always
> > leave the top half registers in the state they were when the return
> > continuation was taken. In this case, when it's taken to pass into
> > foo, P16 is 0x04. When that return continuation is invoked, no matter
> > where or how many times, P16 will be set to 0x04. This does make
> > return continuations slightly different from 'plain' continuations,
> > but I think this is acceptable.
>
> Ah, I see.
>
>
>
> >> None of this should have anything to do with return continuations
> >> specifically, since this is the case where the body of foo (or
> >> something called from it) creates a "real" continuation, which as I
> >> understand it is supposed to promote the return continuations to
> >> "real" continuations all the way up the stack.
> >
> > The return continuations have to maintain their returny-ness
> > regardless, otherwise they can't be trusted and we'd need to
> > unconditionally reload the registers after the return from foo(),
> > since there's no way to tell whether we were invoked via a normal
> > return continuation chain invocation, or whether something funky
> > happened down deep in the call chain.
>
> Yeah, so I think that won't work correctly. Here's an example from Ruby
> which I posted in a previous thread. If the return from the call to
> strange() by outer() always restores the registers as of the point the
> (return) continuation was created, then the below would print out "a =
> 1" over and over, but really it's intended that the value should
> increase, so with the behavior you describe, the following Ruby code
> wouldn't work right:
>
> % cat continuation6.ruby
> def strange
> callcc {|continuation| $saved = continuation}
> end
>
> def outer
> a = 0
> strange()
> a = a + 1
> print "a = ", a, "\n"
> end
>
> # these two lines are "main"
> outer()
> $saved.call
>
> % ruby continuation6.ruby
> a = 1
> a = 2
> a = 3
> a = 4
> a = 5
> a = 6
> a = 7
> a = 8
> a = 9
> a = 10
> ...infinite loop, by design
a would need to be stored in a RubyInt PMC. In such a situation it
would work as the register for a is actually a pointer to its value.
An optimizer might try and lower a to an I register, but that would be
an invalid optimization.
Ah, you see, that's where the Cunning Plan comes in. :)
> Here's an example from Ruby which I posted in a previous thread. If
>the return from the call to strange() by outer() always restores the
>registers as of the point the (return) continuation was created,
>then the below would print out "a = 1" over and over, but really
>it's intended that the value should increase, so with the behavior
>you describe, the following Ruby code wouldn't work right:
But it will, you see. All that happens is that the PMC register that
holds a reference to a continues to hold a reference to a. The
*variable* that the register refers to is constant. The value,
though, isn't, and will change over time as it ought to. This isn't
any different than the way we're handling globals and lexicals --
something like, assuming a_variable has an initial value of 0:
foo = global "a_variable"
foo = foo + 1
bar = global "a_variable"
print bar
print foo
this'd print 1 1.
In this example:
>% cat continuation6.ruby
>def strange
> callcc {|continuation| $saved = continuation}
>end
>
>def outer
> a = 0
> strange()
> a = a + 1
> print "a = ", a, "\n"
>end
Through the joys of reference types, a will continue to increase
forevermore, assuming the compiler hasn't incorrectly put a in an int
register. (Which'd be wrong) Remember the PMC and string registers
hold pointers to pmc/string structures, which is all we're preserving
-- the *pointer* to the structure, not the contents of the structure.
(Though that's an interesting thing to do for other reasons, like,
say, transactions, we're not going to be doing that) The contents can
change over and over without the register itself ever changing.
># these two lines are "main"
>outer()
>$saved.call
>
>% ruby continuation6.ruby
>a = 1
>a = 2
>a = 3
>a = 4
>a = 5
>a = 6
>a = 7
>a = 8
>a = 9
>a = 10
>...infinite loop, by design
>
>JEff
> At 10:15 AM -0800 11/30/04, Jeff Clites wrote:
> Oh. No, it won't. We've declared that return continuations will always
> leave the top half registers in the state they were when the return
> continuation was taken. In this case, when it's taken to pass into
> foo, P16 is 0x04. When that return continuation is invoked, no matter
> where or how many times, P16 will be set to 0x04. This does make
> return continuations slightly different from 'plain' continuations,
> but I think this is acceptable.
Ah, I see.
>> None of this should have anything to do with return continuations
>> specifically, since this is the case where the body of foo (or
>> something called from it) creates a "real" continuation, which as I
>> understand it is supposed to promote the return continuations to
>> "real" continuations all the way up the stack.
>
> The return continuations have to maintain their returny-ness
> regardless, otherwise they can't be trusted and we'd need to
> unconditionally reload the registers after the return from foo(),
> since there's no way to tell whether we were invoked via a normal
> return continuation chain invocation, or whether something funky
> happened down deep in the call chain.
Yeah, so I think that won't work correctly. Here's an example from Ruby
which I posted in a previous thread. If the return from the call to
strange() by outer() always restores the registers as of the point the
(return) continuation was created, then the below would print out "a =
1" over and over, but really it's intended that the value should
increase, so with the behavior you describe, the following Ruby code
wouldn't work right:
% cat continuation6.ruby
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print "a = ", a, "\n"
end
# these two lines are "main"
I can see that there is true magic in the power of using references in
this way. Nonetheless, how can the compiler figure out that it can't
use an integer here? The compiler should use integers when it can,
but it sounds like you are saying that when a variable crosses a sub
call (which might invoke a continuation) it will then have to be a PMC
or String to do the right thing.
> Remember the PMC and string registers
> hold pointers to pmc/string structures, which is all we're preserving
> -- the *pointer* to the structure, not the contents of the structure.
> (Though that's an interesting thing to do for other reasons, like,
> say, transactions, we're not going to be doing that) The contents can
> change over and over without the register itself ever changing.
>
>
>
> ># these two lines are "main"
> >outer()
> >$saved.call
> >
Bill
> >JEff
>
> --
> Dan
> In this example:
>
>> % cat continuation6.ruby
>> def strange
>> callcc {|continuation| $saved = continuation}
>> end
>>
>> def outer
>> a = 0
>> strange()
>> a = a + 1
>> print "a = ", a, "\n"
>> end
>
> Through the joys of reference types, a will continue to increase
> forevermore, assuming the compiler hasn't incorrectly put a in an int
> register. (Which'd be wrong)
Separate question, but then what would happen for languages which _do_
use primitive types? (Presumably, Perl6 would do that in the "my int"
case.) If proper behavior requires basically never using the primitive
I/N types, that seems like a waste.
> Remember the PMC and string registers hold pointers to pmc/string
> structures, which is all we're preserving -- the *pointer* to the
> structure, not the contents of the structure.
Sure.
> The contents can change over and over without the register itself ever
> changing.
But in this Ruby case, "a = a + 1" actually creates a new Fixnum
instance, so "a" ends up holding a different instance each time--you
can verify that by printing out a.id in the print statement. (And if
you don't like that, you can get the effect I'm talking about with an a
= MyCustomClass.new(a.int_value() + 1) sort of thing.)
So it would seem that what you are saying only works if the register
isn't holding the Fixnum instance, but instead is holding some
RubyReference instance which references the Fixnum instance. But that
seems like a needless extra object, in the lexical variable case. (I
could see having the intervening object in the global case, to avoid
repeated hash lookups.)
JEff
Yep. And that's something that the compiler will just have to know (or
give IMCC enough information to do something about it).
And, IIRC, an S register wouldn't work either, since strings are value
types at the bytecode level.
Luke
On Tue, 30 Nov 2004 22:12:30 -0800, Bill Coffman <bill.c...@gmail.com> wrote:
> On Tue, 30 Nov 2004 14:45:39 -0500, Dan Sugalski <d...@sidhe.org> wrote:
>
>
> > At 11:20 AM -0800 11/30/04, Jeff Clites wrote:
> > >% cat continuation6.ruby
> > >def strange
> > > callcc {|continuation| $saved = continuation}
> > >end
> > >
> > >def outer
> > > a = 0
> > > strange()
> > > a = a + 1
> > > print "a = ", a, "\n"
> > >end
> >
> > Through the joys of reference types, a will continue to increase
> > forevermore, assuming the compiler hasn't incorrectly put a in an int
> > register. (Which'd be wrong)
>
> I can see that there is true magic in the power of using references in
> this way. Nonetheless, how can the compiler figure out that it can't
> use an integer here? The compiler should use integers when it can,
> but it sounds like you are saying that when a variable crosses a sub
> call (which might invoke a continuation) it will then have to be a PMC
> or String to do the right thing.
This has come up before. It turns out that the compiler will likely
be able to make very few optimizations of this sort on its own.
However, user code rarely needs this kind of optimization. Library
code is what should really get it, and the library author can provide
hint/directives to the compiler for cases like this. If you took a
continuation inside a callback from a library routine and then invoked
it, I don't think that you would have any reasonable guess what the
library routine would do, so either behavior would seem reasonable.
Two potential options. One, they have a backing PMC for the lexicals
(which they'll probably need anyway) and flush/restore at spots where
things could get lost. Two, they don't actually get a low-level type
but the compiler cheats and acts as if it was in spots where it's
safe. (I admit, I'd always planned on having "my int $foo" use a PMC.
The win would be for "my int @foo" which would also get a PMC, but
one that had optimized backing store for the values)
>>The contents can change over and over without the register itself
>>ever changing.
>
>But in this Ruby case, "a = a + 1" actually creates a new Fixnum
>instance, so "a" ends up holding a different instance each time--you
>can verify that by printing out a.id in the print statement.
This is where the magic of objects comes in, for particularly loose
values of "magic". Ruby uses a double-reference system for objects
the same way that perl 5 does -- that is, the underlying structure
for a doesn't hold the object, rather it holds a pointer to another
structure that holds the object. So it looks like:
(name a) -> (a struct, addr 0x04) -> (object struct addr 0x08)
and after the store it looks like:
(name a) -> (a struct, addr 0x04) -> (object struct addr 0x0C)
The PMC register would hold the 0x04 struct pointer, basically a
Reference PMC, and assignments to it just switch the thing it refers
to.
Essentially things like I and N registers are value types, PMC and
strings are (for us) reference types, and many objects (including
ruby and perl's objects) are double-reference types.
Which, yeah, means we've been ignoring some important object things.
Time to deal with that, too.
Generally it can't. Unfortunately our target languages are painfully
difficult (and in the general case, nearly impossible) to optimize. :(
But so it sounds like I and N registers are a bit of a waste then, no?
They won't really be used. Even in your "my int @foo" case, you could
have an optimized backing store for the array, but you'd have to
on-the-fly create a PMC to hold any values you pulled out of the array,
e.g. for a subsequent "x = @foo[1]" (in Perl6 syntax)--x would have to
hold a full PMC there, it seems, so nothing there would end up in an I
register (except as in intermediate in immediately creating the PMC).
Seems like instead we'd be able to simply use an I register, and be
done with it. The returny-ness of a return continuation seems bogus, if
it's supposed to survive promotion to a real continuation. All we need
is some sort of per-frame storage (like lexical pads provide, but for
all register types), and yes explicit moving of things in-and-out of
that storage if we want to re-use registers (and only necessary across
function call if we do plan on re-using a register). That's what
happens in hardware CPUs.
But even with all of what you say below, I still think there's an issue
with overlapping variable lifetimes. Consider my original pseudocode
example again, with one added print statement:
x = 1
foo()
print x
print y
y = 2
return y
The "first time" foo() returns, the "print y" should find y holding
undef (or something like that), and subsequent times it should find it
holding 2. (That's certainly what the analogous goto would do.) That
means that even with an intermediate Reference struct, you still can't
use the same register to hold the Reference for x and the Reference for
y, because their lifetimes overlap, but only due to the presence of
continuations--without them, you could re-use the register. Re-entering
the frame needs to leave all of the local variables intact, and that's
stretching their lifetimes out to extend across most of the frame. Or
so it would seem.
>>> The contents can change over and over without the register itself
>>> ever changing.
>>
>> But in this Ruby case, "a = a + 1" actually creates a new Fixnum
>> instance, so "a" ends up holding a different instance each time--you
>> can verify that by printing out a.id in the print statement.
>
> This is where the magic of objects comes in, for particularly loose
> values of "magic". Ruby uses a double-reference system for objects the
> same way that perl 5 does -- that is, the underlying structure for a
> doesn't hold the object, rather it holds a pointer to another
> structure that holds the object. So it looks like:
>
> (name a) -> (a struct, addr 0x04) -> (object struct addr 0x08)
>
> and after the store it looks like:
>
> (name a) -> (a struct, addr 0x04) -> (object struct addr 0x0C)
>
> The PMC register would hold the 0x04 struct pointer, basically a
> Reference PMC, and assignments to it just switch the thing it refers
> to.
But what's the point of that--seems like a waste of an intermediate
struct. And I assume the intermediate struct would be a PMC, no?
> Essentially things like I and N registers are value types, PMC and
> strings are (for us) reference types, and many objects (including ruby
> and perl's objects) are double-reference types.
In Ruby "everything is an object", so we'd have double references even
for numbers (number-ish PMCs). That loops back to the point of my
question in the thread "Basic compilation example (a + b)?". If "a + b"
compiles as "add P16, P18, P17", and registers actually hold some
intermediate Reference type, then the MMD of "add" would be
type-dispatching on the reference type, not the type of the objects
inside (as the MMD is currently written). Since Ruby (and Python) have
type-less variables, that implies to me a single Reference type at
least for them, so our MMD would always call the same function for
"add", and not do any useful dispatching. Or should our MMD instead
"look inside" the Reference held in the register?
JEff
> But so it sounds like I and N registers are a bit of a waste then, no?
A bit? They would be utterly useless.
As a plain:
add Px, Py, Pz
(as well as almost any other PMC related opcode) can be overloaded by
running a PASM/PIR function, there is no usage for other registers then
P at all. (S registers have value sematics, they are of course
invalidated too)
Perl6 natural types would be unimplementable - go figure ...
That'll cause the biggest architecture change ever. But we don't do such
changes now - says Dan - so I'm not worried ;)
And Dan's "proposal" does still not solve the reuse of temporary
registers.
> JEff
leo
As with most technical problems, fully specifying them, is often half
the battle. In this case, I think we're getting close to
understanding the issues at least.
[please treat all statements as possible questions]
First, consider my original post in this thread:
http://www.nntp.perl.org/group/perl.perl6.internals/27383
To summarize:
Full continuations are powerful but expensive. They are like hidden
goto's and add arcs to the control flow graph. This causes more
registers to interfere.
Proposed Solutions:
- Accept the arcs and probable spilling. We still have the regions
between sub calls.
- Put labels on certain subs that denote they might call or be called
by a continuation.
- Pragmas indicating ranges where continuations won't be called (or will be).
- Restore registers from lexicals somewhere, after each function call.
- Accept incorrectness, as the current implementation does (discovered
when new allocator failed 2 tests, and Leo saw the problem).
Next, consider Dan's message, "Lexicals, continuations, and register
allocation":
On Tue, 30 Nov 2004 10:22:29 -0500, Dan Sugalski <d...@sidhe.org> wrote:
> So far as I can tell there are two cases here:
>
> 1) A return continuation
>
> 2) A regular continuation
>
[snip return contuations, which are far from solved, but likely a
subset of the below]
>
> The second case, where code takes an arbitrary continuation that
> returns to location X, wherever X is. I'm missing the problem there,
> too. Assuming there's a way to note the destination to the register
> allocator (with those points being places where all registers must be
> assumed to be trash) I'm not seeing the problem either. There are
> only two cases here, where the destination is marked and where it
> isn't. If it's marked, the register allocator assumes things are
> dirty and loads everything, which is fine. If it's unmarked, the code
> has essentially shot itself and everything quietly goes to hell since
> you've lied to the register allocator and you shouldn't do that.
> Which is fine too. Don't Do That.
If I read the above correctly, Dan is advocating the strongest
restriction of all, which is to specify any arcs that might be added.
That is, specify all sub_i -> sub_j, where "->" means that a
continuation saved in sub_j is invoked by sub_i.
To reclassify the reasonable solutions:
1a. Concentrate on restoring registers after calling -- probably using
the lexicals.
1b. Possibly increase the size of the register set, and/or try to use
lexicals or globals (I think of this as a kind of improved spilling)..
2a. Insert all the arcs into the CFG, which will increase spilling, but
2b. Reduce the number of arcs in the CFG, by introducing various
annotations on subs indicating when they do or do not save or invoke a
continuation. Especially target library functions.
Currently, I'm trying to work on 2a right now, but my day job is
making that tough lately. Would like to see more attention payed to
2b, by those who care about the issue. In particular I'd like to see
HLL designers say, "Yeah, 2b looks like a great idea!", and maybe a
few, "yeah, let's specifically implement the following..."
Then, there are additional problems to contend with...
On Wed, 1 Dec 2004 09:49:34 -0800, Jeff Clites <jcl...@mac.com> wrote:
> But so it sounds like I and N registers are a bit of a waste then, no?
> They won't really be used. Even in your "my int @foo" case, you could
> have an optimized backing store for the array, but you'd have to
> on-the-fly create a PMC to hold any values you pulled out of the array,
> e.g. for a subsequent "x = @foo[1]" (in Perl6 syntax)--x would have to
> hold a full PMC there, it seems, so nothing there would end up in an I
> register (except as in intermediate in immediately creating the PMC).
[snippage of examples]
If I understand right, PerlArray's would be used for "my int @foo".
If you want to use these values, you have to copy back and forth from
I* registers, for the most part.
The ISN registers can be used in the nether regions, between sub calls
(even the lower half of the registers can be used there). Subs which
are garanteed not to call or save continuations are safe (see 2b
above), and ISN registers can be used while crossing those.
However, we now know that only P registers can cross unsafe subs,
unless ... well, there's probably certain cases where ISN's can be
used. That question would take more thought.
My sense is that we want to support continuations, but we don't want
to be crushed under the heavy load. Perhaps we can adapt the policy
that if one is brazen enough to use full continuations, then s/he
should be expected to at least inform the compiler about it.
Bill
> First, consider my original post in this thread:
> http://www.nntp.perl.org/group/perl.perl6.internals/27383
> To summarize:
> Full continuations are powerful but expensive. They are like hidden
> goto's and add arcs to the control flow graph. This causes more
> registers to interfere.
> Proposed Solutions:
...
> - Put labels on certain subs that denote they might call or be called
> by a continuation.
> - Pragmas indicating ranges where continuations won't be called (or
> will be).
...
> Next, consider Dan's message, "Lexicals, continuations, and register
> allocation":
> On Tue, 30 Nov 2004 10:22:29 -0500, Dan Sugalski <d...@sidhe.org> wrote:
>>
>> The second case, where code takes an arbitrary continuation that
>> returns to location X, wherever X is. I'm missing the problem there,
>> too. Assuming there's a way to note the destination to the register
>> allocator (with those points being places where all registers must be
>> assumed to be trash) I'm not seeing the problem either. There are
>> only two cases here, where the destination is marked and where it
>> isn't. If it's marked, the register allocator assumes things are
>> dirty and loads everything, which is fine. If it's unmarked, the code
>> has essentially shot itself and everything quietly goes to hell since
>> you've lied to the register allocator and you shouldn't do that.
>> Which is fine too. Don't Do That.
>
> If I read the above correctly, Dan is advocating the strongest
> restriction of all, which is to specify any arcs that might be added.
> That is, specify all sub_i -> sub_j, where "->" means that a
> continuation saved in sub_j is invoked by sub_i.
...
> To reclassify the reasonable solutions:
...
> 2a. Insert all the arcs into the CFG, which will increase spilling, but
...
> Would like to see more attention payed to
> 2b, by those who care about the issue. In particular I'd like to see
> HLL designers say, "Yeah, 2b looks like a great idea!", and maybe a
> few, "yeah, let's specifically implement the following..."
...
> My sense is that we want to support continuations, but we don't want
> to be crushed under the heavy load. Perhaps we can adapt the policy
> that if one is brazen enough to use full continuations, then s/he
> should be expected to at least inform the compiler about it.
Above I've left just the parts relevant to "explicitly label somehow
function calls which might capture full continuations". I've not been
able to tell if the intention is that the user would have some special
syntax to use in the HLL, or if the HLL compiler is expected to put in
the indications at compile-time. The problem with the former approach
is that existing languages which use continuations (Scheme, Ruby) don't
have this--we can't change the languages in such a fundamental way. The
problem with the latter approach is that the HLL itself doesn't
know--it would need to label any function call which could call
anything which could call something which would capture a real
continuation. In practice that would end up meaning that the HLL
compiler would have to mark all function call sites this way, so you'd
still end up with the maximal set of arcs--just with the HLL compiler
doing the work, rather than the register allocator. It doesn't really
help the issue, just moves the work around.
I think Leo originally brought up the "labeling" idea, and seemed to
think that the HLL always had enough info to label all of the call
sites which could be affected by continuations. Here's my Ruby example
again, and an explanation of why I think the HLL compiler can't do the
necessary labeling:
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print "a = ", a, "\n"
end
# these two lines are "main"
outer()
$saved.call
This program prints out an ever-increasing sequence of numbers.
Now, "outer" could have been defined in its own compilation unit. The
HLL compiler has no way of knowing that the call to strange() might
capture a continuation--there's no syntactic indication inside of the
definition of "outer", or at the site of the call to "outer" from the
main program flow. The most an HLL compiler for Ruby could do would be
to assume that all function calls might capture a continuation, and
that puts us back in the same boat we're already in--we get all of the
arcs.
I think we have basically 3 choices: support continuations such that
they work correctly in all HLL situations and accept likely poor
performance, or support only escape continuations, or devise a strategy
whereby Parrot itself doesn't provide continuations, but allows them to
exist at the HLL level somehow. I don't know of a way the third
approach could work (though maybe someone has a clever idea), and the
first two both carry significant down sides.
JEff
This sounds like a much better solution than the one I've just proposed.
No saving/restoring, the register frame size is known at compile time
(so the double-indirection can be reduced to single), JIT still works.
So, scrap my proposal. Seriously consider his. My comments about "this
is a real problem and needs a real solution" still hold.
Luke
Luke Palmer writes:
> I looked through google groups and couldn't find leo's solution
> ("putting lexicals in registers", and I can't figure out what that
> means; couldn't you have a loop counter that isn't a lexical?).
>
> I think we all agree that this is a major problem, and that to ignore it
> would be a fatal mistake in Parrot's support of continuations. And
> parrot not supporting continuations means parrot not supporting
> coroutines. And that means parrot not supporting Perl 6 :-/
>
> We would like a solution that allows optimization, keeps JIT around, and
> behaves correctly. The main problem with "Dan's solution" (which I also
> couldn't find, but I'm inferring by talk about it) is that it restores
> things values when they may have already changed.
>
> So--here's the naive proposal. Naive because I'm not very good at
> foreseeing performance issues, etc. Keep in mind that this is a
> major architecture change (which we're not doing, right?), but can be
> made opaque with proper support from IMCC.
>
> No more value semantics.
>
> That's it. All values on the bytecode level are pointers. I's are
> pointers to ints (that's still a win since you don't need to keep going
> through a vtable), etc. P's are just as they used to be.
>
> Then to use an int, you have to allocate it first. Perhaps this can be
> made efficient with a specialized small-object pool attached onto the
> register frame for cache locality (which I don't really understand).
>
> Anyway, I'm expecting you smart guys to shoot this down. But
> something's got to be done about this problem. Proposals have to
> happen.
>
> Luke
>
> I think Leo originally brought up the "labeling" idea,
Yes, one of the first ideas I had. I've tossed it later for similar
reasons you are showing here with the ruby example. It just doesn't
work. The second was refetching from lexicals. I tossed it later because
not every register that is preserved around a call is a lexical (or
global).
I think we have discussed these two approaches long enough to be sure
that they don't work.
> I think we have basically 3 choices: support continuations such that
> they work correctly in all HLL situations and accept likely poor
> performance, or support only escape continuations, or devise a strategy
> whereby Parrot itself doesn't provide continuations, but allows them to
> exist at the HLL level somehow.
These choices are not really desirable, IMHO. Number 3 doesn't work for
technical reasons anyway.
Frankly I think, I've presented a way to make continuations work
correctly. I did *not* hear any technical or otherwise reasonable
argument that it wouldn't work or that it's untenable, nothing, nada.
A rule #1 overriding is neither technical nor reasonable in a case,
where correctness is the problem.
> JEff
leo
>
> I think we have basically 3 choices: support continuations such that
> they work correctly in all HLL situations and accept likely poor
> performance, or support only escape continuations, or devise a
> strategy whereby Parrot itself doesn't provide continuations, but
> allows them to exist at the HLL level somehow. I don't know of a way
> the third approach could work (though maybe someone has a clever
> idea), and the first two both carry significant down sides.
Actually these are only two approaches (as 'provide only escape
continuations and let compilers fend for themselves' is same as 3rd
approach).
Yes, it's possible to make the continuations efficient - Schemes have
been doing this for a long time (note that possible != easy). In
particular, Leo's variable-sized frames look like a workable solution to
me and aren't intrusive at PIR level, are moderately intrusive at the VM
implementation level, they do heavily impact things on assembly/bytecode
level... and ought to be quite speedy. Modulo unforeseen problems *evil
cackle*
Biting the bullet and just spilling a lot is also acceptable, or at
least it looks like that for me. You get a constant factor penalty and
it's not a huge constant factor, esp. if you have a bulk spill/reload
opcode. Read Bill's posts about hammering the allocator into doing this
- I wouldn't put the words in his mouth but from his posts this seems
completely feasible. The architectural impact is low across the board
(if Bill is right about being able to do this easily in the register
allocator), and while slower (compared to variable frames), this doesn't
seem too bad a solution.
A scheme that requires en-masse dropping of I and N registers and adding
double-referenced P registers does not sound like a good idea to me.
Performance-wise, it's no better (but is possibly worse) than refetching
everything from lexicals after each function call - you simply refetch
from the extra refs rather than the lexical pad, and you have to do it
*every time* you access a variable, and you can't do bulk spill. VM
implementation impact should be lower than the 2 solutions above, but
every single code generator in existence will break, and if this is ever
fixed, they'll have to be broken again.
It's also possible to provide continuations for specific languages only,
for example by using trampoline. That's why you can implement Scheme on
top of JVM. No, this doesn't interoperate with languages that don't use
the trampoline, at least not bidirectionally. Considering that Parrot is
primarily targetted for Perl6, this begs the question of whether Perl6
needs real continuations. So far, nothing I've seen in Apocalypses or
Synopsi (Synopse? Synopses?) looked unimplementable without them -
junctions are datatypes that affect function calls at call sites, not at
their point of creation (since they can be introspected), the laziness
is handlable by closures and coroutines can be handled specially - the
way Python and Lua handle them. Still... continuations are a pretty neat
toy to have. ;)
Miro
Well, since I (and I'm sure many others) have been skipping over a lot
of messages in this thread, could you either restate your solution or
give a pointer to your proposal message?
Thanks,
Luke
I must admit I am a little confused as to what the problem is. Given
that any continuation which was originally created as a return
continuation will restore its registers, all code which uses P/S
registers will work as expected. Furthermore, I/N registers will be
usable accross function calls, it is just that continuations will
cause the same to computate as before to be repeated (which is really
not that much of a spanner in the works).
Please don't draw the same diagrams and claim that registers life is
secretly extended. That is not the case, as the invocation of the
(once) return continuation will restore the register allocation to the
correct state that code after it is expecting.
> I must admit I am a little confused as to what the problem is. Given
> that any continuation which was originally created as a return
> continuation will restore its registers, all code which uses P/S
> registers will work as expected. Furthermore, I/N registers will be
> usable accross function calls, it is just that continuations will
> cause the same to computate as before to be repeated (which is really
> not that much of a spanner in the works).
> Please don't draw the same diagrams and claim that registers life is
> secretly extended. That is not the case, as the invocation of the
> (once) return continuation will restore the register allocation to the
> correct state that code after it is expecting.
Ok. I'll try to summarize, where I see the problem. No diagrams, just
code :)
$I0 = 42 # set I16, 42 42
$N0 = 2.5 # set N16, 2.5 ..101...
$S0 = "a" # set S16, "a" 0x1004 -> "a"
$P0 = "a" # set P16, "a" 0x2008 -> "a"
loop:
foo() # set P0, ...; invokecc
We have some temporary variables and a function call. Variables are used
beyond that point, so the register allocator puts these in the preserved
register range. The function C<foo()> might or might not capture the
continuation created by the C<invokecc> opcode.
Let's assume, it is captured, and stored into a global, if it wasn't
already, i.e. the first time. According to Dan's plan, the function
return restores the register contents to the state of the creation of
the return continuation, which is shown in the right column.
$I0 += 1 # add I16, 1 43
$N0 *= 2.0 # mul N16, 2.0 .101....
$S0 .= "b" # concat S16, "b" 0x1008 -> "ab"
inc $P0 # inc P16 0x2008 -> "b"
dec a # dec P17 0x200c -> 1
if a goto loop # if P17, loop
A note WRT strings: the concat might or might not assign a new string to
S16. It depends on the capacity of the string buffer. But generally:
string operations do create new string headers with a different memory
address like shown here. While S registers hold pointers, they have
value semantics.
Now we loop once over the function call. This creates a new return
continuation and on function return registers are restored to their new
values (44, 10.0, "abb", "c"). All fine till here.
The loop counter "a" reaches zero. Now the next instruction is
another function call.
bar() # set P0, ... invokecc
The "bar()" function extracts the return continuation captured in the
first call to "foo()" from the global and invokes it. Control flow
continues right after the "invokecc" opcode that called "foo()".
This would restore the register contents to the first state shown above.
That is, not only I and N registers would be clobbered also S registers
are involved.
Above code could only use P registers. Or in other words: I, N, and S
registers are almost[1] useless.
Perl6 syntax like "my int $i;" would need PMCs. Extracting an integer
from an array that stores natural ints would need a PMC.
And further: to achieve this functionality of restoring registers, there
must AFAIK be some copying involved (probably when the return
continuation is captured and converted into a real continuation).
This would make the usage of continuations again very expensive and very
likely unusable for backtracking in the rules engine or such.
[1] What about I,S,N register usage between function calls or in leaf
functions (that don't call further)?
A piece of code like:
$I0 = 10
loop:
$P0 = shift ar
dec $I0
if $I0 goto loop
would look reasonable sane with a loop counter in an I register. But
what happens if the "shift" extracts values from the array and the array
creates these values lazily with a generator function? You never know if
there is a continuation involved.
To me this all smells like a huge change of the Parrot architecture.
Please correct me, if any of my assumptions is wrong.
leo
On Fri, 3 Dec 2004 09:26:24 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> Ok. I'll try to summarize, where I see the problem. No diagrams, just
> code :)
>
> [snip]
Thanks for the clear explanation. I did not realize that S registers
could switch pointers, that does make things a little harder. I have
a recommendation for a possible hybrid solution. Incur the cost of
spilling I,S,N registers heavily. Restore the state of P register.
I suggest this because it seems likely that P registers will have much
greater pressure on them then the others. This would allow us to
minimize the spilling of them (where spilling is most likely), while
still allowing the usage of I,S,N registers (which are less likely to
be used to the point of spilling.
On Fri, 3 Dec 2004 09:26:24 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> Ok. I'll try to summarize, where I see the problem. No diagrams, just
> code :)
>
> [snip]
Thanks for the clear explanation. I did not realize that S registers
could switch pointers, that does make things a little harder. I have
a recommendation for a possible hybrid solution. Incur the cost of
spilling I,S,N registers heavily. Restore the state of P register.
I suggest this because it seems likely that P registers will have much
greater pressure on them then the others. This would allow us to
minimize the spilling of them (where spilling is most likely), while
still allowing the usage of I,S,N registers (which are less likely to
be used to the point of spilling.
(Sorry for sending this to you twice Leo, accidentally did not cc the list)
> I looked through google groups and couldn't find leo's solution
> ("putting lexicals in registers", and I can't figure out what that
> means; couldn't you have a loop counter that isn't a lexical?).
I believe its the first in the thread 'Lexicals, continuations, and
register allocation', from 23 Nov.
http://groups-beta.google.com/group/perl.perl6.internals/msg/
5d407a49e35702c0?dmode=source
Richard
>>Frankly I think, I've presented a way to make continuations work
>>correctly. I did *not* hear any technical or otherwise reasonable
>>argument that it wouldn't work or that it's untenable, nothing, nada.
>
> Well, since I (and I'm sure many others) have been skipping over a lot
> of messages in this thread, could you either restate your solution or
> give a pointer to your proposal message?
Subject: Lexicals, continuations, and register allocator
Date: Tue, 23 Nov 2004
You can add to the proposed work arounds:
c) don't use I,N,S registers, and return continuations restore register
contents.
> Thanks,
> Luke
leo
> Thanks for the clear explanation. I did not realize that S registers
> could switch pointers, that does make things a little harder. I have
> a recommendation for a possible hybrid solution. Incur the cost of
> spilling I,S,N registers heavily. Restore the state of P register.
My conclusion was that with the copying approach I,S,N registers are
unusable.
> I suggest this because it seems likely that P registers will have much
> greater pressure on them then the others.
Who knows that (now)?
Here is the register usage line Dan has posted some time ago:
| registers needed: I3597, N0, S962, P6207
(these are of course virtual registers but 271 registers got spilled)
It'll be ultimate fun to map registers to 16 preserved P over a
call, BTW.
> Matt
leo
> Matt Fowles <uber...@gmail.com> wrote:
>
>> Thanks for the clear explanation. I did not realize that S registers
>> could switch pointers, that does make things a little harder. I have
>> a recommendation for a possible hybrid solution. Incur the cost of
>> spilling I,S,N registers heavily. Restore the state of P register.
>
> My conclusion was that with the copying approach I,S,N registers are
> unusable.
But you only need to copy when the frame you're restoring is a full
continuation (and, actually, if copy on write works at a per register
level, copy on write might be the way to go). If it's a return
continuation you can simply use the stored state.
I'd submit that, in the vast majority of cases you're not going to be
dealing with full continuations, and on the occasions when you are the
programmer using them will be aware of the cost and will be willing to
pay it.
> Piers Cawley writes:
>> I'd submit that, in the vast majority of cases you're not going to be
>> dealing with full continuations, and on the occasions when you are the
>> programmer using them will be aware of the cost and will be willing to
>> pay it.
>
> Yeah probably. Except the problem isn't the cost. The problem is the
> semantics. If you copy the registers, then when you invoke the
> continuation, their *values* restore to what they were when you made the
> continuation. These are not proper semantics, and would result in
> subtle, incorrect infinite loops.
PMCs don't relocate, so the values you're restoring are simply the
addresses of said PMCs. The Numeric registers are value registers
anyway so no problem there (since there's no way of making a pointer to
the contents of such a register AFAICT). I'm not sure about string
registers.
And anyway, copying is how it used to work, and work it did, albeit slowly.
Yeah probably. Except the problem isn't the cost. The problem is the
semantics. If you copy the registers, then when you invoke the
continuation, their *values* restore to what they were when you made the
continuation. These are not proper semantics, and would result in
subtle, incorrect infinite loops.
Luke
>> Matt Fowles <uber...@gmail.com> wrote:
>>
>>> Thanks for the clear explanation. I did not realize that S registers
>>> could switch pointers, that does make things a little harder. I have
>>> a recommendation for a possible hybrid solution. Incur the cost of
>>> spilling I,S,N registers heavily. Restore the state of P register.
>>
>> My conclusion was that with the copying approach I,S,N registers are
>> unusable.
> But you only need to copy when the frame you're restoring is a full
> continuation
Yes. With the effect that semantics of I,S,N (i.e. value registers)
suddenly changes.
> I'd submit that, in the vast majority of cases you're not going to be
> dealing with full continuations, and on the occasions when you are the
> programmer using them will be aware of the cost and will be willing to
> pay it.
*If* the programmer is aware of the fact that a subroutine can return
multiple times, he can annotate the source so that a correct CFG is
created that prevents register reusing alltogether. The problem is
gone in the first place.
*If* that's not true, you'd get the effect that suddenly I,S,N registers
restore to some older values which makes this registers de facto unusable.
leo
But they're bloody value registers. They're *supposed* to restore to the
state they were in when the function was originally called. Which is
what copying semantics does.
Further to my last response. If you have things set up so that you can
return multiple times from the same function invocation then the return
continuation should have been replaced with a full continuation before
the first return, so even the first return will use copying semantics,
and the registers will always be restored to the state they were in when
the function was first called, which is absolutely the right thing to
do.
> Further to my last response. If you have things set up so that you can
> return multiple times from the same function invocation then the return
> continuation should have been replaced with a full continuation before
> the first return, so even the first return will use copying semantics,
> and the registers will always be restored to the state they were in when
> the function was first called, which is absolutely the right thing to
> do.
Here is again the example I've brought recently. Please go through it
and tell me what's wrong with my conclusion.
leo
Is that guaranteed? Because it probably needs to be.
>
> Now we loop once over the function call. This creates a new return
> continuation and on function return registers are restored to their new
> values (44, 10.0, "abb", "c"). All fine till here.
>
> The loop counter "a" reaches zero. Now the next instruction is
> another function call.
>
> bar() # set P0, ... invokecc
>
> The "bar()" function extracts the return continuation captured in the
> first call to "foo()" from the global and invokes it. Control flow
> continues right after the "invokecc" opcode that called "foo()".
>
> This would restore the register contents to the first state shown above.
> That is, not only I and N registers would be clobbered also S registers
> are involved.
That's correct. What's the problem? Okay, you've created an infinite
loop, but what you're describing is absolutely the correct behaviour for
a continuation. If you need any state to be 'protected' from taking the
continuation then it needs to be in a lexical or a mutated PMC. This is
just how continuations are supposed to work.
> Above code could only use P registers. Or in other words: I, N, and S
> registers are almost[1] useless.
No they're not. But you should expect them to be reset if you take a
(full) continuation back to them.
Presumably if foo() doesn't store a full continuation, the restoration
just reuses an existing register frame and, if foo has made a full
continuation its return does a restore by copying?
>> ... While S registers hold pointers, they have
>> value semantics.
> Is that guaranteed? Because it probably needs to be.
It's the current implementation and tested.
>> This would restore the register contents to the first state shown above.
>> That is, not only I and N registers would be clobbered also S registers
>> are involved.
> That's correct. What's the problem? Okay, you've created an infinite
> loop, but what you're describing is absolutely the correct behaviour for
> a continuation.
Ok. It's a bit mind-twisting but OTOH it's the same as setjmp/longjmp
with all implications on CPU registers. C has the volatile keyword to
avoid clobbering of a register due to a longjmp.
>> Above code could only use P registers. Or in other words: I, N, and S
>> registers are almost[1] useless.
> No they're not. But you should expect them to be reset if you take a
> (full) continuation back to them.
The problem I have is: do we know where registers may be reset? For
example:
$I0 = 10
loop:
$P0 = shift array
dec $I0
if $I0 goto loop
What happens if the array PMC's C<shift> get overloaded and does some
fancy stuff with continuations. My gut feeling is that the loop might
suddenly turn into an infinite loop, depending on some code behind the
scenes ($I0 might be allocated into the preserved register range or not
depending on allocation pressure).
Second: if we don't have a notion that a continuation may capture and
restore a register frame, a compiler can hardly use any I,S,N registers
because some library code or external function might just restore these
registers.
> Presumably if foo() doesn't store a full continuation, the restoration
> just reuses an existing register frame and, if foo has made a full
> continuation its return does a restore by copying?
Yes, that should be a reasonable implementation.
leo
This is, of course, why so many languages that have full continuations
use reference types throughout, even for numbers. And immutable strings...
So my conclusion that (in combination with restoring registers to the
values of continuation creation) I,S,N registers are almost unusable is
correct?
What about my proposal "Lexicals, continuations, and register
allocation"? Would that provide proper semantics for continuations?
leo
On Wed, 8 Dec 2004 20:29:00 +0100, Leopold Toetsch <l...@toetsch.at> wrote:
> So my conclusion that (in combination with restoring registers to the
> values of continuation creation) I,S,N registers are almost unusable is
> correct?
I would disagree. Let me take the above example and work with it a little:
$I0 = 10
loop:
$P0 = shift array
dec $I0
if $I0 goto loop
We are (for the moment) assuming that "shift array" somehow causes a
full continuations to be taken and then invoked it in a subsequent
call. Then this code would infinite loop; however, so would this code
as the second call is returning through the first calls continuation.
$P0 = shift array
$P1 = shift array
On the other hand, if every call to "shift array" took a full
continuation, did some stuff, and eventually returned through its
return continuation. Then neither would infinite loop, as every call
to "shift array" would have its own return continuation.
What this means is that care must be taken when you are writing code
that you expects to be invoked multiple times. However, if you are a
function that on your second invocation returns via the continuation
from you first invocation, you should probably expect to be called
again because it happened the first time! If you are expecting other
behavior, it is probably because one person wrote the whole chain of
calls and had some extra knowledge about the caller. This author may
have to be a little wary about value vs reference semantics, but
programmers are fairly used to that pitfall by now.
>What this means is that care must be taken when you are writing code
>that you expects to be invoked multiple times. However, if you are a
>function that on your second invocation returns via the continuation
>from you first invocation, you should probably expect to be called
>again because it happened the first time!
>
The contentious point is precisely that there is no way to tell at the
compile time that you will be invoked multiple times. If something you
pull from a library or a callback passed to you captures the
continuation, you may notice that your return continuation had been
promoted, but at that point it's already too late to kill all I/N/S
register usage and to double-reference all PMCs. Of course, unless you
keep the original AST and keep recompiling it to PIR. I'd argue that
it's vastly more complex and error prone than Leo's solution he refered
to in his post. It's also no better than refetching all lexicals.
Also, note that return-continuation-restores semantics is simply a
(possibly premature) optimisation in its own right. CPS is supposed to
treat every continuation the same (which turned out to be inefficient),
and then return continuations were added to simplify the most common
case. So, while return continuation is unpromoted, it's perfectly okay
for it to behave any way it wants. Once it does get promoted, I'd argue
that it should behave like a normal continuation first because it's more
practical (see above) and second because that way it doesn't break CPS
semantics.
Miro
> ... This author may
> have to be a little wary about value vs reference semantics, but
> programmers are fairly used to that pitfall by now.
Yes. But this still boils down to the question, if an author or compiler
knows that a full continuation is involved and that I,S,N registers
aren't usable for a piece of code.
> Matt
leo
> I would disagree. Let me take the above example and work with it a little:
>
> $I0 = 10
> loop:
> $P0 = shift array
> dec $I0
> if $I0 goto loop
>
> We are (for the moment) assuming that "shift array" somehow causes a
> full continuations to be taken and then invoked it in a subsequent
> call. Then this code would infinite loop; however, so would this code
Is shift a vtable method? IIRC Dan said that we're not going to be able
to take continuations that cross PMC vtable invocation.
Nicholas Clark
>> We are (for the moment) assuming that "shift array" somehow causes a
>> full continuations to be taken and then invoked it in a subsequent
>> call. Then this code would infinite loop; however, so would this code
> Is shift a vtable method?
Yep.
> ... IIRC Dan said that we're not going to be able
> to take continuations that cross PMC vtable invocation.
Yes. Coincidently I went through Dan's blog yesterday and found a
sentence that it isn't allowed to escape from overloaded vtable/MMD
methods. Makes a lot of sense.
I've no clue how we can enforce that, though.
> Nicholas Clark
leo
Which, ordinarily, isn't what you should want.
> I'd use them for almost all my int-y variables where I didn't
> need any additional cruft [...]
The prevalence of this sort of thinking is precisely why I cringed
when I read that in the Apocalypse, and why I am nearly certain
that inclusion of these types in Perl6 is a Very Bad Idea. People
will use them all the time without thinking, because "it'll be faster".
There are two very strong arguments against this:
1. It's not "faster" enough to make any worthwhile difference in
the overwhelming majority of cases. On a modern system with
a decent-sized L2 cache (and bear in mind that by the time
this code enters production, what we right now consider a
modern system will be old and cheap on Ebay), the performance
impact of double referencing is next to nothing.
2. There's a reason high-level data structures were developed:
they obviate entire classes of subtle, hard-to-debug bugs.
(Not as subtle or hard-to-find as pointer-related bugs, but
nevertheless, subtle and hard to find.) Throwing out those
advantages for a few picoseconds of potential optimization
is, in the vast majority of cases, a very poor trade.
Yes, there are a few cases where using these variable types
would be appropriate, in carefully considered situations, but
people who are thinking, "Wow, this is the greatest thing since
sliced bread, I'm going to use this for most my integer
variables" have failed to learn the hard lessons the software
industry has had to learn over the last thirty years.
To illustrate, I'll go back and more closely examine your
statement above:
> I'd use them for almost all my int-y variables where I didn't
> need any additional cruft [...]
That additional "cruft", as you put it, is in fact a safety net.
The question you should be asking yourself is, how do you KNOW
whether you're going to need the "cruft" or not, in any given
instance? Let's consider as our given instance the precise
example you cite:
> like as an index in a for loop (who needs all the cruft there?
> You just want a number that goes up one every time!).
For the moment, let's ignore the fact that in Perl6 the C-style for
loop is (thank goodness) going away, and for is merely a synonym for
foreach. Someone will certainly implement a module that provides a
C-style for loop (implemented in terms of while, I suppose), and so
we'll assume you're using that:
my int $x;
c_style_for ($x = 0; $x < $y; ++$x) {
do_stuff();
}
The question is, how do you know this isn't buggy? How do you
know that $x won't overflow? How do you know that $x won't overflow
only in certain special circumstances -- perhaps only if the size of
the program's input is abnormally large, or perhaps only late at
night or late in the month, or perhaps only on certain architectures,
or in other special circumstances you'll neglect to test? You don't.
In many cases you *can't* know that, and even in the cases where you
can, it will generally involve more desk-checking and precise
calculation of pathological cases than most programmers want to do.
In rare cases the performance of a particular bit of code is so
critical (perhaps because it is embedded in a triple-nested loop)
that it's worth working out all those details and calculating exactly
what the maximum possible value is and thus what size variable you
need. In normal situations, however, it's not worth it, and you
won't do it. A high-level data type saves you from needing to
calculate that for every variable (and, indeed, in many cases it's
not even POSSIBLE to calculate the max possible value), because it
will autopromote if necessary rather than overflow. At least, the
Apocalypse hinted that if it doesn't do that by default, there will
be an easily invoked pragma to make it do that. Personally I'd
prefer that all variables do that all the time, and if the programmer
every *actually* wants overflow semantics for some hyperunusual
reason (e.g., to emulate legacy computer-system behavior), that can
be easily implemented as a special class of OverFlowingInt object
that checks after each operation to see if its value has exceeded
some maximum and if so throws an exception or does some modular
arithmetic or something, depending on which type of overflow you
prefer.
> Nearly all modules could make great use of them for speedy
> internal variables, too.
I certainly hope not. Mindlessly using lowlevel types without regard
for the consequences is NOT the direction Perl programmers should want
to go.