Calling conventions and opcodes (and of course the semantics of these) define the ABI of the Parrot VM. Any change in the ABI creates incompatibilities and the need to rewrite compilers that target Parrot.
The current calling conventions as laid in stone in pdd03 [1] are IMHO limited and suboptimal. I've already shown with the cachegrind analysis of the fib benchmark[2] that we just can't keep the scheme in the long run for performance reasons. Further: the current calling scheme doesn't cope with MMD as argument ordering isn't provided:
foo(int i, float v)
and
foo(float v, int i)
end up with the same call signature setup *I2 := I4 := 1",
Finally the current scheme doesn't cope with any of the fancy call stuff like named arguments, default values, scalar vs list context, and what not. For HLL interoperbility Parrot should provide a scheme that has at least some common denominator to support major target languages.
Some of my arguments are of course related to performance and optimzation, which we shouldn't consider much until we have a complete Parrot VM. But changing the ABI can't be done much later as too much existing code is impacted.
To get around this problem I propose an abstraction layer around the calling scheme.
2) Proposal
2.1) @ARGS
@ARGS is a pseudo-array (hash, object) that handles call arguments and return values. It allows indexed and named access to arguments in a straight forward manner:
$I0 = @ARGS[0] # get 1st argument as plain int set I16, @ARGS[0] # same, PASM $P0 = @ARGS[1] # get 2nd arg as PMC
This replaces the current scheme of addressing arguments:
$I0 = I5 # 1st arg in I5 $P0 = P5 # 2nd arg in P5
This should work of course the other way too:
@ARGS[0] = 42 # set first return value # or sub param to pass in
or
push @ARGS, 42
We probably need two of these: one for the current incoming arguments and one for outgoing arguments per call. Return values can arrive in the @ARGS of the call. So it's more likely called @IN_ARGS and @OUT_ARGS or some such.
2.2) named access of arguments:
$P0 = @ARGS['a'] # get argument named 'a'
2.3) @ARGS methods and attributes
@ARGS provides all array-ish access methods:
argv = elements @ARGS # get argument count @ARGS = 4 # want to pass 4 args to sub
All necessary introspection and functionality can be implemented with existing opcodes or syntax:
$I0 = typeof @ARGS[2] # get type ID of 3rd argument $P0 = @ARGS."__ro"(3) # get 4th argument as read-only (COW) copy
@ARGS."__push"($P0) # set next call arg @ARGS."__push_flatten(a) # flatten "a" and add to outgoing args
@ARGS."__set_default("a", $P0) # set default value for arg named "a"
and so on.
2.4.) variadic arguments (perl @_, Python *a)
The @ARGS array can directly map to an @_ array:
@ARGS = argv
or:
@ARGS[2] = argv # ($a, $b, *@c)
2.5) return context
Yesterdays conversation on IRC (yes!) has clearly shown that the current calling conventions are lacking information about scalar vs list vs void context.
sub foo { want.List ?? (1,2,3) :: 1 } # or some such
This information could also be attached to @ARGS. E.g.
@ARGS."return_list"(1)
3) implementation idea
For each call we've to allocate a context structure with a new register frame. As shown in [3] this can be done effectively with a sliding register window and variable sized call frames. An overall layout of a call frame could look like:
+---------+-----------+-----------+-----------+ | context | in @ARGS | registers | out @ARGS | +---------+-----------+-----------+-----------+
But actually it doesn't really matter now. With the abstraction we can do it as we like and optimize it like hell later.
4) Conclusion
The @ARGS abstraction hides all the nasty call stuff inside a consistent interface, which is easily extensible and not as limited as the current layout in pdd03. Changing the ABI later isn't really a good idea, an abstraction like this should be defined ASAP to provide all freedom for the implementaion.
The attributes, methods, and semantics of @ARGS (or a similar abstraction) should be consolidated by the various HLL folks, so that we eventually have a consistent usable cross-language call interface that just works.
leo
[1] docs/pdds/pdd03_calling_conventions.pod
[2] Subject: "Why is the fib benchmark still slow - part 1"
NB: part 2 would have been the argument/return value copying between register frames (src/sub.c:copy_regs())
[3] Subject: "[Summary] Register stacks again" Subject: "[PROPOSAL] for a new calling scheme"
Leopold Toetsch wrote: > Below inline attached is a scheme for an abstraction layer around > calling conventions.
> Comments welcome, > leo
> 2.5) return context > > Yesterdays conversation on IRC (yes!) has clearly shown that the > current calling conventions are lacking information about scalar vs > list vs void context. > > sub foo { want.List ?? (1,2,3) :: 1 } # or some such > > This information could also be attached to @ARGS. E.g. > > @ARGS."return_list"(1)
Would it be possible to attach it to the continuation? Then in the course of tail-calling the information continues to be available just where it's needed.
Roger Hale <r...@theworld.com> wrote: > Leopold Toetsch wrote: > > sub foo { want.List ?? (1,2,3) :: 1 } # or some such
> > This information could also be attached to @ARGS. E.g.
> > @ARGS."return_list"(1) > Would it be possible to attach it to the continuation? Then in the > course of tail-calling the information continues to be available just > where it's needed.
As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and that context is defacto the continuation, yes - a tail-call would inherit this information.
Leopold Toetsch wrote: > Roger Hale <r...@theworld.com> wrote:
>>Leopold Toetsch wrote:
>>> sub foo { want.List ?? (1,2,3) :: 1 } # or some such
>>>This information could also be attached to @ARGS. E.g.
>>> @ARGS."return_list"(1)
>>Would it be possible to attach it to the continuation? Then in the >>course of tail-calling the information continues to be available just >>where it's needed.
> As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and > that context is defacto the continuation, yes - a tail-call would > inherit this information.
> leo
But as each tail-call supplies a new @ARGS, how can this be the case?
One can also think of {scalar, list, ...} context as the continuation's signature...
Roger Hale <roger.h...@rcn.com> wrote: > Leopold Toetsch wrote:
>> As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and >> that context is defacto the continuation, yes - a tail-call would >> inherit this information.
>> leo > But as each tail-call supplies a new @ARGS, how can this be the case?
We would have two parts in the context: @IN_ARGS, @OUT_ARGS. The C<tailcall> opcode can preserve that part with the return context.
> One can also think of {scalar, list, ...} context as the continuation's > signature...
Leopold Toetsch wrote: > Roger Hale <roger.h...@rcn.com> wrote:
>>Leopold Toetsch wrote:
>>>As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and >>>that context is defacto the continuation, yes - a tail-call would >>>inherit this information.
>>But as each tail-call supplies a new @ARGS, how can this be the case?
> We would have two parts in the context: @IN_ARGS, @OUT_ARGS. The > C<tailcall> opcode can preserve that part with the return context.
It seems to me that both @IN_ARGS and @OUT_ARGS get used for other things (the tail-calls' arguments) in a chain of tail-calls. Consider this chain:
A calls B(@OUT_ARGS 1)[continuation: A*] in context c
C(@IN_ARGS 2)[c10n: A*] wants to know context c, as it's getting ready to return something. Neither @IN_ARGS (the arguments C received from B) nor @OUT_ARGS (the arguments of any call C may make) has this information, but the continuation (I propose) does; and this continues to be good for whoever wants to know: the return object holds the return context.
From: Roger Hale <roger.h...@rcn.com> Date: Thu, 07 Apr 2005 04:23:41 -0400
Leopold Toetsch wrote: > Roger Hale <roger.h...@rcn.com> wrote: > >>Leopold Toetsch wrote: >> >>>As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and >>>that context is defacto the continuation, yes - a tail-call would >>>inherit this information. > >>But as each tail-call supplies a new @ARGS, how can this be the case? > > We would have two parts in the context: @IN_ARGS, @OUT_ARGS. The > C<tailcall> opcode can preserve that part with the return context.
It seems to me that both @IN_ARGS and @OUT_ARGS get used for other things (the tail-calls' arguments) in a chain of tail-calls.
The definition of a tail call is that it returns its callee's results back to its caller unmodified. So if @OUT_ARGS is used for other things, then it's not a tail call.
Consider this chain:
A calls B(@OUT_ARGS 1)[continuation: A*] in context c
C(@IN_ARGS 2)[c10n: A*] wants to know context c, as it's getting ready to return something. Neither @IN_ARGS (the arguments C received from B) nor @OUT_ARGS (the arguments of any call C may make) has this information . . .
I don't think this is a real situation. If B passes C the continuation it got from A, then the call to C is indeed a tail call [1], and cannot have different @OUT_ARGS from the call to B, because B never regains control when C returns. Your notation in that case has to be:
Or do you mean something different from a tail call? If so, could you please express it in a programming language?
. . . but the continuation (I propose) does; and this continues to be good for whoever wants to know: the return object holds the return context.
No?
regards, Roger
I believe so, but I think this is what Leo meant by "... that context is defacto the continuation." There doesn't need to be a separate "return object" because it would be one-to-one with the continuation.
Bob Rogers wrote: > From: Roger Hale <roger.h...@rcn.com> > Date: Thu, 07 Apr 2005 04:23:41 -0400
> Leopold Toetsch wrote: > > Roger Hale <roger.h...@rcn.com> wrote:
> >>Leopold Toetsch wrote:
> >>>As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and > >>>that context is defacto the continuation, yes - a tail-call would > >>>inherit this information.
> >>But as each tail-call supplies a new @ARGS, how can this be the case?
> > We would have two parts in the context: @IN_ARGS, @OUT_ARGS. The > > C<tailcall> opcode can preserve that part with the return context.
> It seems to me that both @IN_ARGS and @OUT_ARGS get used for other > things (the tail-calls' arguments) in a chain of tail-calls.
> The definition of a tail call is that it returns its callee's results > back to its caller unmodified.
Agreed, but...
> So if @OUT_ARGS is used for other > things, then it's not a tail call.
I don't understand. @OUT_ARGS aren't the arguments returned (to my understanding), they're the arguments to the next function in sequence.
> Consider this chain:
> A calls B(@OUT_ARGS 1)[continuation: A*] in context c
> C(@IN_ARGS 2)[c10n: A*] wants to know context c, as it's getting > ready to return something. Neither @IN_ARGS (the arguments C > received from B) nor @OUT_ARGS (the arguments of any call C may make) > has this information . . .
> I don't think this is a real situation. If B passes C the continuation > it got from A, then the call to C is indeed a tail call [1],
Yes...
> and cannot > have different @OUT_ARGS from the call to B, because B never regains > control when C returns.
I don't see what this has to do with the continuation passed with the call. A tail call is an arbitrary call with arbitrary arguments, in final position before returning. (If it omitted the tail-call optimization and supplied a continuation B* of its own, B* would be the identity stub
-> @return_args { A* @return_args }
as it were; but these are not B's @OUT_ARGS, but B*'s @IN_ARGS and @OUT_ARGS, and A*'s @IN_ARGS, to my understanding.) B can call C with anything it likes, because it still has control before calling C.
What forces B to call C with the same arguments it was called with?
> Or do you mean something different from a tail call? If so, could you > please express it in a programming language?
> . . . but the continuation (I propose) does; and this continues to be > good for whoever wants to know: the return object holds the return > context.
> No?
> regards, > Roger
> I believe so, but I think this is what Leo meant by "... that context is > defacto the continuation." There doesn't need to be a separate "return > object" because it would be one-to-one with the continuation.
Sorry, by "return object" I was only meaning the continuation; you are quite right. Just using a different term for parallelism with "return context", but I see it only introduced confusion.
From: Roger Hale <roger.h...@rcn.com> Date: Mon, 11 Apr 2005 09:30:32 -0400
Bob Rogers wrote: > From: Roger Hale <roger.h...@rcn.com> > Date: Thu, 07 Apr 2005 04:23:41 -0400 > > Leopold Toetsch wrote: > > Roger Hale <roger.h...@rcn.com> wrote: > > > >>Leopold Toetsch wrote: > >> > >>>As @ARGS (or @IN_ARGS, @OUT_ARGS) is being stored in the context, and > >>>that context is defacto the continuation, yes - a tail-call would > >>>inherit this information. > > > >>But as each tail-call supplies a new @ARGS, how can this be the case? > > > > We would have two parts in the context: @IN_ARGS, @OUT_ARGS. The > > C<tailcall> opcode can preserve that part with the return context. > > It seems to me that both @IN_ARGS and @OUT_ARGS get used for other > things (the tail-calls' arguments) in a chain of tail-calls. > > The definition of a tail call is that it returns its callee's results > back to its caller unmodified.
Agreed, but...
> So if @OUT_ARGS is used for other > things, then it's not a tail call.
I don't understand. @OUT_ARGS aren't the arguments returned (to my understanding), they're the arguments to the next function in sequence.
My mistake; I had thought "@OUT_ARGS" meant "results". I see I didn't read Leo's original proposal carefully enough, and you were just following his terminology; my apologies. I agree that information about return context can't live in @ARGS (in or out) directly.
> . . . but the continuation (I propose) does; and this continues to be > good for whoever wants to know: the return object holds the return > context. > > No? > > regards, > Roger > > I believe so, but I think this is what Leo meant by "... that context is > defacto the continuation." There doesn't need to be a separate "return > object" because it would be one-to-one with the continuation.
Sorry, by "return object" I was only meaning the continuation; you are quite right. Just using a different term for parallelism with "return context", but I see it only introduced confusion.
So it sounds like we are all saying the same thing now?