branched off pdd03 changes

Leopold Toetsch

unread,

Jul 6, 2005, 12:01:22 PM7/6/05

to Perl 6 Internals

1) I扉e create a branch (branches/leo-ctx5) of my current work on
getting pdd03 implemented

2) please give it a try

e.g.
export SVNPARROT=https://svn.oerl.org/parrot
cp -R trunk leo-ctx5
cd leo-ctx5
svn switch $SVNPARROT/branches/leo-ctx5

3) a lot is still broken but I hope mostly just PASM syntax using old
calling conventions

Things that need changes:

* compile opcode

It did a function call under the hood, which isn't fixed. But I think,
we should just toss the compile opcode instead. So when you had:

.local pmc comp, r
comp = compreg "PIR"
r = compile comp, code

just use:

r = comp(code)

It's simpler and allows us to pass arguments to the compiler eventually.

* optional arguments

PGE defines e.g. a sub emit:

.sub emit ( PMC code, STR fmt, [ STR str1, STR str2, INT int1 ] )

with the optional params str1, str2, int1. I've set the :optional bit
accordingly. But when calling the sub, you can't use the INT argument
inmidst of STR arguments any more:

emit (P, S, S, S) # ok
emit (P, S, S, I) # ok
emit (P, S, I, S) # err

I've reordered a lot of these arguments, but might have missed some - or
messed a few up - some PGE tests are still failing.

* argc{I,S,P,N}

These are now opcodes instead of variables. This changes the usage slightly:

$I0 = argcI # ok
if argcI < 2 ... # err

I'd be glad if folks can go through the failing tests (one exception
tests hangs) and change old PASM call syntax to new.

Thanks,
leo

Leopold Toetsch

unread,

Jul 6, 2005, 12:29:21 PM7/6/05

to Perl 6 Internals

Leopold Toetsch wrote:

> Things that need changes:

[ some more I forgot ]

* python PMCs that mess with interpreter context or duplicate Parrot's
function call API need rework - compiling dynclasses/py* is disabled.

* I've not converted two of the NCI signatures (L, T) to the new scheme.
Could be done of course, but handling lists of things inside NCI seems
to be overkill to me. We got already too much signatures there and
second - the {,Un}ManagedStruct PMC can handle almost arbitray
structures and arrays of items.

leo

Jens Rieks

unread,

Jul 6, 2005, 1:20:28 PM7/6/05

to perl6-i...@perl.org

On Wednesday 06 July 2005 18:01, Leopold Toetsch wrote:
> export SVNPARROT=https://svn.oerl.org/parrot
export SVNPARROT=https://svn.perl.org/parrot

> cp -R trunk leo-ctx5
> cd leo-ctx5
> svn switch $SVNPARROT/branches/leo-ctx5

jens

Patrick R. Michaud

unread,

Jul 7, 2005, 12:18:21 AM7/7/05

to perl6-i...@perl.org

First, my thanks to Leo and Chip for their ongoing work to the
Parrot calling conventions. I'm looking forward to the outcome.

As of the leo-ctx5 branch, however, I find I'm confused about
what the new conventions are supposed to achieve, and indeed
even how they are going to work when they're finished. So, this
is going to be a longish message with my initial confusions and
observations about the new calling conventions, in the hopes of
clearing things up a bit.

First, let me provide some context -- in my career programmed in a
lot of different languages and environments, including Perl, C, shells,
awk, C++, Java, PHP, and Pascal, as well as 68000, 6809, Z-80, IBM/360,
and VAX-11 assembly languages. So, in learning the new Parrot calling
conventions I find I'm trying to map them into something I'm familiar
with, and having a lot of trouble doing that. Perhaps Parrot just
doesn't fit well into any of those models.

Also, I've been working under the assumption that it's generally
better (faster/more efficient) for frequently used integer and
string values to be held and manipulated in I and S registers rather
than PMCs. Perhaps this assumption is invalid. Indeed, it's possible
that I'm working under a number of incorrect assumptions about how to
do things in Parrot, so if anyone can think of a better way to
approach the things that PGE is trying to do, please feel free to
speak up.

With that background out of the way; let me give some examples
of confusion and/or difficulties I'm encountering in moving PGE
to the new calling conventions as implemented in leo-ctx5 (and
as discussed today on #parrot). My questions and comments below
are formed from today's experience and conversations; it's
entirely possible I'm missing the bigger picture, or that what
I'm finding is all based on a misconception somewhere.

First, from reading pdd03 I was under the impression that
parameter passing was going to end up purely positional --
that each argument would automatically be converted into the
type required by the corresponding target register. This is
the way I interpreted the words "standard conversion" in pdd03:

=head3 Type Conversions

Unlike the C<set_*> opcodes, the C<get_*> opcodes must perform
conversion from one register type to another. Here are the conversion
rules:

* When the target is an I, N, or S register, storage will
behave like an C<assign> (standard conversion).

However, according to Leo's journal, the I, N, and S registers
do not convert among themselves but instead perform type checking
and throw an exception on mismatches. (cf. "Type checking"
at http://use.perl.org/~leo/journal/25491.)

For subroutines that have fixed parameter lists, perhaps this
is desirable. (Note, however, that I come from a Perl background,
where autoconversion of values is the norm.) It does give me
pause that PIR will autoconvert arguments to and from PMCs but
not other registers -- i.e.:

.sub "foo" .sub "bar"
.param pmc arg1 .param string arg1
print arg1 print arg1
.end .end

foo(1) # valid bar(1) # type mismatch

I suppose if we consider pmcs to be a "universal type" and
registers to be "restricted types" this makes sense, but in either
case above I have to write extra statements somewhere to get an
integer argument into a string register in the subroutine.

But my real confusion comes from the handling of "optional" parameters
in subroutine calls. PGE uses a number of subroutine calls with
optional parameters -- I think that this results in code that is
more readable and maintainable, and also can avoid any overhead
from setting up and passing lots of "dummy" arguments that aren't
going to be used in a particular subroutine invocation anyway.

One of the PGE functions that has trouble with the new calling
conventions is emit(), which is used to generate the PIR instructions
corresponding to a rule being compiled. emit() was designed to
look and act a lot like C's sprintf(), in that it uses %d and %s as
placeholders for values to be substituted in a string on output:

emit(code, "if rep == %d goto %s", min, next)

Here, "code" is the accumulated PIR instructions, "min" is an int
register containing the minimum number of repetitions, and "next"
is the string label to be branched to for executing the next
component of the rule. In the old-style calling conventions,
emit() was simply defined as:

.sub "emit" method
.param pmc code # accumulating code object
.param string out # string to output
.param string str1 # first %s substitution
.param string str2 # second %s substitution
.param string str3 # third %s substitution
.param int int1 # first %d substitution
# ...

Since int arguments (min) always went to the int register parameters
and string arguments (next) always went to the string register parameters,
all was fine regardless of the order they appeared in the argument list.
I could tell how many of each was sent to emit by looking at the
argcI and argcS pseudo-variables.

With the new PIR calling conventions, emit() is to be written
with ":optional" on the optional parameters, like so:

.sub "emit" method
.param pmc code # accumulating code object
.param string out # string to output
.param string str1 :optional # first %s substitution
.param string str2 :optional # second %s substitution
.param string str3 :optional # third %s substitution
.param int int1 :optional # first %d substitution
# ...

That's fine, but under the new calling conventions the original
calls to emit() no longer work, because for some reason PIR thinks
that any optional int arguments have to occur *after* any optional
string arguments, thus requiring:

emit(code, "if rep == %d goto %s", next, min)

This construction somewhat offends my sense of style,
because the "next" and "min" arguments are reversed from
the order of the %d and %s placeholders in the format string.
Also, it's not clear why PIR requires the integer parameter
to be last in the sequence -- after all, if it could figure out
that "min" should go in the "int1" parameter (skipping "str2"
and "str3"), why couldn't it figure out the same sort of thing for

emit(code, "if rep == %d goto %s", min, next)

such that "min" goes into the first optional int, and "next" goes
into the first optional string? Clearly PIR isn't doing type
checking of optional parameters anyway; at most it's doing a
form "type matching".

But it would be really nice if I didn't have to worry about the
register types at all. That is in fact what I thought the new calling
conventions were intended to do -- free me from having to worry
about matching the register types between caller and callee.
(In Perl I don't have to worry about matching argument types --
it just converts values as needed based on context.)

I do notice that I can get this behavior if I use PMCs for the
arguments of emit():

.sub "emit" method
.param pmc code # accumulating code object
.param string out # string to output
.param pmc sub1 :optional # first %s substitution
.param pmc sub2 :optional # second %s substitution
.param pmc sub3 :optional # third %s substitution
# ...

# and later...
emit(code, "if rep == %s goto %s", min, next)

Here, even though "min" is an int argument and "next" is a string
argument, they both get placed into the "sub1" and "sub2" pmc parameters
respectively, and I can just replace the first %s with the string
value of "sub1" and the second %s with the string value of "sub2".

Of course, the downside is that with this approach I'm creating PMCs
just to get to a string value, and inside emit() I have to convert
each PMC into a string to be able to do various string operations
on the argument. For arguments that were in string registers to
begin with, we've needlessly converted them into pmcs and back to
string registers again.

Things get really odd if we have a mix of optional pmc and other
parameters. For example, consider the following:

.sub "foo" .sub "bar"
.param pmc abc .param pmc abc
.param pmc sub1 :optional .param string sub1 :optional
.param pmc sub2 :optional .param string sub2 :optional
.param pmc sub3 :optional .param string sub3 :optional
.param int int1 :optional .param string int1 :optional
... ...

foo($P0, "hello", 0) bar($P0, "hello", 0)

For the call to "foo", the arguments end up in the "sub1" and
"sub2" parameters, while in the call to "bar" the arguments end
up in the "sub1" and "int1" parameters. Thus, in the call to "foo"
if the caller wants to get that 0 argument into the "int1" parameter
it has to be the fifth argument, while in the call to "bar" the zero
can be any argument as long as it's the first int. So here, the
caller really needs to know the arguments used in the callee.

And I was a bit surprised when Leo mentioned on IRC that the
following is illegal:

.sub "baz"
.param pmc abc
.param string arg1 :optional
.param int arg2 :optional
.param string arg3 :optional # illegal (?)

According to Leo, any optional parameters have to be grouped together
by register type (all strings together, all ints together, all pmcs
together, etc.).

So, optional parameters are not purely positional, and they
don't strictly match by register type. Instead it seems to be a
hybrid register-type-and-sequence pattern, where values are automatically
converted if the next argument or parameter in the list is a pmc,
but otherwise we skip any optional arguments until we find the
group of parameter registers that match the type of argument
we're currently at.

Personally, I don't think that having Parrot do register type checking
among only the I, S, and N registers is all that useful to me.
(There's no type checking for P registers -- they always convert.)
And it's not immediately clear to me what advantage we're gaining by
automatically converting values to/from P registers but not the
others.

I think it would be much more useful -- and result in cleaner PIR
code -- if all parameters were positional, with Parrot automatically
converting values among the register types as needed, similar to
how Perl (5) handles arguments. If type checking of the register
arguments is needed, then perhaps that can be offered as a pragma,
trace option, errorson flag, or some other option that tells parrot
to not autoconvert the I/S/N registers but throw an exception instead.
Or perhaps we go the other way, and have a pragma on subs that says to
convert argument values positionally instead of tossing exceptions for
them.

I have no problem with doing whatever the Parrot design ultimately
requires; it's just that at the moment I don't seem to fully
understand the rationale or design goals of the latest convention.

Lastly, while on the topic of calling conventions, has there been
any thought given as to a standard convention for named argument passing
in Parrot subroutines? There are a number of places in PGE where
that would be really useful (and necessary...).

Pm

Leopold Toetsch

unread,

Jul 7, 2005, 6:43:58 AM7/7/05

to Patrick R. Michaud, perl6-i...@perl.org

Patrick R. Michaud wrote:

> With the new PIR calling conventions, emit() is to be written
> with ":optional" on the optional parameters, like so:
>
> .sub "emit" method
> .param pmc code # accumulating code object
> .param string out # string to output
> .param string str1 :optional # first %s substitution
> .param string str2 :optional # second %s substitution
> .param string str3 :optional # third %s substitution
> .param int int1 :optional # first %d substitution
> # ...
>
> That's fine, but under the new calling conventions the original
> calls to emit() no longer work

To me it seems, however you declare the "emit" function (with mixed
optional native types), it either isn't following the principle of
strict positional argument passing, or it does the wrong thing due to
type conversions.

emit(P, S, S, I, S) # int1 passed out of order Exp.pir:534 /backref
emit(P, S, I, S) # a lot of code e.g. min, label
emit(P, S, S, S) # Exp.pir:701 /charmatch, label

The only sane way seems to be to define emit as

emit(code, out, ?str1, ?str2, ?str3, ?str4)

and do the substitutions just in order, i.e. treat %d and %s the same.
An integer argument will be converted to a string (which is good, as you
need one for the substr anyway.

> Pm

leo

Leopold Toetsch

unread,

Jul 7, 2005, 6:13:04 AM7/7/05

to Patrick R. Michaud, perl6-i...@perl.org

Patrick R. Michaud wrote:

> Also, I've been working under the assumption that it's generally
> better (faster/more efficient) for frequently used integer and
> string values to be held and manipulated in I and S registers rather
> than PMCs. Perhaps this assumption is invalid.

The assumption is of course true.

> =head3 Type Conversions
>
> Unlike the C<set_*> opcodes, the C<get_*> opcodes must perform
> conversion from one register type to another. Here are the conversion
> rules:
>
> * When the target is an I, N, or S register, storage will
> behave like an C<assign> (standard conversion).
>
> However, according to Leo's journal, the I, N, and S registers
> do not convert among themselves but instead perform type checking
> and throw an exception on mismatches. (cf. "Type checking"
> at http://use.perl.org/~leo/journal/25491.)

I've just misinterpreted this para and thought that we convert only
to/from PMCs but not between native types. I'll fix that.

> I think it would be much more useful -- and result in cleaner PIR
> code -- if all parameters were positional, with Parrot automatically
> converting values among the register types as needed, similar to
> how Perl (5) handles arguments.

Thanks for the long explanation, showing the troubles of the current
state. The problem mostly arises from the switch of the old scheme to
the new one of course. I found code in libraries that had e.g. call
arguments swapped:

foo(I, S) # .sub foo (S, I)

This worked fine in the old scheme but is of course at least strange, if
you compare it to other programming languages.

So on the way to first fix existing code and tests, type checking isn't
that bad, but wrong still in the long run.

The only sane way to go is strictly positional, with conversions and
maybe optional warnings or errors.

> Lastly, while on the topic of calling conventions, has there been
> any thought given as to a standard convention for named argument passing
> in Parrot subroutines? There are a number of places in PGE where
> that would be really useful (and necessary...).

The plan is to support named arguments. There is already a Pair PMC in
the tree for that.

> Pm

leo

Patrick R. Michaud

unread,

Jul 7, 2005, 9:28:02 AM7/7/05

to Leopold Toetsch, perl6-i...@perl.org

On Thu, Jul 07, 2005 at 12:13:04PM +0200, Leopold Toetsch wrote:
> I've just misinterpreted this para and thought that we convert only
> to/from PMCs but not between native types. I'll fix that.

That will be *very* nice.

> >Lastly, while on the topic of calling conventions, has there been
> >any thought given as to a standard convention for named argument passing
> >in Parrot subroutines? There are a number of places in PGE where
> >that would be really useful (and necessary...).
>
> The plan is to support named arguments. There is already a Pair PMC in
> the tree for that.

I definitely need to look into that. Thanks for the clarifications.

Pm

Patrick R. Michaud

unread,

Jul 7, 2005, 9:04:10 AM7/7/05

to Leopold Toetsch, perl6-i...@perl.org

On Thu, Jul 07, 2005 at 12:43:58PM +0200, Leopold Toetsch wrote:
> To me it seems, however you declare the "emit" function (with mixed
> optional native types), it either isn't following the principle of
> strict positional argument passing, or it does the wrong thing due to
> type conversions.

> ...

> The only sane way seems to be to define emit as
>
> emit(code, out, ?str1, ?str2, ?str3, ?str4)

I agree entirely, this is exactly what I'm planning (and was
hoping) to be able to do. I'm still working out how I want
to handle the "docut" parameter in emitsub, but it will
probably become a string-register flag instead of an int.

Pm

Jonathan Worthington

unread,

Jul 14, 2005, 7:21:22 PM7/14/05

to Leopold Toetsch, Perl 6 Internals

"Leopold Toetsch" <l...@toetsch.at> wrote:
>
> I'd be glad if folks can go through the failing tests (one exception tests
> hangs) and change old PASM call syntax to new.
>

It's a drop in the ocean, but attached patch fixes one that's been missed.

Here's the test output on Win32 with MSVC++ for the branch (with my patch).

Failed Test Status Wstat Total Fail Failed List of Failed
--------------------------------------------------------------------------------

t\dynclass\foo.t 8 2048 9 8 88.89% 1-5, 7-9
t\library\yaml_parser_syck.t 1 1 100.00% 1
t\op\gc.t 1 256 19 1 5.26% 13
t\op\spawnw.t 3 768 6 3 50.00% 4-6
t\perl\Parrot_IO.t 8 2048 55 8 14.55% 23-29, 45
t\pmc\io.t 31 1 3.23% 1
t\pmc\mmd.t 6 1536 30 6 20.00% 20-22, 24-25,
28
t\pmc\namespace.t 1 256 15 1 6.67% 15
t\pmc\object-meths.t 1 256 28 1 3.57% 11
t\pmc\objects.t 5 1280 62 7 11.29% 29-31, 53,
55-57
t\pmc\pair.t 1 256 3 1 33.33% 3
t\pmc\perlnum.t 55 1 1.82% 55
t\pmc\pmc.t 1 256 24 1 4.17% 5
t\pmc\ref.t 12 1 8.33% 6
t\pmc\threads.t 7 1792 11 7 63.64% 2-5, 7-9
t\src\compiler.t 1 256 1 1 100.00% 1
t\src\manifest.t 1 256 5 1 20.00% 3
4 tests and 81 subtests skipped.
Failed 17/157 test scripts, 89.17% okay. 50/2658 subtests failed, 98.12%
okay.
NMAKE : fatal error U1077: 'c:\Perl\bin\perl.exe' : return code '0x2'
Stop.

However, a lot of tests are failing because of this:-

# Can't spawn ".\parrot.exe --gc-debug
"D:\Jonathan\hacking\parrot\branches\leo-ctx5\t\op\gc_13.pir"": Bad file
descriptor at lib/Parrot/Test.pm line 238.

I've not been able to track down why that's happening yet.

Thanks,

Jonathan

extend.diff