[PROPOSAL] infix MMD operators

1 view
Skip to first unread message

Leopold Toetsch

unread,
Mar 27, 2005, 9:37:41 AM3/27/05
to Perl 6 Internals
1) Mixed infix operations

Opcodes that take one PMC and one native type argument, like:

op add(in PMC, in PMC, in INT)

should probably become plain vtable methods again. There isn't much
"multi" in the dispatch as the dispatch degenerates to a search in the
class of the left argument's MRO. OTOH this may have some implications
as Perl6 is treating these as normal multi subs. Maybe we need just a
fake "int" class for the sake of MMD.

2) all vtable and MMD functions should be NCI methods

The current static lookup in the MMD_table doesn't work correctly
especially, when methods are overloaded. While the C<mmdvtregister>
opcode can be used to install a new method for a class pair, it is not
usable to get the inheritance right. If e.g. a method is inserted
dynamically in the middle of a classes MRO, all possible classes would
need adjustment. I don't see a way to do this correctly.

It's much simpler to just do a dynamic lookup once and cache the result
inside the PIC (polymorphic inline cache) [1].

3) tcl and py classes should inherit all common functions

Currently too much code is just cut&pasted. Removing these duplicates
also simplifies 5)

4) internal opcode change (PMC ops only)

The PASM/PIR compilers convert the user visible opcode

d = l + r # or add d, l, r

to a new opcode:

infix "__add", d, l, r

Rational: all these infix opcodes with PMCs do the same thing: locate
the method and either call the C function (inside the NCI method) or run
a piece of user code. We therefore need just one opcode for all these
methods.

This also allows easily extensions like:

infix "__hyper_add", d, l, r

5) infix method signature change:

METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
if (!dest)
dest = pmc_new(INTERP, SELF->vtable->base_type);
...
return dest;
}

If the destination PMC is passed in, it's used else a new PMC of an
appropriate type is created.

We need this basically for 4 reasons:

a) Python
All Python scalars are immutable. The current scheme to morph the
destination PMC to the type of the result is basically wrong. All Python
operations on scalars always return a new result PMC.

b) operations with temporary PMCs

$P0 = a + b

the current canonical sequence is:

$P0 = new Undef
$P0 = a + b

this takes two opcodes and the morphing is a slow and bulky operation
see src/pmc.c:pmc_reuse().

c) overridden infix methods return a new PMC, there is no destination to
pass in.

multi infix<+>(Int $l, Int $r) { ... return $d; }

or

def myadd(self, r):
return ..
class I(object):
__add__ = myadd

d) singleton return values. You can't morph a destination to a
singletion as this would need the change of the PMC's address.

6) new user-visible opcodes that create new destination PMCs:

op n_add(out PMC, in PMC, in PMC)
^^^^^^^
or

d = n+ l, r # or some such

Within the pragma ".use n_operators" the plain:

d = l + r

also translates to "n_add"

The internal opcode is:

infix_n "__add", d, l, r

7) separate inplace methods

Opcodes like:

d += r # add d, r

are currently using the normal add method with the destination set to
SELF. This is suboptimal, especially, when the destination PMC is
morphed to a different type (e.g. due to bigint promotion) which
destroys the value of SELF.

It's just cleaner to have distinct inplace methods, it's very likely
also needed anyway, as method overloading would not work if the inplace
operations are the same.

Therefore we have:

infix "__i_add", d, r

and in *.pmc

METHOD void i_add( [INTERP, SELF, ] PMC* r) {...}


[1] See also:

Subject: "PIC for more MOPS but not only"
Subject: "PIC again"

Comments welcome,
leo

Matt Fowles

unread,
Mar 27, 2005, 9:53:53 PM3/27/05
to Leopold Toetsch, Perl 6 Internals
Leo~


On Sun, 27 Mar 2005 16:37:41 +0200, Leopold Toetsch <l...@toetsch.at> wrote:
> 5) infix method signature change:
>
> METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
> if (!dest)
> dest = pmc_new(INTERP, SELF->vtable->base_type);
> ...
> return dest;
> }
>
> If the destination PMC is passed in, it's used else a new PMC of an
> appropriate type is created.

I would actually appreciate a refresher on the original motivation
behind never autogenerating a LHS. I recall being told it has
something to do with tied data, but I was never clear on exactly how
that related. I would think that tied data would only require
VTABLE_assign method, and would not care how its RHS was created (via
an add or mul or whatever).

Thus I would argue for having most operators create their result (but
having a special assign that would call a VTABLE method) and forcing
languages with active data to go through a two step assignment

$P0 = $P1 + $P2 # P0 created
$P3 <- $P0 # P3 gets to run its tied code

and languages like python which have immutable scalars could always use

$P0 = $P1 + $P2 # P0 created


One concern that occurs to me is that this would cause more new PMC
allocations. But I am whether or not that is true.

> 7) separate inplace methods
>
> Opcodes like:
>
> d += r # add d, r
>
> are currently using the normal add method with the destination set to
> SELF. This is suboptimal, especially, when the destination PMC is
> morphed to a different type (e.g. due to bigint promotion) which
> destroys the value of SELF.
>
> It's just cleaner to have distinct inplace methods, it's very likely
> also needed anyway, as method overloading would not work if the inplace
> operations are the same.
>
> Therefore we have:
>
> infix "__i_add", d, r
>
> and in *.pmc
>
> METHOD void i_add( [INTERP, SELF, ] PMC* r) {...}
>

I think this one is very necessary.

Matt
--
"Computer Science is merely the post-Turing Decline of Formal Systems Theory."
-???

Bob Rogers

unread,
Mar 27, 2005, 10:17:14 PM3/27/05
to Leopold Toetsch, Perl 6 Internals
From: Leopold Toetsch <l...@toetsch.at>
Date: Sun, 27 Mar 2005 16:37:41 +0200

1) Mixed infix operations

Opcodes that take one PMC and one native type argument, like:

op add(in PMC, in PMC, in INT)

should probably become plain vtable methods again. There isn't much
"multi" in the dispatch as the dispatch degenerates to a search in the
class of the left argument's MRO. OTOH this may have some implications
as Perl6 is treating these as normal multi subs.

IMHO, one can have too much overloading. It seems cleaner to
distinguish between "+, the (sometimes overloaded) HLL operator" and
"add, the Parrot addition operator" so that compiled code can opt out of
the overloading when the compiler knows that it really needs to do mere
addition.

Maybe we need just a fake "int" class for the sake of MMD.

I think this is a really good idea. Indeed, I think it's hard to do
otherwise; you need some place to store the "isa" relationships between
PMC classes for primitive types. Beyond that, I think it's good to
minimize the distinction between PMC classes and ParrotClass classes
from the perpective of HLLs; implementors are then freer to use a
ParrotClass to start, and reimplement as PMCs later for speed.

BTW, I just noticed that "pmclass ParrotObject extends ParrotClass
..."; is this right? Seems like the class and instance hierarchies
ought to be disjoint . . .

5) infix method signature change . . .

We need this basically for 4 reasons:

a) Python
All Python scalars are immutable. The current scheme to morph the
destination PMC to the type of the result is basically wrong. All Python
operations on scalars always return a new result PMC.

This is also how Common Lisp prefers to view the world.

. . .

7) separate inplace methods

Opcodes like:

d += r # add d, r

are currently using the normal add method with the destination set to
SELF. This is suboptimal, especially, when the destination PMC is
morphed to a different type (e.g. due to bigint promotion) which
destroys the value of SELF.

It's just cleaner to have distinct inplace methods, it's very likely
also needed anyway, as method overloading would not work if the inplace
operations are the same.

It also allows "$a = $a + $b" to have different semantics from
"$a += $b" for suitably defined operators and values. Not that I can
think of a reasonable example off the top of my head . . .

-- Bob Rogers
http://rgrjr.dyndns.org/

Leopold Toetsch

unread,
Mar 28, 2005, 3:53:46 AM3/28/05
to Matt Fowles, perl6-i...@perl.org
Matt Fowles <uber...@gmail.com> wrote:
> Leo~

> On Sun, 27 Mar 2005 16:37:41 +0200, Leopold Toetsch <l...@toetsch.at> wrote:
>> 5) infix method signature change:
>>
>> METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
>> if (!dest)
>> dest = pmc_new(INTERP, SELF->vtable->base_type);
>> ...
>> return dest;
>> }
>>
>> If the destination PMC is passed in, it's used else a new PMC of an
>> appropriate type is created.

> I would actually appreciate a refresher on the original motivation
> behind never autogenerating a LHS. I recall being told it has
> something to do with tied data, but I was never clear on exactly how
> that related.

I'd say: tied variables or references to variables:

$r = \$a;
$a = $a + 2;

If the add operation produced a new destination, the reference is lost.

> ... I would think that tied data would only require


> VTABLE_assign method, and would not care how its RHS was created (via
> an add or mul or whatever).

You can write it in both ways:

temp = a + 2
assign a, temp

or

a = a + 2 # current sematics, a modified in place

or with lexicals/globals:

a = find_lex "a" # or a = global "a"
a = a + 2

The final operation to assign the value to the perl var has IMHO always
to reuse the existing var.

> Thus I would argue for having most operators create their result (but
> having a special assign that would call a VTABLE method) and forcing
> languages with active data to go through a two step assignment

> $P0 = $P1 + $P2 # P0 created
> $P3 <- $P0 # P3 gets to run its tied code

We could do it that way too. OTOH it always needs two opcodes to
achieve the current effect of working with existing destination PMCs.

> and languages like python which have immutable scalars could always use

> $P0 = $P1 + $P2 # P0 created

As said, we can create either "__add" or "__n_add" depending on some
pragma or depending on the HLL.

> One concern that occurs to me is that this would cause more new PMC
> allocations. But I am whether or not that is true.

The C<temp> above isn't needed, so yes.

>> infix "__i_add", d, r

> I think this one is very necessary.

Yep

> Matt

leo

Leopold Toetsch

unread,
Mar 28, 2005, 4:27:58 AM3/28/05
to Bob Rogers, perl6-i...@perl.org
Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:

> IMHO, one can have too much overloading. It seems cleaner to
> distinguish between "+, the (sometimes overloaded) HLL operator" and
> "add, the Parrot addition operator" so that compiled code can opt out of
> the overloading when the compiler knows that it really needs to do mere
> addition.

The compiler doesn't and can't know that. The only way to be sure that
"add" isn't overloaded is in HLLs that have a notion of "closed" or
"finalized" classes. Python doesn't have such a construct. Therefore the
plan is to do a method lookup once (MMD search per default) and then
cache the result.

> Maybe we need just a fake "int" class for the sake of MMD.

> I think this is a really good idea. Indeed, I think it's hard to do
> otherwise; you need some place to store the "isa" relationships between
> PMC classes for primitive types.

It's probably not so much the "isa" relationship, but this bothers me:

infix<+>(int $l, Int $r) { ... }

has to look into the class of "$l" for MMD candidates.

> ... Beyond that, I think it's good to


> minimize the distinction between PMC classes and ParrotClass classes
> from the perpective of HLLs; implementors are then freer to use a
> ParrotClass to start, and reimplement as PMCs later for speed.

Yes. The distinction is already vanishing. E.g. both PMCs and
ParrotClasses now have the MRO field so that a method lookup is the same
for both cases.

> BTW, I just noticed that "pmclass ParrotObject extends ParrotClass
> ..."; is this right? Seems like the class and instance hierarchies
> ought to be disjoint . . .

I don't know, but probably yes.

> This is also how Common Lisp prefers to view the world.

Probably almost all HLLs except Perl.

> It also allows "$a = $a + $b" to have different semantics from
> "$a += $b" for suitably defined operators and values. Not that I can
> think of a reasonable example off the top of my head . . .

Yep.

> -- Bob Rogers

leo

Leopold Toetsch

unread,
Mar 28, 2005, 5:22:29 AM3/28/05
to perl6-i...@perl.org
Leopold Toetsch <l...@toetsch.at> wrote:

> 5) infix method signature change:

> METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
> if (!dest)
> dest = pmc_new(INTERP, SELF->vtable->base_type);
> ...
> return dest;
> }

> If the destination PMC is passed in, it's used else a new PMC of an
> appropriate type is created.

The same scheme should of course be introduced with all the unary prefix
and postfix operators:

METHOD PMC* absolute( [INTERP, SELF,] PMC ´*dest) {
if (PMC_IS_NULL(dest))


dest = pmc_new(INTERP, SELF->vtable->base_type);

VTABLE_set_integer_native(INTERP, dest, abs(PMC_int_val(SELF)));
return dest;
}

and

op abs(out PMC, in PMC)
op n_abs(out PMC, in PMC)

aka

unary "__absolute", Px, Py
unary "__n_absolute", Px, Py

which is the same as:

Px = Py."__absolute"(Px)
Px = Py."__absolute"()

leo

Matt Fowles

unread,
Mar 28, 2005, 9:02:40 AM3/28/05
to l...@toetsch.at, perl6-i...@perl.org
Leo~


On Mon, 28 Mar 2005 12:22:29 +0200, Leopold Toetsch <l...@toetsch.at> wrote:
> Leopold Toetsch <l...@toetsch.at> wrote:
>
> > 5) infix method signature change:
>
> > METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
> > if (!dest)
> > dest = pmc_new(INTERP, SELF->vtable->base_type);
> > ...
> > return dest;
> > }
>
> > If the destination PMC is passed in, it's used else a new PMC of an
> > appropriate type is created.
>
> The same scheme should of course be introduced with all the unary prefix
> and postfix operators:
>
> METHOD PMC* absolute( [INTERP, SELF,] PMC ´*dest) {
> if (PMC_IS_NULL(dest))
> dest = pmc_new(INTERP, SELF->vtable->base_type);
> VTABLE_set_integer_native(INTERP, dest, abs(PMC_int_val(SELF)));
> return dest;
> }
>
> and
>
> op abs(out PMC, in PMC)
> op n_abs(out PMC, in PMC)

Why bother with the IS_NULL check if we have the "n_" variant already?
Why not have one option unconditionally use the destination pmc and
the other unconditionally create a new destination pmc?

Leopold Toetsch

unread,
Mar 29, 2005, 2:37:36 AM3/29/05
to Matt Fowles, perl6-i...@perl.org
Matt Fowles <uber...@gmail.com> wrote:
> Leo~

> Why bother with the IS_NULL check if we have the "n_" variant already?


> Why not have one option unconditionally use the destination pmc and
> the other unconditionally create a new destination pmc?

I think, we can just have one method with the same functionality. While
it would work for the builtin one, the problem arises with overloading.
For more complicated methods like "add" also a lot of code duplication
is avoided.

But overloading is still a problem anyway. When we have perl semantics:

$a = $b + $c;

the best way to translate it to PIR is probably:

a = new PerlUndef # at scope start
...
a = b + c # current op, modifying "a" in place

But when infix<+> is overloaded the internal executed code has to be:

temp = "__add"(b, c) # multi sub "__add" returns new val
assign a, temp

to achieve the same semantics like in the non-overloaded case.

> Matt

leo

Bob Rogers

unread,
Mar 29, 2005, 8:19:17 PM3/29/05
to l...@toetsch.at, perl6-i...@perl.org
From: Leopold Toetsch <l...@toetsch.at>
Date: Mon, 28 Mar 2005 11:27:58 +0200

Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:

> IMHO, one can have too much overloading. It seems cleaner to
> distinguish between "+, the (sometimes overloaded) HLL operator" and
> "add, the Parrot addition operator" so that compiled code can opt out of
> the overloading when the compiler knows that it really needs to do mere
> addition.

The compiler doesn't and can't know that.

Seems to me that depends on the compiler . . .

The only way to be sure that "add" isn't overloaded is in HLLs that
have a notion of "closed" or "finalized" classes. Python doesn't have
such a construct. Therefore the plan is to do a method lookup once
(MMD search per default) and then cache the result.

I guess I was hoping for access to a lower-level mechanism. FWIW,
Common Lisp is an example of a dynamic HLL that doesn't allow certain
ops to be overloaded (at least not directly). But the existing "add" is
already too generic, so I'd have to fake this with explicit type checks
anyway. Oh, well.

> Maybe we need just a fake "int" class for the sake of MMD.

> I think this is a really good idea. Indeed, I think it's hard to do
> otherwise; you need some place to store the "isa" relationships between
> PMC classes for primitive types.

It's probably not so much the "isa" relationship, but this bothers me:

infix<+>(int $l, Int $r) { ... }

has to look into the class of "$l" for MMD candidates.

I'm afraid I don't understand; how is "int" different from "Int"? And
why would one need both?

> ... Beyond that, I think it's good to
> minimize the distinction between PMC classes and ParrotClass classes
> from the perpective of HLLs; implementors are then freer to use a
> ParrotClass to start, and reimplement as PMCs later for speed.

Yes. The distinction is already vanishing. E.g. both PMCs and
ParrotClasses now have the MRO field so that a method lookup is the same
for both cases.

Great; glad to hear it.

Leopold Toetsch

unread,
Mar 30, 2005, 1:57:55 AM3/30/05
to Bob Rogers, perl6-i...@perl.org
Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:

> I guess I was hoping for access to a lower-level mechanism. FWIW,
> Common Lisp is an example of a dynamic HLL that doesn't allow certain
> ops to be overloaded (at least not directly).

Overloading is a syntactic construct that the compiler supports (or
not). It's more or less explicit, though:

Perl6: infix<+>(...) {...}
Python: __add__ = myadd or even c.__dict__.update(myops)

Both compilers have to create a multi sub or a method called "__add" in
Parrot and store the sub or method in an appropriate namespace either
via C<store_lexical, C<store_global>, or C<add_method>. If your compiler
doesn't allow to overload some core functionality, it ought to emit an
error and not install the overloaded function.

Parrot's builtin "__add" is still there, probably as:

ns = interpinfo .INTERP_INFO_ROOT_NAMESPACE
m = ns["\0__parrot_core"; "__add"]
a = m(b, c)

or

iclass = getclass "Integer"
a = iclass."__add"(b, c)

or similar.

But given the dynamic nature of our target languages, I don't see any way
to keep a "primitive" add_p_p_p opcode, which is BTW not more efficient
or faster.

> already too generic, so I'd have to fake this with explicit type checks
> anyway. Oh, well.

Why? When Perl6 overloads e.g. infix<+>(Int, Int) it's overloading the
"__add" multi sub of the Perl6 class PerlInt. No Python or Lisp "__add"
method is involved here, nor Parrot's Integer.

> infix<+>(int $l, Int $r) { ... }

> has to look into the class of "$l" for MMD candidates.

> I'm afraid I don't understand; how is "int" different from "Int"? And
> why would one need both?

The lowercased "classes" denote natural C types in Perl6. Above
corresponds to an opcode:

op add(out PMC, in INT, in PMC)
add_p_i_p

which we don't have BTW. The "_i" stands for the native C type INTVAL.

leo

Bob Rogers

unread,
Mar 30, 2005, 11:16:03 PM3/30/05
to l...@toetsch.at, perl6-i...@perl.org
From: Leopold Toetsch <l...@toetsch.at>
Date: Wed, 30 Mar 2005 08:57:55 +0200

Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:

> I guess I was hoping for access to a lower-level mechanism. FWIW,
> Common Lisp is an example of a dynamic HLL that doesn't allow certain
> ops to be overloaded (at least not directly).

Overloading is a syntactic construct that the compiler supports (or

not). It's more or less explicit, though . . .

Thanks for taking the time to explain this; I appreciate all the bits.
I confess that I didn't understand this part the first time I read it;
the key, I think, is that you are assuming (as I did not) that each and
every language built on top of Parrot will define its own PMC classes,
even for primitive arithmetic types, when necessary to get the right MMD
operator semantics. Is this correct?

> already too generic, so I'd have to fake this with explicit type checks
> anyway. Oh, well.

Why? When Perl6 overloads e.g. infix<+>(Int, Int) it's overloading the
"__add" multi sub of the Perl6 class PerlInt. No Python or Lisp "__add"
method is involved here, nor Parrot's Integer.

One of my motivations for exploring Common Lisp in Parrot is summed up
in four letters: CPAN. So I expect to want to call Perl[56] code and
get back Perl data types for further mungery in Lisp (if I ever get that
far). It might be the right thing in that case to accept the additional
Perl semantics, but I'm not completely convinced. I tend to think of
operator semantics as being "language scoped," but perhaps this is just
personal bias, as I can't think of a really compelling reason why it
should be so. It may be just that I am not used to thinking in terms of
interactions between multiple languages.

But I am also concerned that a proliferation of low-level PMC types
will be an obstacle to interoperability. It seems simpler to me to
represent each distinct kind of mathematical object as a single PMC
class (e.g. for integers and complex numbers, but emphatically not
strings), and use different operator implementations to capture the
semantics of each language. But different languages will differ on what
should be distinct, and having language-dependent operators seems kinda
ugly, so it's hard to make a strong case for this. Besides, it would be
a major change from Parrot's current vtable-oriented design.

On the other hand, the new MMD arithmetic moves more of the semantics
away from the individual PMC class vtable methods and into the MMD
operator, so maybe it's not so far-fetched.

But all of this may just be a sign that I'm not yet sufficiently well
indoctrinated into Parrot culture to "get it." I'm still trying,
though. ;-}

> infix<+>(int $l, Int $r) { ... }

> has to look into the class of "$l" for MMD candidates.

> I'm afraid I don't understand; how is "int" different from "Int"? And
> why would one need both?

The lowercased "classes" denote natural C types in Perl6. Above
corresponds to an opcode:

op add(out PMC, in INT, in PMC)
add_p_i_p

which we don't have BTW. The "_i" stands for the native C type INTVAL.

leo

Then I would argue that whether something is in a register or not should
have no effect on MMD semantics. Suppose you pick some suitable PMC
class (e.g. Int and Float) which is representationally identical (or at
least you can go from PMC to register and back without loss), and define
that as the assumed class of all such registers? Seems to me that doing
anything else is likely to cause confusion without providing a useful
distinction. You wouldn't want the result of MMD dispatch to change
because the compiler got cleverer about assigning things to registers,
would you?

Leopold Toetsch

unread,
Mar 31, 2005, 2:38:15 AM3/31/05
to Bob Rogers, perl6-i...@perl.org
Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:
> From: Leopold Toetsch <l...@toetsch.at>
> Date: Wed, 30 Mar 2005 08:57:55 +0200

>
> ... , is that you are assuming (as I did not) that each and


> every language built on top of Parrot will define its own PMC classes,
> even for primitive arithmetic types, when necessary to get the right MMD
> operator semantics. Is this correct?

Not necessarily. I'd like to have core scalar types with the most common
functionality, which - e.g. for Integer PMCs - seems to be an arbitrary
integer type with automatic promotion to bigint. If your language is
happy with that type and all semantics are the same, there is no need to
subclass the PMC.

But normally all these dynamic languages have some kind of
introspection. E.g.

$ python
>>> 1+2j
(1+2j)

The string representation of a different language could be "1+2i".
Not to speak about different semantics like using "+" for string
concatenation.

> One of my motivations for exploring Common Lisp in Parrot is summed up
> in four letters: CPAN. So I expect to want to call Perl[56] code and
> get back Perl data types for further mungery in Lisp (if I ever get that
> far). It might be the right thing in that case to accept the additional
> Perl semantics, but I'm not completely convinced. I tend to think of
> operator semantics as being "language scoped," but perhaps this is just
> personal bias, as I can't think of a really compelling reason why it
> should be so.

I don't know. But Perl scalar operator semantics seems to be to morph the
destination PMC to the value of the operation:

.language perl
a = Undef
a = b + c # a maybe promoted to bigint

Python semantics are that scalar operators always create a new result.
Python scalars including strings are immutable.

.language python
a = b + c # new "a" value created

The question now is, what happens, when you call a Perl module that
happens to return the scalar "b":

.language lisp


a = b + c

Lisp works here AFAIK like Python, e.g. a new destination should be
created. So it seems that under the pragma "lisp" or "python" this
should translate to:

n_add a, b, c # not existing op that creates new "a"

and within perl semantics

add a, b, c # reuse "a"

> But I am also concerned that a proliferation of low-level PMC types
> will be an obstacle to interoperability. It seems simpler to me to
> represent each distinct kind of mathematical object as a single PMC
> class

I don't get that.

> ... You wouldn't want the result of MMD dispatch to change


> because the compiler got cleverer about assigning things to registers,
> would you?

Parrot itself doesn't do this optimization. Only the HLL compiler can
emit such code, if the language provides a type system. Perl6 allows:

my int ($i, $j, $k); # use natural integer
$i = $j + $k # I0 = I1 + I2

This integer $i can't promote to a bigint for example.

The more important case is:

my int @ints; # array of plain integers

There are a few rare cases, where the HLL to parrot translator could
optimize to natural types though:

for i in xrange(1..10):
print i

Pyhon's xrange produces only natural ints.

> -- Bob Rogers

leo

Bob Rogers

unread,
Mar 31, 2005, 11:02:13 PM3/31/05
to l...@toetsch.at, perl6-i...@perl.org
From: Leopold Toetsch <l...@toetsch.at>
Date: Thu, 31 Mar 2005 09:38:15 +0200

Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:
> From: Leopold Toetsch <l...@toetsch.at>
> Date: Wed, 30 Mar 2005 08:57:55 +0200

>
> ... , is that you are assuming (as I did not) that each and
> every language built on top of Parrot will define its own PMC classes,
> even for primitive arithmetic types, when necessary to get the right MMD
> operator semantics. Is this correct?

Not necessarily. I'd like to have core scalar types with the most common
functionality, which - e.g. for Integer PMCs - seems to be an arbitrary
integer type with automatic promotion to bigint. If your language is
happy with that type and all semantics are the same, there is no need to
subclass the PMC.

That sounds good, but I'm not sure that can be used in all situations.
After all, the notion of "all semantics" is much broader with MMD. For
instance, even if Int+Int=>(Int or Bigint) was what was wanted, there
would then be no way to disallow Int+String, since I assume an
Int+String=>Number method will exist that coerces the string to a number
first. I imagine you could do this by creating a subclass of Int,
e.g. LispInt, and then defining a LispInt+String method that throws an
error. But even then, you still lose Lisp semantics if calls into other
code return PerlInt's, or whatever.

But normally all these dynamic languages have some kind of
introspection. E.g.

$ python
>>> 1+2j
(1+2j)

The string representation of a different language could be "1+2i".
Not to speak about different semantics like using "+" for string
concatenation.

That is a good point, but subclassing isn't the only way to skin this
particular cat. To follow your example, the Lisp syntax for this
particular complex number is "#C(1 2)", but this is produced only by
printing operators; there is no "coerce-to-string" operation per se. So
Lisp could define "__lisp__print_object" methods on each of the base
arithmetic classes, and then it would be able to get away without
subclassing them (at least as far as this issue is concerned). This
also has the advantage that this same (eq_addr) number will appear as
"(1+2j)" when printed by Python code and "#C(1 2)" when printed by Lisp
code, which is (IMHO) as it should be.

> One of my motivations for exploring Common Lisp in Parrot is summed up
> in four letters: CPAN. So I expect to want to call Perl[56] code and
> get back Perl data types for further mungery in Lisp (if I ever get that
> far). It might be the right thing in that case to accept the additional
> Perl semantics, but I'm not completely convinced. I tend to think of
> operator semantics as being "language scoped," but perhaps this is just
> personal bias, as I can't think of a really compelling reason why it
> should be so.

I don't know. But Perl scalar operator semantics seems to be to morph the
destination PMC to the value of the operation:

.language perl
a = Undef
a = b + c # a maybe promoted to bigint

Python semantics are that scalar operators always create a new result.
Python scalars including strings are immutable.

.language python
a = b + c # new "a" value created

The question now is, what happens, when you call a Perl module that
happens to return the scalar "b":

.language lisp
a = b + c

Lisp works here AFAIK like Python, e.g. a new destination should be
created.

That is correct.

So it seems that under the pragma "lisp" or "python" this should
translate to:

n_add a, b, c # not existing op that creates new "a"

and within perl semantics

add a, b, c # reuse "a"

Yes. And, now that you mention it, I can see headaches when Perl
scribbles on Lisp or Python constant values. So it may be necessary
(shudder) to have interface routines between Perl code and the rest of
the world in order to avoid this. I hope I'm wrong.

> But I am also concerned that a proliferation of low-level PMC types
> will be an obstacle to interoperability. It seems simpler to me to
> represent each distinct kind of mathematical object as a single PMC
> class

I don't get that.

I'm sorry; I'm not sure how to explain it better. Except perhaps to say
that I think that numbers as mathematical objects ought to be
represented in a way that is independent of the programming language
that spawned them, to the greatest extent possible.

Also, the Int+String example above seems to point to a situation
where code could behave one way with its "native" numeric values, but
very differently with "foreign" numbers that represent the same
mathematical values. Admittedly, this particular example seems like a
corner case, but on the other hand it seems very odd, and a source of
more debugging headaches, that the "same" numbers could give different
results.

> ... You wouldn't want the result of MMD dispatch to change
> because the compiler got cleverer about assigning things to registers,
> would you?

Parrot itself doesn't do this optimization. Only the HLL compiler can
emit such code, if the language provides a type system. Perl6 allows:

my int ($i, $j, $k); # use natural integer
$i = $j + $k # I0 = I1 + I2

This integer $i can't promote to a bigint for example.

Oops; I forgot about promotion. So now I think I understand why Parrot
needs to distinguish between situations where a machine integer can be
promoted to a bigint, and ones where it can't.

But in this case the choice not to promote has already been made by
the compiler. Presumably, assignment to an integer register includes an
implicit "mod" operation, so the compiler can't possibly have wanted to
promote there. And writing

P0 = I1 + I2

should probably be seen as a request to permit promotion on overflow.
So isn't the promotion choice always determined by the destination
register type? If that is so, then it seems to me that you really have
a total of four addition MMD operators:

* __addi produces a native integer in an I register.
* __addn produces a float in an N register.
* __add stores into an existing destination PMC.
* __n_add creates a new destination PMC.

The latter two might promote to BigInt -- or produce anything else,
depending on what they are given.

Would that work? If so, please pick saner names; I was trying to
reuse the existing names, but that gives rise to "n confusion."

The more important case is:

my int @ints; # array of plain integers

There are a few rare cases, where the HLL to parrot translator could
optimize to natural types though:

for i in xrange(1..10):
print i

Python's xrange produces only natural ints.

So that is sufficient to bind the type of i. The Lisp idiom is actually
quite close, but people frequently declare indices of various sorts to
be machine integers[1] for speed, which makes it not at all rare.

And so even in Lisp, you would want (and expect) different behavior
for "int" vs. "Int". It's a pity I wasn't thinking more clearly
yesterday . . .

[1] Actually, "fixnums", which are a few bits smaller.

Leopold Toetsch

unread,
Apr 1, 2005, 1:42:53 AM4/1/05
to Bob Rogers, perl6-i...@perl.org
Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:
> From: Leopold Toetsch <l...@toetsch.at>

> ..., since I assume an


> Int+String=>Number method will exist that coerces the string to a number
> first. I imagine you could do this by creating a subclass of Int,
> e.g. LispInt, and then defining a LispInt+String method that throws an
> error.

Yes, as said the builtin Integer can just provide the most common
functionality. When it comes to infix operations with strings languages
tend to be pretty different, so that there'll be a subclass for most of
the languages.

> ... But even then, you still lose Lisp semantics if calls into other


> code return PerlInt's, or whatever.

It depends, either you have a PerlInt for e.g.

$Data::Dumper::Terse = 1;

then you'd need Perl semantics to set the value.

Or you get a result of some module, then you'd probably do the
equivalent of:

lispint = perlint

But I'm handwaving here.

> $ python
> >>> 1+2j
> (1+2j)

> That is a good point, but subclassing isn't the only way to skin this
> particular cat. To follow your example, the Lisp syntax for this
> particular complex number is "#C(1 2)", but this is produced only by
> printing operators; there is no "coerce-to-string" operation per se. So
> Lisp could define "__lisp__print_object" methods on each of the base
> arithmetic classes, and then it would be able to get away without
> subclassing them (at least as far as this issue is concerned).

No, not really. There's a common method already in core:

o."__repr"() # $o.perl in Perl6 speak

> also has the advantage that this same (eq_addr) number will appear as
> "(1+2j)" when printed by Python code and "#C(1 2)" when printed by Lisp
> code, which is (IMHO) as it should be.

I don't think, that this effect can be achieved by
"__list_print_object". Python or Perl wouldn't know that it exists.

> Yes. And, now that you mention it, I can see headaches when Perl
> scribbles on Lisp or Python constant values. So it may be necessary
> (shudder) to have interface routines between Perl code and the rest of
> the world in order to avoid this. I hope I'm wrong.

No, I think that's pretty correct. But you have to define an interface
anyway (defperl my-perl-foo ...) or some such, I presume.

> > But I am also concerned that a proliferation of low-level PMC types
> > will be an obstacle to interoperability. It seems simpler to me to
> > represent each distinct kind of mathematical object as a single PMC
> > class

> I don't get that.

> I'm sorry; I'm not sure how to explain it better. Except perhaps to say
> that I think that numbers as mathematical objects ought to be
> represented in a way that is independent of the programming language
> that spawned them, to the greatest extent possible.

Well, classes and subclasses, with common functionality in the base
class.

> Also, the Int+String example above seems to point to a situation
> where code could behave one way with its "native" numeric values, but
> very differently with "foreign" numbers that represent the same
> mathematical values. Admittedly, this particular example seems like a
> corner case, but on the other hand it seems very odd, and a source of
> more debugging headaches, that the "same" numbers could give different
> results.

That's right. And we can't do much against that:

$perl_var = python_str + $perl_str

executed inside a Perl environment would concatenate, *except* when Perl
provides a more derived multi sub that does it the Perl way:

$ perl -le "print '02a' + '3x'"
5

That means some of the nasty cases can be catched with some effort.
That's one of the big advantages of MMD.

> And so even in Lisp, you would want (and expect) different behavior
> for "int" vs. "Int".

Fine.

> -- Bob Rogers

leo

Bob Rogers

unread,
Apr 3, 2005, 6:08:07 PM4/3/05
to l...@toetsch.at, perl6-i...@perl.org
From: Leopold Toetsch <l...@toetsch.at>
Date: Fri, 1 Apr 2005 08:42:53 +0200

Bob Rogers <rogers...@rgrjr.dyndns.org> wrote:
> From: Leopold Toetsch <l...@toetsch.at>

> . . .

> $ python
> >>> 1+2j
> (1+2j)

> That is a good point, but subclassing isn't the only way to skin this
> particular cat. To follow your example, the Lisp syntax for this
> particular complex number is "#C(1 2)", but this is produced only by
> printing operators; there is no "coerce-to-string" operation per se. So
> Lisp could define "__lisp__print_object" methods on each of the base
> arithmetic classes, and then it would be able to get away without
> subclassing them (at least as far as this issue is concerned).

No, not really. There's a common method already in core:

o."__repr"() # $o.perl in Perl6 speak

Seems to me that this is a method that produces Python syntax, and works
only for Python classes. Since Lisp uses a different printed
representation, Lisp needs a distinct operation. And since "__repr__"
is defined only on Python PMCs, those methods would need to be moved to
more generic PMC classes in order for numbers generated by Lisp (or
anything else) to use Python syntax when printed in Python.

> also has the advantage that this same (eq_addr) number will appear as
> "(1+2j)" when printed by Python code and "#C(1 2)" when printed by Lisp
> code, which is (IMHO) as it should be.

I don't think, that this effect can be achieved by
"__list_print_object". Python or Perl wouldn't know that it exists.

They wouldn't have to know -- or even care. All such methods would be
defined and used only by the Lisp system, so that wouldn't affect other
languages. Their PMC classes would all inherit suitable methods for
Parrot-defined base classes, and so they would still print correctly in
Lisp.

In fact, I have noticed two common styles of interoperation between
coexisting Lisp systems:

1. System A can define methods for its own generic (MMD) functions
that specialize on System B's classes. The generic function has a name
that belongs to one of System A's packages, so only code that is aware
of System A can call it.

2. System A can define methods for generic functions of System B
that specialize on one or more of System A's classes. This would appear
to tresspass into System B's semantic space, except that System B will
only invoke those new methods if it has instances of those System A
classes. And it can't do that by accident, because those classes have
names in System A's packages.

The "__lisp__print_object" vs. "__repr__" thing is of course an
example of the second style, with Parrot as System B. If necessary, of
course, I could do this for arithmetic operations as well, which is what
I think you've been trying to tell me all along. If so, I apologize for
being dense.

At this point I've degenerated to thinking aloud, so I'll quit now.
But this thread has helped me visualize how language interoperation can
work, and I hope it has shed some light on the issues generally (though
perhaps only for others of similar density).

Leopold Toetsch

unread,
Apr 13, 2005, 8:55:56 AM4/13/05
to Perl 6 Internals
Leopold Toetsch wrote:

If there are no objections, I'll continue with:

> 5) infix method signature change:
>
> METHOD PMC* add( [INTERP, SELF,] PMC* rhs, PMC ´*dest) {
> if (!dest)
> dest = pmc_new(INTERP, SELF->vtable->base_type);
> ...
> return dest;
> }
>
> If the destination PMC is passed in, it's used else a new PMC of an
> appropriate type is created.
>
> We need this basically for 4 reasons:

[ ... ]

and

> 7) separate inplace methods
>
> Opcodes like:
>
> d += r # add d, r

To simplify the MMD function signature change, dynclasses scalars should
inherit all identical code from Parrot core scalars.

leo

Reply all
Reply to author
Forward
0 new messages