hyper op - proof of concept

Leopold Toetsch

unread,

Apr 20, 2004, 10:28:57 AM4/20/04

to P6I

I've implemented a (rather hackish and incomplete) new opcode called
C<hyper>. Usage looks like:

ar = new IntList
ar = 1000000
hyper
ar = 10

or
hyper
ar += 10

The atached tests fill an integer array with one Meg int constants and
then increment each value 5 times.

Here are the timing results (unoptimized parrot)
$ time parrot -j i.imc
6060

real 0m1.927s

$ time parrot ih.imc
6060

real 0m0.328s

Plain run core or JIT doesn't matter for the hyper version.

Only the set and add opcodes with constants are done. But it's not hard
to extend the scheme and generalize it.

Comments welcome,
leo

$ cat ih.imc
.sub _main @MAIN
loadlib P11, "myops_ops"
.local pmc ar
ar = new IntList
ar = 1000000
hyper
ar = 10
.local int j
j = 0
lp1:
hyper
ar += 10
inc j
if j < 5 goto lp1
set $I0, ar[0]
print $I0
set $I0, ar[999999]
print $I0
print "\n"
.end

$ cat i.imc
.sub _main @MAIN
.local pmc ar
ar = new IntList
ar = 1000000
.local int i
i = 0
lp:
ar[i] = 10
inc i
if i < 1000000 goto lp
.local int j
j = 0
lp1:
i = 0
lp2:
$I0 = ar[i]
$I0 += 10
ar[i] = $I0
inc i
if i < 1000000 goto lp2
inc j
if j < 5 goto lp1
set $I0, ar[0]
print $I0
set $I0, ar[999999]
print $I0
print "\n"
.end

# from dynops/myops.ops
op hyper() {
opcode_t *pc = expr NEXT();
INTVAL i,l;
UINTVAL c;
op_info_t *opinfo = &interpreter->op_info_table[*pc];
PMC *ar = REG_PMC(pc[1]); /* TODO inspect opcode */
List *list = PMC_data(ar);
List_chunk *chunk;
INTVAL v = pc[2], *p;
int op = *pc;
l = list_length(interpreter, list);
i = 0;
for (chunk = list->first; chunk; chunk = chunk->next) {
if (chunk->flags & sparse)
goto slow;
p = PObj_bufstart(&chunk->data);
for (c = 0; c < chunk->items; ++c, ++p) {
switch (op) {
case 906: /* set_p_ic */
*(INTVAL*) p = v;
break;
case 460: /* add_p_ic */
*(INTVAL*) p += v;
break;
}
}
i += c;
}
goto done;
slow:
for (i; i < l; ++i)
list_assign(interpreter, list, i, INTVAL2PTR(void*, v),
enum_type_INTVAL);
done:
pc += opinfo->arg_count;
goto ADDRESS(pc);
}

PS needs latest CVS for a missing intlist vtable.

Dan Sugalski

unread,

Apr 20, 2004, 11:53:17 AM4/20/04

to Leopold Toetsch, P6I

At 4:28 PM +0200 4/20/04, Leopold Toetsch wrote:
>I've implemented a (rather hackish and incomplete) new opcode called
>C<hyper>. Usage looks like:
>
> ar = new IntList
> ar = 1000000
> hyper
> ar = 10
>
>or
> hyper
> ar += 10
>
>The atached tests fill an integer array with one Meg int constants
>and then increment each value 5 times.

Y'know... let's just go all the way with this, since we're going to have to.

We'll add a hyper version of all the vtable entries. Since this is
going to bloat the hell out of the vtable, we'll do it by adding a
VTABLE *hyper to the main vtable structure and hang it off there. We
can work out a default version that unrolls the operations that
pretty much everyone'll inherit so there won't have to be a huge
number of slots reserved for every new class. The classes that care
can then override the defaults and that's fine.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,

Apr 20, 2004, 2:10:20 PM4/20/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:

> We'll add a hyper version of all the vtable entries. Since this is
> going to bloat the hell out of the vtable, we'll do it by adding a
> VTABLE *hyper to the main vtable structure and hang it off there.

Aren't the relevant vtable slots for aggregates unused anyway? Can we
define that

add Pagg, Pagg, Ix

or such is just the hyper add?

The only one that's used is AFAIK C<set Pagg, Ix> which should better be
C<set_elements> anyway.

leo

Dan Sugalski

unread,

Apr 20, 2004, 2:32:09 PM4/20/04

to l...@toetsch.at, perl6-i...@perl.org

At 8:10 PM +0200 4/20/04, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
>
>> We'll add a hyper version of all the vtable entries. Since this is
>> going to bloat the hell out of the vtable, we'll do it by adding a
>> VTABLE *hyper to the main vtable structure and hang it off there.
>
>Aren't the relevant vtable slots for aggregates unused anyway?

Only because we've not gotten around to writing the code. :)

I've actually come across the need locally, but haven't had the time
in the schedule to write the code that'd be needed. It's easier to
get the compiler to generate the long code, though less optimal in
the long run.

> Can we
>define that
>
> add Pagg, Pagg, Ix
>
>or such is just the hyper add?

Nope. :)

Aaron Sherman

unread,

Apr 20, 2004, 4:20:18 PM4/20/04

to Dan Sugalski, Perl6 Internals List

On Tue, 2004-04-20 at 11:53, Dan Sugalski wrote:

> Y'know... let's just go all the way with this, since we're going to have to.
>
> We'll add a hyper version of all the vtable entries.

Another of those darned "I don't get it" posts, but I'll keep this one
short.

Why does Parrot need this? What's so special about hyper operations that
makes Parrot want to take them on?

The way I have read Perl 6's Apocalypses hyper-operations are
essentially (sometimes lazy) operations on arrays and/or lists of
values. So:

@a = @b >>+<< @c

Is just a rather simple for loop that traverses the two lists, adding
them together and constructing a new list. You don't need a vtable entry
for that.

You *do* need to be able to override infix:>>+<<, but that's just a
perlop. Does it need to be Parrot? Does every single infix:... thing in
Perl 6 need to have support in Parrot? As long as Perl 6 can iterate
over values that it gets from Parrot (e.g. PMCs that come from Perl 6,
Perl 5, Ruby, whatever), then all hyperops should be easily implemented
in Perl 6's compiler, no?

Confused.

--
Aaron Sherman <a...@ajs.com>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback

Chris

unread,

Apr 20, 2004, 4:30:15 PM4/20/04

to Aaron Sherman, Dan Sugalski, Perl6 Internals List

On Tue, 2004-04-20 at 1:26PM, Aaron Sherman wrote:
>
> Another of those darned "I don't get it" posts, but I'll keep
> this one short.
>
> Why does Parrot need this? What's so special about hyper
> operations that makes Parrot want to take them on?

I'm not sure I entirely get it myself, but perhaps it's a good thing if it
can take some work off the shoulders of the Perl6 compiler writers. After
all, if a lot of Perl6 can be mapped fairly directly to Parrot, without
hurting the efforts of other compiler writers, it'll mean an even quicker
working Perl6 once the language design is (more or less) finished.
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.659 / Virus Database: 423 - Release Date: 15/04/2004

Dan Sugalski

unread,

Apr 20, 2004, 4:30:42 PM4/20/04

to Aaron Sherman, Perl6 Internals List

At 4:20 PM -0400 4/20/04, Aaron Sherman wrote:
>On Tue, 2004-04-20 at 11:53, Dan Sugalski wrote:
>
>> Y'know... let's just go all the way with this, since we're going to have to.
>>
>> We'll add a hyper version of all the vtable entries.
>
>Another of those darned "I don't get it" posts, but I'll keep this one
>short.
>
>Why does Parrot need this? What's so special about hyper operations that
>makes Parrot want to take them on?

Because they can be overridden separately from the regular version of
the operation.

Because if they're separate, people can do Really Evil Things with
them. (Like, say, custom matrix manipulation code with assembly
routines that use MMX, SSE, or Altivec functionality)

Because the things being operated on may give the compilers
insufficient information to generate at compile time the proper array
iteration code.

Pick one. They're all good reasons.

Aaron Sherman

unread,

Apr 20, 2004, 5:15:03 PM4/20/04

to Dan Sugalski, Perl6 Internals List

On Tue, 2004-04-20 at 16:30, Dan Sugalski wrote:

> Because they can be overridden separately from the regular version of
> the operation.

Of course.

Quoting A3 (note, syntax has changed, but as far as I know, the content has not):

@a ^* @b

is equivalent to this:

parallel { $^a * $^b } @a, @b

So, while they can be overridden separately in the client language, I
don't think that requires any support from Parrot (after all, Python
objects won't override their infix:>>+<<, but they might override their
infix:+, and that should carry over to Perl 6 when hyper-adding Python
ints).

> Because if they're separate, people can do Really Evil Things with
> them. (Like, say, custom matrix manipulation code with assembly
> routines that use MMX, SSE, or Altivec functionality)
>
> Because the things being operated on may give the compilers
> insufficient information to generate at compile time the proper array
> iteration code.

Your talk began with the woes of dynamic languages. You explained to us
in great detail how polymorphism and the lack of any sort of static data
lead to the inability to optimize in this way.

What changed?

Specifically, if I have:

@a >>+<< @b

Why is it that I can apply MMX instructions, even if I think I know the
types of everything in @a and @b (e.g. because of a declaration). What
if element 10 of @a is REALLY a URI fetch object that got put in there
by some library? Worse, will the MMX instructions obey the semantics of
a host of language-specific PMCs all crammed into one array?

You seem to be eager to do this, and I won't try to stop you, but I
guess I am curious why you want to do this so badly....

Leopold Toetsch

unread,

Apr 20, 2004, 4:55:34 PM4/20/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 8:10 PM +0200 4/20/04, Leopold Toetsch wrote:

[ unused scalar vtables in aggregates ]

>>Aren't the relevant vtable slots for aggregates unused anyway?

> Only because we've not gotten around to writing the code. :)

Do you want to reserve these just for implementing perl's scalar context
of arrays or hashes, or is there more behind the scene?

leo

Leopold Toetsch

unread,

Apr 20, 2004, 6:06:00 PM4/20/04

to Aaron Sherman, perl6-i...@perl.org

Aaron Sherman <a...@ajs.com> wrote:

> Why does Parrot need this? What's so special about hyper operations that
> makes Parrot want to take them on?

The special thing is optimizing for integers and floats. The bad thing
is overridden »op« for aggregates holding PMCs. These might do whatever
they want, e.g.

hyper
Px += 4 # add 4 to each column in data base

and worse, we have two versions of delegate/tie/whatever, the aggregate
itself can provide »+« and each aggregate member can have a
different C<add> operation.

> The way I have read Perl 6's Apocalypses hyper-operations are
> essentially (sometimes lazy) operations on arrays and/or lists of
> values. So:

> @a = @b >>+<< @c

> Is just a rather simple for loop that traverses the two lists, adding
> them together and constructing a new list. You don't need a vtable entry
> for that.

Well, yes. Except for the special case, which is nice though:

$ time parrot ih.imc #[1]
real 0m0.370s

$ time perl i.pl #[2]
real 0m5.656s

That'll be around a factor of twenty for an optimized build, which can
make a difference ...

leo

[1] unoptimized parrot, ih.imc from my proof of concept
[2]
$ cat i.pl
#!perl
use strict;
my (@ar); # is array of int
for (0 .. 999_999) {
$ar[$_] = 10;
}
for (1 ..5) {
for (0 .. 999_999) {
$ar[$_] += 10;
}
}
print $ar[0], $ar[999_999], "\n";

Jeff Clites

unread,

Apr 20, 2004, 8:07:09 PM4/20/04

to l...@toetsch.at, Perl 6 Internals

On Apr 20, 2004, at 3:06 PM, Leopold Toetsch wrote:

> hyper
> Px += 4 # add 4 to each column in data base

How does this look in pasm? Is it supposed to be:

hyper
add P0, 4

or is it:

hyperadd P0, 4

If it's the former, it seems really odd to have an op which modifies
the meaning of the next op. It seems confusing, for an assembly syntax.

Or is the op going to be "hyper", with "add" passed to it as a
parameter?:

hyper P0, 4, .add # or something

Just a bit confused about the syntax....

JEff

Leopold Toetsch

unread,

Apr 21, 2004, 3:17:05 AM4/21/04

to Jeff Clites, Perl 6 Internals

Jeff Clites wrote:
> On Apr 20, 2004, at 3:06 PM, Leopold Toetsch wrote:
>
>> hyper
>> Px += 4 # add 4 to each column in data base
>
>
> How does this look in pasm? Is it supposed to be:
>
> hyper
> add P0, 4

Exactly that.

> If it's the former, it seems really odd to have an op which modifies the
> meaning of the next op. It seems confusing, for an assembly syntax.

I want to avoid opcode explosion. Having hyper ops of all math, bitwise,
logical, and string opcodes with all operand permutatations would blow
opcode count and codesize up to insanity.

> Or is the op going to be "hyper", with "add" passed to it as a parameter?:
>
> hyper P0, 4, .add # or something

I can imagine the the final ops look something like this.

> Just a bit confused about the syntax....

The syntax isn't layed out yet.

> JEff

leo

Larry Wall

unread,

Apr 21, 2004, 12:36:38 AM4/21/04

to Perl6 Internals List

On Tue, Apr 20, 2004 at 04:30:42PM -0400, Dan Sugalski wrote:
: At 4:20 PM -0400 4/20/04, Aaron Sherman wrote:
: >On Tue, 2004-04-20 at 11:53, Dan Sugalski wrote:
: >
: >> Y'know... let's just go all the way with this, since we're going to have
: >> to.
: >>
: >> We'll add a hyper version of all the vtable entries.
: >
: >Another of those darned "I don't get it" posts, but I'll keep this one
: >short.
: >
: >Why does Parrot need this? What's so special about hyper operations that
: >makes Parrot want to take them on?
:
: Because they can be overridden separately from the regular version of
: the operation.

Well, for the record I've never said that they can be, but...

: Because if they're separate, people can do Really Evil Things with

: them. (Like, say, custom matrix manipulation code with assembly
: routines that use MMX, SSE, or Altivec functionality)
:
: Because the things being operated on may give the compilers
: insufficient information to generate at compile time the proper array
: iteration code.

I don't know if these rate a bunch of vtable entries, though.
If they're a little slower to dispatch it's no biggie, since even
with hardware acceleration the actual vector ops will tend to dominate
the timings.

Larry

Dan Sugalski

unread,

Apr 21, 2004, 8:37:07 AM4/21/04

to l...@toetsch.at, perl6-i...@perl.org

More behind the scenes. (Though that's a good reason too) The problem
is that we've got quite a few cases where we can't tell at compile
time what the heck's going on with a PMC, so there's no good way to
know if it's an aggregate or a scalar. If we overload the vtable
entries we're going to get into trouble.

Separate functions, separate entries. I already made the mistake of
trying to overload some of the other entries in the past. We've fixed
that up and I'd rather not do that again.

Yeah, having hyper versions of the ops does blow out the opcode list
a lot, but the alternative is to end up with a half-assed system
that'll have the math guys down on my head. I'd as soon skip that
one. :) I'm OK with putting limitations on it--pmc-only, for example,
so we don't have to deal with S/I/N versions of the hyper ops.

Leopold Toetsch

unread,

Apr 21, 2004, 10:00:58 AM4/21/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> At 10:55 PM +0200 4/20/04, Leopold Toetsch wrote:
>>
>>Do you want to reserve these just for implementing perl's scalar context
>>of arrays or hashes, or is there more behind the scene?

> More behind the scenes. (Though that's a good reason too) The problem
> is that we've got quite a few cases where we can't tell at compile
> time what the heck's going on with a PMC, so there's no good way to
> know if it's an aggregate or a scalar. If we overload the vtable
> entries we're going to get into trouble.

I don't get that. Why should we need to know at compile time, if we are
dealing with an aggregate or not? It doesn't matter. At runtime we end
up with:

pmc->vtable->add_int() # example below

whatever the PMC is. When it's kind of a scalar, it will support the
operation. When it's a PerlArray you get:

add_int() not implemented in class 'PerlArray'

I don't see a reason, why

new P0, .PerlArray
add P0, 1

should be meaningful or supported--or much worse:

shl P0, 31

With separate vtable entries, you'll either have to duplicate the whole
MMD slots, with one extra indirection for vtable->hyper->func, or
overloading is an all or nothing operation. Both is suboptimal.

> Separate functions, separate entries. I already made the mistake of
> trying to overload some of the other entries in the past. We've fixed
> that up and I'd rather not do that again.

That's reasonable of course, but I don't see why we just can define:

- an Arrays vtable slot for scalar operation C<op> means C< »op« >

But this still means that

add P0, 1

is an error - because the C<hyper> prefix is missing.

> Yeah, having hyper versions of the ops does blow out the opcode list
> a lot, but the alternative is to end up with a half-assed system
> that'll have the math guys down on my head. I'd as soon skip that
> one. :)

We don't need a hyper version of each opcode, if we go with the scheme
I've proposed, either

hyper
op args

or maybe

hyper .OP, args

Such a scheme also fits nicely with Luke's summary (thanks BTW) that is:

@a »op« @b

always means

map { $a[$_] op $b[$_] } 0..max(+@a, +@b)

The C<hyper> prefix pvovides the map part and the needed environement
and then calls the appropriate vtable (for PMCs) to do the actual work.

> ... I'm OK with putting limitations on it--pmc-only, for example,

> so we don't have to deal with S/I/N versions of the hyper ops.

PMC-only means, that you'll always have to call e.g. get_integer on the
PMC, because the PMC might be tied. This limitation isn't really good
for performance reasons. People might use it most likely in combination
with natural typed arrays.

AFAIK is Perl6 the only language that provides these hyper ops so we
should support these efficiently, especially with natural types.

leo

Simon Glover

unread,

Apr 21, 2004, 10:13:57 AM4/21/04

to Leopold Toetsch, Dan Sugalski, perl6-i...@perl.org

On Wed, 21 Apr 2004, Leopold Toetsch wrote:

> PMC-only means, that you'll always have to call e.g. get_integer on the
> PMC, because the PMC might be tied. This limitation isn't really good
> for performance reasons. People might use it most likely in combination
> with natural typed arrays.

Absolutely -- I really, _really_ want to be able to use hyper ops with
fixed size, floating point arrays, and to have that be as fast as
possible, as that should make it possible to implement something like
PDL in the core.

Simon

Aaron Sherman

unread,

Apr 21, 2004, 10:24:39 AM4/21/04

to Leopold Toetsch, Perl6 Internals List

On Tue, 2004-04-20 at 18:06, Leopold Toetsch wrote:
> Aaron Sherman <a...@ajs.com> wrote:

This horse is getting a bit ripe, so I'm going to skip most of the
detail. I think we all agree on most of the basics, we just disagree on
what to do with them. That's cool.

I do want to pick a couple of small nits though:

> Well, yes. Except for the special case, which is nice though:
>
> $ time parrot ih.imc #[1]
> real 0m0.370s
>
> $ time perl i.pl #[2]
> real 0m5.656s

That's unrealistic. In P6, you should be able to take:

@a >>+<< @b

and turn it into:

# Trivial example of hyper-operation, untested pseudo-IMCC
# Just take __Perl_Ary_a and add it to __Perl_Ary_b and put
# the result in tmp5
.local int tmp1
tmp1 = 0
.local int tmp2
tmp2 = __Perl_Ary_a
.local int tmp3
tmp3 = __Perl_Ary_b
.local int tmp4
# Not sure what the ? is below... is there a typeof?
.local ? PerlArray tmp5
tmp5 = new .PerlArray
# We auto-extend here... that may not be P6's eventual MO
# but it's enough to get the point across
if tmp2 >= tmp3 goto AutoExtend_HYPER_1
__Perl_Ary_a = tmp3
tmp4 = tmp3
goto PRE_HYPER_1
AutoExtend_HYPER_1:
__Perl_Ary_b = tmp2
tmp4 = tmp2
PRE_HYPER_1:
tmp5 = tmp4
BEGIN_HYPER_1:
if tmp1 >= tmp4 goto END_HYPER_1
tmp5[tmp1] = __Perl_Ary_a[tmp1] + __Perl_Ary_b[tmp1]
CONT_HYPER_1:
# I forget if there's an inc op....
tmp1 = tmp1 + 1
goto BEGIN_HYPER_1
END_HYPER_1

Are we seriously suggesting that after JIT, that's going to be as slow
as raw Perl, or even any slower than:

.local ? PerlArray tmp1
hyper
tmp1 = __Perl_Ary_a >>+<< __Perl_Ary_b

?! If so, I'm curious to know why. It seems to me that you're just
moving the work from the Perl 6 compiler all the way down to the JIT,
but the resulting code is the same, no?

I would agree that a bulk array copy and iterators should go in Parrot.
That much would speed up many things (especially the above code).

Putting Perl 6 features into Parrot without factoring out their modular
essence would seem to me to result in a great deal of duplication, but
now I'm starting to get close to that horse again....

Aaron Sherman

unread,

Apr 21, 2004, 10:44:32 AM4/21/04

to Simon Glover, Perl6 Internals List

On Wed, 2004-04-21 at 10:13, Simon Glover wrote:

> Absolutely -- I really, _really_ want to be able to use hyper ops with
> fixed size, floating point arrays, and to have that be as fast as
> possible, as that should make it possible to implement something like
> PDL in the core.

Mistake.

You don't want to have to convert to-and-from arrays of PMCs in order to
do those ops, and regardless of what kind of hyper-nifty-mumbo-jumbo you
put into Parrot, that's exactly what you're going to have to do.

In fact, Parrot Data Language (if there were such a thing) would likely
introduce its own runtime-loadable opcode set to operate on a new PMC
type called a piddle. Then, each client language could define (in a
module/library) its own means of interacting with a piddle. For example
in Perl, you might:

multi method new(Class $class, int @ary) {...}
multi method new(Class $class, float @ary) {...}
multi method new(Class $class, int $value) {...}
multi method new(Class $class, Octets $value: %*_) {...}

and then you would override BUILD in order to emit your special piddle
opcodes.

Then, in user-space:

my PDL::Piddle $foo = [1,2,3,4,5,6];

Does what you expect, and

$foo + $bar

is special.

Simon Glover

unread,

Apr 21, 2004, 11:04:56 AM4/21/04

to Aaron Sherman, Perl6 Internals List

On Wed, 21 Apr 2004, Aaron Sherman wrote:

> On Wed, 2004-04-21 at 10:13, Simon Glover wrote:
>
> > Absolutely -- I really, _really_ want to be able to use hyper ops with
> > fixed size, floating point arrays, and to have that be as fast as
> > possible, as that should make it possible to implement something like
> > PDL in the core.
>
> Mistake.
>
> You don't want to have to convert to-and-from arrays of PMCs in order to
> do those ops, and regardless of what kind of hyper-nifty-mumbo-jumbo you
> put into Parrot, that's exactly what you're going to have to do.
>

Why? I was under the impression that in Perl 6 it was going to be
possible to declare arrays that only contain values of a particular
type -- I believe the syntax I saw was:

my @array is float;

although I've not been following p6l, so this may have changed somewhat.
Are you saying that despite that, those values still have to be PMCs?
If so, then you're quite right -- I would make no sense to convert them
to floats and then back again -- but this also means that core Perl 6
is not going to be nearly as useful to me as I had hoped.

> In fact, Parrot Data Language (if there were such a thing) would likely
> introduce its own runtime-loadable opcode set to operate on a new PMC
> type called a piddle. Then, each client language could define (in a
> module/library) its own means of interacting with a piddle. For example
> in Perl, you might:
>
> multi method new(Class $class, int @ary) {...}
> multi method new(Class $class, float @ary) {...}
> multi method new(Class $class, int $value) {...}
> multi method new(Class $class, Octets $value: %*_) {...}
>
> and then you would override BUILD in order to emit your special piddle
> opcodes.
>
> Then, in user-space:
>
> my PDL::Piddle $foo = [1,2,3,4,5,6];
>
> Does what you expect, and
>
> $foo + $bar
>
> is special.

This is all very well, but it places a much larger hurdle in the way of
someone wanting to use Perl 6 for these kinds of computationally
intensive array manipulations, particularly if our hypothetical Perl 6
version of PDL isn't a core module.

Simon

Dan Sugalski

unread,

Apr 21, 2004, 11:30:36 AM4/21/04

to Simon Glover, Aaron Sherman, Perl6 Internals List

At 11:04 AM -0400 4/21/04, Simon Glover wrote:
>On Wed, 21 Apr 2004, Aaron Sherman wrote:
> >
>> You don't want to have to convert to-and-from arrays of PMCs in order to
>> do those ops, and regardless of what kind of hyper-nifty-mumbo-jumbo you
>> put into Parrot, that's exactly what you're going to have to do.
>>
>
> Why? I was under the impression that in Perl 6 it was going to be
> possible to declare arrays that only contain values of a particular
> type -- I believe the syntax I saw was:
>
> my @array is float;

I don't know what the current syntax is, but it doesn't
matter--you'll be able to do this.

Aggregates don't have to hold PMCs, they can hold base types. Binary
vtable functions that know the internal representation of both sides
(probably because you've used MMD to get there, but hey, peek if you
want) can dive right into the underlying bits and go wild without
ever touching a PMC if they want.

Most of PDL should be doable efficiently in bytecode without having
to drop to C. (Fortran, possibly, but only because I'd bet that a
Fortran version of most of the matrix ops would be a bit faster,
given most Fortran compilers have amazingly good optimizers, which
our JIT will likely never be able to touch)

Dan Sugalski

unread,

Apr 21, 2004, 11:14:35 AM4/21/04

to l...@toetsch.at, perl6-i...@perl.org

At 4:00 PM +0200 4/21/04, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
>> At 10:55 PM +0200 4/20/04, Leopold Toetsch wrote:
>>>
>>>Do you want to reserve these just for implementing perl's scalar context
>>>of arrays or hashes, or is there more behind the scene?
>
>> More behind the scenes. (Though that's a good reason too) The problem
>> is that we've got quite a few cases where we can't tell at compile
>> time what the heck's going on with a PMC, so there's no good way to
>> know if it's an aggregate or a scalar. If we overload the vtable
>> entries we're going to get into trouble.
>
>I don't get that. Why should we need to know at compile time, if we are
>dealing with an aggregate or not?

So we know what we're doing.

There are really three functions here:

1) Access PMC as a single element
2) Access PMC as a collection of elements
3) Access an element of the PMC

#3 is done with the much-maligned operation_pmc_keyed vtable entries
#1 is done with the plain operation_pmc vtable entries
#2, which corresponds to the hyper operation,
doesn't have a spot in the vtables.

I'm drawing the distinction between an operation
on the container and an operation on all the
container's contents here. I think it's the right
distinction.

>With separate vtable entries, you'll either have to duplicate the whole
>MMD slots, with one extra indirection for vtable->hyper->func, or
>overloading is an all or nothing operation. Both is suboptimal.

The extra indirection for hyper ops isn't a big
deal. The operation itself should take long
enough that the indirection's lost in the noise,

> > Yeah, having hyper versions of the ops does blow out the opcode list
>> a lot, but the alternative is to end up with a half-assed system
>> that'll have the math guys down on my head. I'd as soon skip that
>> one. :)
>
>We don't need a hyper version of each opcode, if we go with the scheme
>I've proposed, either
>
> hyper
> op args
>
>or maybe
>
> hyper .OP, args

I dunno. It's awfully early to be wedging in
hacks--we ought to at least wait until we've hit
1.0...

>Such a scheme also fits nicely with Luke's summary (thanks BTW) that is:
>
> @a »op« @b
>
>always means
>
> map { $a[$_] op $b[$_] } 0..max(+@a, +@b)

I think we'll find that if we

> > ... I'm OK with putting limitations on it--pmc-only, for example,
>> so we don't have to deal with S/I/N versions of the hyper ops.
>
>PMC-only means, that you'll always have to call e.g. get_integer on the
>PMC, because the PMC might be tied. This limitation isn't really good
>for performance reasons. People might use it most likely in combination
>with natural typed arrays.

What I'm saying is that if someone does:

$foo >>*<< 1

that the 1 is a PMC, rather than a low-level
integer, in an attempt to reduce the number of
ops in the core and slots in the hyper vtable.

This is an efficiency limitation. I'm willing to
give it up and go the full route, but that'll
make the optable somewhat large. (OTOH we can
stick 'em all in hyper.ops, I suppose)

>AFAIK is Perl6 the only language that provides these hyper ops so we
>should support these efficiently, especially with natural types.

I think we're going to disagree on which way's efficient here.

Leopold Toetsch

unread,

Apr 21, 2004, 12:05:57 PM4/21/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:

> I'm drawing the distinction between an operation
> on the container and an operation on all the
> container's contents here. I think it's the right
> distinction.

Sure. But the prefix C<hyper> just is the distinction. PerlArray's add,
add_int, bitwise whatever vtable slots are all unused. There isn't any
usage for these except throwing an exception or being overloaded as an
C<hyper> operation.

The C<hyper> prefix could look at the vtable and if it's overloaded just
call it. If it's not overloaded a loop like that in my proof of concepts
is run which (for PMCs) calls the aggregates member's vtable.

>>With separate vtable entries, you'll either have to duplicate the whole
>>MMD slots, with one extra indirection for vtable->hyper->func, or
>>overloading is an all or nothing operation. Both is suboptimal.

> The extra indirection for hyper ops isn't a big
> deal.

No. But it is for MMD.

>> hyper
>> op args
>>
>>or maybe
>>
>> hyper .OP, args

> I dunno. It's awfully early to be wedging in
> hacks--we ought to at least wait until we've hit
> 1.0...

I do consider 1000 more opcodes being a hack ;)

leo

Leopold Toetsch

unread,

Apr 21, 2004, 11:52:06 AM4/21/04

to Aaron Sherman, perl6-i...@perl.org

Aaron Sherman <a...@ajs.com> wrote:
> On Tue, 2004-04-20 at 18:06, Leopold Toetsch wrote:

>> Well, yes. Except for the special case, which is nice though:
>>
>> $ time parrot ih.imc #[1]
>> real 0m0.370s
>>
>> $ time perl i.pl #[2]
>> real 0m5.656s

> That's unrealistic.

No. A real test. I did post all the ingredients. You can run it yourself.
Using an array of natural ints (which you can declare in Perl6) filling
these with values and incremeting (or doing some operation on these.

> ... In P6, you should be able to take:

> @a >>+<< @b

> and turn it into:

I don't get that shuffling tmp's around. Anyway, it doesn't matter. Of
course the Perl6 compiler can roll it's own hyper loop for all the hyper
ops.

> Are we seriously suggesting that after JIT, that's going to be as slow
> as raw Perl, or even any slower than:

> .local ? PerlArray tmp1
> hyper
> tmp1 = __Perl_Ary_a >>+<< __Perl_Ary_b

For example: With the C<hyper> prefix (and the internal implementation)
Parrot will know that this operation will use C<max(@a.elem, @b.elem) +
1> new PMCs. So before even starting this operation, we can query the
memory subsystems and if there are less free PMCs, we can trigger
a DOD run and turn DOD *off* - it want be of any use, to run more DOD
runs - these can't yield more recycled PMCs.
So instead of running 100 useless DOD runs, we run one.

> ?! If so, I'm curious to know why. It seems to me that you're just
> moving the work from the Perl 6 compiler all the way down to the JIT,
> but the resulting code is the same, no?

No. Parrot can e.g. access List internals. Look at my patch in the
original post. And there was *no* JIT involved with the C<hyper> op.

leo

Leopold Toetsch

unread,

Apr 21, 2004, 1:29:19 PM4/21/04

to Aaron Sherman, perl6-i...@perl.org

Aaron Sherman <a...@ajs.com> wrote:

> In fact, Parrot Data Language (if there were such a thing) would likely
> introduce its own runtime-loadable opcode set to operate on a new PMC
> type called a piddle.

It would likely base the piddle object on Parrot's internal datatypes
like IntvalArray or FloatvalArray, which (then) already provide all the
low-level ops to do the desired operation. No XS, no loadable ops (which
is C code too), no compiler needed to use PDL.

E.g. $piddle->raw_data could be an IntvalArray. Perl6's hyper ops work
with that as well as overloaded PDLs ops (though I know nothing about PDL
internals).

leo

Jeff Clites

unread,

Apr 21, 2004, 1:24:42 PM4/21/04

to l...@toetsch.at, Dan Sugalski, perl6-i...@perl.org

On Apr 21, 2004, at 9:05 AM, Leopold Toetsch wrote:

> Dan Sugalski <d...@sidhe.org> wrote:
>
>> I'm drawing the distinction between an operation
>> on the container and an operation on all the
>> container's contents here. I think it's the right
>> distinction.
>
> Sure. But the prefix C<hyper> just is the distinction. PerlArray's add,
> add_int, bitwise whatever vtable slots are all unused. There isn't any
> usage for these except throwing an exception or being overloaded as an
> C<hyper> operation.

Although, I would think that "@ary += 1" would extend the length of the
array by one. That is, I can think of logical uses for the
currently-unimplemented arithmetic ops on PerlArray.

> The C<hyper> prefix could look at the vtable and if it's overloaded
> just
> call it. If it's not overloaded a loop like that in my proof of
> concepts
> is run which (for PMCs) calls the aggregates member's vtable.

So are you saying, have separate vtable slots for the hyper operations,
but then you only have to fill in the vtable->hyper_add() slot _if_ you
want it to do something other than applying vtable->add() in a loop
(otherwise, do what your proof-of-concept did)? If so, that makes a lot
of sense to me.

On the other hand, from what's been said it sounds like Perl6 may not
intend to allow you to override a hyper op independently. If so, we
don't need the separate slots, it seems, but we do need a
vtable->hyper() which somehow takes an op as a parameter, since only
the PMC itself know how to get at its "logical contents" as a
container.

....time passes....
The more I think about it, the more it doesn't make sense to me to
allow >>+<< to be overridden independently of +. Conceptually (in
general, and based on what Luke says), this:

@a »+« 1;

doesn't read as "apply hyper-add to...", but rather as "hyper-apply
'add' to...". That is, if the syntax had been something like:

array.appleToEachElement(&add, 1);
or
array.appleToEachElement(+, 1);

then there wouldn't be any temptation to think of >>+<< as a separate
operator.

FWIW, there's an ObjC add-on framework which defines an "each" method,
to allow you to do things such as:

[[array each] increment];

to increment each element of an array, rather than the "default" way,
which would be:

[array makeObjectsPerformSelector:@selector(increment)];

Just mentioning this, because it brings up the idea of doing something
like:

new P0, .PerlArray
# fill it....
hyper P1, P0 # P1 is now a wrapper object, which holds a reference to P0
add P1, 1 # applying an op to the wrapper really hyper-applies it to P0

That is, have a "hyper-wrapper" PMC, which knows how to hyper-apply any
op to the PMC it's wrapping.

This would let us avoid a bytecode syntax in which either an op
modifies the meaning of an op later in the stream, or an op needs to
take another op as a parameter. This hyper-wrapper PMC would then be
the only thing which needs to know about hyper-ness (I think),
including encapsulating the knowledge about what to do when applying
ops to containers of unequal length (since some ops might treat this as
extending the shorter one with 1's, other ops may fill with 0's, etc.).

It would also avoid adding any vtable operations, or any ops. (The
"hyper" op I used above could be replaced by a simple "new" + "set"
sequence.)

Hmm, I'm actually liking the idea of that approach.

JEff

Dan Sugalski

unread,

Apr 21, 2004, 3:15:37 PM4/21/04

to Jeff Clites, l...@toetsch.at, perl6-i...@perl.org

At 10:24 AM -0700 4/21/04, Jeff Clites wrote:
>So are you saying, have separate vtable slots for the hyper
>operations, but then you only have to fill in the
>vtable->hyper_add() slot _if_ you want it to do something other than
>applying vtable->add() in a loop (otherwise, do what your
>proof-of-concept did)? If so, that makes a lot of sense to me.

Which is what we'd be doing anyway. Splitting the hyper operations
out into a secondary vtable is there because generally the entries
won't get overridden--sharing the table will cut down on memory usage
a bit and hopefully get us a bit more cache coherency.

>On the other hand, from what's been said it sounds like Perl6 may
>not intend to allow you to override a hyper op independently.

Wouldn't be the first bonus thing Larry gets in perl 6, then. One of
the Secret Toy Surprises in every box of Perl 6 Flakes. :)

>The more I think about it, the more it doesn't make sense to me to
>allow >>+<< to be overridden independently of +.

The math folks tell me it makes sense. I can come up with a
half-dozen non-contrived examples, and will if I have to. :-P

>then there wouldn't be any temptation to think of >>+<< as a
>separate operator.

I think... that'd be bad, generally speaking. (And not just because
the math folks have tensors and know what to do with 'em) It's a
separate set of operations and, while there are *default* behaviors,
there's reason to override those defaults while not touching other
things.

Doug McNutt

unread,

Apr 21, 2004, 4:01:35 PM4/21/04

to perl6-i...@perl.org

At 15:15 -0400 4/21/04, Dan Sugalski wrote:
>At 10:24 AM -0700 4/21/04, Jeff Clites wrote:
>>then there wouldn't be any temptation to think of >>+<< as a separate operator.
>
>I think... that'd be bad, generally speaking. (And not just because the math folks have tensors and know what to do with 'em) It's a separate set of operations and, while there are *default* behaviors, there's reason to override those defaults while not touching other things.

It's nice to see that they are no longer "vector" operators. Thanks Larry.

But >>+<< really is a vector sum in the mathematical sense. It's >>*<< that I would really like to override some day so that, with a scalar result expected, it will return the sum of the products taken component wise - the dot product.

$dotproduct = @myclassvectorA >>*<< @myclassvectorB;

It could also return a cross product (not just a component-wise product) in a list context.

@myclassvectorX = @myclassvectorA >>*<< @myclassvectorB;

Multiplying a vector by a scalar should do just what the hyper-op does.

@mylongervector = $scalefactor >>*<< @myclassvectorA;

I understand that it's not practical extraction and report generation and understandably shouldn't be built into perl 6, but the individual override capability seems important to me.

--
--> There are 10 kinds of people: those who understand binary, and those who don't <--

Leopold Toetsch

unread,

Apr 21, 2004, 5:23:55 PM4/21/04

to Jeff Clites, perl6-i...@perl.org

Jeff Clites <jcl...@mac.com> wrote:
> On Apr 21, 2004, at 9:05 AM, Leopold Toetsch wrote:

> Although, I would think that "@ary += 1" would extend the length of the
> array by one. That is, I can think of logical uses for the
> currently-unimplemented arithmetic ops on PerlArray.

I can think of "@ar .= 'foo'" doing something--or not.

But that's all not really useful. Setting the length (or better
elements) of an array is a totally different thing then "P0 = I0". Its
C<elements P0, I0> *if* we have a dedicated opcode/vtable for it or its
just C<P0.elements(I0)> i.e. a method.

This is the same madness as defining C<S0 += S1> as concatenate or even
worse.

The C<$len = @ar> semantics of Perl arrays aren't really Parrot's
semantics. It's a Perl language thingy IMHO to emit "len = elements ar".

leo

Leopold Toetsch

unread,

Apr 21, 2004, 6:14:27 PM4/21/04

to Jeff Clites, perl6-i...@perl.org

Jeff Clites <jcl...@mac.com> wrote:
> On Apr 21, 2004, at 9:05 AM, Leopold Toetsch wrote:

[ just another f'up to separate items ]

> So are you saying, have separate vtable slots for the hyper operations,
> but then you only have to fill in the vtable->hyper_add() slot _if_ you
> want it to do something other than applying vtable->add() in a loop
> (otherwise, do what your proof-of-concept did)? If so, that makes a lot
> of sense to me.

We have:

Parray->vtable->add ... unused

and it's never used in an hyper loop (given that aggregate elements
aren't arrays by themselves). Hyper-add for PMCs just calls

Pelem->vtable->add ... scalars have it

on each aggregate member. The Parray->vtable->add slot is still unused
and will give you the "unimplemented" exception.

But if a class inherits from PerlArray, it can override

Parray->vtable->add ... delegated, ok

and use it as the hyper-add operation.

The idea is that

hyper .ADD, args # or hyper \n add args

or whatever the syntax really is, works as an operator prefix. The
C<hyper> prefix inspects the operation and dispatches either directly to
the overriden vtable or runs its own map-like loop. The latter can call
an appropriate vtable or rolls its own optimized version for natural
types.

> On the other hand, from what's been said it sounds like Perl6 may not
> intend to allow you to override a hyper op independently.

I don't care and mathematicians will like to be able to override e.g.
»*«.

> ... If so, we

> don't need the separate slots, it seems, but we do need a
> vtable->hyper() which somehow takes an op as a parameter, since only
> the PMC itself know how to get at its "logical contents" as a
> container.

C<hyper> per se isn't an operation. It's a do "these sequence of
operations with ..." prefix. Keeping the prefix notation inside just
keeps that logically consistent.

> ....time passes....

> ... but rather as "hyper-apply

> 'add' to...". That is, if the syntax had been something like:

Hah. Same thoughts. Yes.

> array.appleToEachElement(&add, 1);

Yep. And that's exactly what I try to achieve. Without +1000 additional
opcodes and (~50) vtables.

> then there wouldn't be any temptation to think of >>+<< as a separate
> operator.

The C<+> operator is never applied to the array. It's always for the
elements. C< »+« > isn't an opcode, that's it. It doesn't need a vtable,
it doesn't need multiple implementations, all with the whole loop around
ever and ever repeated.

> That is, have a "hyper-wrapper" PMC, which knows how to hyper-apply any
> op to the PMC it's wrapping.

That *is* an interesting idea. It could hide some of the ugliness of my
approach. OTOH the usage of that hyper-PMC is a bit clumsy, reading the
code doesn't really get you what's happening:

hyper # ok we do hyper add
add P0, 10

or

.param pmc h
add h, 10 # what's that - h like hyper or not?

> This would let us avoid a bytecode syntax in which either an op
> modifies the meaning of an op later in the stream, or an op needs to
> take another op as a parameter.

There is already one op (wrapper__) that executes another op ;) But
anyway, C<hyper> does modify the behavior of e.g. C<add>, so I don't
have a problem with implementing it exactly like that.

> ... This hyper-wrapper PMC would then be

> the only thing which needs to know about hyper-ness (I think),

Yes. Including delegation. All transparent. But the drawbacks are in
situations like passing such a PMC around.

> JEff

leo

Larry Wall

unread,

Apr 21, 2004, 1:51:00 PM4/21/04

to Perl6 Internals List

On Wed, Apr 21, 2004 at 11:04:56AM -0400, Simon Glover wrote:
: Why? I was under the impression that in Perl 6 it was going to be

: possible to declare arrays that only contain values of a particular
: type -- I believe the syntax I saw was:
:
: my @array is float;

Just for the record, that's

my num @array;

You don't use "is" on any array unless you want to change the indexing
policy (think "tie" in Perl 5). The standard "is Array" type knows
how to deal with compactly stored low-level types as well as PMCs.
The type you declare out front is the return type of a single array
or hash element. The type you apply with "is" is the type of the
array or hash as an aggregate. Most aggregate types are parameterized
types based on the return type, if you think about it.

In any event, it is absolutely my intent that the builtin array
types of Perl 6 support PDL directly, both in terms of efficiency
and flexibility. You ain't seen Apocalypse 9 yet, but that's what
it's all about. Straight from my rfc list file:

ch09/116 Efficient numerics with perl
ch09/117.rr Perl syntax support for ranges
ch09/122 types and structures
ch09/123 Builtin: lazy
ch09/124 Sort order for any hash
ch09/136 Implementation of hash iterators
ch09/142 Enhanced Pack/Unpack
ch09/169.rr Proposed syntax for matrix element access and slicing.
ch09/202 Arrays: Overview of multidimensional array RFCs (RFC 203 through RFC 207)
ch09/203 Arrays: Notation for declaring and creating arrays
ch09/204 Arrays: Use list reference for multidimensional array access
ch09/205 Arrays: New operator ';' for creating array slices
ch09/206 Arrays: @#arr for getting the dimensions of an array
ch09/207 Arrays: Efficient Array Loops
ch09/225 Data: Superpositions
ch09/231 Data: Multi-dimensional arrays/hashes and slices
ch09/247 pack/unpack C-like enhancements
ch09/266 Any scalar can be a hash key
ch09/268.rr Keyed arrays
ch09/273 Internal representation of Pseudo-hashes using attributes.
ch09/282 Open-ended slices
ch09/341 unified container theory

Larry

Larry Wall

unread,

Apr 21, 2004, 3:46:21 PM4/21/04

to perl6-i...@perl.org

On Wed, Apr 21, 2004 at 03:15:37PM -0400, Dan Sugalski wrote:
: The math folks tell me it makes sense. I can come up with a

: half-dozen non-contrived examples, and will if I have to. :-P

I've said this before, and I'll keep repeating it till it sinks in.
The math folks are completely, totally, blazingly untrustworthy on
this subject. Hyper means one thing and one thing only in Perl 6.
If you give me that little prize in the cereal box I will throw
it out immediately with extreme prejudice. The mathematicians
have all of Unicode to choose from. They can define their own
infix_circumfix_metaoperator if they can come up with a consistent
meaning for it. They can't have my 猾 without a fight.

Larry

Larry Wall

unread,

Apr 21, 2004, 4:14:23 PM4/21/04

to perl6-i...@perl.org

On Wed, Apr 21, 2004 at 02:01:35PM -0600, Doug McNutt wrote:
: I understand that it's not practical extraction and report generation

: and understandably shouldn't be built into perl 6, but the individual
: override capability seems important to me.

I think if you want a mathematical dwim metaoperator, that's fine,
but it should be something other than Perl 6's hyper 猾.

But I repeat myself...

Larry

Larry Wall

unread,

Apr 21, 2004, 8:16:03 PM4/21/04

to Leopold Toetsch, Jeff Clites, perl6-i...@perl.org

On Thu, Apr 22, 2004 at 12:14:27AM +0200, Leopold Toetsch wrote:
: I don't care and mathematicians will like to be able to override e.g.
: »*«.

None of my mail on this subject seems to be getting through to
p6i, and I'm getting frustrated. Perl 6 will *not* be allowing
mathematicians to turn »« into a math-dwimmer. It's strictly
for parallelism. Perl 6 will let the mathematicians define their
own infix_circumfix_metaoperator if they like, or define any other
Unicode operators that they like, but they can't have »«.

Larry

Aaron Sherman

unread,

Apr 22, 2004, 10:29:39 AM4/22/04

to Perl6 Internals List

On Wed, 2004-04-21 at 13:51, Larry Wall wrote:

> In any event, it is absolutely my intent that the builtin array
> types of Perl 6 support PDL directly, both in terms of efficiency
> and flexibility. You ain't seen Apocalypse 9 yet, but that's what
> it's all about. Straight from my rfc list file:

Ok, the combination of Dan's (perhaps overzealous) emphasis on the
dynamic nature of Parrot's client languages and my assumption that we
had learned all there was to learn about the storage of aggregates
mislead me here.

That said, I now see why hyper goes in Parrot... maybe. It depends on
how dynamic Perl is about lazy arrays (e.g. "my int @foo = 1..Inf") and
what happens when I:

my int @foo = 1..3;
$foo[0] = URI::AutoFetch.new("http://numberoftheweek.math.gov/");

If that's polymorphic, we're hosed. If it's an auto-conversion, then
we're good. I like the polymorphic version for a lot of reasons, but
I'll understand if we can't get that.

Thanks all!

Larry Wall

unread,

Apr 22, 2004, 12:26:57 PM4/22/04

to Perl6 Internals List

On Thu, Apr 22, 2004 at 10:29:39AM -0400, Aaron Sherman wrote:
: That said, I now see why hyper goes in Parrot... maybe. It depends on

: how dynamic Perl is about lazy arrays (e.g. "my int @foo = 1..Inf")

As dynamic as it needs to be. The built-in array type has to know how
much of the array is really there already, and how to build the rest of the
array on demand.

: and what happens when I:

:
: my int @foo = 1..3;
: $foo[0] = URI::AutoFetch.new("http://numberoftheweek.math.gov/");
:
: If that's polymorphic, we're hosed. If it's an auto-conversion, then
: we're good. I like the polymorphic version for a lot of reasons, but
: I'll understand if we can't get that.

It's auto-conversion. An "int" declaration is a strong guarantee that
the thing is never going to be required to store a random scalar. On
the other hand, and "Int" declaration merely requires that the scalar
stored "does" Int.

Larry

Aaron Sherman

unread,

Apr 22, 2004, 3:58:40 PM4/22/04

to Larry Wall, Perl6 Internals List

On Wed, 2004-04-21 at 15:46, Larry Wall wrote:
> On Wed, Apr 21, 2004 at 03:15:37PM -0400, Dan Sugalski wrote:
> : The math folks tell me it makes sense. I can come up with a
> : half-dozen non-contrived examples, and will if I have to. :-P
>
> I've said this before, and I'll keep repeating it till it sinks in.
> The math folks are completely, totally, blazingly untrustworthy on

> this subject. [...] They can't have my »« without a fight.

Ah... now you see the true face of the age-old Linguistics-Mathematics
wars! ;-)

But seriously, to summarize what I've learned from this thread:

* "my int @foo" will compile down into an efficient representation
* PDL (and its like) will be able to use this to efficiently
perform high-level operations on arrays, but only built-in
operations
* If someone (e.g. PDL) wants to implement other operations and
their hyper-equivalent, they can do it in a high level language
like P6 or as run-time loadable parrot opcodes (which PDL will
certainly have to do, since most of their ops are in an ancient
and gigantic Fortran lib).

Sounding like problem solved to me! Thanks Larry.

Dan Sugalski

unread,

Apr 22, 2004, 4:11:57 PM4/22/04

to Perl6 Internals List

At 3:58 PM -0400 4/22/04, Aaron Sherman wrote:
>On Wed, 2004-04-21 at 15:46, Larry Wall wrote:
>> On Wed, Apr 21, 2004 at 03:15:37PM -0400, Dan Sugalski wrote:
>> : The math folks tell me it makes sense. I can come up with a
>> : half-dozen non-contrived examples, and will if I have to. :-P
>>
>> I've said this before, and I'll keep repeating it till it sinks in.
>> The math folks are completely, totally, blazingly untrustworthy on
>> this subject. [...] They can't have my »« without a fight.
>
>Ah... now you see the true face of the age-old Linguistics-Mathematics
>wars! ;-)

Just to be clear here, for the archive:

Perl 6's hyper operators *will* use hyper vtable
slots in the PMCs, and will use the hyper
versions of the opcodes. Leo and I are going to
fight out the "Do we have lots of ops or do we
use a shift opcode" implementation details.
Whether or not Larry allows overriding the hyper
vtable slots in the base grammar is up to him. So:

$a = $b + $c # VTABLE_add($b, $c, $a)
$a = $b[1] + $c # VTABLE_add_keyed($b, key(1), $c, key(), $a, key())
$a = $b >>+<< $c # VTABLE_add_hyper($b, $c, $a)

The default hyper functions will do whatever
piecewise activity is defined, though I expect
that'll be overridden in a few unusual cases.

Aaron Sherman

unread,

Apr 23, 2004, 2:09:53 PM4/23/04

to Leopold Toetsch, Perl6 Internals List

Note: We've moved past hyper-ops (I hope!), but there are still some
details in this post that deserve a response on tangential topics.

On Wed, 2004-04-21 at 11:52, Leopold Toetsch wrote:
> Aaron Sherman <a...@ajs.com> wrote:

> > That's unrealistic.
>
> No. A real test.

Sorry, I was not clear enough. Yes, of course, non-Parrot Perl 5 is
going to be slow at this, but we expect that and your results showed
nothing surprising.

What might be interesting is to compare Parrot to Parrot doing this with
and without a hyper-operator. That's all I was trying to say.

As for the DOD: you have an excellent point, but it extends far beyond
the hyper-operators. I'm starting to think that front-ends like the
Python compiler or the Perl 6 compiler are going to need controls over
the DOD for just the reasons you cite. After all, they know when they
are about to start doing some large looping operation that's all highly
constrained with respect to allocation. It would make sense to gather
the resources they need, lock down DOD, do what they need to do and then
unlock the DOD...

Dan Sugalski

unread,

Apr 23, 2004, 2:28:43 PM4/23/04

to Perl6 Internals List

At 2:09 PM -0400 4/23/04, Aaron Sherman wrote:
>As for the DOD: you have an excellent point, but it extends far beyond
>the hyper-operators. I'm starting to think that front-ends like the
>Python compiler or the Perl 6 compiler are going to need controls over
>the DOD for just the reasons you cite. After all, they know when they
>are about to start doing some large looping operation that's all highly
>constrained with respect to allocation. It would make sense to gather
>the resources they need, lock down DOD, do what they need to do and then
>unlock the DOD...

While that capability's already in there and has been for years, most
of the literature I've come across indicates that, except in very
specific circumstances, twiddling garbage collection behaviour is a
bad idea. There are too many factors that affect performance, too
much information that's not available at either code-writing or
compile time to do it right, and too many subtle and/or bizarre
things that cause nasty effects.

This is the sort of thing that should be left to Parrot to deal with.
Apps and compilers aren't in a good position to make the right
decision. (Not to say that we shouldn't have infrastructure in place
to make things work better but, again, it's parrot-level stuff that
apps and compilers shouldn't deal with outside of perhaps referencing
the information gathered from a profiling run or twelve)

Leopold Toetsch

unread,

Apr 23, 2004, 2:52:40 PM4/23/04

to Aaron Sherman, perl6-i...@perl.org

Aaron Sherman <a...@ajs.com> wrote:

> What might be interesting is to compare Parrot to Parrot doing this with
> and without a hyper-operator. That's all I was trying to say.

I'd posted that as well. Here again with an O3 build of parrot:

$ time parrot ih.imc
6060

real 0m0.214s
user 0m0.150s
sys 0m0.040s

$ time parrot -j i.imc
6060

real 0m1.793s
user 0m1.750s
sys 0m0.040s

Both are using the same natural int array. But the builtin does use list
internals to directly access list chunk memory. This is of course only
possible for natural types in our aggregates, where no vtable is needed
to access the aggregates data members. But the numbers do indicate, that
number crunchers just want that.

> As for the DOD: you have an excellent point, but it extends far beyond
> the hyper-operators. I'm starting to think that front-ends like the
> Python compiler or the Perl 6 compiler are going to need controls over
> the DOD for just the reasons you cite. After all, they know when they
> are about to start doing some large looping operation that's all highly
> constrained with respect to allocation. It would make sense to gather
> the resources they need, lock down DOD, do what they need to do and then
> unlock the DOD...

Well, it's unlikely that we can expose all the details the more that
such details may change. We could have a generalized version of such an
operation though:

i_need_now_x_pmcs_and_wont_dispose_any_start 100000
# ... deep clone code or loop
i_need_now_x_pmcs_and_wont_dispose_any_end

er EOPCODETOOLONG :)

But as soon as there are any temps created inside that code, this might
already fail miserably because of resource outage. OTOH a compiler now
already can emit:

sweep 1
sweepoff
... deep clone or some such
sweepon

leo

Dan Sugalski

unread,

Apr 23, 2004, 3:34:11 PM4/23/04

to Perl6 Internals List

At 3:25 PM -0400 4/23/04, Aaron Sherman wrote:
>That I did not know about, but noticed Dan pointing it out too. I'm
>still learning a lot here,

It might be best, for everyone's peace of mind, blood pressure, and
general edification, to take a(nother) run through the documentation.
The stuff in docs/pdds isn't too out of date (mostly) and all the
opcodes have POD, so you can do something like:

perldoc ops/var.ops

to see the documentation for all the variable opcodes. Or at least
all the ones in variable.ops.

While diving in feet-first does get you going, looking for the rocks
and deep water first is never ill-advised... :)

Aaron Sherman

unread,

Apr 23, 2004, 3:25:28 PM4/23/04

to Leopold Toetsch, Perl6 Internals List

On Fri, 2004-04-23 at 14:52, Leopold Toetsch wrote:
> Aaron Sherman <a...@ajs.com> wrote:
>
> > What might be interesting is to compare Parrot to Parrot doing this with
> > and without a hyper-operator. That's all I was trying to say.
>
> I'd posted that as well. Here again with an O3 build of parrot:

Oops, missed that. Thanks! I'm shocked by the difference in
performance... it makes me wonder how efficient the optimization+JIT is
when the two operations are SO different. I must simply not understand
what's going on at the lowest level here. More investigation needed on
my part, as I'm sure this will be an important point for me to
understand in later topics that I'll run into writing Parrot code.

> > As for the DOD: you have an excellent point, but it extends far beyond
> > the hyper-operators. I'm starting to think that front-ends like the
> > Python compiler or the Perl 6 compiler are going to need controls over
> > the DOD for just the reasons you cite. After all, they know when they
> > are about to start doing some large looping operation that's all highly
> > constrained with respect to allocation. It would make sense to gather
> > the resources they need, lock down DOD, do what they need to do and then
> > unlock the DOD...
>
> Well, it's unlikely that we can expose all the details the more that
> such details may change. We could have a generalized version of such an
> operation though:
>
> i_need_now_x_pmcs_and_wont_dispose_any_start 100000
> # ... deep clone code or loop
> i_need_now_x_pmcs_and_wont_dispose_any_end
>
> er EOPCODETOOLONG :)

Heh, yeah I getcha. It would be interesting, but as you point out it's
ugly and specialized.

> sweep 1
> sweepoff
> ... deep clone or some such
> sweepon

That I did not know about, but noticed Dan pointing it out too. I'm
still learning a lot here, and while I know it's frustrating, I hope to
condense what I learn into some usable forms (perhaps adding to the FAQ
as I suggested to Dan). I don't always agree with the two of you, but
that's not required. I just need to understand enough that I can get the
work done that I want to do, and make it efficient enough that people
actually USE it ;-)

Leopold Toetsch

unread,

Apr 23, 2004, 5:03:57 PM4/23/04

to Aaron Sherman, perl6-i...@perl.org

Aaron Sherman <a...@ajs.com> wrote:
> On Fri, 2004-04-23 at 14:52, Leopold Toetsch wrote:
>>
>> I'd posted that as well. Here again with an O3 build of parrot:

> Oops, missed that. Thanks! I'm shocked by the difference in
> performance... it makes me wonder how efficient the optimization+JIT is
> when the two operations are SO different.

The difference isn't due to bad JIT code. Its simply that an internal
hyper (prefix) opcode with INTVALs or FLOATVALs *and* with the knowledge
of the underlaying array can achieve a lot more then a generalized
scheme of some keyed opcodes plus a loop. These keyed opcodes might
already be optimized like using INTVALs for keys or even (useless and
unimplementable - hi Dan:) multi-keyed opcodes.

The Perl6ish »op« is an example, where generalization doesn't really
help. Introducing distinct vtable slots in every non-aggregate PMC for
unused math operations in aggregates is suboptimal as well as
permutating opcodes with hyper or multi-keyed variants.

Hyper op isn't an opcde, it's a "map aggregate's members to deal with
an opcode" operation. It should be implemented like that.

And that's the difference in performace, I've shown.

> ... I must simply not understand

> what's going on at the lowest level here.

Well, the "lowest level" that is the wide performance range from an
interpreted "everything is an object" language "down" to optimized C
code. Have a look (again) at the mops tests. They reach from 2 MOps to
800 MOps (on Athlon 800). I.e. between these two POVs you have a factor
of 400 in that tight loop case (which of course isnt't typical for RL
programs but it shows the range nethertheless).

leo

Aaron Sherman

unread,

Apr 25, 2004, 10:27:41 AM4/25/04

to Dan Sugalski, Perl6 Internals List

On Fri, 2004-04-23 at 15:34, Dan Sugalski wrote:
> At 3:25 PM -0400 4/23/04, Aaron Sherman wrote:
> >That I did not know about, but noticed Dan pointing it out too. I'm
> >still learning a lot here,
>
> It might be best, for everyone's peace of mind, blood pressure, and
> general edification, to take a(nother) run through the documentation.
> The stuff in docs/pdds isn't too out of date (mostly) and all the
> opcodes have POD, so you can do something like:

Yeah, I've been plowing through it a piece at a time. I'm currently
still mowing down the DOD docs which (given that I've been in
application space for the last 8 years, and the world of GC has changed
radically in that time) is a hard read. There are 14,304 lines of POD in
the docs subdir and its immediate subdirs. That's a fair amount of
reading, especially for something as dense as technical documentation.

> While diving in feet-first does get you going, looking for the rocks
> and deep water first is never ill-advised... :)

Is that really what I'm doing?

It's also the case that there's a HUGE amount of documentation and
source code, and I doubt that ANYONE coming to this list and asking
questions will understand all of it. I would be so egotistical as to
even suggest that I've read more of the source and docs than most who
will be asking questions in the next few years.

Given that, getting the stupid stuff out of the way now, and putting it
in a highly indexed form (e.g. a mailing list FAQ) that people on the
list can be pointed at, might save EVEN MORE blood pressure.

signature.asc