newbie question....

Matt Greenwood

unread,

Mar 11, 2004, 6:06:56 PM3/11/04

to perl6-i...@perl.org

Hi all,
I have a newbie question. If the answer exists in a doc, just
point the way (I browsed the docs directory). What is the design
rationale for so many opcodes in parrot? What are the criteria for
adding/deleting them?

Thanks,
Matt

Karl Brodowsky

unread,

Mar 11, 2004, 7:51:12 PM3/11/04

to perl6-i...@perl.org

Matt Greenwood wrote:

> I have a newbie question. If the answer exists in a doc, just
> point the way (I browsed the docs directory). What is the design
> rationale for so many opcodes in parrot?

Let me try as another newbie... ;-)

Since the opcodes of parrot are not directly supported by any existing hardware,
at least not now ;-), they have to be mapped to native code during execution.
This costs something per parrot-operation. So if there are many different opcodes
in parrot with powerful functionality behind them, this overhead does not hurt so
much, because a parrot instruction gets a lot of stuff done. At least I heard this
kind of explanation for Perl5, which uses something slightly like parrot internally
as well.

Maybe this reduces the answer by the real experts to a yes/no? ;-)

Best regards,

Karl

Leopold Toetsch

unread,

Mar 12, 2004, 2:07:06 AM3/12/04

to Matt Greenwood, perl6-i...@perl.org

Matt Greenwood <Matt.Gr...@twosigma.com> wrote:
> Hi all,
> I have a newbie question. If the answer exists in a doc, just
> point the way (I browsed the docs directory). What is the design
> rationale for so many opcodes in parrot?

We have four different register types. They have to be covered by
opcode, which leads to a lot of opcode permutations:

$ grep -w add docs/ops/math.pod
=item B<add>(inout INT, in INT)
=item B<add>(inout NUM, in INT)
=item B<add>(inout NUM, in NUM)
=item B<add>(in PMC, in INT)
=item B<add>(in PMC, in NUM)
=item B<add>(in PMC, in PMC)
=item B<add>(out INT, in INT, in INT)
=item B<add>(out NUM, in NUM, in INT)
=item B<add>(out NUM, in NUM, in NUM)
=item B<add>(in PMC, in PMC, in INT)
=item B<add>(in PMC, in PMC, in NUM)
=item B<add>(in PMC, in PMC, in PMC)

We could of course only provide the very last one but that would
prohibit any optimizations. Opcodes with native types running in the JIT
code are may tenths faster then their PMC counterparts.

> ... What are the criteria for
> adding/deleting them?

On demand :)

> Thanks,
> Matt

leo

Jared Rhine

unread,

Mar 11, 2004, 6:25:48 PM3/11/04

to Matt Greenwood, perl6-i...@perl.org

[Matt == Matt.Gr...@twosigma.com on Thu, 11 Mar 2004 18:06:56 -0500]

Matt> What is the design rationale for so many opcodes in parrot?

Completeness and performance. Many of the opcodes are type-specific
variants of other multi-type opcodes.

Given that 99+% of parrot code will be automatically generated from
language compilers, the performance benefits of additional specialized
opcodes outweighs the inability to keep all the opcodes in a human's
head at once.

Matt> What are the criteria for adding/deleting them?

Consensus among parrot developers. To be an opcode, a particular
function should really need to be implemented in C to work properly.

-- ja...@wordzoo.com

"Better to be of a rare breed than a long line." -- TDK

Matt Greenwood

unread,

Mar 12, 2004, 8:55:28 AM3/12/04

to l...@toetsch.at, perl6-i...@perl.org

I completely agree that you would have multiple *of the same* opcode for
the different types. I guess the question I was (too delicately) asking,
is why you have opcodes that are usually in standard libraries, and even
some that aren't. For example; fact, exsec..., why have both concat and
add...?

Matt

> -----Original Message-----
> From: Leopold Toetsch [mailto:l...@toetsch.at]
> Sent: Friday, March 12, 2004 2:07 AM
> To: Matt Greenwood
> Cc: perl6-i...@perl.org
> Subject: Re: newbie question....
>
> Matt Greenwood <Matt.Gr...@twosigma.com> wrote:
> > Hi all,
> > I have a newbie question. If the answer exists in a doc, just

> > point the way (I browsed the docs directory). What is the design

> > rationale for so many opcodes in parrot?
>

> We have four different register types. They have to be covered by
> opcode, which leads to a lot of opcode permutations:
>
> $ grep -w add docs/ops/math.pod
> =item B<add>(inout INT, in INT)
> =item B<add>(inout NUM, in INT)
> =item B<add>(inout NUM, in NUM)
> =item B<add>(in PMC, in INT)
> =item B<add>(in PMC, in NUM)
> =item B<add>(in PMC, in PMC)
> =item B<add>(out INT, in INT, in INT)
> =item B<add>(out NUM, in NUM, in INT)
> =item B<add>(out NUM, in NUM, in NUM)
> =item B<add>(in PMC, in PMC, in INT)
> =item B<add>(in PMC, in PMC, in NUM)
> =item B<add>(in PMC, in PMC, in PMC)
>
> We could of course only provide the very last one but that would
> prohibit any optimizations. Opcodes with native types running in the
JIT
> code are may tenths faster then their PMC counterparts.
>

> > ... What are the criteria for
> > adding/deleting them?
>

Dan Sugalski

unread,

Mar 12, 2004, 10:03:19 AM3/12/04

to Matt Greenwood, perl6-i...@perl.org

Whether we have a lot or not actually depends on how you count. (Last
time I checked the x86 still beat us, but that was a while back) In
absolute, unique op numbers we have more than pretty much any other
processor, but that is in part because we have *no* runtime op
variance.

For example, if you look you'll see we have 28 binary "add" ops.
.NET, on the other hand, only has one, and most hardware CPUs have a
few, two or three. However... for us each of those add ops has a very
specific, fixed, and invariant parameter list. The .NET version, on
the other hand, is specified to be fully general, and has to take the
two parameters off the stack and do whatever the right thing is,
regardless of whether they're platform ints, floats, objects, or a
mix of these. With most hardware CPUs you'll find that several bits
in each parameter are dedicated to identifying the type of the
parameter (int constant, register number, indirect offset from a
register). In both cases (.NET and hardware) the engine needs to
figure out *at runtime* what kind of parameters its been given.
Parrot, on the other hand, figures out *at compiletime*.

Now, for hardware this isn't a huge deal--it's a well-known problem,
they've a lot of transistors (and massive parallelism) to throw at
it, and it only takes a single pipeline stage to go from the raw to
the decoded form. .NET does essentially the same thing, decoding the
parameter types and getting specific, when it JITs the code. (And
runs pretty darned slowly when running without a JIT, though .NET was
designed to have a JIT always available)

Parrot doesn't have massive parallelism, nor are we counting on
having a JIT everywhere or in all circumstances. We could waste a
bunch of bits encoding type information in the parameters and figure
it all out at runtime, but... why bother? Since we *know* with
certainty at compile (or assemble) time what the parameter types are,
there's no reason to not take advantage of it. So we do.

It's also important to note that there's no less code involved (or,
for the hardware, complexity) doing it our way or the
decode-at-runtime way--all the code is still there in every case,
since we all have to do the same things (add a mix of ints, floats,
and objects, with a variety of ways of finding them) so there's no
real penalty to doing it our way. It actually simplifies the JIT some
(no need to puzzle out the parameter types), so in that we get a win
over other platforms since JIT expenses are paid by the user every
run, while our form of decoding's only paid when you compile.

Finally, there's the big "does it matter, and to whom?" question. As
someone actually writing parrot assembly, it looks like parrot only
has one "add" op--when emitting pasm or pir you use the "add"
mnemonic. That it gets qualified and assembles down to one variant or
another based on the (fixed at assemble time) parameters is just an
implementation detail. For those of us writing op bodies, it just
looks like we've got an engine with full signature-based dispatching
(which, really, we do--it's just a static variant), so rather than
having to have a big switch statement or chain of ifs at the
beginning of the add op we just write the specific variants
identified by function prototype and leave it to the engine to choose
the right variant.

Heck, we could, if we chose, switch over to a system with a single
add op with tagged parameter types and do runtime decoding without
changing the source for the ops at all--the op preprocessor could
glom them all together and autogenerate the big switch/if ladder at
the head of the function. (We're not going to, of course, but we
could. Heck, it might be worth doing if someone wanted to translate
parrot's interpreter engine to hardware, though it'd have bytecode
that wasn't compatible with the software engine)

As for what the rationale is... well, it's a combination of whim and
necessity for adding them, and brutal reality for deleting them.

Our ops fall into two basic categories. The first, like add, are just
basic operations that any engine has to perform. The second, like
time, are low-level library functions. (Where the object ops fall is
a matter of some opinion, though I'd put most of them in the "basic
operation" category)

For something like hardware, splitting standard library from the CPU
makes sense--often the library requires resources that the hardware
doesn't have handy. (I wouldn't, for example, want to contemplate
implementing time functions with cross-timezone and leap-second
calculations with a mass 'o transistors. The System/360 architecture
has a data-formatting instruction that I figure had to tie up a good
10-15% of the total CPU transistors when it was first introduced)
Hardware is also often bit-limited--opcodes need to fit in 8 or 9
bits.

For things like the JVM or .NET, opcodes are also bit-limited (though
there's much less of a real reason to do so) since they only allocate
a byte for their opcode number. Whether that's a good idea or not
depends on the assumptions underlying the design of their engines--a
lot of very good people at Sun and Microsoft were involved in the
design and I fully expect the engines met their design goals.

Parrot, on the other hand, *isn't* bit-limited, since our ops are 32
bits. (A more efficient design on RISC systems where byte-access is
expensive) That opens things up a bunch.

If you think about it, the core opcode functions and the core
low-level libraries are *always* available. Always. The library
functions also have a very fixed parameter list. Fixed parameter
list, guaranteed availability... looks like an opcode function to me.
So they are. We could make them library functions instead, but all
that'd mean would be that they'd be more expensive to call (our
sub/method call is a bit heavyweight) and that you'd have to do more
work to find and call the functions. Seemed silly.

Or, I suppose, you could think of it as if we had *no* opcodes at all
other than end and loadoplib. Heck, we've a loadable opcode
system--it'd not be too much of a stretch to consider all the opcode
functions other than those two as just functions with a fast-path
calling system. The fact that a while bunch of 'em are available when
you start up's just a convenience for you.

So, there ya go. We've either got two, a reasonable number the same
as pretty much everyone else, an insane number of them, or the
question itself is meaningless. Take your pick, they're all true. :)
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Dan Sugalski

unread,

Mar 12, 2004, 12:52:07 PM3/12/04

to Matt Greenwood, l...@toetsch.at, perl6-i...@perl.org

At 8:55 AM -0500 3/12/04, Matt Greenwood wrote:
>I completely agree that you would have multiple *of the same* opcode for
>the different types. I guess the question I was (too delicately) asking,
>is why you have opcodes that are usually in standard libraries, and even
>some that aren't. For example; fact, exsec...,

I answered this in some detail, but the short answer is "There's no
reason not to"

>why have both concat and
>add...?

Erm... because they do completely different things?

Brent "Dax" Royal-Gordon

unread,

Mar 12, 2004, 2:03:39 PM3/12/04

to Matt Greenwood, perl6-i...@perl.org

Matt Greenwood wrote:
> why have both concat and
> add...?

How, exactly, is taking two strings, making a third string that's big
enough to contain both, and copying the contents of those two strings
into the third one like taking two numbers, doing a binary OR with
carry, and storing the result in a third number?

Some languages overload addition to do both. Other languages don't; in
fact, a Perl add and a Perl concat (to take one example) behave very
differently from one another.

Generally speaking, it's better for compilers to do a bit of extra work
to figure out the argument types involved than it is for them to throw
away information they already have. (Besides, it's not that big a deal
with PMCs--a PythonString can put the same code in its concat_*() and
add_*() vtable entries.)

--
Brent "Dax" Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker

Oceania has always been at war with Eastasia.

Matt Greenwood

unread,

Mar 12, 2004, 2:57:12 PM3/12/04

to Brent "Dax" Royal-Gordon, perl6-i...@perl.org

> How, exactly, is taking two strings, making a third string that's big
> enough to contain both, and copying the contents of those two strings
> into the third one like taking two numbers, doing a binary OR with
> carry, and storing the result in a third number?

Firstly, you have made an assumption that the addition here is
equivalent to OR and carry, which may be correct for certain
representations of integral datatypes, but certainly isn't for any kind
of floating point arithmetic that I know of.

Secondly, you missed the point that I was making. The current add
opcodes defined in parrot are the following:

add (in PMC, in PMC, in PMC)

add(in PMC, in INT)
add(in PMC, in NUM)
add(in PMC, in PMC)
add(in PMC, in PMC, in INT)
add(in PMC, in PMC, in NUM)
add(inout INT, in INT)
add(inout NUM, in INT)
add(inout NUM, in NUM)
add(out INT, in INT, in INT)
add(out NUM, in NUM, in INT)
add(out NUM, in NUM, in NUM)

I was simply asking why there wasn't an

add(out STR, in STR, in STR)

which seems reasonable. This is not a question of operator overloading,
but rather semantics - that's all.

> Some languages overload addition to do both. Other languages don't;
in
> fact, a Perl add and a Perl concat (to take one example) behave very
> differently from one another.

Ahh yes, but this includes implicit type conversion, which is not what
you want to do in Parrot (if I am to understand Dan correctly)

DanS> Right now it's flat-out disallowed in parrot, and I'm also
DanS> comfortable with that. (Plan on keeping it that way, honestly)

> Generally speaking, it's better for compilers to do a bit of extra
work
> to figure out the argument types involved than it is for them to throw
> away information they already have. (Besides, it's not that big a
deal
> with PMCs--a PythonString can put the same code in its concat_*() and
> add_*() vtable entries.)

Agreed, though in this case it's the opposite. The compiler doesn't need
to do any extra work because it knows exactly what argument types it
has.

Thanks,

Matt

Brent "Dax" Royal-Gordon

unread,

Mar 12, 2004, 4:59:14 PM3/12/04

to Matt Greenwood, perl6-i...@perl.org

Matt Greenwood wrote:
> Firstly, you have made an assumption that the addition here is
> equivalent to OR and carry, which may be correct for certain
> representations of integral datatypes, but certainly isn't for any
> kind of floating point arithmetic that I know of.

True enough, but I think I got my point across--concatenation is a
fundamentally different operation from addition.

> Secondly, you missed the point that I was making. The current add
> opcodes defined in parrot are the following:
>

(various combinations of PMC, INT, and NUM)

>
> I was simply asking why there wasn't an
>
> add(out STR, in STR, in STR)
>
> which seems reasonable. This is not a question of operator
> overloading, but rather semantics - that's all.

I suppose that depends on what you want it to do. If you want it to
convert $2 and $3 to integers, add them, convert the result to a string,
and put it in $1, then the answer is "that's not a common enough
operation to warrant adding the extra opcodes"--especially since the
I/S/N registers aren't supposed to be used for anything but optimizations.

If you want it to concatenate $2 and $3 and insert the result into $1,
and remove the "concat" opcode altogether...well, the answer stems from
the existence of add(in PMC, in PMC, in PMC). What should that
do--integer addition, or string concatenation? Remember, some of our
languages don't overload add for strings. We need a separate concat(in
PMC, in PMC, in PMC), so we might as well have concat(out STR, in STR,
in STR) too.

Paolo Molaro

unread,

Mar 14, 2004, 9:31:54 AM3/14/04

to perl6-i...@perl.org

My weekly perusing on parrot lists...

On 03/12/04 Dan Sugalski wrote:
> For example, if you look you'll see we have 28 binary "add" ops.
> .NET, on the other hand, only has one, and most hardware CPUs have a

Actually, there are three opcodes: add, add.ovf, add.ovf.un (the last
two throw an exception on overflow with signed or unsigned addition:
does parrot have any way to detect oveflow?).

> few, two or three. However... for us each of those add ops has a very
> specific, fixed, and invariant parameter list. The .NET version, on
> the other hand, is specified to be fully general, and has to take the
> two parameters off the stack and do whatever the right thing is,
> regardless of whether they're platform ints, floats, objects, or a
> mix of these. With most hardware CPUs you'll find that several bits

Well, not really: add is specified for fp numbers, 32-bit ints, 64-bit
ints and pointer-sized ints. Addition of objects or structs is handled
by the compiler (by calling the op_Addition static method if it exists,
otherwise the operation is not defined for the types). Also, no mixing
is allowed, except between 32-bit ints ant pointer-sized ints,
conversions, if needed, need to be inserted by the compiler.

> in each parameter are dedicated to identifying the type of the
> parameter (int constant, register number, indirect offset from a
> register). In both cases (.NET and hardware) the engine needs to
> figure out *at runtime* what kind of parameters its been given.

Well, on hardware the opcodes are really different, even if it may look
like they have a major opcode and a sub-opcode specifying the type.

> the decoded form. .NET does essentially the same thing, decoding the
> parameter types and getting specific, when it JITs the code. (And
> runs pretty darned slowly when running without a JIT, though .NET was
> designed to have a JIT always available)

Yes, so it doesn't matter:-) It's like saying that x86 code runs slow if
you run it in an emulator:-) It's true, but almost nobody cares
(especially since IL code can now be run with a jit on x86, ppc, sparc
and itanium - s390, arm, amd64 are in the works).

> Parrot doesn't have massive parallelism, nor are we counting on
> having a JIT everywhere or in all circumstances. We could waste a
> bunch of bits encoding type information in the parameters and figure
> it all out at runtime, but... why bother? Since we *know* with
> certainty at compile (or assemble) time what the parameter types are,
> there's no reason to not take advantage of it. So we do.

Sure, doing things as java does, with different opcodes for different
types is entirely reasonable if you design a VM for interpretation
(though arguably there should be a limit to the combinatorial explosion
of different type arguments). There is only a marginal issue with
generics code that the IL way of doing opcodes allows and the java
style does not, but it doesn't matter much.

> real penalty to doing it our way. It actually simplifies the JIT some
> (no need to puzzle out the parameter types), so in that we get a win
> over other platforms since JIT expenses are paid by the user every
> run, while our form of decoding's only paid when you compile.

This overhead is negligible (and is completely avoided by using the
ahead of time compilation feature of mono).

> Finally, there's the big "does it matter, and to whom?" question. As
> someone actually writing parrot assembly, it looks like parrot only
> has one "add" op--when emitting pasm or pir you use the "add"
> mnemonic. That it gets qualified and assembles down to one variant or

Well, as you mention, someone has to do it and parrot needs to do it
anyway for runtime-generated parrot asm (if parrot doesn't do it already
I guess it will need to do it anyway to support features like eval etc.).
Anyway, if you're going to JIT it doesn't matter if you use one opcode
for add or one opcode for each different kind of addition. If you're
going to interpret the bytecode, having specific opcodes makes sense.

> For things like the JVM or .NET, opcodes are also bit-limited (though
> there's much less of a real reason to do so) since they only allocate
> a byte for their opcode number. Whether that's a good idea or not

Don't know about the JVM, but the CLR doesn't have a single byte limit
for opcodes: two byte opcodes are already specified (and if you consider
prefix opcodes you could say there are 3 and 4 bytes opcodes already:
unaligned.volatile.cpblk is such an opcode). Also, the design allows for
any number of bytes per opcode, though I don't think that will be ever
needed: the CLR is designed to provide a fast implementation of the
low-level opcodes and to provide fast method calls: combining the two
you can implement rich semantics in a fast way without needing to change
the VM. There are still a few rough areas that could use a speedup
with specialized opcodes, but there are very few of them and 2-3
additional opcodes will fix them.

> Parrot, on the other hand, *isn't* bit-limited, since our ops are 32
> bits. (A more efficient design on RISC systems where byte-access is
> expensive) That opens things up a bunch.

Note that it also uses much more data cache (and disk space): this may
become relevant especially if parrot is to target embedded systems.
Anyone has done measurments on real-life code to see how much disk space
is used (data cache effects could be measured with cpu counters, but
it's much more difficult)?
For example adding two regs and storing them in a third requires 16
bytes of bytecode in parrot. The same expression takes 4 bytes in IL
code in the best case, 7 in more complex but probably more common methods.
The maximum is 13 bytes (in the CLR operations happen on the eval stack,
so a single byte is enough, but I added the opcodes needed to load two
local vars and to store the result: you can consider the CLR a mixed
stack and register machine, but, unlike parrot, there can be as much as
65535 registers each with their own type).
Anyway, please consider this issue: I'd suggest at least to use a single
opcode_t to store the indexes to the argument and result registers for
an opcode. This would cut down the space required to 8 bytes, still
bigger than IL code, but much more comparable (unless, of course,
opcode_t is changed to be 8 bytes on some platforms...).

> functions also have a very fixed parameter list. Fixed parameter
> list, guaranteed availability... looks like an opcode function to me.
> So they are. We could make them library functions instead, but all
> that'd mean would be that they'd be more expensive to call (our
> sub/method call is a bit heavyweight) and that you'd have to do more
> work to find and call the functions. Seemed silly.

Well, a different solution is to speedup function calls: I imagine
nobody would be against that:-)

> So, there ya go. We've either got two, a reasonable number the same
> as pretty much everyone else, an insane number of them, or the
> question itself is meaningless. Take your pick, they're all true. :)

An issue I think you should consider as well with the current parrot
design is this: the last time I built parrot there were 180 vtable slots
(in vtable.dump: not sure this is the actual number, but it seems
reasonable). 4 of them are because of add, for example. This means that
for each type, on a 32 bit system, at least 180*4 bytes are spent on the
vtable. How likely is it that the vtable will grow when parrot
starts getting some real use with compilers starting to target it?
For a moderately complex app that uses 500 different types that amounts
to more than 350 KB of memory already just for the vtables. Or are you
going to discourage the definition of new PMC types and to do vtable
dispatching in a different language-specific way?

Thanks.
lupus

--
-----------------------------------------------------------------
lu...@debian.org debian/rules
lu...@ximian.com Monkeys do it better

Tim Bunce

unread,

Mar 15, 2004, 7:26:04 AM3/15/04

to Dan Sugalski, Matt Greenwood, perl6-i...@perl.org

On Fri, Mar 12, 2004 at 10:03:19AM -0500, Dan Sugalski wrote:
> At 6:06 PM -0500 3/11/04, Matt Greenwood wrote:
> >Hi all,
> > I have a newbie question. If the answer exists in a doc, just
> >point the way (I browsed the docs directory). What is the design
> >rationale for so many opcodes in parrot? What are the criteria for
> >adding/deleting them?
>
> Whether we have a lot or not actually depends on how you count. (Last

Is someone tracking the mailing list and adding questions and (good)
answers into the FAQ?

Tim.

Chromatic

unread,

Mar 15, 2004, 12:26:52 PM3/15/04

to Tim Bunce, perl6-i...@perl.org

On Mon, 2004-03-15 at 04:26, Tim Bunce wrote:

> Is someone tracking the mailing list and adding questions and (good)
> answers into the FAQ?

Whoops, I'd planned to add this opcode question and answer to the FAQ
this weekend. Thanks for the reminder, Tim!

-- c