Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bytecode portability and word/int sizes

6 views
Skip to first unread message

Melvin Smith

unread,
Nov 22, 2003, 12:50:49 PM11/22/03
to Leopold Toetsch, dan Sugalski, perl6-i...@perl.org
At 12:13 PM 11/22/2003 +0000, you wrote:
> * write intval size into PBC header

Leo, I know this is a first cut at freeze/thaw, and I'm happy you've
done it. Let me make some comments to you and Dan.

I'm pretty sure Dan and I discussed this when I was reworking bytecode
to be portable last year, but I think at the time we decided there were other
things (like freezing PMCs) that needed to be done. Now that it is in, its
time to re-light this discussion until we get a document out of it. :)


A) I didn't look at your commit to see the reason for storing the INTVAL size
into the bytecode header, so I won't question it yet, but since p6i has argued
this for a couple of years, I want to restate some things.

Parrot currently assumes INTVAL size == OPCODE size because
both get configured as the same integral type, although you can choose
to override it, it is not supported to choose INTVAL > OPCODE, though
the inverse is. So storing it in the header is probably redundant, unless
we change the rules.

While we may have platforms that have void */size_t > than INT,
we should not have any where void */size_t < INT.

An int should always pack into 1 opcode_t, if it doesn't we have configured
wrong for INTVAL (someone chose a 64-bit type on a 32-bit platform),
and the fact that by configure, it is even possible, says we have the
typedef hierarchy wrong. I still feel that Parrot_Int should be dependant
on Parrot_Opcode.


B) For bytecodes to be portable we have to have some rules about
standard sizes. Yes, we can unpack a 64-bit bytecode on a 32-bit
CPU, but INTVAL constants (non-PMC) get truncated.

Since it is NEVER portable for someone to write code using the Ix
registers with values > than 32-bit, we have to make a rule. Any use
of integers > than 32-bit should use a PMC type, not an integral type.

The fact that:

set I0, 0
LOOP:
inc I0
lt I0, <MAXINT CONSTANT>, LOOP

Runs differently between platforms is enough reason to make
a least common denominator MAXINT the Parrot standard.

The summary of this:

Parrot must use a PMC Int type to store portable data, so the INTVAL
should be stored as part of the PMC stream, not the bytecode
header, as we may support multiple sizes.

-Melvin

Leopold Toetsch

unread,
Nov 22, 2003, 5:34:28 PM11/22/03
to Melvin Smith, perl6-i...@perl.org
Melvin Smith <mrjol...@mindspring.com> wrote:

> Parrot currently assumes INTVAL size == OPCODE size because
> both get configured as the same integral type, although you can choose
> to override it, it is not supported to choose INTVAL > OPCODE, though
> the inverse is. So storing it in the header is probably redundant, unless
> we change the rules.

1) opcode_t is the size of opcodes, but has to hold pointers too, for
prederef and such. I'd name it the natural wordsize.

2) INTVAL is a VM register category holding some integers, which the VM
can work with -- as is FLOATVAL.
Your conlusion above is just reverse. The whole machinery relies on
the fact that we can say sizeof(opcode_t) == sizeof(void *).
sizeof(INTVAL) is AFAIK not related to that in any way.

3) If there is such a configure option, and it doesn't work its a
mistake.

> B) For bytecodes to be portable we have to have some rules about
> standard sizes. Yes, we can unpack a 64-bit bytecode on a 32-bit
> CPU, but INTVAL constants (non-PMC) get truncated.

The concept of having INTVAL constants inside the opcodes is
wrong from a general POV. Please have a look at e.g jit/arm/ what
immediate constants are requiring as work arounds.

> Since it is NEVER portable for someone to write code using the Ix
> registers with values > than 32-bit, we have to make a rule.

which is, Parrot core types, their usage and the concept needs some
rework, when it comes to make 64bit INTVALs work on 32-bit CPUs.

IIRC: "Make mine SuperSized" by the well known and ever forgotten
Bryan C. Warnock and the discussion following!!!1 was quite informative.

> lt I0, <MAXINT CONSTANT>, LOOP

> Runs differently between platforms is enough reason to make
> a least common denominator MAXINT the Parrot standard.

Yep, thats the summary of the problem. INTVAL constants are not portable
as FLOTAVAL constants aren't --if the are inline

> The summary of this:

> Parrot must use a PMC Int type to store portable data, so the INTVAL
> should be stored as part of the PMC stream, not the bytecode
> header, as we may support multiple sizes.

How do you communicate the long-long-long intval constants to the PMC?

> -Melvin

leo

Melvin Smith

unread,
Nov 22, 2003, 10:14:42 PM11/22/03
to leopold Toetsch, dan Sugalski, perl6-i...@perl.org
At 11:34 PM 11/22/2003 +0100, Leopold Toetsch wrote:
>Melvin Smith <mrjol...@mindspring.com> wrote:
>
> > Parrot currently assumes INTVAL size == OPCODE size because
> > both get configured as the same integral type, although you can choose
> > to override it, it is not supported to choose INTVAL > OPCODE, though
> > the inverse is. So storing it in the header is probably redundant, unless
> > we change the rules.
>
>1) opcode_t is the size of opcodes, but has to hold pointers too, for
> prederef and such. I'd name it the natural wordsize.
>
>2) INTVAL is a VM register category holding some integers, which the VM
> can work with -- as is FLOATVAL.
> Your conlusion above is just reverse. The whole machinery relies on

My conclusion isn't reverse, it is the current state of Parrot, is it not?

> the fact that we can say sizeof(opcode_t) == sizeof(void *).
> sizeof(INTVAL) is AFAIK not related to that in any way.

But it IS related, unless you remove immediate operands. Parrot currently
assumes the INTVAL will be in 1 opcode.

set_i_ic is: intregs[i] = curopcode[2]

Anyway, the "unofficial" relationship of opcode_t to INTVAL is only an
implementation problem, not a portability problem.

I think this is another example of my inability to correctly communicate
an idea. When I say something is unsupported, I'm saying that currently
it doesn't work. Whether it _should_ work is not what I am saying.

Typically, size_t, void * and long int are all defined as the wordsize of the
platform, and this is because the majority of all operations are integer
operations (addresses, indexing, iteration/incrementing). I say "typically"
because
I can only speak from experience, and as soon as I say "always" someone
will point
out a platform that doesn't fit. :)

On 64-bit platforms C usually provides int as 32-bit and long int as 64-bit.
In any case, there is still a 64-bit integer available, and that is the
type that
should be used for INTVAL since it will be optimal for the CPU.

**Again, this does not mean it has to be required to store an INTVAL in 1
opcode,
its just the way it currently is. Java works just fine with single byte
opcodes (for certain
definitions of "fine").


> > B) For bytecodes to be portable we have to have some rules about
> > standard sizes. Yes, we can unpack a 64-bit bytecode on a 32-bit
> > CPU, but INTVAL constants (non-PMC) get truncated.
>
>The concept of having INTVAL constants inside the opcodes is
>wrong from a general POV. Please have a look at e.g jit/arm/ what
>immediate constants are requiring as work arounds.

I'm not aware of those problems concerning immediate integers.
Could you describe the issue? I really don't go into the JIT much.

I can't agree that immediate operands are ever "generally wrong" as
there are immediate instructions on most hardware and those
instructions are efficient, since the instruction and operand
are both in the pipeline.

Maybe they don't fit into our JIT design?

> > Since it is NEVER portable for someone to write code using the Ix
> > registers with values > than 32-bit, we have to make a rule.
>
>which is, Parrot core types, their usage and the concept needs some
>rework, when it comes to make 64bit INTVALs work on 32-bit CPUs.

Yes, true, so see my comments down further. Maybe we can arrive
at a conclusion on what the rework should look like.

>IIRC: "Make mine SuperSized" by the well known and ever forgotten
>Bryan C. Warnock and the discussion following!!!1 was quite informative.

I've been saying the same thing Bryan has been saying.
One is: the "default" Ix registers should be the same size as opcode_t,
which should
all probably match size_t. Everyone wants fast, basic INT operations, and
having a number in the Ix register LARGER than what the hardware can use as
an address is a waste in the general case.

Ix regs are for:
1) Fast integer stuff
2) Iteration (increment variables)
3) Conditional checks
4) Branching and holding addresses
5) Indexing arrays and strings
6) Holding small constant values and flags

All of this works out fine if we use the native size, and none of this
should be penalized (for example if we decided to use "default" 64-bit
INTVAL registers
on a 32-bit CPU, this is penalization -- which I am NOT proposing).

It is possible that we can rework Parrot to provide 64-bit registers that
would be
portable, and even lay them over our existing register set which would
require little
rework. I expect it will hurt performance if we do it transparently, but
I've not thought
about the "transparent" solution(s), if there are any.

>How do you communicate the long-long-long intval constants to the PMC?

You can't do that now anyway, but the two options I see are:
(1) use a string or (2) use a multiple register op
or..... (3) Fix Parrot not to assume any integer constant must fit into an
Ix register or single opcode! :)

So, on to a solution:

Any solution must:

1) Provide fast integer operations in the native wordsize of the hardware
2) Provide efficient register block saving/restoring
3) Work on any platform in non-JIT mode with no runtime op modification

Here is one idea, though I haven't thought about all of the impacts:

1) The assembler (or IMCC) needs to be smart enough to correctly compile
64-bit integer constants on all platforms but limits loads to the
Ix regs to 32-bit.
2) Map 64-bit registers over the same register segment that the Ix
registers use. We call them Lx.
Provide a few operations for accessing the Lx registers. This will
be a VERY limited
set of ops for set/get + the math ops + inc/dec and a few others.
Maybe even a
branch/bsr for Lx. Loading of 64-bit constant values into Lx reg
is supported on all platforms,
and Ix registers are explicitly 32-bit.
3a is optional, depending on how we want to overlay Ix and Lx.
3a) We could map the Ix registers onto a 256 byte segment (currently it is
128 on 32-bit, 256 on 64-bit)
This means that on 32-bit platforms, there will be a word between
Ix registers, but we get
and 1 to 1 intersection and it is easier to rework the register
allocator since Ix always = Lx for any x.
3b) If we don't actually "map" Ix to Lx as far as the assembly code goes,
(meaning, you can't load Ix
and expect the value to be there for Lx) than we don't have to pad
the Ix register blocks.
The downside is Lx intersects with Ix and Ix+1 by requirement. No
big deal. My gut feeling is
this is the way to go because the Lx regs will be the rare case.

If we keep supporting immediate values in the bytestream, the Lx ops would
use 2 opcode slots on 32-bit hardware, 1 opcode slot on 64-bit. It'll still
be packed/unpacked
portably, but we'd have to have a portable NEXT() macro and friends for the
op cores.
Easy enough to accomplish. Or we could just do away with immediate values
altogether.

It is simply _optional_ to use the Lx registers. It is _only_ for code that
explicitly works with values that
could potentially be > 64-bit, and for some reason the user code doesn't
wish to use a PMC
type. It just so happens that the code would run faster on 64-bit CPUs, but
it would still run on 32-bit
and Ix ops would run optimized on either.

99.9% of core register use will be the "always native" Ix regs.

-Melvin


Melvin Smith

unread,
Nov 22, 2003, 10:32:26 PM11/22/03
to leopold Toetsch, dan Sugalski, perl6-i...@perl.org
At 10:14 PM 11/22/2003 -0500, Melvin Smith wrote:
>At 11:34 PM 11/22/2003 +0100, Leopold Toetsch wrote:
>>Melvin Smith <mrjol...@mindspring.com> wrote:
>> > to override it, it is not supported to choose INTVAL > OPCODE, though
>> > the inverse is. So storing it in the header is probably redundant, unless
>> > we change the rules.
>> Your conlusion above is just reverse. The whole machinery relies on
>
>My conclusion isn't reverse, it is the current state of Parrot, is it not?

Arg, you are correct, I must be dyslexic. I re-read my original
note, even after you pointed it out and still read it in reverse.
How embarassing. :(

I hope you did get the basic idea of what I "thought" I was saying,
since we were actually agreeing, even if I was convinced that we
were not.

-Melvin


Leopold Toetsch

unread,
Nov 23, 2003, 7:07:33 AM11/23/03
to Melvin Smith, perl6-i...@perl.org
Melvin Smith <mrjol...@mindspring.com> wrote:
> At 11:34 PM 11/22/2003 +0100, Leopold Toetsch wrote:

>>The concept of having INTVAL constants inside the opcodes is
>>wrong from a general POV. Please have a look at e.g jit/arm/ what
>>immediate constants are requiring as work arounds.

> I'm not aware of those problems concerning immediate integers.
> Could you describe the issue? I really don't go into the JIT much.

The arm architecture can only load 12-bit immediate constants directly.
Its a size limit as we have one, when going with 32-bit ops and 64-bit
INTVALs. This means, that integer constants bigger then a certain size
should better be in the constant table. That's of course suboptimal too,
as we would need a lot of ops duplication to handle both integer const
formats.

> I can't agree that immediate operands are ever "generally wrong" as
> there are immediate instructions on most hardware and those
> instructions are efficient, since the instruction and operand
> are both in the pipeline.

> Maybe they don't fit into our JIT design?

They don't fit for all architectures and all sizes. JIT design is fine
but has to cope with hardware limitations.

> Ix regs are for:
> 1) Fast integer stuff
> 2) Iteration (increment variables)
> 3) Conditional checks
> 4) Branching and holding addresses
> 5) Indexing arrays and strings

While the operands clearly are of wordsize (opcode_t) I doubt that it
makes sense to say, that arrays or strings can hold max-INTVAL items.

> 6) Holding small constant values and flags

> All of this works out fine if we use the native size, and none of this
> should be penalized (for example if we decided to use "default" 64-bit
> INTVAL registers
> on a 32-bit CPU, this is penalization -- which I am NOT proposing).

The question is: Do we want to support a configuration with 32-bit
opcode_t and 64-bit INTVALs? Perl5 has 64-bit IV support, but it doesn't
has native data types. Perl6 may have both. The languages, we are
running will define, if we need such a configuration.

>>How do you communicate the long-long-long intval constants to the PMC?

> You can't do that now anyway, but the two options I see are:
> (1) use a string or (2) use a multiple register op
> or..... (3) Fix Parrot not to assume any integer constant must fit into an
> Ix register or single opcode! :)

or (4) assign_l Px, Lc # assign long long constant from const_table to PMC.

> So, on to a solution:

I think, we should first have a decision to above question - the snipped
proposal is fine.

> -Melvin

leo

Melvin Smith

unread,
Nov 23, 2003, 1:59:46 PM11/23/03
to l...@toetsch.at, perl6-i...@perl.org
At 01:07 PM 11/23/2003 +0100, Leopold Toetsch wrote:
>Melvin Smith <mrjol...@mindspring.com> wrote:
> > At 11:34 PM 11/22/2003 +0100, Leopold Toetsch wrote:
> > Ix regs are for:
> > 1) Fast integer stuff
> > 2) Iteration (increment variables)
> > 3) Conditional checks
> > 4) Branching and holding addresses
> > 5) Indexing arrays and strings
>
>While the operands clearly are of wordsize (opcode_t) I doubt that it
>makes sense to say, that arrays or strings can hold max-INTVAL items.

Maybe not, but the point is that you want the efficient wordsize for
your INTVAL in those cases (ie. _not_ 64-bit on 32-bit architecture, or
vice-versa).

> > All of this works out fine if we use the native size, and none of this
> > should be penalized (for example if we decided to use "default" 64-bit
> > INTVAL registers
> > on a 32-bit CPU, this is penalization -- which I am NOT proposing).
>
>The question is: Do we want to support a configuration with 32-bit
>opcode_t and 64-bit INTVALs? Perl5 has 64-bit IV support, but it doesn't
>has native data types. Perl6 may have both. The languages, we are
>running will define, if we need such a configuration.

The languages don't get to decide for Parrot, though.
Parrot should guarantee existence and size of INTVAL and HUGEINTVAL
registers. How a HLL maps to Parrot registers (or PMCS) is its own business.

To your 1st question, I don't think we should support any case where
sizeof(opcode_t) != sizeof(INTVAL). Both of these should be the most
efficient wordsize, so on each platform they should match.

The 2 cases you are describing would be:

a) 64-bit INTVAL on 32-bit platform (suboptimal)
& 32-bit opcode_t (ok)

b) 64-bit INTVAL on 64-bit platform (ok)
& 32-bit opcode_t on 64-bit platform (suboptimal)

Neither of these is a valid configuration IMO.

**We should always guarantee both opcode_t & INTVAL will be optimal for
speed. We never have to guarantee HUGEINTVAL will be.

-Melvin


0 new messages