Well, I'm beginning to feel it's habitual for me to periodically pop my head in and waffle on Parrot's core sizes. But waffle I shall.
- opcode_t
This has already been discussed, so I'll sum up. To remain compatible (and efficient) across the spectrum of 32-bit and 64-bit platforms, the value of opcode_t is limited to 32-bits. (Or, more accurately, 31 bits.) Although you could do larger on a 64-bit platform, the use of opcode_t as an array index and memory offset limits it to the size of the addressable memory anyway. (So the value would be downcast by the end, if not before. I can't find a reference to what integer type an array index is.)
Not to mention all the *other* problems we'll have if we've got more than 2^31 different opcodes. (Although that's why there's UUIDs now, isn't there?)
Although Parrot needs to be able to convert 32-bit and 64-bit wide opcodes, there's no reason to process at anything other than native (size_t-ish) size, since a good 90%+ of the uses will be cast that size anyway.
- INTVAL
Early on, I was a big fan of making INTVALs as big as you could. Bitten by integer rollover, watching the struggles of complete 64-bit Int support in Perl 5, huge INTVALs were important to me.
As Parrot has evolved, I've come to realize that what I *really* want is to be able to program with huge INTVALs. Which isn't the same thing.
----------- ------------------| Opcodes | <--- Program S | Interpreter ----------- Y | ^ | S <- | G R <-----| | T -> | u <- e v | E | t -> g -------- M | s s | PMCs | --------------------------
So when I write a program, there are going to be two types of numbers, user and system. (For lack of imagination.)
User numbers, of course, are the numbers that exist for their own purpose, and for the user's benefit.
$a = 5; $b = $a * 2 + 6;
System numbers are those marked "internal use only". File numbers, array indices, counters, the language infrastructure. These bubble down to the guts of the interpreter, and eventually to the system. If INTVAL is greater than the natural system width, conversion is in order.
(For the sake of using real numbers, I'll assume 32/64.)
Currently, the flow is, in variable sizes:
Opcodes: 32 (constants are limited by the spec) PMCs : 64 Regs : 64 Guts : 64/32 mix System : 32
What's troublesome is the rash of conversions between the system and some guts, those guts and other guts, or those guts and registers. (Besides the extra cost of schlepping around the extra data, size differentials between INTVALs and pointers (which is problematic to begin with), unchecked truncation, and the added burden on the JIT, it's not really a problem.)
And for what? To be able to add large numbers?
Numbers, as a type in a language that rides upon Parrot, never really reach beyond the boundaries of the PMCs themselves. The majority of numerics passed down through the registers are destined for conversion anyway.
The flow *really* is, in value sizes:
Opcodes: 32 (constants are limited by the spec) PMCs : 64 Regs : 32 Guts : 32 System : 32
Certainly, much like the physical machine the virtual machine runs on, it needs to support, or at least not preclude, wider numeric types for access by languages. But given the mapping of the bulk of the virtual on the machine onto the physical, that should probably be relegated to just support.
For example, take Perl 5's struggle for maximal bitness. Given that Perl 6 will continue in that direction - and further, if you consider auto-promotion to arbitrarily sized numbers - and the language will provide all of the functionality within its PMCs, why does it need the interpreter to do any more than not get in its way? (Consider, for a moment, that bytecode strives to be portable across all Parrot virtual machines, which implies that nothing in the bytecode, nor in the supporting languages, should be dependent on Parrot being configured with extended integers in the first place.)
On the off chance that a language with extended numerics wants to use registers, what would the feasibility be (from the JIT, compiler, etc) to borrow a page from the physical hardware and simply join two smaller registers together? (The advantage of contiguous memory regions.)
- FLOATVAL
The same principle, with a twist. Like most operating systems, the interpreter doesn't really have a need - in and of itself - for floating point. Floating points pretty much exist entirely for end calculations. So there's much less internal data flow of floats and needless conversions. But there's also much less need for the interpreter itself to have to have configurable sized floats. But then there's little reason not to have configurable sized floats. The JIT, I guess.
- Problems
Well, Parrot's had problems from the beginning with non-"long, double, long" configurations. By keeping INTVAL and FLOATVAL as the maximum size supported (basically either "long" or "long long", or "double" or "long double"), languages can feel free to take advantage of what facilities are available to them, if they so choose.
But what of inter-language operability? Will the registers become the crossroads for data conversions between PMCs from difference languages? It doesn't look that way, from the direction that PMCs have gone.
Can we simplify interpreter types this much, while still providing extended numerics to hosted languages?
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
If memory serves me right, Bryan C. Warnock wrote:
> Not to mention all the *other* problems we'll have if we've got more > than 2^31 different opcodes. (Although that's why there's UUIDs now, > isn't there?)
I think parrot has already crossed the limit of 1024 ... (I can't even keep 256 opcodes in my head , let alone 1024 :-)
> And for what? To be able to add large numbers?
No .. to add large numbers very quickly ... ie split registers and enemies ;-)
No sense in keeping an Int64 in 2 32 bit regs if you have a 64 bit CPU & shift-mask-add .. But how can you be sure where the .pbc will run.
> to borrow a page from the physical hardware and simply join two smaller > registers together? (The advantage of contiguous memory regions.)
Well that's only if those two regs are in memory ... the Parrot JIT does use a register allocation scheme , IIRC .
> Can we simplify interpreter types this much, while still providing > extended numerics to hosted languages?
I *had* to hack out a couple of types of parrot to have fixed size types irrespective of implementation size ... (@see dotgnu.ops)
But of course it's a sad situation that Parrot is missing Objects still . Until then those opcodes are there to occupy numbers.
Gopal -- The difference between insanity and genius is measured by success
On Sat, 2003-05-31 at 11:15, Gopal V wrote: > If memory serves me right, Bryan C. Warnock wrote: > > Not to mention all the *other* problems we'll have if we've got more > > than 2^31 different opcodes. (Although that's why there's UUIDs now, > > isn't there?)
> I think parrot has already crossed the limit of 1024 ... > (I can't even keep 256 opcodes in my head , let alone 1024 :-)
> > And for what? To be able to add large numbers?
> No .. to add large numbers very quickly ... ie split registers and > enemies ;-)
Understood. My point was that - to parallel virtual machines with physical ones - the big drive for 64-bit has never been about squeezing out another point-n percent when doing ultra-high precision math, but more about the ability to represent a range of numbers, such as those needed to address memory or storage. (Which Parrot is completely dependent on the hardware to do.)
> No sense in keeping an Int64 in 2 32 bit regs if you have a 64 bit > CPU & shift-mask-add .. But how can you be sure where the .pbc will > run.
I'm not saying keep Parrot 32-bit. I'm saying there's no reason to run the Parrot core at a width wider than the hardware. (So core Parrot on a 64 bit machine will do 64 bit math. That doesn't prevent languages running atop from using wider types, as long as Parrot is aware.)
> > to borrow a page from the physical hardware and simply join two smaller > > registers together? (The advantage of contiguous memory regions.)
> Well that's only if those two regs are in memory ... the Parrot JIT > does use a register allocation scheme , IIRC .
That's why the big punt on whether it'd be doable in the JIT. I'll let those more clever than me address how to do that.
Of course, I completely forgot that splitting Parrot registers (which is basically casting a register as being wider, and obliterating the register behind it) might introduce alignment problems, so you might only be able to do that for registers mod 2.
> > Can we simplify interpreter types this much, while still providing > > extended numerics to hosted languages?
> I *had* to hack out a couple of types of parrot to have fixed size > types irrespective of implementation size ... (@see dotgnu.ops)
Which, I think, is okay.
Perl 6, Lisp, DotGNU - they should be free within their own framework to define their own types.
> But of course it's a sad situation that Parrot is missing > Objects still . Until then those opcodes are there to occupy numbers.
> Gopal
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
On Sat, 2003-05-31 at 11:43, Leopold Toetsch wrote: > Bryan C. Warnock <bwarn...@raba.com> wrote:
> > The flow *really* is, in value sizes:
> > Opcodes: 32 (constants are limited by the spec)
> In which spec? How would we handle 64 bit INTVAL constants on 32 bit > systems?
Parrotbyte.pod. Googling for 'parrot constant "32 bit"' also returns some discussions. (Although I don't remember - and can't find - any reference to what Dan had suggested for handling what, essentially, are PMC constants.)
> Yep, guts should really be plain C<int> or C<size_t>. There are far too > many U?INTVALs in data structures or whatever.
> I'm not sure, if we need 64 bit INTVAL in regs. But the implementation > in JIT wouldn't be too hard.
I don't think we need them. An awful lot of the numbers making it to the registers are passing through to the guts. And implemented languages have to take into consideration that a 64-bit type isn't available in the first place, so we shouldn't be breaking anything. (Actually, this will make sure we don't break anything.)
> > Can we simplify interpreter types this much, while still providing > > extended numerics to hosted languages?
> For sure.
Okay, let me rephase.
Can *those of us who aren't Leo* simplify interpreter types this much, while still providing extended numerics to hosted languages? :-)
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
If memory serves me right, Bryan C. Warnock wrote:
> > No .. to add large numbers very quickly ... ie split registers and > > enemies ;-)
> Understood. My point was that - to parallel virtual machines with > physical ones - the big drive for 64-bit has never been about squeezing > out another point-n percent when doing ultra-high precision math, but > more about the ability to represent a range of numbers, such as those > needed to address memory or storage. (Which Parrot is completely > dependent on the hardware to do.)
Let me get this straight .... if we endup using a 64-bit INTVAL in a 32-bit machine , it will suffer a speed loss even when you write a Rot13 converter ?
/me mutters about Dan, Python , Zope labs and banana creme pie
> I'm not saying keep Parrot 32-bit. I'm saying there's no reason to run > the Parrot core at a width wider than the hardware. (So core Parrot on > a 64 bit machine will do 64 bit math. That doesn't prevent languages > running atop from using wider types, as long as Parrot is aware.)
In parallel speaking IL has an "INTVAL" according to hardware and fixed size integers usable in the VM . (read more comments on that below)
> Of course, I completely forgot that splitting Parrot registers (which is > basically casting a register as being wider, and obliterating the
DotGNU Parrot IRC meeting -- 2002-10-19
*****************************Parrot, IL and JVM************************** [10:26] <Dan> acme: Do you remember if the JVM requires 64 bit ints? [10:27] <q[acme]> Dan: yup, for longs. see http://java.sun.com/docs/books/vmspec/2nd-edition/html/ Overview.doc.html#22239 [10:28] <Dan> Okay, then. That clinches it. Time to do weird things for parrot's ints. [10:29] <q[acme]> shame that jvm and msil are more hardware-level really ;-) [10:30] <Dan> I just begrudge the cache fluff that emulated 64 bit ints will bring [10:31] <Dan> But split registers are a major pain, and I don't want to emulate 64 bit math anywhere if I can avoid it
:-)
> Perl 6, Lisp, DotGNU - they should be free within their own framework > to define their own types.
Though those types are available for any language that's strict about sizes ... like say Java :-) . The mul.ovf and similar operations do throw exceptions on overflow from native...
The reason these were made into new ops instead of PMCs was to allow JIT'ing these in the future if needed ... They are simple enough to be easily JIT'd, but better wait until they are used :-)
Gopal -- The difference between insanity and genius is measured by success
On Sun, 2003-06-01 at 10:08, Gopal V wrote: > If memory serves me right, Bryan C. Warnock wrote: > > > No .. to add large numbers very quickly ... ie split registers and > > > enemies ;-)
> > Understood. My point was that - to parallel virtual machines with > > physical ones - the big drive for 64-bit has never been about squeezing > > out another point-n percent when doing ultra-high precision math, but > > more about the ability to represent a range of numbers, such as those > > needed to address memory or storage. (Which Parrot is completely > > dependent on the hardware to do.)
> Let me get this straight .... if we endup using a 64-bit INTVAL in > a 32-bit machine , it will suffer a speed loss even when you write > a Rot13 converter ?
Of course. 64-bit math on a 32-bit machine will be a tad slower than 32-bit math. 64-bit math on a 64-bit machine can be a tad slower than 32-bit math. (But that's another story.) Emulating (in software) 64-bit math on a 32-bit machine will be much slower. But we're not going to do that except where we have to. (Which shouldn't be too many places, any more.)
If the hardware can do 64-bit math, and the compiler can produce the code to do 64-bit math on the hardware, then languages running atop Parrot should be able to do 64-bit math on the hardware.
But that doesn't mean Parrot, which spends most of it time mediating amongst a language's PMCs and hardware services (like IO and signals), needs to be built upon it.
> /me mutters about Dan, Python , Zope labs and banana creme pie
> > I'm not saying keep Parrot 32-bit. I'm saying there's no reason to run > > the Parrot core at a width wider than the hardware. (So core Parrot on > > a 64 bit machine will do 64 bit math. That doesn't prevent languages > > running atop from using wider types, as long as Parrot is aware.)
> In parallel speaking IL has an "INTVAL" according to hardware and fixed > size integers usable in the VM . (read more comments on that below)
> > Of course, I completely forgot that splitting Parrot registers (which is > > basically casting a register as being wider, and obliterating the
> DotGNU Parrot IRC meeting -- 2002-10-19
> *****************************Parrot, IL and JVM************************** > [10:26] <Dan> acme: Do you remember if the JVM requires 64 bit ints? > [10:27] <q[acme]> Dan: yup, for longs. see > http://java.sun.com/docs/books/vmspec/2nd-edition/html/ > Overview.doc.html#22239 > [10:28] <Dan> Okay, then. That clinches it. Time to do weird things for > parrot's ints. > [10:29] <q[acme]> shame that jvm and msil are more hardware-level really ;-) > [10:30] <Dan> I just begrudge the cache fluff that emulated 64 bit ints > will bring > [10:31] <Dan> But split registers are a major pain, and I don't want to > emulate 64 bit math anywhere if I can avoid it
> :-)
> > Perl 6, Lisp, DotGNU - they should be free within their own framework > > to define their own types.
> Though those types are available for any language that's strict about > sizes ... like say Java :-) . The mul.ovf and similar operations do > throw exceptions on overflow from native...
But even the JVM doesn't cripple itself by mandating that, in order to support 64-bit longs, all integers and operations on integers must be 64-bit wide. Which is *exactly* what Parrot is doing.
> The reason these were made into new ops instead of PMCs was to allow > JIT'ing these in the future if needed ... They are simple enough to be > easily JIT'd, but better wait until they are used :-)
Good point. But for most supported languages other than those statically typed - ie, Parrot's tier 1 audience - those operations are going to be vectored through PMCs anyway. Perhaps even for statically typed languages. After all, Parrot really only has one size register. Have we provided full semantics - like overflow - for all supported integer sizes?
It's these types of problems that have caused me - and I've caught Dan, on occasion - to constantly waffle on what we should be doing.
I think that with most languages handling their numerics via PMCs, there are few places within Parrot that need to built around "long long"s on a 32-bit platform. (Technically, we can't even *guarantee* "long long"s on a 32-bit platform.)
As Leo and I both documented, the integer registers are iffy. We don't know. (The problem stems from trying to dual-purpose the registers for interpreter-space and user-space calculations. Early on, Dan wasn't expecting too much to use them.)
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
If memory serves me right, Bryan C. Warnock wrote:
Reply inline ... and I've said more than I've quoted ... It could be called as a critical appreciation ... though not much has been appreciated below ... and what I know about parrot can be written on a shirt sleeve ;-) Please do tell me if I've gone off the track below.
> If the hardware can do 64-bit math, and the compiler can produce the > code to do 64-bit math on the hardware, then languages running atop > Parrot should be able to do 64-bit math on the hardware.
What I wanted to say was to have fixed size variables and an interpreter specific internal notation would be ideal. And only if you wanted to operate stuff with direct int registers.. The fixed size variables allow the JIT to decide if we use half-a-register , one or two for each operation according to hardware.
Though I must say this is totally against the INTVAL philosophy ...
But I think the INTVAL philosophy is a creep of a Perl idea into Parrot ie having variable size integers to execute ... Why this interpreter concept (ie perl5) ended up in the engine instead of the code production phase shows how blurry the distinction is for Parrot & Perl6 .
So compiling perl6 code with --with-64bit-integers (or something like that) will use the Int64 instead of Int32 for all "int" variables without affecting the array indexing , addressing features. Thus the engine can adopt the policy of "The compiler asked for Int64" and let it go at that. So we standardize data/integer sizes after the packfile phase of Perl6 code.
So after compiling something that needs 64 bit ints , you could pick it up and run it in a Parrot configured to 32 bit operations and it will use Int64 (2 registers, register pairs or software emulation -- JIT picks) since that is mentioned explicitly in code . And we'll all be happy. Similarly for 32 bit code compiled and ran on a 64 bit platform will have the correct semantics & overflow modes as it is explicitly mentioned as Int32 in the packfile.
Having said all that ... Isn't it a bit too late to bring this up ?
> > The reason these were made into new ops instead of PMCs was to allow > > JIT'ing these in the future if needed ... They are simple enough to be > > easily JIT'd, but better wait until they are used :-)
> Good point. But for most supported languages other than those > statically typed - ie, Parrot's tier 1 audience - those operations are > going to be vectored through PMCs anyway. Perhaps even for statically > typed languages. After all, Parrot really only has one size register. > Have we provided full semantics - like overflow - for all supported > integer sizes?
Another point ... how are PMC methods called ?. Do they cause a register flush ?. Ie do you push registers before moving into a PMC ?. Because in x86, an Int32 addition is 1 instruction and a (reg,reg) => (reg) operation as well. Being able to do it inline rather than pass through a stack push, call and pop is also another speed factor.
Such a register flush back to memory would have the a side-effect on the perfomance .. Especially when *all* numerics are via PMCs.
> I think that with most languages handling their numerics via PMCs, there > are few places within Parrot that need to built around "long long"s on a > 32-bit platform. (Technically, we can't even *guarantee* "long long"s > on a 32-bit platform.)
You're making me wonder if Parrot_Int4 should be the user/lang space type and INTVAL the interpreter space type , instead of otherwise.
> As Leo and I both documented, the integer registers are iffy. We don't > know.
Neither do I ... I'm just saying what I can see ..And of course I'm waiting for Dan and Leo to pounce on this thread (and hopefully they'll be quick & merciful ;-)
Gopal -- The difference between insanity and genius is measured by success
Gopal V <gopal...@symonds.net> wrote: > What I wanted to say was to have fixed size variables and an interpreter > specific internal notation would be ideal. And only if you wanted to > operate stuff with direct int registers.. The fixed size variables allow > the JIT to decide if we use half-a-register , one or two for each operation > according to hardware.
I think, we should see parrot like what it is - a CPU. A CPU has its natural integer size - called INTVAL in the parrot CPU. INTVAL size differs because parrots CPU depends on the underlying hardware CPU. So we need an additional concept of fixed sized integers (as e.g. a C compiler with int and int64_t ...)
The fixed sized int operations could either be done with separate opcodes or with current opcodes (where appropriate) + dotgnu-like ops for size adjustment. Bigger then INTVAL types would need special storage + special treetment, so they dont fit very well into this scheme.
But I really have strange feelings against putting an 64bit int into two adjacent 32-bit INTVAL registers. We might end up with another register set e.g. L (LONGINTVAL) plus basic math opcodes.
> But I think the INTVAL philosophy is a creep of a Perl idea into > Parrot ie having variable size integers to execute ... Why this > interpreter concept (ie perl5) ended up in the engine instead of > the code production phase shows how blurry the distinction is for > Parrot & Perl6 .
It seems that these scheme is good for Perl - or Perl users are just used to it - but impractical for typed languages.
> So compiling perl6 code with --with-64bit-integers (or something like that) > will use the Int64 instead of Int32
Perl6 will be (optionally for the user) a typed language too. Why not adopt existing schemes to Perl6: A C<int> is whatever the CPU i.e. parrot i.e. the hardware provides. And if the user wants a C<int64> she will get one. The same conecpts hold in C (gcc) with "int" vs "long long".
>> ... - those operations are >> going to be vectored through PMCs anyway.
Not for ultimate performance. If parrot will compete with Java and C#, we probably will need fixed sized natural types.
> Another point ... how are PMC methods called ?. Do they cause a register > flush ?.
Partially yes. But JIT/i386 and JIT/sun4 already call vtable functions directly and at least JIT/i386 also do push the register of mapped Parrot registers directly onto the processor stack. But a register flush will be necessary anyway due to exceptions - at least very likely.
> Neither do I ... I'm just saying what I can see ..And of course I'm > waiting for Dan and Leo to pounce on this thread (and hopefully they'll > be quick & merciful ;-)
I thinks, we should combine Brian and Gopal's concepts: - parrot guts use natural integers (the C compiler's "int") - INTVAL is the natural parrot integer (dependig on what hardware parrot was configured) - additionally parrot provides fixed sized integer types and math operations for these.
> > Opcodes: 32 (constants are limited by the spec) > > PMCs : 64 > > Regs : 64 > > Guts : 64/32 mix > > System : 32
> [snip] > > The flow *really* is, in value sizes:
> > Opcodes: 32 (constants are limited by the spec) > > PMCs : 64 > > Regs : 32 > > Guts : 32 > > System : 32 > [snip] > You seem to forget that there *are* systems out there that have a native 64-bit integer size (and a heavy penalty > for handling 32-bit ints). Even the pointer size *has* to be 64 bits - the system I write this on has 6 GB of > memory installed, so a 32 bit pointer is just useless.
I have forgotten nothing.[1] I simply got tired of talking in the abstract. The above also holds true for 64/128. (Well, except that opcode values are still limited to 32-bits, but padded in a 64-bit construct.
I believe this was reiterated in the thread that followed with Gopal V, but I'll try to clarify here. (I don't always convey my messages clearly.)
This is *not* a proposal that Parrot should be 32 bit.
This *is* a suggestion that the Parrot core should *not* be built around types wider than the underlying physical hardware is.
> So even the 'System' size must be 64-bits, as > size_t is an unsigned long (and therefor 8 bits.
And that's right in line with what I was saying. Don't get wrapped up in the numbers. They were just an example.
[1] Actually, I'm sure I've forgotten a lot, which it why I'm posted this for comments and criticism. But I haven't forgotten about 64 bit platforms, which is my platform of predominant use. That would be rather short-sighted, even for me.
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
Okay, after reading the thread, I think some explanations and notes are in order.
*) Opcodes are limited to 32 bits so we have full bytecode portability. While defining an opcode number past 2 billion is utterly insane, we don't require that there be no holes in the opcode numbers, so theoretically a platform could load in an opcode library and start its numbers at, say, 2^37. I didn't want that to happen.
*) Integer constants are limited to 32 bit signed integers because they're inline. I couldn't think of enough likely reasons to have an integer constant outside the range of +-2^31. For those, I'm figuring people can use PMC or float constants
*) INTVAL is meant to be the fastest native integer type for integer math that's at least 32 bits. That integer registers are INTVALs is an unfortunate side-effect, and one I'm tempted to do something about.
*) The IRC conversation that Gopal quoted is actually a fairly important one. The engine needs to deal with 64 bit integer math natively, as well as 32 bit integer math. (Smaller math sizes--16 and 8 bits--are easy enough to deal with regardless of what we do)
The last is the important bit. I want, and I think we need, to do 64 bit math. I'm not 100% sure that we actually need to do it as a plain integer rather than as a PMC, but I'm not 100% sure we don't either.
Our options, as I see them, are:
1) Make the I registers 64 bits 2) Make some way to gang together I registers to make 64 bit things 3) Have I registers switchable between 32 and 64 bit somehow 4) Have separate 32 and 64 bit I registers 5) Do guaranteed 64 bit math in PMCs
The first is just out. It's an unreasonable slowdown on 32 bit (and some 64 bit) machines, for no overall win. The majority of integers will be smallish, and most of even the 32 bit range will be wasted.
I don't like option 2, since it means that we speed-penalize 64 bit systems, which seems foolish.
Option 3 wastes half the L1 cache space that I registers takes up. Fluffy caches--ick. Plus validating the bytecode will be... interesting, even at runtime.
4 isn't that bad. Not great, as it's more registers, and something of a waste on 64 bit systems, but...
#5 is something of a cop-out, but I'm not quite sure how much.
From what I can think, we need guaranteed 64 bit integers for file offsets, JVM & .NET support, and some fairly special-purpose math stuff. I'd tend to discount the special-purpose math stuff--that's not our target. JVM and .NET don't do much 64 bit stuff, but they do some. The file offset parts are in some ways the least of it, though we do need to have some internal support for 64 bits to get integer values out of PMCs without loss.
Anyway, I'm still somewhat conflicted. Opinions? -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
Dan Sugalski <d...@sidhe.org> wrote: > *) Opcodes are limited to 32 bits so we have full bytecode > portability. While defining an opcode number past 2 billion is > utterly insane, we don't require that there be no holes in the opcode > numbers,
or several over places. I see no chance (nor any value) in holes in opcode numbers.
> *) Integer constants are limited to 32 bit signed integers because > they're inline.
Yep. But this will cause problems with JIT/Prederef and multi threading, and its already causing problems inside JIT on architectures with only small immediate constants. We have to consider these upcoming problems too. I dont see any reason, not to have optionally/additionally - or always - integers in the const_table.
> *) INTVAL is meant to be the fastest native integer type for integer > math that's at least 32 bits. That integer registers are INTVALs is > an unfortunate side-effect, and one I'm tempted to do something about.
In one of the FUPs, I had a different definition: An INTVAL is the size of the integer register. The fastest integer type on $arch ought to be just a plain C<int>. I think, when we look at the problem from this side, it should be simpler.
> 4) Have separate 32 and 64 bit I registers > 5) Do guaranteed 64 bit math in PMCs > 4 isn't that bad. Not great, as it's more registers, and something of > a waste on 64 bit systems, but...
Not a waste. amd-64 aka x86-64 has 32bit ints and 64bit longs and pointers (gcc). This architecture could fully exploit both integer types.
If memory consumption (register in saving) is a big issue, reducing the amount of LONG (and N, S) registers could be an option, though I really don't like the irregularities, this would cause. But it should be worth some thoughts: The probably most used types will be PMCs and IREGs, followed by nothing first, then N or S or L dpending on the program.
> #5 is something of a cop-out, but I'm not quite sure how much.
It will slow down the usual untyped interpreter (PMC) scalars a lot.
I'm - as stated in the thread - for a new register type (L, long).
They have the same relation as 32bit ints and 64 bit longs, with the difference that we guarantee at least these sizes.
> 2) Make some way to gang together I registers to make 64 bit things [snip] > I don't like option 2, since it means that we speed-penalize 64 bit > systems, which seems foolish.
I think that it could be done in a way that doesn't significantly peanalize 64 bit machines.
First, define a pair of ops:
op canon_to_native_L(out int, out int, in int, in int)
On a 32 bit machine, this would be implemented as:
I expect that for most of the uses of this, we'd be modifying registers "in-place" from pairs of 32-bit values to single 64-bit values.
On those platforms where no change needs to be done to the data (i.e., the "int" type, and thus our "I" type, is only 32 bits), then when the interpreter loads the bytecode, it could treat it as a no-op, and discard it. Thus, there's no cost to it on those platforms.
On a 64 bit platform, well, we're wasting the memory that could be in $2, but at least there's no *speed* penalty here.
op native_to_canon_L(out int, out int, in int, in int)
This, of course, would do the opposite of the the other op, changing the "native" 64-bit integer into two 32-bit values.
The mathematical opcodes would, for long arithmetic, each use a pair of registers for each input. They'd only work on "nativeized" long values.
And of course, on a machine with 64-bit ints, the second register of each pair (c|w)ould be ignored.
> 3) Have I registers switchable between 32 and 64 bit somehow > Option 3 wastes half the L1 cache space that I registers takes up. > Fluffy caches--ick. Plus validating the bytecode will be... > interesting, even at runtime.
Consider the following optomization for my suggestion above:
On those machines where "int" is 32 bits, but a 64 bit "long long" exists, our interpreter, upon loading the bytecode, could detect when pairs of register arguments to 64-bit math ops are adjacent and aligned, and could replace that op with an alternate equvilant one, which is optomized by casting the (interpreter->ctx.int_reg.registers) into a (long long*) and accessing the two registers as one register.
Note that these alternate versions of long-int opcodes would never appear in portable parrot assembly files -- the interpreter would replace register-pair math ops with "long long" ops when loading the bytecode, and only do so when it's save/valid/correct to do so. Thus, there's no extra work when validating the bytecode.
> 4 isn't that bad. Not great, as it's more registers, and something of > a waste on 64 bit systems, but...
When you say "64-bit system" here, do you mean ones where "int" is 64 bits, or where "int" is 32 bits, but there just happens to exist a 64 bit integer type as well?
Certainly for the latter, it's not a waste.
And for the former... well, we'd be wasting half of the memory that's in our "32-bit" registers (since we'd still use 64 bits of storage for each of our registers, even though we're "using" only 32 bits of it), but there's no speed penalty, and unless there's overflow of the 32 LSB, there's little harm in using a 64 bit integer as if it were a 32 bit integer.
The big waste, of course, is that if code doesn't *use* them, then it could be wasteful/costly to save them.
-- $a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca );{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6 ]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}
> I'm - as stated in the thread - for a new register type (L, long).
I'm thinking of virtual registers here... ie
if sizeof(INTVAL) == 4 'L' is a PMC else if sizeof(INTVAL) == 8 'L' is an INTVAL
So 64 bit opcodes are compiled in differently for each platform ? ie add a set of virtual 'L' registers which turn into 32 new PMC regs in 32 bit and 32 new INTVALs in 64 bit systems ?
To take away the choice deep down into the JIT compile time ? (actually "when the JIT is compiled" time :-)
Since due to an accident , we have conv_i8 as an opcode and not as an explicit PMC call , we should be able to #ifdef it's internals to match the hardware ?. This should hard-code it after compilation with none of the runtime overhead.
After all we saw lots of C compilers do something similar for floats :-)
Gopal -- The difference between insanity and genius is measured by success
On Fri, 2003-06-06 at 15:12, Dan Sugalski wrote: > Our options, as I see them, are:
> 1) Make the I registers 64 bits > 2) Make some way to gang together I registers to make 64 bit things > 3) Have I registers switchable between 32 and 64 bit somehow > 4) Have separate 32 and 64 bit I registers > 5) Do guaranteed 64 bit math in PMCs
> The first is just out. It's an unreasonable slowdown on 32 bit (and > some 64 bit) machines, for no overall win. The majority of integers > will be smallish, and most of even the 32 bit range will be wasted.
I don't necessarily agree that this option is gone. IREGs are basically used for one of two things. To do non-PMC integer math, and to pass things to and from Parrot's guts. (And then you're talking a store and a load. I think passing them throughout Parrot is where the problem is.) So that would leave doing non-PMC integer math. That just doesn't sound like a whole lot. (But then again, I'm assuming that most math will be PMC-based, in order to handle int->num->str->big type conversions. If we want to minimize PMC-math, then perhaps this is a bigger deal.)
You know, there was a day when we'd just write some code and benchmark it to see *how* much slower it is....
No, no, no. Don't get up. I'll do it. :-)
Gluing together most of the IREG-based arithmetics pasm files, removing the prints, and wrapping an iterator around it.
Athlon 1 GHz, Linux 2.4.20. Identical Parrot configurations, save the size of INTVALs.
long long INTVALs: 4.98u @ 54% long INTVALS : 4.31u @ 54%
Difference, .67u @ 54%, or about 15%. (With the JIT, long long INTVALs were *much* faster, but only because they cheated and dumped core.)
So what percentage of a program is using the IREGs for math? 10%? 5%? 2%? That's a 1.5% to .3% overall slow down. Keep those numbers in mind.
> I don't like option 2, since it means that we speed-penalize 64 bit > systems, which seems foolish.
See below.
> Option 3 wastes half the L1 cache space that I registers takes up. > Fluffy caches--ick. Plus validating the bytecode will be... > interesting, even at runtime.
See below.
> 4 isn't that bad. Not great, as it's more registers, and something of > a waste on 64 bit systems, but...
See below.
> #5 is something of a cop-out, but I'm not quite sure how much.
See below.
> From what I can think, we need guaranteed 64 bit integers for file > offsets, JVM & .NET support, and some fairly special-purpose math > stuff. I'd tend to discount the special-purpose math stuff--that's > not our target. JVM and .NET don't do much 64 bit stuff, but they do > some. The file offset parts are in some ways the least of it, though > we do need to have some internal support for 64 bits to get integer > values out of PMCs without loss.
See below. Oh, wait. This *is* below. Okay, see here.
Let's back up a step. When it comes to integers, there are two types - no pun intended - of languages. Those that care, and those that don't.
Sized integer math has two properties to it, which are intertwined: dynamic range and mathematical semantics. (Dynamic range states that 8 bits can hold 8 bits worth of stuff, whether it's interpreted as signed, unsigned, or normalized (like exponents in IEEE floating point representations); as either numbers or bits. Mathematical semantics are what make
Although there will be cases where a typed language doesn't really care how large the range or the nature of the mathematical semantics for a given type, there will be times that it does. So we've either got to provide, somehow, all types, or provide one type that emulates the semantics of all types.
Untyped languages simply don't care what they get underneath, as long as they work. Except, of course, when they're trying to tie into a typed language. (Pass a 16-bit int from Java to Perl, do some stuff, and pass it back, for instance.)
Hardware handles this with different ops, of course, although compilers cheat where they can (or have to). For Parrot, however, that means multiplying the number of ops by 4 or 5. (Multiple IREG ops would still be a common multiple and not an exponent, as you'd promote both integers to the same size.) I think we're op-heavy, already, and Parrot would then have to track integer sizes. (Although for untyped languages, that'd be easy, as they'd all be one size.) Plus, you'd have to map those onto the common set of IREGs. Or create 4 or 5 more. (And then decide how you handle things like integer promotion.)
Of course, you could continue to handle this with one op, albeit smart enough to handle the semantics of whatever size math you're doing. That way, you'd only be doing the slow, 64-bit math when you absolutely needed to.
The problem is, of course, those numbers up top I told you to remember. Writing that smart op is going to cost you far more than a mere 1.5%. You've slowed everything down to speed up one case, which, by the way, didn't speed up because you're jumping through such hoops to avoid it.
Even the JIT may not handle this efficiently. Certainly, at everything less than native size, it's normally a trivial tweak or two:
So that brings us back to one big, flat space. Either the ideal system width, which will run faster, or the largest width possible. If we choose the ideal system width, it may be too small to support typed languages, or the occasional system metric which requires it. (Like 64-bit file offsets, which Dan ever-so-kindly reminded me of.) If we choose the largest width possible, we slow things down, but we can mostly support everything.
I say mostly, because there's no telling how typed languages will feel about being run atop a unitype system, regardless of the size of that one type. Of course, the languages should feel free to either create their own PMCs that map to those types, or create the ops that their compiler would generate to handle the mathematical semantics of those types within Parrot's unitype:
inline op add_8 (out INT, in INT, in INT) { $1 = (INTVAL)((int8_t)$2 + (int8_t)$3); goto NEXT(); }
inline op add_16 (out INT, in INT, in INT) { $1 = (INTVAL)((int16_t)$2 + (int16_t)$3); goto NEXT(); }
That puts the impetus on each language to track its own types, but all types map correctly in, out, and between languages. (And at the cost of only one more instruction.) And it only affects those in need.
But then we're back to where we started, with these big INTVALs running amok needlessly throughout Parrot. We certainly want to minimize their usage, and, practically speaking, their usage is language level math.
After all, C is a typed language, and if we're going to interface with C (or, more accurately, Parrot's internals and the underlying system, which are written in C), then we can do it like above.
Luckily, for us, Parrot's internals (at any given time) are pretty well fixed, which means if an op - say, print - needs to pass a file number, that number will always be an int.
|---- LANGUAGE LAYER ----|----- INTERPRETER LAYER -----|
program <-> registers <-> ops <-> parrot internals <-> system
|---- ARBITRARY SIZES ---|------- SYSTEM SIZES --------|
The boundary between op code and parrot internals is also the boundary between where arbitrary numbers are needed for language support, and useless for the system.
So let's convert when we cross that boundary.
1) We gain a performance boost in Parrot's internals, in both faster and smaller code. 2) We suffer a slight penalty in IREG math. (But we don't suffer a larger penalty trying to avoid it.) 3) We keep Parrot simple, and, well.... KISS. 4) We push the complexity and the decisions of integer types to the specific languages to implement as they see fit - PMC, op, or don't really care - while providing a common type to convert through, and without tying them to one all-encompassing model. 5) The coding rules are simple: ops are built on INTVALs, Parrot internals are not.[1]
How far away are we?
For Parrot internals, it's largely a substitute job. Find the right type for the job, and fix the code. The ops need explicit casting added. The biggest problem is probably the JIT, because mandating 64-bit support means a long long on x86, which doesn't JIT right now. But, overall, that's not that far.
Thoughts?
[1] Of course, you know there *has* to be an exception. Currently, Parrot internally provides some direct support routines explicitly for INTVALs, namely stringification as part of the various *printf routines. I consider those type of routines more of an "op support library" than Parrot internals. (Functionally, although certainly not lexically, as it currently stands.)
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
On Fri, 2003-06-06 at 16:34, Leopold Toetsch wrote: > > *) Integer constants are limited to 32 bit signed integers because > > they're inline.
> Yep. But this will cause problems with JIT/Prederef and multi threading, > and its already causing problems inside JIT on architectures with only > small immediate constants. We have to consider these upcoming problems > too. I dont see any reason, not to have optionally/additionally - or > always - integers in the const_table.
There must be *some* limit, even if it's the physical limit of the machine. Either that limit is hard - Parrot cannot support integers larger than that type - or it's soft - Parrot will work around the limit by promoting to arbitrary-width numbers.
If it's a soft limit (which it is), then the limit itself is arbitrary.
> > *) INTVAL is meant to be the fastest native integer type for integer > > math that's at least 32 bits. That integer registers are INTVALs is > > an unfortunate side-effect, and one I'm tempted to do something about.
> In one of the FUPs, I had a different definition: > An INTVAL is the size of the integer register. The fastest integer > type on $arch ought to be just a plain C<int>. I think, when we look at > the problem from this side, it should be simpler.
IIRC, Jarkko pointed out that that's not always true. (The *last* time I was waffling on sizes.)
> I'm - as stated in the thread - for a new register type (L, long).
> They have the same relation as 32bit ints and 64 bit longs, with the > difference that we guarantee at least these sizes.
I'm not an actor, nor do I play one on TV. That being said, if you can handle making Parrot keep all the registers straight, I'm not adverse to this. (What am I saying? Of course *you* can handle that. :-)
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)
On Fri, 2003-06-06 at 21:47, Benjamin Goldberg wrote: > And for the former... well, we'd be wasting half of the memory that's in > our "32-bit" registers (since we'd still use 64 bits of storage for each > of our registers, even though we're "using" only 32 bits of it), but > there's no speed penalty, and unless there's overflow of the 32 LSB, > there's little harm in using a 64 bit integer as if it were a 32 bit > integer.
> The big waste, of course, is that if code doesn't *use* them, then it > could be wasteful/costly to save them.
And it's this sort of rumination that made me think that this is all just false economics.
-- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)