# New Ticket Created by Klaas-Jan Stol # Please include the string: [perl #41788] # in the subject line of all future correspondence about this issue. # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=41788 >
Currently, S, N, I and P registers are limited to 2 digits; that is, you can only use:
[S|N|I|P]0 to [S|N|I|P]99.
For instance, this fails:
.sub main P333 = new .Integer P333 = 1 .end
Since Parrot does not have a limit of 32 registers per type anymore, this is a bug, according to Coke:
[15:34] <kjs> it's limited to 2 digits [15:35] <@cognominal> kjs: probably a linitation in the lexer :) [15:35] <@Coke> that's a bug, SFAIK.
If this is indeed a bug, I think it should be easy to fix (as mentioned, probably an IMCC limitation).
> Currently, S, N, I and P registers are limited to 2 digits; that is, you > can only use:
> [S|N|I|P]0 to [S|N|I|P]99.
> For instance, this fails:
> .sub main > P333 = new .Integer > P333 = 1 > .end
> Since Parrot does not have a limit of 32 registers per type anymore, > this is a bug, according to Coke:
> [15:34] <kjs> it's limited to 2 digits > [15:35] <@cognominal> kjs: probably a linitation in the lexer :) > [15:35] <@Coke> that's a bug, SFAIK.
> If this is indeed a bug, I think it should be easy to fix (as mentioned, > probably an IMCC limitation).
The attached patch changes the lexer to accept [S|N|I|P][0-9]+, note that the patch also changes compilers/imcc/imclexer.c, because it's necessary to generate a new lexer.
With this lexer: $ cat test.pir .sub main :main P111 = new Integer P111 = 1 print P111 .end $ ./parrot test.pir 1
I've run the tests and didn't find any test failling because of this change. With this change we can have "P99999999999..999", if a limit to the number of digits (or size) in the register name is defined i can change the lexer to accept only names inside the limit.
On Monday 12 March 2007 10:52, Nuno Carvalho via RT wrote:
> I've run the tests and didn't find any test failling because of this > change. With this change we can have "P99999999999..999", if a limit to > the number of digits (or size) in the register name is defined i can > change the lexer to accept only names inside the limit.
> More testing is welcome.
What does this do to the register allocator and to memory usage? If I use integer registers 10,000 and 100,000, will Parrot allocate a sparse data structure?
What does this give over using unlimited remappable symbolic registers?
> What does this do to the register allocator and to memory usage? If I use > integer registers 10,000 and 100,000, will Parrot allocate a sparse data > structure?
Of course not ;) PASM regs are "physical" registers. The register allocator allocates regs from 0 up. If some other allocator (i.e. PASM writer) is allocating register P100000 it get's what it deservers that is a huge waste of memory.
IMHO there should be some (possibly per commandline overridable) limit of max register numbers.
chromatic wrote: > On Monday 12 March 2007 10:52, Nuno Carvalho via RT wrote:
>> I've run the tests and didn't find any test failling because of this >> change. With this change we can have "P99999999999..999", if a limit to >> the number of digits (or size) in the register name is defined i can >> change the lexer to accept only names inside the limit.
>> More testing is welcome.
> What does this do to the register allocator and to memory usage? If I use integer registers 10,000 and 100,000, will Parrot allocate a sparse data structure?
The assumption is that if you're naming actual registers rather than asking Parrot to allocate them for you, you are generating sensible code. :-)
> What does this give over using unlimited remappable symbolic registers?
In the .Net translator it was a lot easier to reference registers numerically when generating PIR; I guess there will be other cases where that is the case too. Using ".local"s would have made it harder (I assume that's what you meant by symbolic). And yes, admittedly I was using numeric remappable registers ($Inn and so on), rather than Inn directly, but the win for doing so was small (and it was probably a non-win in terms of PIR->PBC time).
On the patch itself, I think at least limit it to something that can always fit in a 32-bit integer.
> chromatic wrote: > > On Monday 12 March 2007 10:52, Nuno Carvalho via RT wrote:
> >> I've run the tests and didn't find any test failling because of > this > >> change. With this change we can have "P99999999999..999", if a > limit to > >> the number of digits (or size) in the register name is defined i > can > >> change the lexer to accept only names inside the limit.
> >> More testing is welcome.
> > What does this do to the register allocator and to memory usage? If > I use integer registers 10,000 and 100,000, will Parrot allocate a > sparse data structure?
> The assumption is that if you're naming actual registers rather than > asking Parrot to allocate them for you, you are generating sensible > code. :-)
> > What does this give over using unlimited remappable symbolic > registers?
> In the .Net translator it was a lot easier to reference registers > numerically when generating PIR; I guess there will be other cases > where > that is the case too. Using ".local"s would have made it harder (I > assume that's what you meant by symbolic). And yes, admittedly I was > using numeric remappable registers ($Inn and so on), rather than Inn > directly, but the win for doing so was small (and it was probably a > non-win in terms of PIR->PBC time).
> On the patch itself, I think at least limit it to something that can > always fit in a 32-bit integer.
> Jonathan
Do not forget that we are talking about the lexer here. Looking into the grammar file we can see that the register name that is matched in the lexer changed rule is directly translated into an INTVAL (unless i'm reading something wrong):
the char* name is exactly the string matched in the lexer we are discussing here. The 'color' field in SymReg type defintion is type INTVAL.
Bottom line: string matched in lexer is directly mapped to an INTVAL later, so IMHO the max allowed size of the register name in the lexer should be the max allowed for an INTVAL.
> I've allowed for any number of digits in register names, as long that > the number specified always fits in a 32 integer value. Meaning, if the > number defined in the register name is less than MAX_INT then it's ok to > proceed, else get an error to avoid segmentation fault. Example:
> $ cat test2.pir > .sub main :main > P99999999999999999999 = new Integer > P99999999999999999999 = 1 > print P99999999999999999999 > .end > $ ./parrot test2.pir > error:imcc:number in 'P99999999999999999999' out of range > in file 'test2.pir' line 2
That's much nicer. Could the error message be "register number in...." though? One extra word of clarity could help.
Am Dienstag, 13. März 2007 11:55 schrieb Nuno Carvalho via RT:
> so IMHO the max allowed size of the register name in the lexer > should be the max allowed for an INTVAL.
Please folks, get serious. INTVAL allows 2^31/2^63 registers. A register is taking 4/8 bytes of mem. Multiply.
Or IOW allowing arbitrary PASM register numbers (n) is super inefficient. Parrot will allocate contiguous memory (per sub/recursion where it's used) to hold 0..n register cells [1] [2]. Limit it to some reasonable amount (e.g. 1024) and make a commandline switch to increase that for special purpose code.
As a side note: when you now say, we could compact these register numbers to a contiguous range, yes: that's what the default dumb & fast register allocator is already doing now with e.g. $P123456 and $P10.
Thanks, leo
[1] making this sparse will slow down interpreter execution time by magnitudes and is totally non-sensical.
[2] using huge ~INTVAL-ranged register numbers will produce meaningless error messages, when dying due to out of memory errors.
> Am Dienstag, 13. März 2007 11:55 schrieb Nuno Carvalho via RT: >> so IMHO the max allowed size of the register name in the lexer >> should be the max allowed for an INTVAL.
> Please folks, get serious. INTVAL allows 2^31/2^63 registers. A > register is > taking 4/8 bytes of mem. Multiply.
> Or IOW allowing arbitrary PASM register numbers (n) is super > inefficient. > Parrot will allocate contiguous memory (per sub/recursion where > it's used) to > hold 0..n register cells [1] [2]. Limit it to some reasonable > amount (e.g. > 1024) and make a commandline switch to increase that for special > purpose > code.
> As a side note: when you now say, we could compact these register > numbers to a > contiguous range, yes: that's what the default dumb & fast register > allocator > is already doing now with e.g. $P123456 and $P10.
> Thanks, > leo
> [1] making this sparse will slow down interpreter execution time by > magnitudes > and is totally non-sensical.
> [2] using huge ~INTVAL-ranged register numbers will produce > meaningless error > messages, when dying due to out of memory errors.
> Having a limit is more than reasonable, agreed: the goal of this > patch was to bring the code into agreement with the docs.
> Consider this a poke to the Architect to verify/replace the previous > overturn of the original 32-register limit.
> On Mar 13, 2007, at 6:45 PM, Leopold Toetsch wrote:
> > Am Dienstag, 13. März 2007 11:55 schrieb Nuno Carvalho via RT: > >> so IMHO the max allowed size of the register name in the lexer > >> should be the max allowed for an INTVAL.
> > Please folks, get serious. INTVAL allows 2^31/2^63 registers. A > > register is > > taking 4/8 bytes of mem. Multiply.
> > Or IOW allowing arbitrary PASM register numbers (n) is super > > inefficient. > > Parrot will allocate contiguous memory (per sub/recursion where > > it's used) to > > hold 0..n register cells [1] [2]. Limit it to some reasonable > > amount (e.g. > > 1024) and make a commandline switch to increase that for special > > purpose > > code.
> > As a side note: when you now say, we could compact these register > > numbers to a > > contiguous range, yes: that's what the default dumb & fast register > > allocator > > is already doing now with e.g. $P123456 and $P10.
> > Thanks, > > leo
> > [1] making this sparse will slow down interpreter execution time by > > magnitudes > > and is totally non-sensical.
> > [2] using huge ~INTVAL-ranged register numbers will produce > > meaningless error > > messages, when dying due to out of memory errors.
I agree with most POVs, and i'm available for changing the patch as soon as anyone makes a 'sane' and final decision on the limit.
>> Having a limit is more than reasonable, agreed: the goal of this >> patch was to bring the code into agreement with the docs.
>> Consider this a poke to the Architect to verify/replace the previous >> overturn of the original 32-register limit.
>> On Mar 13, 2007, at 6:45 PM, Leopold Toetsch wrote:
>>> Am Dienstag, 13. März 2007 11:55 schrieb Nuno Carvalho via RT:
>>>> so IMHO the max allowed size of the register name in the lexer >>>> should be the max allowed for an INTVAL.
>>> Please folks, get serious. INTVAL allows 2^31/2^63 registers. A >>> register is >>> taking 4/8 bytes of mem. Multiply.
>>> Or IOW allowing arbitrary PASM register numbers (n) is super >>> inefficient. >>> Parrot will allocate contiguous memory (per sub/recursion where >>> it's used) to >>> hold 0..n register cells [1] [2]. Limit it to some reasonable >>> amount (e.g. >>> 1024) and make a commandline switch to increase that for special >>> purpose >>> code.
>>> As a side note: when you now say, we could compact these register >>> numbers to a >>> contiguous range, yes: that's what the default dumb & fast register >>> allocator >>> is already doing now with e.g. $P123456 and $P10.
>>> Thanks, >>> leo
>>> [1] making this sparse will slow down interpreter execution time by >>> magnitudes >>> and is totally non-sensical.
>>> [2] using huge ~INTVAL-ranged register numbers will produce >>> meaningless error >>> messages, when dying due to out of memory errors.
> I agree with most POVs, and i'm available for changing the patch as soon > as anyone makes a 'sane' and final decision on the limit.
Will Coleda wrote: > Having a limit is more than reasonable, agreed: the goal of this patch > was to bring the code into agreement with the docs.
> Consider this a poke to the Architect to verify/replace the previous > overturn of the original 32-register limit.
The advantage of allowing unlimited registers per sub is that it frees us from spending quite so many cycles on register spilling. What's the advantage of adding a limit?
PIR/PASM should of course parse whatever register numbers we decide are legal. But, the lexer should only impose limitations that are also imposed internally. PIR/PASM are not the only ways to generate bytecode. If we set a lexer limitation of, for example, 250 registers, the internal register allocator could validly create and use registers that can't be directly accessed from PIR/PASM code, which could prove problematic for debugging tools.
My take:
- Change the lexer to allow the maximum size of a 32 bit integer. (This doesn't cost anything, since we're already storing the value as an integer.)
- Leave it to the interpreter and optimizer to decide when a sub should have enough registers for every possible value, and when to employ register spilling/reuse.
- Set coding standards for hand-written PIR to use the temporary register variables ($P0) or local variables (.local pmc foo) instead of direct registers (P0), except when direct register access is absolutely necessary.
If at some point we put an internal restriction on how many registers a sub can have, then we change the lexer to match.