Currently, S, N, I and P registers are limited to 2 digits; that is, you
can only use:
[S|N|I|P]0 to [S|N|I|P]99.
For instance, this fails:
P333 = new .Integer
P333 = 1
Since Parrot does not have a limit of 32 registers per type anymore,
this is a bug, according to Coke:
[15:34] <kjs> it's limited to 2 digits
[15:35] <@cognominal> kjs: probably a linitation in the lexer :)
[15:35] <@Coke> that's a bug, SFAIK.
If this is indeed a bug, I think it should be easy to fix (as mentioned,
probably an IMCC limitation).
On Sun Mar 11 08:03:06 2007, kjs wrote:
> Currently, S, N, I and P registers are limited to 2 digits; that is, you
> can only use:
> [S|N|I|P]0 to [S|N|I|P]99.
> For instance, this fails:
> .sub main
> P333 = new .Integer
> P333 = 1
> Since Parrot does not have a limit of 32 registers per type anymore,
> this is a bug, according to Coke:
> [15:34] <kjs> it's limited to 2 digits
> [15:35] <@cognominal> kjs: probably a linitation in the lexer :)
> [15:35] <@Coke> that's a bug, SFAIK.
> If this is indeed a bug, I think it should be easy to fix (as mentioned,
> probably an IMCC limitation).
The attached patch changes the lexer to accept [S|N|I|P][0-9]+, note
that the patch also changes compilers/imcc/imclexer.c, because it's
necessary to generate a new lexer.
With this lexer:
$ cat test.pir
.sub main :main
P111 = new Integer
P111 = 1
$ ./parrot test.pir
I've run the tests and didn't find any test failling because of this
change. With this change we can have "P99999999999..999", if a limit to
the number of digits (or size) in the register name is defined i can
change the lexer to accept only names inside the limit.
More testing is welcome.
> I've run the tests and didn't find any test failling because of this
> change. With this change we can have "P99999999999..999", if a limit to
> the number of digits (or size) in the register name is defined i can
> change the lexer to accept only names inside the limit.
> More testing is welcome.
What does this do to the register allocator and to memory usage? If I use
integer registers 10,000 and 100,000, will Parrot allocate a sparse data
What does this give over using unlimited remappable symbolic registers?
Of course not ;) PASM regs are "physical" registers. The register allocator
allocates regs from 0 up. If some other allocator (i.e. PASM writer) is
allocating register P100000 it get's what it deservers that is a huge waste
IMHO there should be some (possibly per commandline overridable) limit of max
> What does this give over using unlimited remappable symbolic registers?
In the .Net translator it was a lot easier to reference registers
numerically when generating PIR; I guess there will be other cases where
that is the case too. Using ".local"s would have made it harder (I
assume that's what you meant by symbolic). And yes, admittedly I was
using numeric remappable registers ($Inn and so on), rather than Inn
directly, but the win for doing so was small (and it was probably a
non-win in terms of PIR->PBC time).
On the patch itself, I think at least limit it to something that can
always fit in a 32-bit integer.
Do not forget that we are talking about the lexer here. Looking into the
grammar file we can see that the register name that is matched in the
lexer changed rule is directly translated into an INTVAL (unless i'm
reading something wrong):
mk_pasm_reg(Interp *interp, char * name)
SymReg * r;
r->color = atoi(name+1);
the char* name is exactly the string matched in the lexer we are
discussing here. The 'color' field in SymReg type defintion is type INTVAL.
Bottom line: string matched in lexer is directly mapped to an INTVAL
later, so IMHO the max allowed size of the register name in the lexer
should be the max allowed for an INTVAL.
> I've allowed for any number of digits in register names, as long that
> the number specified always fits in a 32 integer value. Meaning, if the
> number defined in the register name is less than MAX_INT then it's ok to
> proceed, else get an error to avoid segmentation fault. Example:
> $ cat test.pir
> .sub main :main
> P8888 = new Integer
> P8888 = 1
> print P8888
> $ ./parrot test.pir
> $ cat test2.pir
> .sub main :main
> P99999999999999999999 = new Integer
> P99999999999999999999 = 1
> print P99999999999999999999
> $ ./parrot test2.pir
> error:imcc:number in 'P99999999999999999999' out of range
> in file 'test2.pir' line 2
That's much nicer. Could the error message be "register number in...."
though? One extra word of clarity could help.
Please folks, get serious. INTVAL allows 2^31/2^63 registers. A register is
taking 4/8 bytes of mem. Multiply.
Or IOW allowing arbitrary PASM register numbers (n) is super inefficient.
Parrot will allocate contiguous memory (per sub/recursion where it's used) to
hold 0..n register cells  . Limit it to some reasonable amount (e.g.
1024) and make a commandline switch to increase that for special purpose
As a side note: when you now say, we could compact these register numbers to a
contiguous range, yes: that's what the default dumb & fast register allocator
is already doing now with e.g. $P123456 and $P10.
 making this sparse will slow down interpreter execution time by magnitudes
and is totally non-sensical.
 using huge ~INTVAL-ranged register numbers will produce meaningless error
messages, when dying due to out of memory errors.
Consider this a poke to the Architect to verify/replace the previous
overturn of the original 32-register limit.
Will "Coke" Coleda
I agree with most POVs, and i'm available for changing the patch as soon
as anyone makes a 'sane' and final decision on the limit.
The advantage of allowing unlimited registers per sub is that it frees
us from spending quite so many cycles on register spilling. What's the
advantage of adding a limit?
PIR/PASM should of course parse whatever register numbers we decide are
legal. But, the lexer should only impose limitations that are also
imposed internally. PIR/PASM are not the only ways to generate bytecode.
If we set a lexer limitation of, for example, 250 registers, the
internal register allocator could validly create and use registers that
can't be directly accessed from PIR/PASM code, which could prove
problematic for debugging tools.
- Change the lexer to allow the maximum size of a 32 bit integer. (This
doesn't cost anything, since we're already storing the value as an integer.)
- Leave it to the interpreter and optimizer to decide when a sub should
have enough registers for every possible value, and when to employ
- Set coding standards for hand-written PIR to use the temporary
register variables ($P0) or local variables (.local pmc foo) instead of
direct registers (P0), except when direct register access is absolutely
If at some point we put an internal restriction on how many registers a
sub can have, then we change the lexer to match.