[perl #41788] [BUG] Real registers are limited to 2 digits

1 view
Skip to first unread message

Klaas-Jan Stol

unread,
Mar 11, 2007, 11:03:07 AM3/11/07
to bugs-bi...@rt.perl.org
# New Ticket Created by Klaas-Jan Stol
# Please include the string: [perl #41788]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=41788 >


Currently, S, N, I and P registers are limited to 2 digits; that is, you
can only use:

[S|N|I|P]0 to [S|N|I|P]99.

For instance, this fails:

.sub main
P333 = new .Integer
P333 = 1
.end

Since Parrot does not have a limit of 32 registers per type anymore,
this is a bug, according to Coke:

[15:34] <kjs> it's limited to 2 digits
[15:35] <@cognominal> kjs: probably a linitation in the lexer :)
[15:35] <@Coke> that's a bug, SFAIK.


If this is indeed a bug, I think it should be easy to fix (as mentioned,
probably an IMCC limitation).

regards,
kjs

Nuno Carvalho via RT

unread,
Mar 12, 2007, 1:52:31 PM3/12/07
to perl6-i...@perl.org
Greetings,

On Sun Mar 11 08:03:06 2007, kjs wrote:
> Currently, S, N, I and P registers are limited to 2 digits; that is, you
> can only use:
>
> [S|N|I|P]0 to [S|N|I|P]99.
>
> For instance, this fails:
>
> .sub main
> P333 = new .Integer
> P333 = 1
> .end
>
> Since Parrot does not have a limit of 32 registers per type anymore,
> this is a bug, according to Coke:
>
> [15:34] <kjs> it's limited to 2 digits
> [15:35] <@cognominal> kjs: probably a linitation in the lexer :)
> [15:35] <@Coke> that's a bug, SFAIK.
>
>
> If this is indeed a bug, I think it should be easy to fix (as mentioned,
> probably an IMCC limitation).

The attached patch changes the lexer to accept [S|N|I|P][0-9]+, note
that the patch also changes compilers/imcc/imclexer.c, because it's
necessary to generate a new lexer.

With this lexer:
$ cat test.pir
.sub main :main
P111 = new Integer
P111 = 1
print P111
.end
$ ./parrot test.pir
1

I've run the tests and didn't find any test failling because of this
change. With this change we can have "P99999999999..999", if a limit to
the number of digits (or size) in the register name is defined i can
change the lexer to accept only names inside the limit.

More testing is welcome.

> regards,
> kjs
Best regards,
./smash


Chromatic

unread,
Mar 12, 2007, 4:34:41 PM3/12/07
to perl6-i...@perl.org, parrotbug...@parrotcode.org
On Monday 12 March 2007 10:52, Nuno Carvalho via RT wrote:

> I've run the tests and didn't find any test failling because of this
> change. With this change we can have "P99999999999..999", if a limit to
> the number of digits (or size) in the register name is defined i can
> change the lexer to accept only names inside the limit.
>
> More testing is welcome.

What does this do to the register allocator and to memory usage? If I use
integer registers 10,000 and 100,000, will Parrot allocate a sparse data
structure?

What does this give over using unlimited remappable symbolic registers?

-- c

Leopold Toetsch

unread,
Mar 12, 2007, 5:00:28 PM3/12/07
to perl6-i...@perl.org, chromatic, parrotbug...@parrotcode.org
Am Montag, 12. März 2007 21:34 schrieb chromatic:
> What does this do to the register allocator and to memory usage?  If I use
> integer registers 10,000 and 100,000, will Parrot allocate a sparse data
> structure?

Of course not ;) PASM regs are "physical" registers. The register allocator
allocates regs from 0 up. If some other allocator (i.e. PASM writer) is
allocating register P100000 it get's what it deservers that is a huge waste
of memory.

IMHO there should be some (possibly per commandline overridable) limit of max
register numbers.

leo

Jonathan Worthington

unread,
Mar 12, 2007, 6:25:51 PM3/12/07
to chromatic, perl6-i...@perl.org, parrotbug...@parrotcode.org
chromatic wrote:
> On Monday 12 March 2007 10:52, Nuno Carvalho via RT wrote:
>
>
>> I've run the tests and didn't find any test failling because of this
>> change. With this change we can have "P99999999999..999", if a limit to
>> the number of digits (or size) in the register name is defined i can
>> change the lexer to accept only names inside the limit.
>>
>> More testing is welcome.
>>
>
> What does this do to the register allocator and to memory usage? If I use integer registers 10,000 and 100,000, will Parrot allocate a sparse data structure?
>
The assumption is that if you're naming actual registers rather than
asking Parrot to allocate them for you, you are generating sensible
code. :-)

> What does this give over using unlimited remappable symbolic registers?
>

In the .Net translator it was a lot easier to reference registers
numerically when generating PIR; I guess there will be other cases where
that is the case too. Using ".local"s would have made it harder (I
assume that's what you meant by symbolic). And yes, admittedly I was
using numeric remappable registers ($Inn and so on), rather than Inn
directly, but the win for doing so was small (and it was probably a
non-win in terms of PIR->PBC time).

On the patch itself, I think at least limit it to something that can
always fit in a 32-bit integer.

Jonathan

Nuno Carvalho via RT

unread,
Mar 13, 2007, 6:55:22 AM3/13/07
to perl6-i...@perl.org
Greetings,

Do not forget that we are talking about the lexer here. Looking into the
grammar file we can see that the register name that is matched in the
lexer changed rule is directly translated into an INTVAL (unless i'm
reading something wrong):

SymReg *
mk_pasm_reg(Interp *interp, char * name)
{
SymReg * r;
(...)
r->color = atoi(name+1);
(...)
}

the char* name is exactly the string matched in the lexer we are
discussing here. The 'color' field in SymReg type defintion is type INTVAL.

Bottom line: string matched in lexer is directly mapped to an INTVAL
later, so IMHO the max allowed size of the register name in the lexer
should be the max allowed for an INTVAL.

./smash

Chromatic

unread,
Mar 13, 2007, 1:40:03 PM3/13/07
to perl6-i...@perl.org, parrotbug...@parrotcode.org
On Tuesday 13 March 2007 07:54, Nuno Carvalho via RT wrote:

> I've allowed for any number of digits in register names, as long that
> the number specified always fits in a 32 integer value. Meaning, if the
> number defined in the register name is less than MAX_INT then it's ok to
> proceed, else get an error to avoid segmentation fault. Example:


>
> $ cat test.pir
> .sub main :main

> P8888 = new Integer
> P8888 = 1
> print P8888


> .end
> $ ./parrot test.pir
> 1
>

> $ cat test2.pir
> .sub main :main
> P99999999999999999999 = new Integer
> P99999999999999999999 = 1
> print P99999999999999999999
> .end
> $ ./parrot test2.pir
> error:imcc:number in 'P99999999999999999999' out of range
> in file 'test2.pir' line 2

That's much nicer. Could the error message be "register number in...."
though? One extra word of clarity could help.

-- c

Leopold Toetsch

unread,
Mar 13, 2007, 6:45:04 PM3/13/07
to perl6-i...@perl.org
Am Dienstag, 13. März 2007 11:55 schrieb Nuno Carvalho via RT:
> so IMHO the max allowed size of the register name in the lexer
> should be the max allowed for an INTVAL.

Please folks, get serious. INTVAL allows 2^31/2^63 registers. A register is
taking 4/8 bytes of mem. Multiply.

Or IOW allowing arbitrary PASM register numbers (n) is super inefficient.
Parrot will allocate contiguous memory (per sub/recursion where it's used) to
hold 0..n register cells [1] [2]. Limit it to some reasonable amount (e.g.
1024) and make a commandline switch to increase that for special purpose
code.

As a side note: when you now say, we could compact these register numbers to a
contiguous range, yes: that's what the default dumb & fast register allocator
is already doing now with e.g. $P123456 and $P10.

Thanks,
leo

[1] making this sparse will slow down interpreter execution time by magnitudes
and is totally non-sensical.

[2] using huge ~INTVAL-ranged register numbers will produce meaningless error
messages, when dying due to out of memory errors.

Will Coleda

unread,
Mar 13, 2007, 7:19:21 PM3/13/07
to Leopold Toetsch, Allison Randal, Perl 6 Internals
Having a limit is more than reasonable, agreed: the goal of this
patch was to bring the code into agreement with the docs.

Consider this a poke to the Architect to verify/replace the previous
overturn of the original 32-register limit.

--
Will "Coke" Coleda
wi...@coleda.com


Nuno Carvalho via RT

unread,
Mar 14, 2007, 7:44:51 AM3/14/07
to perl6-i...@perl.org
Hi again,

I agree with most POVs, and i'm available for changing the patch as soon
as anyone makes a 'sane' and final decision on the limit.

Best regards,
./smash

Klaas-Jan Stol

unread,
Mar 14, 2007, 8:23:00 AM3/14/07
to parrotbug...@parrotcode.org, perl6-i...@perl.org
Nuno Carvalho via RT wrote:
To mention another reg.based VM: the Lua VM has 250 registers (but it's
a build option) (http://www.tecgraf.puc-rio.br/~lhf/ftp/doc/jucs05.pdf)

my 2c,

kjs

Allison Randal

unread,
Mar 17, 2007, 12:23:24 AM3/17/07
to Will Coleda, Leopold Toetsch, Perl 6 Internals
Will Coleda wrote:
> Having a limit is more than reasonable, agreed: the goal of this patch
> was to bring the code into agreement with the docs.
>
> Consider this a poke to the Architect to verify/replace the previous
> overturn of the original 32-register limit.

The advantage of allowing unlimited registers per sub is that it frees
us from spending quite so many cycles on register spilling. What's the
advantage of adding a limit?

PIR/PASM should of course parse whatever register numbers we decide are
legal. But, the lexer should only impose limitations that are also
imposed internally. PIR/PASM are not the only ways to generate bytecode.
If we set a lexer limitation of, for example, 250 registers, the
internal register allocator could validly create and use registers that
can't be directly accessed from PIR/PASM code, which could prove
problematic for debugging tools.

My take:

- Change the lexer to allow the maximum size of a 32 bit integer. (This
doesn't cost anything, since we're already storing the value as an integer.)

- Leave it to the interpreter and optimizer to decide when a sub should
have enough registers for every possible value, and when to employ
register spilling/reuse.

- Set coding standards for hand-written PIR to use the temporary
register variables ($P0) or local variables (.local pmc foo) instead of
direct registers (P0), except when direct register access is absolutely
necessary.

If at some point we put an internal restriction on how many registers a
sub can have, then we change the lexer to match.

Allison

Reply all
Reply to author
Forward
0 new messages