I think it may be a handy thing if someone'd go through and draw up a
set of rules for the use of temps, and things that'll cause the
register coloring algorithm to go mad. (I'd like to avoid 30 minute
compile sessions--it's a tad tedious :)
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
> I think it may be a handy thing if someone'd go through and draw up a
> set of rules for the use of temps, and things that'll cause the
> register coloring algorithm to go mad. (I'd like to avoid 30 minute
> compile sessions--it's a tad tedious :)
Are you still sticking everything in one big _main?
leo
Should be faster now by some factor.
How many symbols are in the biggest compilation unit (parrot -v
"registers in .imc")?
leo
Cool, thanks. I've an optimized build of parrot going now, and I'll
see what things look like when it's dine.
>How many symbols are in the biggest compilation unit (parrot -v
>"registers in .imc")?
Dunno what parrot thinks--it's not done yet. grep says 569 .locals
and 473 temp PMC registers. ($Px) I think that can reasonably be
considered A Lot. I'm rejigging the compiler to cut down on the
number of .local declarations, but that'll increase the temp pmc
usage, at least with the relatively simple temp system I've got now.
(I can throw dummy labels in to create fake basic blocks if that'll
help the register coloring code)
Except for the library code, yup. No way around that.
I've put in another factor ~2.5 change for a unit with 2000 temps.
leo
> Dunno what parrot thinks--it's not done yet. grep says 569 .locals
> and 473 temp PMC registers.
I've now enabled some more code that speeds up temp allocation more
(~2.5 for 2000 temps in a unit). This needs that the usage range is
limited to 10 lines. If the code from a compiler looks like the output
from the program below, this works fine.
> ... I'm rejigging the compiler to cut down on the
> number of .local declarations, but that'll increase the temp pmc
> usage, at least with the relatively simple temp system I've got now.
Temps are fine. ".local"s ranging from top of the program (not counting
the declaration) down do hurt. Many small short ranged temps are much
better then long ranged vars because they have more interferences on
each other.
> (I can throw dummy labels in to create fake basic blocks if that'll
> help the register coloring code)
That makes it worse. More blocks, more life ranges to compare.
If you can emit code similar to "gen.pl" it could take advantage of my
last change.
leo
$ cat gen.pl
#!/usr/bin/perl
use strict;
# a = 1 + 1 + 1 ... + 1
my ($i, $t, $N);
$N = $ARGV[0] || 299;
print ".sub _main \@MAIN\n";
$t++;
print "\t\$P$t = new PerlInt\n";
print "\t\$P$t = 1\n";
for $i (0..$N) {
$t++;
print "\t\$P$t = new PerlInt\n";
print "\t\$P$t = 1\n";
$t++;
print "\t\$P$t = new PerlInt\n";
print "\t\$P$t = \$P@{[$t-1]} + \$P@{[$t-2]}\n";
}
print "\tprint \$P$t\n";
print "\tprint \"\\n\"\n";
print ".end\n";
I think that what "real compilers" do to solve this problem is
implement two different algorithms: one for normal subs which tries
to generate optimal code, and a naive one for very large subs with
many virtual registers.
That makes compilation much faster, and the execution penalty doesn't
hurt too much.
Actually, it's (for me) an open question whether the "good" register
allocation should be the default one. Perl (and python and..) users
expect blazing compilation times, so maybe we should reserve it for
higher -O levels.
But then, we won't know how bad are our compilation times until there
are "real" programs written in perl6/parrot.
-angel
This sped it up a lot. The output is:
Starting parse...
sub _MAIN:
registers in .imc: I34, N0, S7, P1014
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed: I43, N0, S12, P3327
registers in .pasm: I25, N0, S20, P32 - 464 spilled
2007 basic_blocks, 2079 edges
13691 lines compiled.
Writing sample.pbc
packed code 348208 bytes
sample.pbc written.
Which actually came out in reasonable time, rather than me giving up
after 45 minutes. :) Still takes ages, so I've a lot of work to do
here.
> > ... I'm rejigging the compiler to cut down on the
>> number of .local declarations, but that'll increase the temp pmc
>> usage, at least with the relatively simple temp system I've got now.
>
>Temps are fine. ".local"s ranging from top of the program (not counting
>the declaration) down do hurt. Many small short ranged temps are much
>better then long ranged vars because they have more interferences on
>each other.
Hrm. Does the code currently consider something like:
$P0 = foo
to start a new lifetime for $P0?
Ouch. Register allocation is spending huge times during spilling.
Something is definetely wrong with your code - wrong in the sense of: it
doesn't play with, what imcc expects ;)
You must have too much pseudo-globals in that unit, spanning a huge
range and interfering with one another. Do you use lexical vars or
Parrot globals?
> Hrm. Does the code currently consider something like:
> $P0 = foo
> to start a new lifetime for $P0?
If you don't use $P0 beyond that point yes. Do you name all
temps $P0 or some such? Or are you giving them unique names. You should
do the latter.
leo
Yep, the 45 minute compile times were the hint that maybe imcc and I
were fighting a bit. :)
>You must have too much pseudo-globals in that unit, spanning a huge
>range and interfering with one another. Do you use lexical vars or
>Parrot globals?
I was using .locals for the actual variables in the source program,
and $Px for all the temps the compiler generated. I've been migrating
a lot of the code to use a few .local-defined hashes and indexing
into them, and it looks like that's the way to go. (This'd be easier
if this language had, y'know, actual subroutines and stuff...)
> > Hrm. Does the code currently consider something like:
>
>> $P0 = foo
>
>> to start a new lifetime for $P0?
>
>If you don't use $P0 beyond that point yes. Do you name all
>temps $P0 or some such? Or are you giving them unique names. You should
>do the latter.
I've a lot of 'constant' temps named $P0. I'll go fix that and see where we go.
> I was using .locals for the actual variables in the source program,
Well, you know it: .locals aren't vars.
> and $Px for all the temps the compiler generated. I've been migrating
> a lot of the code to use a few .local-defined hashes and indexing
> into them, and it looks like that's the way to go.
If you use a hash anyway, just use globals - this seems to match the
target language's POV:
$Px = global "foo" # fetch
global "foo" = $Py # store
If spilling is still needed, the spill code can be improved, if the
variable already has a store (like a lexical or a global). Needs only
cutting down the life range, and a refetch in the other locations of
usage. This saves the store and simplifies spilling.
But if you treat all your variables like that, there will be no spilling
at all.
> ... (This'd be easier
> if this language had, y'know, actual subroutines and stuff...)
Yep.
leo