Optimization brainstorm: variable clusters

Melvin Smith

unread,

Jan 15, 2004, 3:51:59 PM1/15/04

to perl6-i...@perl.org, dan Sugalski

While sitting on IRC with Dan and Jonathan discussing how to
optimizer a certain construct with how we handle globals/package
variables, etc. I came to the conclusion that it would be valuable
to not have to fetch each and every global, lexical or package
variable by name, individually, but instead fetch them in
clusters (4-16 at a time).

We already have register frames which are saved and restored
very efficiently.

For instances where a module has package variables and
globals, there exists an overhead upon repeatedly invoking
a sub in one of those modules.

I believe it would be quite handy to store register frames
off for random access (not on the pad stack). The access
could either be by name, or stashed in a PMC and held
on the stack itself.

IMCC could analyze a module, decide if the optimization makes
sense, and place commonly used values (constants or
variables) in a pre-packaged register frame. (or more than 1)
This is done at compile / load time of course. If it were all
constants, compile time works, but for PMCs and Strings
it would have to be built at loadtime.

Upon invoking a busy routine, it _might_ be more efficient (since
we already save register frames anyway) to initialize the
upper frame (top 16 registers) with a pre-built register set.

It might also be more useful to have it more granular than
16, maybe in 4s or 8s. By doing life analysis and some
weighting, IMCC might be able to turn multiple symbol
lookups into 1.

Comments & questions welcome.

-Melvin

Elizabeth Mattijsen

unread,

Jan 15, 2004, 4:02:38 PM1/15/04

to Melvin Smith, perl6-i...@perl.org, dan Sugalski

At 15:51 -0500 1/15/04, Melvin Smith wrote:
>IMCC could analyze a module, decide if the optimization makes
>sense, and place commonly used values (constants or
>variables) in a pre-packaged register frame. (or more than 1)
>This is done at compile / load time of course. If it were all
>constants, compile time works, but for PMCs and Strings
>it would have to be built at loadtime.
>
>Upon invoking a busy routine, it _might_ be more efficient (since
>we already save register frames anyway) to initialize the
>upper frame (top 16 registers) with a pre-built register set.
>
>It might also be more useful to have it more granular than
>16, maybe in 4s or 8s. By doing life analysis and some
>weighting, IMCC might be able to turn multiple symbol
>lookups into 1.
>
>Comments & questions welcome.

Why am I thinking of the "register" keyword in C?

Liz

Melvin Smith

unread,

Jan 15, 2004, 6:13:40 PM1/15/04

to Elizabeth Mattijsen, perl6-i...@perl.org, dan Sugalski

I have no idea and I can't see the relationship. :)

With Parrot's architecture, we have overhead of storing and retrieving
globals, lexicals and package variables by _name_. This was a
design choice, but it has ramifications.

C has none of this overhead since it does virtuall all of its symbol resolution
at compile time or load time (except in the case of dlopen and company).

My email was concerned with storing/retrieving multiple variables
with a single lookup, not with hinting to the compiler.

-Melvin

Elizabeth Mattijsen

unread,

Jan 15, 2004, 6:23:59 PM1/15/04

to Melvin Smith, perl6-i...@perl.org, dan Sugalski

At 18:13 -0500 1/15/04, Melvin Smith wrote:
>At 10:02 PM 1/15/2004 +0100, Elizabeth Mattijsen wrote:
>>At 15:51 -0500 1/15/04, Melvin Smith wrote:
>>>It might also be more useful to have it more granular than
>>>16, maybe in 4s or 8s. By doing life analysis and some
>>>weighting, IMCC might be able to turn multiple symbol
>>>lookups into 1.

>>Why am I thinking of the "register" keyword in C?

>My email was concerned with storing/retrieving multiple variables
>with a single lookup, not with hinting to the compiler.

I guess my point was than that it maybe would make sense to be able
to hint to IMCC, so that IMCC can do a better job of grouping
variables together.

Please ignore me if I'm not making any sense. I'm just coming down
after a _long_ day and may not be thinking too clearly anymore... ;-)

Liz

Luke Palmer

unread,

Jan 15, 2004, 6:26:36 PM1/15/04

to Melvin Smith, Elizabeth Mattijsen, perl6-i...@perl.org, dan Sugalski

Okay. Can you show an example of this optimization. I'd rather see it
in a HLL talking about PIR, as opposed to in PIR itself.

Luke

Melvin Smith

unread,

Jan 15, 2004, 6:27:52 PM1/15/04

to Elizabeth Mattijsen, perl6-i...@perl.org, dan Sugalski

At 06:13 PM 1/15/2004 -0500, Melvin Smith wrote:
>At 10:02 PM 1/15/2004 +0100, Elizabeth Mattijsen wrote:
>>At 15:51 -0500 1/15/04, Melvin Smith wrote:
>>>Comments & questions welcome.
>>
>>Why am I thinking of the "register" keyword in C?
>
>I have no idea and I can't see the relationship. :)

I just realized my response sounded crass, and wasn't meant to be.
I welcome comments, I just didn't understand what relation
you were getting at. Feel free to point it out to me.

The context: Jonathan was asking about importing
constants at runtime and/or constant namespaces.

Dan and I were discussing the issues and how routines
with lots of package globals or constants would spend
a significant part of their time retrieving symbols. Jonathan
did not want compile time constants, Dan did not want
"importable" constants that mutate the bytecode at runtime,
so I was trying to come up with a compromise, ugly as
it may be.

Weird optimizations are ok in my book if they can be
hidden in a back-end compiler.

-Melvin

Melvin Smith

unread,

Jan 15, 2004, 7:52:00 PM1/15/04

to Luke Palmer, Elizabeth Mattijsen, perl6-i...@perl.org, dan Sugalski

At 04:26 PM 1/15/2004 -0700, Luke Palmer wrote:

>Melvin Smith writes:
> > My email was concerned with storing/retrieving multiple variables
> > with a single lookup, not with hinting to the compiler.
>
>Okay. Can you show an example of this optimization. I'd rather see it
>in a HLL talking about PIR, as opposed to in PIR itself.

Sure, although what I am talking about is something that the compiler
would insert itself, not something we would expose at the HLL.

But anyway, a typical module with package variables (syntax may vary)
might do:

package MyLDAP;

import LDAPLib;

# Package variables
string top = "dn=foo, cn=bar, co=blah";
string user = "melvin";
string password = "abcd";

sub do_something
{
do_something_else(top, type, user);
}

Assume that the 3 variables are commonly used in this package (in various
routines).
Assume that there are other variables and constants that are private to
this package.
Or forget package altogether, they might be globals.

The Parrot translation of the sub do_something() has to fetch the 3 package
variables
every invocation.

.sub _do_something
.local string top
.local string user
.local string password
top = fetch "MyLDAP::top"
user = fetch "MyLDAP::user"
password = fetch "MyLDAP::password"
_do_something_else(top, user, password)
.end

This would translate to bytecode that assigns 3 registers to
the variables. Each invocation those registers get initialized
by a fetch (hash lookup).

For the simple case, I'd like IMCC to look at the package (all
subs) and compute a use-count of certain variables, pre-package
the N most common (4, 8?) into a register block, and restore
that block of registers at the entrance of the routine, rather than
doing the 3 fetch lookups.

# Beware pseudo-ops :)

_do_something:
fetch P29, "MyLDAP::top"
fetch P30, "MyLDAP::user"
fetch P31, "MyLDAP::password"
_do_someting_else(P29, P30, P31) # Assume the fetch worked for the
top 4 regs

Changes to:

_do_something:
fetchreggroup "MyLDAP::g1" # 1 lookup and you get all 3
_do_someting_else(P29, P30, P31) # Assume the fetch worked for the
top 4 regs,
# so P28
just gets clobbered with a NULL

I can see some potential problems to solve with regards to some
languages where variables are dynamic and can be "undefined",
such as Perl6, but the optimization would certainly work for
constants in all languages. The only problem with Perl6 would be
if a global or package variable's address changed after it was stored
in the register group at bytecode load time, (which could probably happen).

Anytime we cache something dynamic, we have to make sure the caches
know about changes. I think that is where notifications might help.

For constants it is easy. IMCC might say, "this routine requires us to
intialize
at least 3 registers with a constant value, lets make it into a register block"

This may be a premature optimization, but for certain cases I think its pretty
nifty.

-Melvin

Elizabeth Mattijsen

unread,

Jan 16, 2004, 3:30:48 AM1/16/04

to Melvin Smith, Luke Palmer, perl6-i...@perl.org, dan Sugalski

At 19:52 -0500 1/15/04, Melvin Smith wrote:
>At 04:26 PM 1/15/2004 -0700, Luke Palmer wrote:
>I can see some potential problems to solve with regards to some
>languages where variables are dynamic and can be "undefined",
>such as Perl6, but the optimization would certainly work for
>constants in all languages. The only problem with Perl6 would be
>if a global or package variable's address changed after it was stored
>in the register group at bytecode load time, (which could probably happen).

Which is very hard not to happen as soon as you get into Exporter
land. ;-( For example:

use Scalar::Util qw(blessed weaken reftype);
use POSIX;

>Anytime we cache something dynamic, we have to make sure the caches
>know about changes. I think that is where notifications might help.
>
>For constants it is easy. IMCC might say, "this routine requires us
>to intialize
>at least 3 registers with a constant value, lets make it into a
>register block"
>
>This may be a premature optimization, but for certain cases I think its pretty
>nifty.

This smells like premature optimization to me for languages such as Perl[\d].

The number of times a variable occurs in a program, may have _no_
relation to how many times it will be accessed. So what's the
optimization then?

If you're thinking about this, then maybe a better heuristic would be
to group globals into groups that are _only_ referenced within a
specific scope and fetch them on scope entry and store them on scope
exit. But then, anything like eval"" or the equivalent of a glob
assignment (or even worse: an event) within that scope, will cause
problems.

But please, people around me always tell me that I'm way too
negative. That I'm always saying why things _can't_ happen. I'd
like to be proven wrong... ;-)

Liz

Leopold Toetsch

unread,

Jan 16, 2004, 4:24:12 AM1/16/04

to Elizabeth Mattijsen, perl6-i...@perl.org

Elizabeth Mattijsen <l...@dijkmat.nl> wrote:

> If you're thinking about this, then maybe a better heuristic would be
> to group globals into groups that are _only_ referenced within a
> specific scope and fetch them on scope entry and store them on scope
> exit. But then, anything like eval"" or the equivalent of a glob
> assignment (or even worse: an event) within that scope, will cause
> problems.

Storing lexicals or globals isn't needed:

$ cat g.pasm
new P0, .PerlInt
set P0, 4
store_global "$a", P0
# ...
find_global P1, "$a"
inc P1
find_global P2, "$a"
print P2
print "\n"
end
$ parrot g.pasm
5

So the optimization is to just keep lexicals/globals in registers as long
as we have some. Where currently spilling is done, we just forget about
that register (but not *reuse* it, C<new> or such is ok) - and refetch
the variable later.

So the *only* current optimization is: we need HLL directives for
lexicals and globals so that the spilling code and register allocator
can use this information. That is: we can always cut the life range of
lexicals/globals, *if* we refetch, where we now fetch from the spill
array.

> Liz

leo

Tim Bunce

unread,

Jan 16, 2004, 8:02:11 AM1/16/04

to Melvin Smith, Elizabeth Mattijsen, perl6-i...@perl.org, dan Sugalski

Aren't constant strings going to be saved in a way that lets the
address of the saved string be used to avoid string comparisons?
(As is done for hash keys in perl5.) Perhaps that's already done.

Then bytecode could 'declare' all the string constants it contains.
The byteloader could merge them into the global saved strings pool
and 'fixup' references to them in the bytecode.

If the namespace lookup code knew when it was being given saved
string pointers it could avoid the string compares as it walks the
namespace tree.

Maybe all that's been done. Here's an idea that builds on that:

Perhaps a variant of a hash that worked with large integers (pointers)
as keys could be of some use here. The namespace could be a tree of
these 'integer hashes'. Most namespace lookups use constants which
can be 'registered' in the unique string pool at byteloader time.

To lookup a non-constant string you just need to check if it's in
the unique string pool and get that pointer if it is. If it's not
then you know it doesn't exist anywhere. If it is you do the lookup
using the address of the string in the pool.

The JudyL functions (http://judy.sourceforge.net/) provide a very
efficient 'integer hash'.

Tim.

Dan Sugalski

unread,

Jan 16, 2004, 9:27:57 AM1/16/04

to Elizabeth Mattijsen, Melvin Smith, Luke Palmer, perl6-i...@perl.org

At 9:30 AM +0100 1/16/04, Elizabeth Mattijsen wrote:
>At 19:52 -0500 1/15/04, Melvin Smith wrote:
>>At 04:26 PM 1/15/2004 -0700, Luke Palmer wrote:
>>I can see some potential problems to solve with regards to some
>>languages where variables are dynamic and can be "undefined",
>>such as Perl6, but the optimization would certainly work for
>>constants in all languages. The only problem with Perl6 would be
>>if a global or package variable's address changed after it was stored
>>in the register group at bytecode load time, (which could probably happen).
>
>Which is very hard not to happen as soon as you get into Exporter land. ;-(

Well... sorta. A lot of that stuff's known at compile time.

>>Anytime we cache something dynamic, we have to make sure the caches
>>know about changes. I think that is where notifications might help.
>>
>>For constants it is easy. IMCC might say, "this routine requires us
>>to intialize
>>at least 3 registers with a constant value, lets make it into a
>>register block"
>>
>>This may be a premature optimization, but for certain cases I think
>>its pretty
>>nifty.
>
>This smells like premature optimization to me for languages such as Perl[\d].

To some extent, yes. I just had a really nasty thought, and I think
the compiler writers need to get Official Rulings on behavior.

With perl, for example, it's distinctly possible that this:

our $foo; # It's a global
$foo = 12;
if ($foo > 10) {
print $foo;
}

will require fetching $foo's PMC out of the global namespace three
times, once for each usage. I don't know offhand if this is how perl
5 works (I think it might be) and we should check for perl 6, python,
and ruby. This is mainly because of the possibility of tied or
overridden namespaces, which would argue for a refetch on each use.

*Not* refetching is a perfectly valid thing, and not specifying is
also perfectly valid, but we need to check.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,

Jan 16, 2004, 9:05:43 AM1/16/04

to Tim Bunce, perl6-i...@perl.org

Tim Bunce <Tim....@pobox.com> wrote:
> Aren't constant strings going to be saved in a way that lets the
> address of the saved string be used to avoid string comparisons?

Constant strings get an entry in the constant_table. So for comparing 2
constant strings of the *same* code segment, strings differ, if their
address differ.

> (As is done for hash keys in perl5.) Perhaps that's already done.

Not yet, but its worth to look at.

> The byteloader could merge them into the global saved strings pool
> and 'fixup' references to them in the bytecode.

That's not possible generally. E.g. eval()ing a piece of code with
varying string constants would grow the "global" string constants
forever.

> Perhaps a variant of a hash that worked with large integers (pointers)
> as keys could be of some use here.

That doesn't play well with dynamic code AFAIK. Namespace keys cane be
string vars too.

> The JudyL functions (http://judy.sourceforge.net/) provide a very
> efficient 'integer hash'.

I had a look at that some time ago, but the internals are horribly
complex and it was leaking memory too.

> Tim.

leo

Simon Cozens

unread,

Jan 16, 2004, 9:37:02 AM1/16/04

to perl6-i...@perl.org

d...@sidhe.org (Dan Sugalski) writes:
> if ($foo > 10) {
> print $foo;
> }

> This is mainly because of the possibility of tied or
> overridden namespaces, which would argue for a refetch on each use.

No, come on, Dan. It's far worse than that.

It'll be possible, from Perl-space to override either > or print, and it
should well be possible for them to, for instance, tie their operands. Wee!

--
>but I'm one guy working weekends - what the hell is MS's excuse?
"We don't care, we don't have to, we're the phone company."
- Ben Jemmet, Paul Tomblin.

Dan Sugalski

unread,

Jan 16, 2004, 9:30:33 AM1/16/04

to Melvin Smith, perl6-i...@perl.org

At 3:51 PM -0500 1/15/04, Melvin Smith wrote:
>While sitting on IRC with Dan and Jonathan discussing how to
>optimizer a certain construct with how we handle globals/package
>variables, etc. I came to the conclusion that it would be valuable
>to not have to fetch each and every global, lexical or package
>variable by name, individually, but instead fetch them in
>clusters (4-16 at a time).
>
>We already have register frames which are saved and restored
>very efficiently.

Two things:

1) Lexicals should be reasonably fast, as they're integer indexable
in most cases. (The only time we *need* hashlike access is when we're
doing symbolic lookup, which is pretty uncommon. Downright impossible
for many languages)

2) I can easily see having something equivalent to the lookback op
for register stacks. Won't help for ints and floats, but it'll work
fine for PMCs and strings.

Leopold Toetsch

unread,

Jan 16, 2004, 10:02:20 AM1/16/04

to Simon Cozens, perl6-i...@perl.org

Simon Cozens <si...@simon-cozens.org> wrote:
> d...@sidhe.org (Dan Sugalski) writes:
>> if ($foo > 10) {
>> print $foo;
>> }
>> This is mainly because of the possibility of tied or
>> overridden namespaces, which would argue for a refetch on each use.

> No, come on, Dan. It's far worse than that.

> It'll be possible, from Perl-space to override either > or print, and it
> should well be possible for them to, for instance, tie their operands. Wee!

That's not the problem. But if the overriden C<< > >> op (or the print)
changes $foo's namespace on the fly, then a refetch would be necessary.

I'd prefer to have a hint:

our $foo is volatile;

The normal case would be to fetch $foo exactly once - before any loop.

Honestly an overridden op could insert new rules in the code and
recompile everything. If we have to always check for such nasty things,
then we can forget all performance and optimizations.

leo

Dan Sugalski

unread,

Jan 16, 2004, 10:05:20 AM1/16/04

to Simon Cozens, perl6-i...@perl.org

At 2:37 PM +0000 1/16/04, Simon Cozens wrote:
>d...@sidhe.org (Dan Sugalski) writes:
>> if ($foo > 10) {
>> print $foo;
>> }
>> This is mainly because of the possibility of tied or
>> overridden namespaces, which would argue for a refetch on each use.
>
>No, come on, Dan. It's far worse than that.
>
>It'll be possible, from Perl-space to override either > or print, and it
>should well be possible for them to, for instance, tie their operands. Wee!

Yeah, but we still control what they get. We've not gone so
completely mad that we hand in strings and code snippets for runtime
evaluation...

Dan Sugalski

unread,

Jan 16, 2004, 10:41:50 AM1/16/04

to l...@toetsch.at, Simon Cozens, perl6-i...@perl.org

At 4:02 PM +0100 1/16/04, Leopold Toetsch wrote:
>Honestly an overridden op could insert new rules in the code and
>recompile everything. If we have to always check for such nasty things,
>then we can forget all performance and optimizations.

That, at least, we don't have to worry about. None of the existing
languages (well, Tcl might, but I don't care there) require going
quite so mad, and if Larry tries, well... we have ways to deal with
that. :)

Dave Mitchell

unread,

Jan 16, 2004, 11:49:24 AM1/16/04

to Dan Sugalski, Elizabeth Mattijsen, Melvin Smith, Luke Palmer, perl6-i...@perl.org

On Fri, Jan 16, 2004 at 09:27:57AM -0500, Dan Sugalski wrote:
> With perl, for example, it's distinctly possible that this:
>
> our $foo; # It's a global
> $foo = 12;
> if ($foo > 10) {
> print $foo;
> }
>
> will require fetching $foo's PMC out of the global namespace three
> times, once for each usage. I don't know offhand if this is how perl
> 5 works (I think it might be)

In Perl5, a pointer to the GV is cached in the GVSV op (or in the case of
threaded builds, in the pad (with an index into the pad stored in the op
(is that enough levels of parenthesis for you?))). So if the stashes were
tied, you'd be stuffed. There was a recent thread on p5p as to whether
tieing of stashes should be disallowed.

--
The perl5 internals are a complete mess. It's like Jenga - to get the
perl5 tower taller and do something new you select a block somewhere in
the middle, with trepidation pull it out slowly, and then carefully
balance it somewhere new, hoping the whole edifice won't collapse as a
result.
- Nicholas Clark.

Elizabeth Mattijsen

unread,

Jan 17, 2004, 8:52:41 AM1/17/04

to Dan Sugalski, Melvin Smith, Luke Palmer, perl6-i...@perl.org

At 09:27 -0500 1/16/04, Dan Sugalski wrote:
>To some extent, yes. I just had a really nasty thought, and I think
>the compiler writers need to get Official Rulings on behavior.
>
>With perl, for example, it's distinctly possible that this:
>
> our $foo; # It's a global
> $foo = 12;
> if ($foo > 10) {
> print $foo;
> }
>
>will require fetching $foo's PMC out of the global namespace three
>times, once for each usage. I don't know offhand if this is how perl
>5 works (I think it might be) and we should check for perl 6,
>python, and ruby. This is mainly because of the possibility of tied
>or overridden namespaces, which would argue for a refetch on each
>use.
>
>*Not* refetching is a perfectly valid thing, and not specifying is
>also perfectly valid, but we need to check.

sub Foo::TIESCALAR { bless \$_[1],$_[0] }
sub Foo::FETCH { warn "FETCH\n"; ${$_[0]} }
sub Foo::STORE { warn "STORE\n"; ${$_[0]} = $_[1] }

our $foo; # It's a global

tie $foo,'Foo',$foo:;

$foo = 12;
if ($foo > 10) {
print $foo;
}

__END__
STORE
FETCH
FETCH
12

Don't know why you thgink it would be fetched 3 times, but as using
tied variables in Perl 5, a fetch is done _everytime_ the value of
the tied variable is needed.

Liz

Simon Cozens

unread,

Jan 17, 2004, 11:58:25 AM1/17/04

to perl6-i...@perl.org

da...@fdisolutions.com (Dave Mitchell) writes:
> The perl5 internals are a complete mess. It's like Jenga - to get the
> perl5 tower taller and do something new you select a block somewhere in
> the middle, with trepidation pull it out slowly, and then carefully
> balance it somewhere new, hoping the whole edifice won't collapse as a
> result.
> - Nicholas Clark.

I thought I'd seen that before: http://www.oreillynet.com/pub/wlg/555

--
Life's a switch() and then you die().

Dave Mitchell

unread,

Jan 17, 2004, 12:12:35 PM1/17/04

to Simon Cozens, perl6-i...@perl.org

Oooh, plagiarism! Sue sue sue!

I'll update the attribution for future .sigs.

--
Any [programming] language that doesn't occasionally surprise the
novice will pay for it by continually surprising the expert.
-- Larry Wall

Dan Sugalski

unread,

Jan 22, 2004, 1:48:30 PM1/22/04

to Elizabeth Mattijsen, Melvin Smith, Luke Palmer, perl6-i...@perl.org

At 2:52 PM +0100 1/17/04, Elizabeth Mattijsen wrote:
>Don't know why you thgink it would be fetched 3 times, but as using
>tied variables in Perl 5, a fetch is done _everytime_ the value of
>the tied variable is needed.

You misunderstand. I'm talking about fetching the PMC for the
variable *out of the stash* every time. This done with the assumption
that if the *stash* is tied, then every use of a variable from the
stash must refetch it out so the stash tie code can decide what it
wants to do.