We already have register frames which are saved and restored
very efficiently.
For instances where a module has package variables and
globals, there exists an overhead upon repeatedly invoking
a sub in one of those modules.
I believe it would be quite handy to store register frames
off for random access (not on the pad stack). The access
could either be by name, or stashed in a PMC and held
on the stack itself.
IMCC could analyze a module, decide if the optimization makes
sense, and place commonly used values (constants or
variables) in a pre-packaged register frame. (or more than 1)
This is done at compile / load time of course. If it were all
constants, compile time works, but for PMCs and Strings
it would have to be built at loadtime.
Upon invoking a busy routine, it _might_ be more efficient (since
we already save register frames anyway) to initialize the
upper frame (top 16 registers) with a pre-built register set.
It might also be more useful to have it more granular than
16, maybe in 4s or 8s. By doing life analysis and some
weighting, IMCC might be able to turn multiple symbol
lookups into 1.
Comments & questions welcome.
-Melvin
Why am I thinking of the "register" keyword in C?
Liz
I have no idea and I can't see the relationship. :)
With Parrot's architecture, we have overhead of storing and retrieving
globals, lexicals and package variables by _name_. This was a
design choice, but it has ramifications.
C has none of this overhead since it does virtuall all of its symbol resolution
at compile time or load time (except in the case of dlopen and company).
My email was concerned with storing/retrieving multiple variables
with a single lookup, not with hinting to the compiler.
-Melvin
I guess my point was than that it maybe would make sense to be able
to hint to IMCC, so that IMCC can do a better job of grouping
variables together.
Please ignore me if I'm not making any sense. I'm just coming down
after a _long_ day and may not be thinking too clearly anymore... ;-)
Liz
Okay. Can you show an example of this optimization. I'd rather see it
in a HLL talking about PIR, as opposed to in PIR itself.
Luke
I just realized my response sounded crass, and wasn't meant to be.
I welcome comments, I just didn't understand what relation
you were getting at. Feel free to point it out to me.
The context: Jonathan was asking about importing
constants at runtime and/or constant namespaces.
Dan and I were discussing the issues and how routines
with lots of package globals or constants would spend
a significant part of their time retrieving symbols. Jonathan
did not want compile time constants, Dan did not want
"importable" constants that mutate the bytecode at runtime,
so I was trying to come up with a compromise, ugly as
it may be.
Weird optimizations are ok in my book if they can be
hidden in a back-end compiler.
-Melvin
Sure, although what I am talking about is something that the compiler
would insert itself, not something we would expose at the HLL.
But anyway, a typical module with package variables (syntax may vary)
might do:
package MyLDAP;
import LDAPLib;
# Package variables
string top = "dn=foo, cn=bar, co=blah";
string user = "melvin";
string password = "abcd";
sub do_something
{
do_something_else(top, type, user);
}
Assume that the 3 variables are commonly used in this package (in various
routines).
Assume that there are other variables and constants that are private to
this package.
Or forget package altogether, they might be globals.
The Parrot translation of the sub do_something() has to fetch the 3 package
variables
every invocation.
.sub _do_something
.local string top
.local string user
.local string password
top = fetch "MyLDAP::top"
user = fetch "MyLDAP::user"
password = fetch "MyLDAP::password"
_do_something_else(top, user, password)
.end
This would translate to bytecode that assigns 3 registers to
the variables. Each invocation those registers get initialized
by a fetch (hash lookup).
For the simple case, I'd like IMCC to look at the package (all
subs) and compute a use-count of certain variables, pre-package
the N most common (4, 8?) into a register block, and restore
that block of registers at the entrance of the routine, rather than
doing the 3 fetch lookups.
# Beware pseudo-ops :)
_do_something:
fetch P29, "MyLDAP::top"
fetch P30, "MyLDAP::user"
fetch P31, "MyLDAP::password"
_do_someting_else(P29, P30, P31) # Assume the fetch worked for the
top 4 regs
Changes to:
_do_something:
fetchreggroup "MyLDAP::g1" # 1 lookup and you get all 3
_do_someting_else(P29, P30, P31) # Assume the fetch worked for the
top 4 regs,
# so P28
just gets clobbered with a NULL
I can see some potential problems to solve with regards to some
languages where variables are dynamic and can be "undefined",
such as Perl6, but the optimization would certainly work for
constants in all languages. The only problem with Perl6 would be
if a global or package variable's address changed after it was stored
in the register group at bytecode load time, (which could probably happen).
Anytime we cache something dynamic, we have to make sure the caches
know about changes. I think that is where notifications might help.
For constants it is easy. IMCC might say, "this routine requires us to
intialize
at least 3 registers with a constant value, lets make it into a register block"
This may be a premature optimization, but for certain cases I think its pretty
nifty.
-Melvin
Which is very hard not to happen as soon as you get into Exporter
land. ;-( For example:
use Scalar::Util qw(blessed weaken reftype);
use POSIX;
>Anytime we cache something dynamic, we have to make sure the caches
>know about changes. I think that is where notifications might help.
>
>For constants it is easy. IMCC might say, "this routine requires us
>to intialize
>at least 3 registers with a constant value, lets make it into a
>register block"
>
>This may be a premature optimization, but for certain cases I think its pretty
>nifty.
This smells like premature optimization to me for languages such as Perl[\d].
The number of times a variable occurs in a program, may have _no_
relation to how many times it will be accessed. So what's the
optimization then?
If you're thinking about this, then maybe a better heuristic would be
to group globals into groups that are _only_ referenced within a
specific scope and fetch them on scope entry and store them on scope
exit. But then, anything like eval"" or the equivalent of a glob
assignment (or even worse: an event) within that scope, will cause
problems.
But please, people around me always tell me that I'm way too
negative. That I'm always saying why things _can't_ happen. I'd
like to be proven wrong... ;-)
Liz
> If you're thinking about this, then maybe a better heuristic would be
> to group globals into groups that are _only_ referenced within a
> specific scope and fetch them on scope entry and store them on scope
> exit. But then, anything like eval"" or the equivalent of a glob
> assignment (or even worse: an event) within that scope, will cause
> problems.
Storing lexicals or globals isn't needed:
$ cat g.pasm
new P0, .PerlInt
set P0, 4
store_global "$a", P0
# ...
find_global P1, "$a"
inc P1
find_global P2, "$a"
print P2
print "\n"
end
$ parrot g.pasm
5
So the optimization is to just keep lexicals/globals in registers as long
as we have some. Where currently spilling is done, we just forget about
that register (but not *reuse* it, C<new> or such is ok) - and refetch
the variable later.
So the *only* current optimization is: we need HLL directives for
lexicals and globals so that the spilling code and register allocator
can use this information. That is: we can always cut the life range of
lexicals/globals, *if* we refetch, where we now fetch from the spill
array.
> Liz
leo
Aren't constant strings going to be saved in a way that lets the
address of the saved string be used to avoid string comparisons?
(As is done for hash keys in perl5.) Perhaps that's already done.
Then bytecode could 'declare' all the string constants it contains.
The byteloader could merge them into the global saved strings pool
and 'fixup' references to them in the bytecode.
If the namespace lookup code knew when it was being given saved
string pointers it could avoid the string compares as it walks the
namespace tree.
Maybe all that's been done. Here's an idea that builds on that:
Perhaps a variant of a hash that worked with large integers (pointers)
as keys could be of some use here. The namespace could be a tree of
these 'integer hashes'. Most namespace lookups use constants which
can be 'registered' in the unique string pool at byteloader time.
To lookup a non-constant string you just need to check if it's in
the unique string pool and get that pointer if it is. If it's not
then you know it doesn't exist anywhere. If it is you do the lookup
using the address of the string in the pool.
The JudyL functions (http://judy.sourceforge.net/) provide a very
efficient 'integer hash'.
Tim.
Well... sorta. A lot of that stuff's known at compile time.
>>Anytime we cache something dynamic, we have to make sure the caches
>>know about changes. I think that is where notifications might help.
>>
>>For constants it is easy. IMCC might say, "this routine requires us
>>to intialize
>>at least 3 registers with a constant value, lets make it into a
>>register block"
>>
>>This may be a premature optimization, but for certain cases I think
>>its pretty
>>nifty.
>
>This smells like premature optimization to me for languages such as Perl[\d].
To some extent, yes. I just had a really nasty thought, and I think
the compiler writers need to get Official Rulings on behavior.
With perl, for example, it's distinctly possible that this:
our $foo; # It's a global
$foo = 12;
if ($foo > 10) {
print $foo;
}
will require fetching $foo's PMC out of the global namespace three
times, once for each usage. I don't know offhand if this is how perl
5 works (I think it might be) and we should check for perl 6, python,
and ruby. This is mainly because of the possibility of tied or
overridden namespaces, which would argue for a refetch on each use.
*Not* refetching is a perfectly valid thing, and not specifying is
also perfectly valid, but we need to check.
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
Constant strings get an entry in the constant_table. So for comparing 2
constant strings of the *same* code segment, strings differ, if their
address differ.
> (As is done for hash keys in perl5.) Perhaps that's already done.
Not yet, but its worth to look at.
> The byteloader could merge them into the global saved strings pool
> and 'fixup' references to them in the bytecode.
That's not possible generally. E.g. eval()ing a piece of code with
varying string constants would grow the "global" string constants
forever.
> Perhaps a variant of a hash that worked with large integers (pointers)
> as keys could be of some use here.
That doesn't play well with dynamic code AFAIK. Namespace keys cane be
string vars too.
> The JudyL functions (http://judy.sourceforge.net/) provide a very
> efficient 'integer hash'.
I had a look at that some time ago, but the internals are horribly
complex and it was leaking memory too.
> Tim.
leo
No, come on, Dan. It's far worse than that.
It'll be possible, from Perl-space to override either > or print, and it
should well be possible for them to, for instance, tie their operands. Wee!
--
>but I'm one guy working weekends - what the hell is MS's excuse?
"We don't care, we don't have to, we're the phone company."
- Ben Jemmet, Paul Tomblin.
Two things:
1) Lexicals should be reasonably fast, as they're integer indexable
in most cases. (The only time we *need* hashlike access is when we're
doing symbolic lookup, which is pretty uncommon. Downright impossible
for many languages)
2) I can easily see having something equivalent to the lookback op
for register stacks. Won't help for ints and floats, but it'll work
fine for PMCs and strings.
> No, come on, Dan. It's far worse than that.
> It'll be possible, from Perl-space to override either > or print, and it
> should well be possible for them to, for instance, tie their operands. Wee!
That's not the problem. But if the overriden C<< > >> op (or the print)
changes $foo's namespace on the fly, then a refetch would be necessary.
I'd prefer to have a hint:
our $foo is volatile;
The normal case would be to fetch $foo exactly once - before any loop.
Honestly an overridden op could insert new rules in the code and
recompile everything. If we have to always check for such nasty things,
then we can forget all performance and optimizations.
leo
Yeah, but we still control what they get. We've not gone so
completely mad that we hand in strings and code snippets for runtime
evaluation...
That, at least, we don't have to worry about. None of the existing
languages (well, Tcl might, but I don't care there) require going
quite so mad, and if Larry tries, well... we have ways to deal with
that. :)
In Perl5, a pointer to the GV is cached in the GVSV op (or in the case of
threaded builds, in the pad (with an index into the pad stored in the op
(is that enough levels of parenthesis for you?))). So if the stashes were
tied, you'd be stuffed. There was a recent thread on p5p as to whether
tieing of stashes should be disallowed.
--
The perl5 internals are a complete mess. It's like Jenga - to get the
perl5 tower taller and do something new you select a block somewhere in
the middle, with trepidation pull it out slowly, and then carefully
balance it somewhere new, hoping the whole edifice won't collapse as a
result.
- Nicholas Clark.
sub Foo::TIESCALAR { bless \$_[1],$_[0] }
sub Foo::FETCH { warn "FETCH\n"; ${$_[0]} }
sub Foo::STORE { warn "STORE\n"; ${$_[0]} = $_[1] }
our $foo; # It's a global
tie $foo,'Foo',$foo:;
$foo = 12;
if ($foo > 10) {
print $foo;
}
__END__
STORE
FETCH
FETCH
12
Don't know why you thgink it would be fetched 3 times, but as using
tied variables in Perl 5, a fetch is done _everytime_ the value of
the tied variable is needed.
Liz
I thought I'd seen that before: http://www.oreillynet.com/pub/wlg/555
--
Life's a switch() and then you die().
Oooh, plagiarism! Sue sue sue!
I'll update the attribution for future .sigs.
--
Any [programming] language that doesn't occasionally surprise the
novice will pay for it by continually surprising the expert.
-- Larry Wall
You misunderstand. I'm talking about fetching the PMC for the
variable *out of the stash* every time. This done with the assumption
that if the *stash* is tied, then every use of a variable from the
stash must refetch it out so the stash tie code can decide what it
wants to do.