Register usage is now kept in an array per register type and is
calculated for non jitted sections too.
Register usage can be per section (a sequence of either jitted or
nonjitted ops not separated by branches) or per basic block. This
allows later to avoid some register loads, when there are preserved
registers.[1][2]
There are some commented out trials, to optimize register load/stores,
but we have to solve at least [1] before going on here.
All allocated memory in jit.c is now freed (except for
REQUIRES_CONSTANT_POOL, which I don't know, what it does).
There are now more tunable defines in jit_emit.h, where per
architecture settings are kept:
- preserved registers
- dont allocate only once used registers
- allocate regs per section or per block
- align jump targets[3]
All have reasonable defaults, which map the old behaviour, so no
changes to architecture files should be needed currently.
[1] we currently have no information, that e.g. C<popi> changes the
parrot registers. For all other ops we have the ARGDIR flags. Proposed
change for pop opcodes: set ARGDIR_OUT on the opcode.
[2] when we know, that an external opcode doesn't throw an exception,
we could avoid saving registers too.
[3] I don't know, if it would be needed. I had some drastic slow down
in mops.pasm, when there was an odd branch target, but this was not
the reason actually, code size (or where the loop is located) seems to
be the culprit, s. end of jit/i386/jit_emit.h - does anyone know,
what's goin on here?
leo
The problem is that we do want to allocate a hardware register for a Parrot
register that is used only once in the section since the section can be
executed more than once, if you don't mind I want to remove
ALLOCATE_REGISTERS_ALWAYS.
Daniel Grunblatt.
> The problem is that we do want to allocate a hardware register for a Parrot
> register that is used only once in the section since the section can be
> executed more than once, if you don't mind I want to remove
> ALLOCATE_REGISTERS_ALWAYS.
I do mind, until all platform concerns are sorted out. On my Athlon the
inner loop in mops.pasm is as fast with one memory access as with two
registers.
We probably really want to allocate registers for RISC platforms where
we can, but not on i386, where a register is a rare ressource.
It doesn't harm to keep it. And platforms can turn it on or off.
> Daniel Grunblatt.
leo
But that's not true on my Intel PIII nor on my PI 233. I get a 50% speed up on
both.
>
> We probably really want to allocate registers for RISC platforms where
> we can, but not on i386, where a register is a rare ressource.
>
> It doesn't harm to keep it. And platforms can turn it on or off.
Keep it for what?
If there is any other thing in the section that deserves a register more than
it, that's OK, but if there won't be anything else using it why will we keep
it?
>
> > Daniel Grunblatt.
>
> leo
> But that's not true on my Intel PIII nor on my PI 233. I get a 50% speed up on
> both.
Ok then - it's gone.
Thanks for your input and your compile fix commit,
leo