I'm trying to port LCC to a custom 32-bits CPU, implemented in a FPGA.
The CPU can only address words, and its intruction set has a lot of
constraints
on the operands that can be used.
I want ton control exactly what registers can each of the operations
use.
The registers are organized as follow:
- $R0..15: 32-bits general purpose registers.
- $C0..15: 32-bits "Counter" registers. They can be used to load a
constant or
an address and to do indirect memory addressing.
They also support post-inc/decrementation, if you append ++ or --
to
the register name.
- $CON: a register hardwired to 0xFFFFFFFF
Here are the operand possibilities for the arithmetic add instruction:
add $Cb, $Ra, $Rd
add $Cb, $CON, $Rd
add $CON, $Ra, $Rd
add d($Cb[--|++]), $Ra, $Rd
add d($Cb[--|++]), $CON, $Rd
add d($Cb[--|++]), d(Imm | $Ca[--|++]), $Rd
where:
- $Rd is the destination register
- Imm represents an immediate value
- d(xxx) represents a memory access
sub, and, or, xor have the same operand constraints.
Some other basic instructions:
loadi Imm, $C ; loads Imm into $C
mov $R, d(xxx) ; moves $R to the memory cell at xxx
Some general questions I have:
Here is my simplified MD file, if anyone want to explore/use it:
http://pastebin.com/m6e300dcc
Are my rules correctly designed? ie: should I use "reg" everywhere a
register
is needed and try to constrain which registers can be effectively
used
elsewhere, for example in target()? Or should I try to separate rules,
using
different non terminals for different register types? (Rreg for $R
registers,
Creg for $C registers...).
That's what I did in a previous try, but that doesnt always work. I
guess
I must do more stuff in target().
(http://pastebin.com/m655de885)
I'd like to force some common subexpressions to be recomputed instead
of being
stored in a temporary. I guess I must raise the "mayrecalc" flag on
the node,
but in which of the MD file function should I do this?
Problem(s) I have:
I can't manage to make LCC choose the right set of registers according
to the
operands constraints and to make it move values between sets ($R to $C
and
$C to $R).
Some examples of expected outputs, to give you an idea:
ASGNI1(VREGP(1), CNSTI1(3))
ASGNI1(ADDRGP1(i), INDIRI1(VREGP(1)))
ASGNI1(ADDRGP1(j), INDIRI1(VREGP(1)))
ASGNI1(ADDRGP1(k), INDIRI1(VREGP(1)))
----------
int i;
void main(void)
{
i = 3;
j = 3;
k = 3;
}
Expected:
loadi 3,$C15 ; loads constant '3' into $C15
and $C15,$CON,$R15 ; loads $C15 into $R15 ($C15 AND 0xFFFFFFFF into
$R15)
mov $R15,d(i) ; moves $R15 to the memory cell occupied by i
mov $R15,d(j)
mov $R15,d(k)
What I actually get:
loadi 3,$C7 ; OK
mov $C7,d(i) ; Forbiden 1st operand, it should be a $R register
mov $C7,d(j)
mov $C7,d(k)
or
loadi 3,$R15 ; Forbiden 2nd operand, it should be a $C register
mov $R15,d(i) ; OK
mov $R15,d(j)
mov $R15,d(k)
----------
What I tried:
Modify target():
case CNST+I:
case CNST+U:
case CNST+P:
{
setreg(p, Cregw);
}
break;
--> Doesn't work: it uses a $R for the loadi (the default wildcard
given in rmap() for type I)
case CNST+I:
case CNST+U:
case CNST+P:
{
setreg(p, Creg[x])
}
break;
--> Doesn't work: $Cx is used for the loadi but also for the mov! It
should be mov $R, ...
I tried to force the right child of ASGN nodes to a $R register:
case ASGN+I:
case ASGN+U:
case ASGN+P:
rtarget(p, 1, Rreg[0]);
break;
case CNST+I:
case CNST+U:
case CNST+P:
{
setreg(p, Creg[0])
}
break;
It produces the output:
loadi 3,$C0
(requate(5b5698): tmp=$R15 src=$C0)
(requate arm 5 at 5b5168)
(requate arm 5 at 5b5288)
(requate arm 5 at 5b53a8)
(requate arm 7 at 0)
[LOAD 15,0]
[LOAD 16,0]
mov $R0,d(i)
[LOAD 16,0]
mov $R0,d(j)
[LOAD 16,0]
mov $R0,d(k)
Now it effectively uses LOAD nodes, so I could put the right code
there, but
there are still 2 problems:
- Why are there so many LOAD issued? One should be enough
- I'm forced to use fixed $C and $R register. I want to be able to do
that with
any $C/$R combination!
So,
- setreg() seems to only work with registers, not wildcards.
- rtarget() can only use register symbol, not wildcards.
Should I try to create 2 new functions, say setregw() and wtarget(),
that would
work like setreg() and rtarget() but could handle register wildcards?
Is it a problem with the instruction set being too weird to be used
easily in LCC?
Is it a problem with the register layout of my target, being too weird
to be used easily in LCC?
Any help will be greatly appreciated!
I have already read and searched for an answer in the source code and
in
Fraser and Hanson's book, but with no luck.
If you need any clarification / other information ..., do not hesitate
to ask
me for it, I'll gladly try to be as clear as possible.
Cyril