Is there anyone who has written an LLVM register allocator who would be
willing to help me understand the basic ideas? I understand the
algorithm, it's LLVM that I don't understand. For example:
- When allocating registers, how do I know which register class to use,
and which registers of that class are available?
- How do I know which operands of a Machine Instruction are candidates
for (simple) register allocation (e.g., are of type int)?
- How do I replace a virtual register with a physical one?
- Do I need to generate spill code to handle virtual registers that
cannot be replaced with physical ones, and if so, how?
Susan Horwitz
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
You may find the PBQP allocator implementation useful as a reference
to what's involved in adding a new allocator to LLVM. It's located in
lib/CodeGen/RegAllocPBQP.cpp, with supporting files in the lib/CodeGen/
PBQP directory.
I'm no expert on the LLVM register allocation interfaces, so I'll
defer to those who are regarding the specifics of your questions.
-Jim
You may find the PBQP allocator implementation useful as a reference
to what's involved in adding a new allocator to LLVM. It's located in
lib/CodeGen/RegAllocPBQP.cpp, with supporting files in the lib/CodeGen/
PBQP directory.
> Is there anyone who has written an LLVM register allocator who would
> be
> willing to help me understand the basic ideas? I understand the
> algorithm, it's LLVM that I don't understand. For example:
>
> - When allocating registers, how do I know which register class to
> use,
> and which registers of that class are available?
> - How do I know which operands of a Machine Instruction are candidates
> for (simple) register allocation (e.g., are of type int)?
> - How do I replace a virtual register with a physical one?
> - Do I need to generate spill code to handle virtual registers that
> cannot be replaced with physical ones, and if so, how?
You could re-write it in place (using MachineRegisterOperand::setReg(<reg>)...
Hi Susan,You may find the PBQP allocator implementation useful as a reference
to what's involved in adding a new allocator to LLVM. It's located in
lib/CodeGen/RegAllocPBQP.cpp, with supporting files in the lib/CodeGen/
PBQP directory.Yes - as far as working allocators go PBQP is pretty simple. If you're just interested in LLVM API details you can focus on the lower half of the RegAllocPBQP.cpp source file (everything from PBQPRegAlloc::findVRegIntervalsToAlloc()). The rest of the source (class declaration at the top of RegAllocPBQP.cpp aside) is mostly concerned with PBQP algorithm specifics, such as constructing cost matrices, or carrying out the PBQP graph reduction algorithm.
> On Mon, 19 Oct 2009, Chris Lattner wrote:
>
>> Other advice: if you're looking to simplify this for students, I'd
>> recommend staying away from X86 or ARM, which use subregs heavily.
>> If you work with (e.g.) the sparc backend, you can avoid them
>> completely, simplifying the problem.
>
> Chris -
>
> Thanks for your reply! But I'm confused about the above. In
> particular, another reply said the following:
>
>> For each virtual register the
>> MachineRegisterInfo::getRegClass(<vreg>) method will give you a
>> TargetRegisterClass. You can use the allocation_order_begin/
>> allocation_order_end methods to iterate over the allocable physregs
>> in that class.
>
> If I can get the appropriate physical register for each virtual one
> this way, then how does X86's use of sub-registers complicate my
> register-allocation code?
Each virtual register has an assigned register class. However,
register classes relate to each other, and the machine IR also has
subreg references. For example, this is how X86 handles AL/AX/EAX/RAX
all aliasing each other. In the Sparc backend, the only aliases are
in the FPU, and it doesn't use subregs to model them at this point.
-Chris
On Oct 20, 2009, at 12:46 PM, Susan Horwitz wrote:
> On Tue, 20 Oct 2009, Chris Lattner wrote:
>
>> Each virtual register has an assigned register class. However,
>> register classes relate to each other, and the machine IR also has
>> subreg references. For example, this is how X86 handles AL/AX/EAX/
>> RAX all aliasing each other. In the Sparc backend, the only aliases
>> are in the FPU, and it doesn't use subregs to model them at this
>> point.
>
> So if AL is a sub-register of EAX (assume this is true even if not),
> then will getAliasSet(AL) include EAX, and will getAliasSet(EAX)
> include AL? If yes, then I think I'm OK.
Yes, I believe so.
> So if AL is a sub-register of EAX (assume this is true even if not),> then will getAliasSet(AL) include EAX, and will getAliasSet(EAX)
> include AL? If yes, then I think I'm OK.
Yes, I believe so.
I've made some progress writing my register allocator, but now I'm stuck.
I have 2 questions for you:
1. I tried running the PBQP allocator (as a dynamic pass), but that didn't
work. When I type this:
llc -f -load Debug/lib/regalloc.so -regalloc=pbqp simple.bc
I get the following error:
llc: /afs/cs.wisc.edu/p/course/cs701-horwitz/public/llvm-root/llvm/include/llvm/Support/CommandLine.h:483: void llvm::cl::parser<DataType>::addLiteralOption(const char*, const DT&, const char*) [with DT = llvm::FunctionPass* (*)(), DataType = llvm::FunctionPass* (*)()]: Assertion `findOption(Name) == Values.size() && "Option already exists!"' failed.
Can you tell from this what I'm doing wrong?
2. I tried writing a very simple register allocator. It works as follows:
Step 1: Find out which physical registers are already used. Do this by
iterating over all instructions and testing each operand. Store results
(including aliases) in a set.
Step 2: For each target reg class, save one (unused) physical register to
be used for spills. (Some classes have no unused physical registers.
This would be detected in Step 3 if there were a virtual register in that
class, but it hasn't happened for my tests so far.)
Step 3: Iterate over all instructions again allocating virtual registers
to available physical ones (in the appropriate class) or, if no such
physical reg is available, allocate the virtual register both to the
saved "spill" register for its class and to a stack slot. Keep track of
the set of virtual registers already processed so none is processed twice.
When a physical register is allocated, add it (and any aliases) to the set
of used physical registers.
Step 4: Run the spiller.
This allocator works on very simple C code, but fails (with an
uninformative segmentation fault) as soon as I include a call to scanf, or
a loop, or an if-else.
Do you see any obvious errors in the approach outlined above?
If not, can you suggest a way to find out what is going wrong?
Thanks for any help you can give me!!
1. I tried running the PBQP allocator (as a dynamic pass), but that didn't
work....
Can you tell from this what I'm doing wrong?
2. I tried writing a very simple register allocator. It works as follows...
This allocator works on very simple C code, but fails (with an uninformative segmentation fault) as soon as I include a call to scanf, or a loop, or an if-else.
Do you see any obvious errors in the approach outlined above?
If not, can you suggest a way to find out what is going wrong?
Thanks for any help you can give me!!
1) Some machine instructions have implicit register operands. If you are not including these in your calculations you will probably have some errors. My apologies for not mentioning this earlier.You can find the set of implicit uses and defs by querying the TargetInstDesc object for each MachineInstr (call MachineInstr::getDesc() to get it, and then the TargetInstrDesc::getImplicitUses() and TargetInstrDesc::getImplicitDefs() methods to find the regs used/definied by this instr).
> There are any number of things that can go wrong in register allocation, so
> it's hard for me to guess without seeing your code.
>
> Possible issues:
>
> 2) How are you making sure that interfering virtregs never receive the same
> physreg? If you're using the LiveIntervals analysis (and the
> LiveInterval::overlaps(LiveInterval &) test) you should be ok, since it gets
> a lot of testing (though bugs are not unheard of). If you've written your
> own liveness analysis that would be one of the first places I'd check for
> correctness.
I'm doing VERY simple reg allocation, just to see if I can get something
to work. I only allocate a physical register to ONE virtual register (the
first such virtual I find that hasn't already had a physical register
allocated to it). So I don't think my problem has to do with
interference.
> In addition, what do you do if/when you encounter an instruction with two or
> more operands (in the same class) that have been spilled?
For each instruction, I iterate over the operands. For each operand that
is a virtual register "r", if "r" hasn't been given a physical register
and there is one available (in "r"'s class), then I allocate that physical
register to "r". If not, I call assignVirt2StackSlot(r) and
assignVirt2Phys(r, sp), where sp is s physical register in "r"'s class
that has been saved to be used as a spill register.
I don't do anything special for the case you mention, so that may be my
problem. I thought that what I am doing, plus calling the spiller at the
end, would magically take care of everything. If not, can you point me at
the relevant documentation or an example from existing code?
I realized that I was unclear about where the actual problem is: My
allocator runs fine, it is (as you suspected) the generated code that
crashes. I can look at the generated (X86) code and I see the problem for
one simple example: there's an instruction
movl %edi, 4(%edi)
that should have been
movl %edi, 4(%esp)
but I don't understand how the code gets generated, so this isn't very
helpful. Is there a way for me to look at the machine instructions before
register allocation?
Thanks again!
Susan
You can run llc with -debug which shows debug information from all
codegen phases.
You can also use -debug-only=<name> to debug only a specific codegen pass.
I see output like this, where I think that instead of LINEAR SCAN your
allocator will be run:
********** MACHINEINSTRS **********
entry:
20 MOV32mr <fi#1>, 1, %reg0, 0, %reg0, %EDI<kill>
28 %reg1026<def> = MOV32rm <fi#1>, 1, %reg0, 0, %reg0
36 MOV32mr <fi#0>, 1, %reg0, 0, %reg0, %reg1026<kill>
44 %EAX<def> = MOV32rm <fi#0>, 1, %reg0, 0, %reg0
64 RET %EAX<imp-usekill>
********** LINEAR SCAN **********
********** Function: foo
fixed intervals:
%reg14,inf = [0,6:0)[6,14:1)[14,22:2) 0@?-(6) 1@? 2@? -> DI
%reg2,inf = [46,54:1)[54,66:0) 0@54-(66) 1@? -> AL
%reg23,inf = [0,22:0) 0@?-(22) -> EDI
%reg1,inf = [46,54:1)[54,66:0) 0@54-(66) 1@? -> AH
%reg15,inf = [0,6:0)[6,14:1)[14,22:2) 0@?-(6) 1@? 2@? -> DIL
%reg3,inf = [46,54:1)[54,66:0) 0@54-(66) 1@? -> AX
%reg19,inf = [46,66:0) 0@46-(66) -> EAX
*** CURRENT ***: %reg1026,inf = [30,38:0) 0@30-(38)
processing active intervals:
processing inactive intervals:
allocating current interval: EAX
active intervals:
%reg1026,inf = [30,38:0) 0@30-(38) -> EAX
inactive intervals:
interval %reg1026,inf = [30,38:0) 0@30-(38) expired
********** REGISTER MAP **********
[reg1026 -> EAX]
**** Local spiller rewriting function 'foo':
**** Machine Instrs (NOTE! Does not include spills and reloads!) ****
# Machine code for foo():
<fi#0>: size is 4 bytes, alignment is 4 bytes, at location [SP+8]
<fi#1>: size is 4 bytes, alignment is 4 bytes, at location [SP+8]
Live Ins: EDI in VR#1025
Live Outs: EAX
....
Best regards
--Edwin
%1024 = xxx
...
yyy = %1024
and I find no physical register for %1024, then I assign it to physical
register %edi and to a stackslot. That creates code like this:
%edi = xxx
store from %edi to the stack
...
yyy = %edi
The last instruction is wrong. There needs to be a load from the
stackslot into %edi before copying from there into yyy.
The documentation says:
If the indirect strategy is used, after all the virtual registers
have been mapped to physical registers or stack slots, it is
necessary to use a spiller object to place load and store
instructions in the code. Every virtual that has been mapped to a
stack slot will be stored to memory after been defined and will be
loaded before being used.
But this doesn't seem to be happening; the stores to memory are there but
the loads are not.
Any ideas what's going wrong?
If not, any advice on how to generate the loads myself??
Susan
But this doesn't seem to be happening; the stores to memory are there but the loads are not.
Any ideas what's going wrong?
If not, any advice on how to generate the loads myself??
I've also found that RegAllocSimple.cpp is a good model for me if I decide
to give up on using the spiller and instead add the spill/reload code
myself.
So I'm currently feeling much more optimistic about this project.
Thanks for all the (very quick!) help!!
Susan
llvm-gcc -emit-llvm -O0 -c replace.c -o replace.bc
opt -mem2reg replace.bc > replace.ssa
mv replace.ssa replace.bc
llc -f -regalloc=simple replace.bc
gcc replace.s
Then I ran it like this (file "input" also attached):
a.out This XXXX < input
I get a Segmentation fault with the current version of LLVM. With the old
version it runs, but I get garbage output.
If I create the ".s" file using just
llc -f replace.bc
everything is fine.
Does anyone know how to fix either version of RegAllocSimple?
Susan Horwitz
Why not use RegAllocLocal?
Evan
> <replace.c><input.txt>_______________________________________________
As a warm-up exercise (before implementing register allocation via graph
coloring) I'm trying to implement a very simple scheme in which NO
pseudo-registers are allocated to physical registers across instructions.
So for each virtual register I call assignVirt2StackSlot. I thought that
I should call assignVirt2Phys as well for each virtual register in an
instruction, mapping that vreg to an unused preg. For example, for an
instruction like this:
%reg1024<def> = MOV32rr %ESP
I assign virtual register 1024 to some available physical register in the
appropriate class so that the generated code will assign to that physical
register. (I also assign virtual register 1024 to a stack slot so that
the value assigned to the physical register is spilled to that slot.)
If a virtual register is used in an instruction, rather than defined, I
again assign it to a physical register preg (but not to a stack slot
because it will already have been assigned a slot) so that the generated
code will load from the stack into that preg.
However, if the same virtual register is used in two different
instructions, I get an abort with an error message saying that I'm trying
to assign a physical register to a virtual register that's already been
mapped. If I don't call assignVirt2Phys then I get an abort with an error
message saying that some virtual register was not assigned a physical
register.
I've tried calling addSpillPoint and addRestorePoint for every instruction
that involves a virtual register, and also not calling those functions,
and I still get the same aborts.
I would very much appreciate any help!
Susan Horwitz
> However, if the same virtual register is used in two different instructions, I get an abort with an error message saying that I'm trying to assign a physical register to a virtual register that's already been mapped. If I don't call assignVirt2Phys then I get an abort with an error message saying that some virtual register was not assigned a physical register.
Each vreg should be mapped to a singe physreg. The rewriter will
rewrite any operand that refers to the virtreg to use the physreg it
is mapped to (including the operands of the loads/stores inserted for
that virtreg). If no mapping is made, or you attempt to redefine the
mapping (without calling VirtRegMap::clearAllVirt()), you'll get
assertion failures as you saw.
You really want to iterate over the function assigning a physreg to
each virtreg the first time you encounter it, being careful not to
assign the same physreg to two virtual registers if they're ever used
in the same instruction.
Unfortunately this simple approach will not work in all cases. If, for
instance, a virtreg were used in a set of instructions which also
used, between them, every possible physreg, then there'd be no physreg
left to allocate to the virtreg.
You can correct this (at some cost to performance) by inserting new
virtregs (see MachineRegisterInfo::createVirtualRegister), and copies
(see TargetInstrInfo::copyRegToReg) before allocation. You could do
the following:
For each instruction i:
For each operand o:
if o is a register operand:
create a new virtreg v with the same register class as o
if i uses o:
insert a copy from o to v before i
if i defines o:
insert a copy from v to o after i
rename o to use v
I believe this should be sufficient to guarantee that you can find a
physreg for every virtreg which does not conflict with the choices for
any other virtreg in that instruction. You would only need to spill
the original vregs, not the ones you inserted in the pre-pass.
One gotcha - Two address instructions. If an operand is used as both
def and use in a two address instruction (see
MachineInstr::isRegTiedToUseOperand &
MachineInstr::isRegTiedToDefOperand) you only want to insert one new
vreg for the two operands and re-use it.
> I've tried calling addSpillPoint and addRestorePoint for every instruction that involves a virtual register, and also not calling those functions, and I still get the same aborts.
You definitely need these calls. They're not the source of your trouble.
Cheers,
Lang.