> Some background first: this issue is for a backend for an 8bit microcontroller with only 8bit regs, however it has a few 16bit instructions that only work with fixed register pairs, so it doesnt allow all combinations of regs. This introduces some problems because if data wider than 8bits is expanded into 8bit operations the 16bit instructions never get selected, also the reg allocator should allocate adjacent regs to form the pairs. The most important 16 bit instruction is data movement, this instruction can move register pairs in a single cycle
Are you doing this only to take advantage of a double move instruction?
If so, I think you are better off just using 8-bit registers. The 16-bit code will never be as good as 8-bit code if you don't have a number of 16-bit instructions.
[...]
> typedef unsigned short t;
> t foo(t a, t b, t c)
> {
> return a+b;
> }
[...]
> This is fine until we get to the register allocation stage, there it does:
> BB#0: derived from LLVM BB %entry
> Live Ins: %R25R24 %R23R22
> %R18<def> = COPY %R24
> %R19<def> = COPY %R25
> %R24<def> = COPY %R22<kill>, %R25R24<imp-def>
> %R24<def> = ADDRdRr %R24, %R18<kill>, %SREG<imp-def>
> %R25<def> = COPY %R23<kill>
> %R25<def> = ADCRdRr %R25, %R19<kill>, %SREG<imp-def,dead>, %SREG<imp-use,kill>
> RET %R25R24<imp-use,kill>
>
> Here instead of splitting the pair into its own subregs, it's copying each subreg into r18 and r19 making useless movements.
> The optimal code should be:
> add r24, r22
> adc r25, r23
>
> So my first question would be how to solve this?
You should not expect LLVM to generate optimal code for toy examples, that is not the design goal. You example requires a lot of trickery to compile optimally, and it is not clear that such tricks would benefit more realistic code.
How are you doing on real code?
/jakob
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Without doing what i mentioned and letting LLVM expand all operations wider than 8 bits as you asked, the code produced is excellent supposing that many of the moves there should be 16 bit moves reducing code size and right register allocation, also something important for me is that the code is better than gcc's. When i say right reg allocation it doesnt mean it's doing things wrong, i mean it's getting regs freely without pairing regs because i dont know how to do it. So now i have to push things further and implement these details to make the backend introduce those 16 bit instructions i dont know how to insert, and this is where i need help.
Hello Lang, thanks for the suggestion :) it's very interesting. I'll take a read to the email you've pointed out there to understand how it works. Btw, does this mean that only your allocator is able to handle or support this type of constraint?