[LLVMdev] Auto-vectorization and phi nodes

92 views
Skip to first unread message

Vesa Norilo

unread,
Feb 19, 2013, 5:40:26 AM2/19/13
to llv...@cs.uiuc.edu
Hi all,

Sorry if this is a dumb or FAQ or the wrong list!

I'm currently investigating LLVM vectorization of my generated code. My
codegen emits a lot of recursions that step through arrays via pointers.
The recursions are nicely optimized into loops, but the loop
vectorization can't seem to work on them because of phi nodes that point
to gep nodes.

Some simple IR to demonstrate; it vectorizes nicely with opt -O3
-vectorize-loops -force-vector-width until I uncomment the phi/gep nodes.

define void @add_vector(float* noalias %a, float* noalias %b, float*
noalias %c, i32 %num)
{
Top:
br label %Loop
Loop:
%i = phi i32 [0,%Top],[%i.next,%Loop]

; phi and gep - won't vectorize
; %a.ptr = phi float* [%a,%Top],[%a.next,%Loop]
; %b.ptr = phi float* [%b,%Top],[%b.next,%Loop]
; %c.ptr = phi float* [%c,%Top],[%c.next,%Loop]

; %a.next = getelementptr float* %a.ptr, i32 1
; %b.next = getelementptr float* %b.ptr, i32 1
; %c.next = getelementptr float* %c.ptr, i32 1

; induction variable as index - will vectorize
%a.ptr = getelementptr float* %a, i32 %i
%b.ptr = getelementptr float* %b, i32 %i
%c.ptr = getelementptr float* %c, i32 %i

%a.val = load float* %a.ptr
%b.val = load float* %b.ptr
%sum = fadd float %a.val, %b.val
store float %sum, float* %c.ptr

%i.next = add i32 %i, 1
%more = icmp slt i32 %i.next, %num
br i1 %more, label %Loop, label %End
End:
ret void
}

So it seems that the loop vectorizer would like the pointer stepping to
be converted to base+index. However as expected, clang doesn't care
whether C code is written as pointer arithmetic or table index.

Is there a pass that converts simple pointer arithmetic to base+index?
If not, should I write one (shouldn't be too hard for my limited use
case) or try to emit more vector-friendly code from the front end?

Thanks a bunch!
Vesa Norilo
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Hal Finkel

unread,
Feb 19, 2013, 10:51:11 AM2/19/13
to Vesa Norilo, llv...@cs.uiuc.edu
As I recall, loop strength reduction can do this; but that happens only very late in the compilation process (well after vectorization). It would probably be better to update the loop vectorizer to deal with this directly. Nadav?

-Hal

Nadav Rotem

unread,
Feb 19, 2013, 12:22:58 PM2/19/13
to Hal Finkel, Vesa Norilo, llv...@cs.uiuc.edu
Hi Vesa,

The pass IndVars changes the induction variables to allow SCEV to analyze them and enable other optimizations. This is the canonicalization phase. Later on, LSR lowers the canonicalized induction variables to induction variables that map nicely to the target's addressing modes. In many cases it can remove some of the induction variables.

I suspect that the loop vectorizer does not vectorize the code because SCEV fails to detect the induction variable. Can you run the loop vectorizer with the '-debug' option and check why it fails ?

Thanks,
Nadav

Vesa Norilo

unread,
Feb 19, 2013, 1:09:15 PM2/19/13
to Nadav Rotem, llv...@cs.uiuc.edu
Hi Nadav and Hal and thanks for the help!

To the best of my understanding, indvars doesn't complain and an
induction variable is detected. However, the loop vectorizer says:

LV: Checking a loop in "add_vector"
LV: Found a loop: Loop
LV: Found an induction variable.
LV: Found an unidentified PHI. %a.ptr = phi float* [ %a, %Top ], [
%a.next, %Loop ]
LV: Can't vectorize the instructions or CFG
LV: Not vectorizing.

Should indvars have changed this phi node into something else earlier
during optimization?

In case I missed something important from the log, I put it all up here:
http://pastebin.com/yBS5RhVN

The IR fed into opt is at the end of the log, functionally unchanged. My
command line is opt -S -O3 -vectorize-loops -force-vector-width=8 -debug

Thanks again!
Vesa

Nadav Rotem

unread,
Feb 19, 2013, 1:16:48 PM2/19/13
to Vesa Norilo, llv...@cs.uiuc.edu


On Feb 19, 2013, at 10:09 AM, Vesa Norilo <vno...@siba.fi> wrote:

> Hi Nadav and Hal and thanks for the help!
>
> To the best of my understanding, indvars doesn't complain and an induction variable is detected. However, the loop vectorizer says:
>
> LV: Checking a loop in "add_vector"
> LV: Found a loop: Loop
> LV: Found an induction variable.
> LV: Found an unidentified PHI. %a.ptr = phi float* [ %a, %Top ], [ %a.next, %Loop ]
> LV: Can't vectorize the instructions or CFG
> LV: Not vectorizing.
>

Yes. SCEV does not identify this PHI. You can add a breakpoint in "isInductionVariable" to verify this. How are you generating this code ? Do you have a C example for it ?

Thanks,
Nadav

Vesa Norilo

unread,
Feb 19, 2013, 1:33:17 PM2/19/13
to Nadav Rotem, llv...@cs.uiuc.edu
19.2.2013 20:16, Nadav Rotem kirjoitti:
>
> On Feb 19, 2013, at 10:09 AM, Vesa Norilo <vno...@siba.fi> wrote:
>
>> LV: Found an unidentified PHI. %a.ptr = phi float* [ %a, %Top ], [ %a.next, %Loop ]
>>
> Yes. SCEV does not identify this PHI. You can add a breakpoint in "isInductionVariable" to verify this. How are you generating this code ? Do you have a C example for it ?
>
> Thanks,
> Nadav
>
>

Sorry, no C example. I'm writing a frontend for a functional language
that generates a bunch of tail calls that iterate through arrays by
passing incremented pointers and a decrementing induction var as
arguments to tail recursion. LLVM makes very neat scalar loop code out
of this, but fails to vectorize.

I wrote the IR I posted by hand to get a minimal example. The reason why
it's quite tricky for me to emit [baseptr+indvar] style IR is that base
pointer isn't local to the function body until after tail call
optimization. Any suggestions are welcome!

I actually wrote a quick pass that moves those phi nodes out of a loop
body and replaces them with base+index, but the resulting machine code
is rather bad (full of gather/scatter), probably because I needed to
screw with the order of llvm optimization passes without a deep enough
understanding of how they interact.

All the best,
Vesa
Reply all
Reply to author
Forward
0 new messages