Hi all,
I am dealing this loop with PLUTO and polly:
>>>>>>>>>>>>>>>>>>>>>>>>
for (i = 0; i < nx; i++){
for (j = 0; j < ny; j++){
if(i<ny-1)
A[i+1][j] = A[i][j];
if(j<ny-1)
A[i][j+1] = A[i][j];
}
}
>>>>>>>>>>>>>>>>>>>>>>>>>
in PLUTO, it can be transformed into this:
(part of the output of pluto :)
>>>>>>>>>>>>>>>>>>>>>>>>>
for (t1=2*ny-2;t1<=nx+ny-2;t1++) {
lbp=max(1,t1-nx+1);
ubp=ny-1;
#pragma omp parallel for private(lbv,ubv,t3)
for (t2=lbp;t2<=ubp;t2++) {
A[(t1-t2)][(t2-1)+1] = A[(t1-t2)][(t2-1)];;
}
}
>>>>>>>>>>>>>>>>>>>>>>>>>>
as you can see with the "nx+ny-2" boundary, PLUTO has skewed this loop.
I have tried with these options with polly:
>>>>>>>>>>>>>>>>>>>>>>>>>
clang -S -g -emit-llvm ../test1.cc -Xclang -disable-O0-optnone -o test1.ll
opt -S -polly-canonicalize test1.ll -o beforepolly.ll
opt -polly-allow-nonaffine -polly-opt-fusion=max -polly-tiling=false
-polly-detect -polly-scops -polly-simplify -polly-optree -polly-delicm -polly-simplify -polly-dependences -polly-opt-isl -polly-ast -polly-codegen beforepolly.ll
>>>>>>>>>>>>>>>>>>>>>>>>>
But I can not get the similar transform with polly. I checked the llvm IR dumped after this passes, the new loop build by polly has the same structure with the original code.
Did I miss something ? like options or pragmas?