[Bug 15643] New: 'No' Progress within isl

44 views
Skip to first unread message

bugzill...@llvm.org

unread,
Apr 1, 2013, 5:49:23 PM4/1/13
to poll...@googlegroups.com
Bug ID 15643
Summary 'No' Progress within isl
Product Projects
Version 3.2
Hardware PC
OS Linux
Status NEW
Severity normal
Priority P
Component Polly
Assignee poll...@googlegroups.com
Reporter johannes...@gmx.de
Classification Unclassified

Created attachment 10266 [details]
C source file, kernel operations on 2D image

While

 opt -polly-codegen -polly-opt-isl -polly-ignore-aliasing  kernel.ll
-polly-detect-only=apply_kernel_3x3

works fine,

 opt -polly-codegen -polly-opt-isl -polly-ignore-aliasing  kernel.ll
-polly-detect-only=apply_kernel_5x5

seems to make no progress at all.

Polly git: 331e829324157cca54491d268410f8ee2d33cd65
Cloog git: c7721fc941db89dd1afc6240eaceea46d0bcad17
isl   git: 9f82ab3cd18ac34f883c30594111e4eb17426e11

kernel.ll is generated by:
clang -emit-llvm -S kernel.c -o kernel.bc
opt -S kernel.bc -polly-prepare -mem2reg -polly-indvars -o kernel.ll


Stacktrace after running opt for a while:
==============================================================
#0  0x00007ffff65bb724 in isl_seq_combine () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#1  0x00007ffff65bff9a in isl_tab_add_row () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#2  0x00007ffff65c0202 in isl_tab_add_ineq () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#3  0x00007ffff6588509 in basic_map_collect_diff.part.1 () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#4  0x00007ffff6588b3c in map_subtract () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#5  0x00007ffff6552adf in isl_access_info_compute_flow () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#6  0x00007ffff65537f3 in compute_flow () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#7  0x00007ffff655b109 in isl_hash_table_foreach () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#8  0x00007ffff65d42ed in isl_union_map_foreach_map () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#9  0x00007ffff6553bd0 in isl_union_map_compute_flow () from
/home/johannes/git/POLLY/install/lib/libisl.so.10
#10 0x00007ffff68fd8e3 in
polly::Dependences::calculateDependences(polly::Scop&) () from
install/lib/LLVMPolly.so
#11 0x00007ffff68fdba4 in polly::Dependences::runOnScop(polly::Scop&) () from
install/lib/LLVMPolly.so
#12 0x00007ffff691e946 in polly::ScopPass::runOnRegion(llvm::Region*,
llvm::RGPassManager&) () from install/lib/LLVMPolly.so
#13 0x000000000151ea44 in llvm::RGPassManager::runOnFunction(llvm::Function&)
()
#14 0x000000000167cf29 in llvm::FPPassManager::runOnFunction(llvm::Function&)
()
#15 0x000000000167d11a in llvm::FPPassManager::runOnModule(llvm::Module&) ()
#16 0x000000000167d477 in llvm::MPPassManager::runOnModule(llvm::Module&) ()
#17 0x000000000167da79 in llvm::PassManagerImpl::run(llvm::Module&) ()
#18 0x000000000167dc8b in llvm::PassManager::run(llvm::Module&) ()
#19 0x00000000008bdf05 in main ()


You are receiving this mail because:
  • You are the assignee for the bug.

Sven Verdoolaege

unread,
Apr 5, 2013, 12:12:51 PM4/5/13
to bugzill...@llvm.org, poll...@googlegroups.com
On Mon, Apr 01, 2013 at 09:49:23PM +0000, bugzill...@llvm.org wrote:
> Stacktrace after running opt for a while:
> ==============================================================
> #0 0x00007ffff65bb724 in isl_seq_combine () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #1 0x00007ffff65bff9a in isl_tab_add_row () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #2 0x00007ffff65c0202 in isl_tab_add_ineq () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #3 0x00007ffff6588509 in basic_map_collect_diff.part.1 () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #4 0x00007ffff6588b3c in map_subtract () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #5 0x00007ffff6552adf in isl_access_info_compute_flow () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #6 0x00007ffff65537f3 in compute_flow () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #7 0x00007ffff655b109 in isl_hash_table_foreach () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #8 0x00007ffff65d42ed in isl_union_map_foreach_map () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10
> #9 0x00007ffff6553bd0 in isl_union_map_compute_flow () from
> /home/johannes/git/POLLY/install/lib/libisl.so.10

What do the inputs to this function look like?
(Use isl_union_map_dump)
Which version of isl are you using?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 6:42:20 AM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, bugzill...@llvm.org, poll...@googlegroups.com
88 isl_union_map_compute_flow(
89 isl_union_map_copy(Read), isl_union_map_copy(Write),
90 isl_union_map_copy(MayWrite), isl_union_map_copy(Schedule),
&RAW, NULL,
91 NULL, NULL);
92
93 isl_union_map_compute_flow(
(gdb) call isl_union_map_dump(Read)
[p_0, ny, p_2, p_3, p_4, p_5, p_6, p_7, p_8, p_9, p_10, p_11, p_12,
p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20, p_21, p_22, p_23, p_24,
p_25, p_26] -> { Stmt_for_body4[i0] -> MemRef_dx[o0] : (8o0 = p_7 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0);
Stmt_for_body4[i0] -> MemRef_u[o0] : (8o0 = p_2 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_3 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0
= p_4 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_5 + 8i0 and i0
>= 0 and i0 <= -2 + p_0) or (8o0 = p_6 + 8i0 and i0 >= 0 and i0 <= -2 +
p_0) or (8o0 = p_8 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_9 +
8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_10 + 8i0 and i0 >= 0 and
i0 <= -2 + p_0) or (8o0 = p_11 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or
(8o0 = p_12 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_13 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_14 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_15 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or
(8o0 = p_16 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_17 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_18 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_19 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or
(8o0 = p_20 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_21 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_22 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_2 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0
= p_8 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_23 + 8i0 and i0
>= 0 and i0 <= -2 + p_0) or (8o0 = p_13 + 8i0 and i0 >= 0 and i0 <= -2
+ p_0) or (8o0 = p_18 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 =
p_3 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_9 + 8i0 and i0 >=
0 and i0 <= -2 + p_0) or (8o0 = p_24 + 8i0 and i0 >= 0 and i0 <= -2 +
p_0) or (8o0 = p_14 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_19
+ 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_5 + 8i0 and i0 >= 0
and i0 <= -2 + p_0) or (8o0 = p_11 + 8i0 and i0 >= 0 and i0 <= -2 + p_0)
or (8o0 = p_25 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_16 +
8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_21 + 8i0 and i0 >= 0 and
i0 <= -2 + p_0) or (8o0 = p_6 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or
(8o0 = p_12 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_26 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_17 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_22 + 8i0 and i0 >= 0 and i0 <= -2 + p_0);
Stmt_for_body4[i0] -> MemRef_dy[o0] : (8o0 = p_7 + 8i0 and i0 >= 0 and
i0 <= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or
(8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) }
$2 = 0
(gdb) call isl_union_map_dump(Write)
[p_0, ny, p_2, p_3, p_4, p_5, p_6, p_7, p_8, p_9, p_10, p_11, p_12,
p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20, p_21, p_22, p_23, p_24,
p_25, p_26] -> { Stmt_for_body4[i0] -> MemRef_dx[o0] : (8o0 = p_7 + 8i0
and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0
<= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0
= p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0); Stmt_for_body4[i0] ->
MemRef_dy[o0] : (8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0
= p_7 + 8i0 and i0 >= 0 and i0 <= -2 + p_0) or (8o0 = p_7 + 8i0 and i0
>= 0 and i0 <= -2 + p_0) or (8o0 = p_7 + 8i0 and i0 >= 0 and i0 <= -2 +
p_0) }
$3 = 0
(gdb) call isl_union_map_dump(MayWrite)
[p_0, ny, p_2, p_3, p_4, p_5, p_6, p_7, p_8, p_9, p_10, p_11, p_12,
p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20, p_21, p_22, p_23, p_24,
p_25, p_26] -> { }
$4 = 0
(gdb) call isl_union_map_dump(Schedule)
[p_0, ny, p_2, p_3, p_4, p_5, p_6, p_7, p_8, p_9, p_10, p_11, p_12,
p_13, p_14, p_15, p_16, p_17, p_18, p_19, p_20, p_21, p_22, p_23, p_24,
p_25, p_26] -> { Stmt_for_body4[i0] -> scattering[0, i0, 0] }

> Which version of isl are you using?

isl 0.11.1

I did not yet look into why this input is generated and if we could
generate some less complex input.

Cheers,
Tobias

Tobias Grosser

unread,
Apr 10, 2013, 7:25:03 AM4/10/13
to sk...@kotnet.org, polly-dev
OK, now I looked. It is actually pretty easy.

The kernel looks as follows (the 5x5 is similar)

void apply_kernel_3x3(double* u, double* dx, double* dy, long nx, long ny) {
long i,j;
for (i = 1; i < nx - 1; i++) {
for (j = 1; j < ny - 1; j++) {

dx[i*ny+j] = u[(i-1)*ny+j-1] + 2 * u[(i-1)*ny+j] +
u[(i-1)*ny+j+1];
dx[i*ny+j] += - u[(i+1)*ny+j-1] - 2 * u[(i+1)*ny+j] -
u[(i+1)*ny+j+1];

dy[i*ny+j] = - u[(i-1)*ny+j-1] - 2 * u[(i)*ny+j-1] -
u[(i+1)*ny+j-1];
dy[i*ny+j] += u[(i-1)*ny+j+1] + 2 * u[(i)*ny+j+1] +
u[(i+1)*ny+j+1];

}
}
}

Due to the linearized variable sized arrays we get expressions such as i
* ny. Those expressions can not be expressed in the polyhedral model.
As a result we only detect the j-loop as a SCoP and hide the i * ny
style expressions in scop invariant parameters. This has two problems:

1) We do not detect the full kernel

If we would understand the multi-dimensionality of the array references,
we could detect the full kernel and could analyze it in detail

2) Large number of parameters causes slow downs

Without delinearization we get a large number of independent parameters.
With increasing dimensionality the number of parameters grows fast, such
that even at 3D the dependence analysis becomes slow.


There are several solutions:

1) Use fixed size arrays

This makes the problem go away and allows us to fully optimize the kernel

2) Make Polly understand the multi-dimensionality of the arrays

This needs some work in Polly, but it would be the optimal solution
as we should both be fast and precise after detecting the
multi-dimensionality.

3) Fix the slow analysis

We should find a way to either do our analysis fast or just bail out
when we realize we can not do anything useful here.

Cheers,
Tobias

Sven Verdoolaege

unread,
Apr 10, 2013, 7:54:21 AM4/10/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 01:25:03PM +0200, Tobias Grosser wrote:
> void apply_kernel_3x3(double* u, double* dx, double* dy, long nx, long ny) {
> long i,j;
> for (i = 1; i < nx - 1; i++) {
> for (j = 1; j < ny - 1; j++) {
>
> dx[i*ny+j] = u[(i-1)*ny+j-1] + 2 * u[(i-1)*ny+j] + u[(i-1)*ny+j+1];
> dx[i*ny+j] += - u[(i+1)*ny+j-1] - 2 * u[(i+1)*ny+j] - u[(i+1)*ny+j+1];
>
> dy[i*ny+j] = - u[(i-1)*ny+j-1] - 2 * u[(i)*ny+j-1] - u[(i+1)*ny+j-1];
> dy[i*ny+j] += u[(i-1)*ny+j+1] + 2 * u[(i)*ny+j+1] + u[(i+1)*ny+j+1];
>
> }
> }
> }
>
> Due to the linearized variable sized arrays we get expressions such as i *
> ny. Those expressions can not be expressed in the polyhedral model.
> As a result we only detect the j-loop as a SCoP and hide the i * ny style
> expressions in scop invariant parameters.

Why do you want to compute dependences that are parametric in these
expressions?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 8:05:13 AM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
I don't think I got the question.

What we currently is to take the largest region that is still a scop,
calculate dependences for it and run the schedule optimizer. All
unconditionally.

Are you saying there is an easy way to understand that calculating
dependences here does not make sense? Or are you proposing that we
should generate a different polyhedral representation?

Tobias

Sven Verdoolaege

unread,
Apr 10, 2013, 9:56:28 AM4/10/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 02:05:13PM +0200, Tobias Grosser wrote:
> On 04/10/2013 01:54 PM, Sven Verdoolaege wrote:
> >On Wed, Apr 10, 2013 at 01:25:03PM +0200, Tobias Grosser wrote:
> >>void apply_kernel_3x3(double* u, double* dx, double* dy, long nx, long ny) {
> >> long i,j;
> >> for (i = 1; i < nx - 1; i++) {
> >> for (j = 1; j < ny - 1; j++) {
> >>
> >> dx[i*ny+j] = u[(i-1)*ny+j-1] + 2 * u[(i-1)*ny+j] + u[(i-1)*ny+j+1];
> >> dx[i*ny+j] += - u[(i+1)*ny+j-1] - 2 * u[(i+1)*ny+j] - u[(i+1)*ny+j+1];
> >>
> >> dy[i*ny+j] = - u[(i-1)*ny+j-1] - 2 * u[(i)*ny+j-1] - u[(i+1)*ny+j-1];
> >> dy[i*ny+j] += u[(i-1)*ny+j+1] + 2 * u[(i)*ny+j+1] + u[(i+1)*ny+j+1];
> >>
> >> }
> >> }
> >>}
> >>
> >>Due to the linearized variable sized arrays we get expressions such as i *
> >>ny. Those expressions can not be expressed in the polyhedral model.
> >>As a result we only detect the j-loop as a SCoP and hide the i * ny style
> >>expressions in scop invariant parameters.
> >
> >Why do you want to compute dependences that are parametric in these
> >expressions?
>
> I don't think I got the question.
>
> What we currently is to take the largest region that is still a scop,
> calculate dependences for it and run the schedule optimizer. All
> unconditionally.

What do you mean by "unconditionally"?

> Are you saying there is an easy way to understand that calculating
> dependences here does not make sense? Or are you proposing that we should
> generate a different polyhedral representation?

I'm just wondering why you want to compute dependences in terms of
all these extra parameters. You end up with lots of special cases
where there is a dependence and lots of special cases where theres
is no dependence. In what way do you exploit these sepcial cases?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 10:03:25 AM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
On 04/10/2013 03:56 PM, Sven Verdoolaege wrote:
> On Wed, Apr 10, 2013 at 02:05:13PM +0200, Tobias Grosser wrote:
>> On 04/10/2013 01:54 PM, Sven Verdoolaege wrote:
>>> On Wed, Apr 10, 2013 at 01:25:03PM +0200, Tobias Grosser wrote:
>>>> void apply_kernel_3x3(double* u, double* dx, double* dy, long nx, long ny) {
>>>> long i,j;
>>>> for (i = 1; i < nx - 1; i++) {
>>>> for (j = 1; j < ny - 1; j++) {
>>>>
>>>> dx[i*ny+j] = u[(i-1)*ny+j-1] + 2 * u[(i-1)*ny+j] + u[(i-1)*ny+j+1];
>>>> dx[i*ny+j] += - u[(i+1)*ny+j-1] - 2 * u[(i+1)*ny+j] - u[(i+1)*ny+j+1];
>>>>
>>>> dy[i*ny+j] = - u[(i-1)*ny+j-1] - 2 * u[(i)*ny+j-1] - u[(i+1)*ny+j-1];
>>>> dy[i*ny+j] += u[(i-1)*ny+j+1] + 2 * u[(i)*ny+j+1] + u[(i+1)*ny+j+1];
>>>>
>>>> }
>>>> }
>>>> }
>>>>
>>>> Due to the linearized variable sized arrays we get expressions such as i *
>>>> ny. Those expressions can not be expressed in the polyhedral model.
>>>> As a result we only detect the j-loop as a SCoP and hide the i * ny style
>>>> expressions in scop invariant parameters.
>>>
>>> Why do you want to compute dependences that are parametric in these
>>> expressions?
>>
>> I don't think I got the question.
>>
>> What we currently is to take the largest region that is still a scop,
>> calculate dependences for it and run the schedule optimizer. All
>> unconditionally.
>
> What do you mean by "unconditionally"?

I mean we always compute the dependences without checking beforehand if
the result may be useful for us.

>> Are you saying there is an easy way to understand that calculating
>> dependences here does not make sense? Or are you proposing that we should
>> generate a different polyhedral representation?
>
> I'm just wondering why you want to compute dependences in terms of
> all these extra parameters. You end up with lots of special cases
> where there is a dependence and lots of special cases where theres
> is no dependence. In what way do you exploit these sepcial cases?

Sorry, I still don't get it. I do not want to compute these dependences
the way we do it today, but I don't see a better solution (except of
delinearization).

I can imagine two solutions:

1) Detect that we can not do anything useful before calculating the
dependences

We do not do this just because I don't have any good check/heuristic
that would help here. In case you have an idea of some check that we
could execute to bail out before dependence calculation, please let us now.

2) Generate less parameters

I have no idea how to reduce the number of parameters. Treating i * ny
as a new parameter is the only way I know to handle such multiplications.

Cheers,
Tobias

Sven Verdoolaege

unread,
Apr 10, 2013, 11:02:51 AM4/10/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 04:03:25PM +0200, Tobias Grosser wrote:
> 2) Generate less parameters
>
> I have no idea how to reduce the number of parameters. Treating i * ny as a
> new parameter is the only way I know to handle such multiplications.

Then why don't you just say that there may be an access for
any value of those expressions?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 11:11:58 AM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
True, we could do this (and in fact Polly supports this in general
already). The question is when to do this. For now we only over
approximate with may accesses, if there is no other choice. In this case
however we are not forced to over approximate and I we would need to
decide when the over-approximation would speed up dependence analysis
without loosing precision.

Unfortunately, each kernel that contains a parameter and where the
parameter value is previously calculated by a multiplication may contain
a very similar pattern. If we always over approximate as soon
as we see parameters that are defined by multiplication, we may loose
the ability to optimize these kernels.

Tobias


Sven Verdoolaege

unread,
Apr 10, 2013, 12:10:24 PM4/10/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 05:11:58PM +0200, Tobias Grosser wrote:
> On 04/10/2013 05:02 PM, Sven Verdoolaege wrote:
> >On Wed, Apr 10, 2013 at 04:03:25PM +0200, Tobias Grosser wrote:
> >>2) Generate less parameters
> >>
> >>I have no idea how to reduce the number of parameters. Treating i * ny as a
> >>new parameter is the only way I know to handle such multiplications.
> >
> >Then why don't you just say that there may be an access for
> >any value of those expressions?
>
> True, we could do this (and in fact Polly supports this in general already).
> The question is when to do this. For now we only over approximate with may
> accesses, if there is no other choice. In this case however we are not
> forced to over approximate and I we would need to decide when the
> over-approximation would speed up dependence analysis without loosing
> precision.

Of course, you are going to lose precision in the dependence analysis.
The question is, why do you care? That is, how do you hope to exploit
the extra precision?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 12:17:05 PM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
For this case I do not care at all. It would obviously better to just
bail out. The question is how to detect the cases where I do not care.

Look at this example:

Here exact dependences will allow us to prove the absence of dependences
for certain values of 'b'.

for (i = 0; i < 10; i++)
A[i + b] = A[i]

The very same code could be written like this:

for (i = 0; i < 10; i++)
A[i + (n * k)] = A[i]

If we over always approximate as soon as we see a multiplication of
parameters, we would lose in cases like this interesting information.

Over approximating if there is no other solution is a simple decision.
If we want to follow your path we need some other way to decide when to
over approximate. Any suggestions?

Cheers,
Tobi





Sven Verdoolaege

unread,
Apr 10, 2013, 12:23:12 PM4/10/13
to Tobias Grosser, polly-dev
How do you exploit the fact that there may not be any dependences
for some values of b?

skimo

Tobias Grosser

unread,
Apr 10, 2013, 12:29:39 PM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
I do not at the moment.

Here another one:

for (i = 0; i < 10; i++)
A[2i + b] = A[2i + b + 1]

We currently prove this one does not carry dependences and we will
parallelize it potentially.


Another option that I see is that we derive more information about the
parameters. If we can e.g. prove that they are positive, this may
simplify the set of dependences that are calculated ??

Tobias



Tobias Grosser

unread,
Apr 10, 2013, 12:47:17 PM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
Another thing that I could imagine is to add some heuristic to the
dependence analysis, that allows it to over-approximate as soon
it sees the dependences explode. This could e.g. be as soon as access
functions with two different parameters are compared. Sven, do you
believe there is an easy condition that we could check for that would
avoid the explosion in this case, while still analyzing the case above?

Thanks,
Tobias

Sven Verdoolaege

unread,
Apr 10, 2013, 5:27:32 PM4/10/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 06:29:39PM +0200, Tobias Grosser wrote:
> On 04/10/2013 06:23 PM, Sven Verdoolaege wrote:
> >On Wed, Apr 10, 2013 at 06:17:05PM +0200, Tobias Grosser wrote:
> >>Here exact dependences will allow us to prove the absence of dependences for
> >>certain values of 'b'.
> >>
> >>for (i = 0; i < 10; i++)
> >> A[i + b] = A[i]
> >
> >How do you exploit the fact that there may not be any dependences
> >for some values of b?
>
> I do not at the moment.

Then I don't think you should worry about the loss in accuracy for now.

> Here another one:
>
> for (i = 0; i < 10; i++)
> A[2i + b] = A[2i + b + 1]
>
> We currently prove this one does not carry dependences and we will
> parallelize it potentially.

AFAICS, you are generating a separate parameter for each expression
(even if it is identical to some other expression), so this situation
cannot arise based on the parameters you introduce.

> Another option that I see is that we derive more information about the
> parameters. If we can e.g. prove that they are positive, this may simplify
> the set of dependences that are calculated ??

How would that help?
What would help is if you can equalities between the parameters,
say expressing that they are equal or that the differ by ny.

skimo

Tobias Grosser

unread,
Apr 10, 2013, 5:42:04 PM4/10/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
On 04/10/2013 11:27 PM, Sven Verdoolaege wrote:
> On Wed, Apr 10, 2013 at 06:29:39PM +0200, Tobias Grosser wrote:
>> On 04/10/2013 06:23 PM, Sven Verdoolaege wrote:
>>> On Wed, Apr 10, 2013 at 06:17:05PM +0200, Tobias Grosser wrote:
>>>> Here exact dependences will allow us to prove the absence of dependences for
>>>> certain values of 'b'.
>>>>
>>>> for (i = 0; i < 10; i++)
>>>> A[i + b] = A[i]
>>>
>>> How do you exploit the fact that there may not be any dependences
>>> for some values of b?
>>
>> I do not at the moment.
>
> Then I don't think you should worry about the loss in accuracy for now.
>
>> Here another one:
>>
>> for (i = 0; i < 10; i++)
>> A[2i + b] = A[2i + b + 1]
>>
>> We currently prove this one does not carry dependences and we will
>> parallelize it potentially.
>
> AFAICS, you are generating a separate parameter for each expression
> (even if it is identical to some other expression), so this situation
> cannot arise based on the parameters you introduce.

Good point.

That is and is not true. Identical region invariant (sub)expressions are
mapped to the same parameter. However, looking into this test case,
I see that this does not work as expected. Scalar evolution builds sub
expression such that constant offsets are moved to the innermost level.
This has the unfortunate effect that Polly sees a large number of
different region invariant subexpressions that are all mapped to
different parameters.

Here the 3d case:

p0: (1 smax (-1 + %ny))
p1: %ny
p2: {0,+,(8 * %ny)}<%for.cond>
p3: {8,+,(8 * %ny)}<%for.cond>
p4: {16,+,(8 * %ny)}<%for.cond>
p5: {(8 + (8 * %ny)),+,(8 * %ny)}<%for.cond>
p6: {(16 * %ny),+,(8 * %ny)}<%for.cond>
p7: {(8 + (16 * %ny)),+,(8 * %ny)}<%for.cond>
p8: {(16 + (16 * %ny)),+,(8 * %ny)}<%for.cond>
p9: {(8 * %ny),+,(8 * %ny)}<%for.cond>
p10: {(16 + (8 * %ny)),+,(8 * %ny)}<%for.cond>

We should probably hoist the loop invariant part out of the loop.

>> Another option that I see is that we derive more information about the
>> parameters. If we can e.g. prove that they are positive, this may simplify
>> the set of dependences that are calculated ??
>
> How would that help?
> What would help is if you can equalities between the parameters,
> say expressing that they are equal or that the differ by ny.

As explained above, the right solution is to recognize that there are
just two parameters: (%ny) and (%ny * %for.conf)

Tobi

Sven Verdoolaege

unread,
Apr 11, 2013, 6:23:03 AM4/11/13
to Tobias Grosser, polly-dev
On Wed, Apr 10, 2013 at 06:47:17PM +0200, Tobias Grosser wrote:
> Another thing that I could imagine is to add some heuristic to the
> dependence analysis, that allows it to over-approximate as soon
> it sees the dependences explode. This could e.g. be as soon as access
> functions with two different parameters are compared. Sven, do you believe
> there is an easy condition that we could check for that would avoid the
> explosion in this case, while still analyzing the case above?

If a given expression only appears in one access, then
don't introduce a parameter for that expression.

skimo

Tobias Grosser

unread,
Apr 11, 2013, 6:25:19 AM4/11/13
to sk...@kotnet.org, Sven Verdoolaege, polly-dev
If I have an expression

A[5i + (something_complex) + 10]

are you saying I should treat this as

A[5i + 10]

instead of

A[5i + p + 10]

?

This does not seem to be correct.

Tobi

Sven Verdoolaege

unread,
Apr 11, 2013, 6:31:02 AM4/11/13
to Tobias Grosser, polly-dev
No, you should treat as a potential access with access relation

{ [...i...] -> A[a] :
exists something_complex: a = 5i + (something_complex) + 10 }

skimo

Tobias Grosser

unread,
Apr 15, 2013, 5:38:09 AM4/15/13
to sk...@kotnet.org, polly-dev
Ah, I forgot to reply.

Yes, this may help. However, we can probably only do this transformation
right before the dependence analysis. If we e.g. want to modify/update
the memory access functions (as needed for GPU code generation), we will
still need access relations that explicitly reference such parameters.

Before using the simplification you suggested above, we should probably
fix the large number of parameters we introduce as suggested by me earlier.

Tobias

bugzill...@llvm.org

unread,
Dec 18, 2013, 6:31:21 AM12/18/13
to poll...@googlegroups.com
Tobias Grosser changed bug 15643
What Removed Added
Status NEW RESOLVED
Resolution --- FIXED

Comment # 3 on bug 15643 from Tobias Grosser
I just checked with 197560. The  number of parameters does not explode any more
and dependence analysis is now performed very quick (< 0.1 sec).

We still do not detect the whole scop. This could be fixed with Sebastian's
delinearization pass. However, as the bug report was about the slowness, we
close it now. Please open a new bug report to track the delinearization issue.

Reply all
Reply to author
Forward
0 new messages