[PATCH] D35761: [Polly][WIP] Use SCEV information for the second level aliasing

11 views
Skip to first unread message

Roman via Phabricator

unread,
Jul 22, 2017, 3:25:31 AM7/22/17
to garee...@gmail.com, tob...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com, w...@google.com
gareevroman created this revision.
Herald added a reviewer: bollu.

We introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. To distinguish two accesses, the comparison of raw pointers representing base pointers is used.

In case of, for example, ublas's prod function that implements GEMM, and DeLiCM we can get accesses to same location represented by different raw pointers. Consequently, we create different alias sets that can prevent accesses from, for example, being sinked or hoisted.

To avoid the issue, we compare the corresponding SCEV information instead of the corresponding raw pointers.


https://reviews.llvm.org/D35761

Files:
include/polly/CodeGen/IRBuilder.h
lib/CodeGen/IRBuilder.cpp
test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop.transformed
test/ScheduleOptimizer/pattern-matching-based-opts_12.ll

D35761.107776.patch

Tobias Grosser via Phabricator

unread,
Jul 31, 2017, 10:01:17 AM7/31/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
grosser added a comment.

Hi Roman,

thanks for the patch. I looked into it a couple of times, but need to have another deeper look. Two high-level questions already:

1. Why do we need to use SCEV? Should we not be able to tell form our information which base pointers are expected to be identical?
2. What exactly must be checked in the new test case here. What was generated before, why was this wrong, and what does the new code generate.

Also, I may have missed this earlier. Why are the number of elements in the check lines growing so much. Is this expected?

CHECK-NEXT: !43 = !{!3, !4, !0, !6, !7, !11, !14, !16, !18, !20, !22, !24, !26, !28, !30, !32, !34, !36, !38, !40}


https://reviews.llvm.org/D35761



Tobias Grosser via Phabricator

unread,
Aug 4, 2017, 2:54:25 AM8/4/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
grosser requested changes to this revision.
grosser added a comment.
This revision now requires changes to proceed.

Marking this as requesting changes, to move this out of the review queue for now.

Roman, it would be great if you could answer my questions before i look deeper.


https://reviews.llvm.org/D35761



Roman via Phabricator

unread,
Aug 5, 2017, 5:19:33 AM8/5/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com, w...@google.com
gareevroman updated this revision to Diff 109861.
gareevroman added a comment.



> 1. Why do we need to use SCEV? Should we not be able to tell form our information which base pointers are expected to be identical?

I think we should be able to tell that. However, it seems we don't do it.

The test case swaps the following access relations (the original JSCoP can be found in the new version of the patch):

{

"kind" : "read",
"relation" : "{ Stmt_for_body6[i0, i1, i2] -> MemRef_C1[0] }"

}

{

"kind" : "read",
"relation" : "{ Stmt_for_body6[i0, i1, i2] -> MemRef_C[i0, i1] }"

}

Subsequently, for some reason, Polly generates different accesses to elements of the matrix C. For example, in case of C[3][7], Polly generates the following code:

%459 = mul nsw i64 96, %polly.indvar98
%460 = mul nsw i64 4, %polly.indvar134
%461 = add nsw i64 %459, %460
%462 = add nsw i64 %461, 3
%463 = mul nsw i64 8, %polly.indvar128
%464 = add nsw i64 %463, 7
%465 = mul nsw i64 256, %polly.indvar
%466 = add nsw i64 %465, %polly.indvar140
br label %polly.stmt.for.body6940

polly.stmt.for.body6940: ; preds = %polly.stmt.for.body6914
%polly.access.cast.C941 = bitcast [1024 x double]* %C to double*
%467 = mul nsw i64 96, %polly.indvar98
%468 = mul nsw i64 4, %polly.indvar134
%469 = add nsw i64 %467, %468
%470 = add nsw i64 %469, 3
%polly.access.mul.C942 = mul nsw i64 %470, 1024
%471 = mul nsw i64 8, %polly.indvar128
%472 = add nsw i64 %471, 7
%polly.access.add.C943 = add nsw i64 %polly.access.mul.C942, %472
%polly.access.C944 = getelementptr double, double* %polly.access.cast.C941, i64 %polly.access.add.C943
%tmp_p_scalar_945 = load double, double* %polly.access.C944, align 8, !alias.scope !136, !noalias !137
%polly.access.cast.Packed_A946 = bitcast [24 x [256 x [4 x double]]]* %Packed_A to double*
%polly.access.mul.Packed_A947 = mul nsw i64 %polly.indvar134, 256
%polly.access.add.Packed_A948 = add nsw i64 %polly.access.mul.Packed_A947, %polly.indvar140
%polly.access.mul.Packed_A949 = mul nsw i64 %polly.access.add.Packed_A948, 4
%polly.access.add.Packed_A950 = add nsw i64 %polly.access.mul.Packed_A949, 3
%polly.access.Packed_A951 = getelementptr double, double* %polly.access.cast.Packed_A946, i64 %polly.access.add.Packed_A950
%tmp1_p_scalar_952 = load double, double* %polly.access.Packed_A951, align 8, !alias.scope !7, !noalias !10
%polly.access.cast.Packed_B953 = bitcast [256 x [256 x [8 x double]]]* %Packed_B to double*
%polly.access.mul.Packed_B954 = mul nsw i64 %polly.indvar128, 256
%polly.access.add.Packed_B955 = add nsw i64 %polly.access.mul.Packed_B954, %polly.indvar140
%polly.access.mul.Packed_B956 = mul nsw i64 %polly.access.add.Packed_B955, 8
%polly.access.add.Packed_B957 = add nsw i64 %polly.access.mul.Packed_B956, 7
%polly.access.Packed_B958 = getelementptr double, double* %polly.access.cast.Packed_B953, i64 %polly.access.add.Packed_B957
%tmp2_p_scalar_959 = load double, double* %polly.access.Packed_B958, align 8, !alias.scope !6, !noalias !8
%p_mul960 = fmul double %tmp1_p_scalar_952, %tmp2_p_scalar_959
%p_add961 = fadd double %tmp_p_scalar_945, %p_mul960
%polly.access.C1962 = getelementptr double, double* %C1, i64 0
%tmp3_p_scalar_963 = load double, double* %polly.access.C1962, align 8, !alias.scope !3, !noalias !13
%p_add18964 = fadd double %tmp3_p_scalar_963, %p_add961
%scevgep965 = getelementptr [1024 x double], [1024 x double]* %C, i64 %462, i64 %464
store double %p_add18964, double* %scevgep965, align 8, !alias.scope !138, !noalias !139
%polly.indvar_next141 = add nsw i64 %polly.indvar140, 1
%polly.loop_cond142 = icmp sle i64 %polly.indvar_next141, 255
br i1 %polly.loop_cond142, label %polly.loop_header137, label %polly.loop_exit139

Let's consider how accesses to the matrix C are formed. Indices of the first dimension are equal:



%459 = mul nsw i64 96, %polly.indvar98
%460 = mul nsw i64 4, %polly.indvar134
%461 = add nsw i64 %459, %460
%462 = add nsw i64 %461, 3



%467 = mul nsw i64 96, %polly.indvar98
%468 = mul nsw i64 4, %polly.indvar134
%469 = add nsw i64 %467, %468
%470 = add nsw i64 %469, 3



Indices of the second dimension are equal too:



%463 = mul nsw i64 8, %polly.indvar128
%464 = add nsw i64 %463, 7



%471 = mul nsw i64 8, %polly.indvar128
%472 = add nsw i64 %471, 7



However, to store a value we access the two dimensional array:

%scevgep965 = getelementptr [1024 x double], [1024 x double]* %C, i64 %462, i64 %464

store double %p_add18964, double* %scevgep965, align 8

To read the value we access the one dimensional array:

%polly.access.cast.C941 = bitcast [1024 x double]* %C to double*



%polly.access.mul.C942 = mul nsw i64 %470, 1024



%polly.access.add.C943 = add nsw i64 %polly.access.mul.C942, %472
%polly.access.C944 = getelementptr double, double* %polly.access.cast.C941, i64 %polly.access.add.C943
%tmp_p_scalar_945 = load double, double* %polly.access.C944, align 8

Consequently, as far as I understand, we have two different base pointers that point to the same location. Since we compare raw pointers to determine a second level alias set, Polly generates different second level alias set for these read and write accesses to the matrix C. In case of DeLiCM, we have a similar situation. However, I haven't manage to get a reduced test case.

Probably, it'd be good to fix the problem of the redundant code generation of Polly. Unfortunately, I'm busy at the moment and can't do it. In any case, I think that it makes sense to make the second level aliasing use the SCEV information, since it makes it more robust.

> 2. What exactly must be checked in the new test case here.

Check that we don't create different alias sets for locations represented by different raw pointers.

> What was generated before, why was this wrong,

Previously, we generated 64 second level alias sets instead of 32 second level alias sets, since we comparee raw pointers to determine second level alias sets.

> and what does the new code generate.

32 second level alias sets.

> Also, I may have missed this earlier. Why are the number of elements in the check lines growing so much. Is this expected?



> CHECK-NEXT: !43 = !{!3, !4, !0, !6, !7, !11, !14, !16, !18, !20, !22, !24, !26, !28, !30, !32, !34, !36, !38, !40}

I think this is expected. The innermost loop computes 32 different elements of the matrix C. Consequently, 32 different second level alias sets are generated.

P.S.: Sorry, I've found out that we check the wrong output. I've updated the test case.


https://reviews.llvm.org/D35761

Files:
include/polly/CodeGen/IRBuilder.h
lib/CodeGen/IRBuilder.cpp
test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop
test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop.transformed
test/ScheduleOptimizer/pattern-matching-based-opts_12.ll

D35761.109861.patch

Tobias Grosser via Phabricator

unread,
Aug 5, 2017, 5:11:06 PM8/5/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
grosser accepted this revision.
grosser added a comment.
This revision is now accepted and ready to land.

OK, that's fine for me than. Can you possibly add to the test case check lines for the load and store to illustrate to which alias data they refer to?


https://reviews.llvm.org/D35761



Phabricator via Phabricator

unread,
Aug 8, 2017, 12:52:41 PM8/8/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
This revision was automatically updated to reflect the committed changes.
Closed by commit rL310380: Use SCEV information for the second level aliasing (authored by romangareev).

Changed prior to commit:
https://reviews.llvm.org/D35761?vs=109861&id=110221#toc

Repository:
rL LLVM

https://reviews.llvm.org/D35761

Files:
polly/trunk/include/polly/CodeGen/IRBuilder.h
polly/trunk/lib/CodeGen/IRBuilder.cpp
polly/trunk/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop
polly/trunk/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop.transformed
polly/trunk/test/ScheduleOptimizer/pattern-matching-based-opts_14.ll

D35761.110221.patch

Roman via Phabricator

unread,
Aug 8, 2017, 12:57:04 PM8/8/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
gareevroman added a comment.

In https://reviews.llvm.org/D35761#833110, @grosser wrote:

> OK, that's fine for me than. Can you possibly add to the test case check lines for the load and store to illustrate to which alias data they refer to?


AFAIU, it'd require to add IR and make it dependent on the generated code. I'm not sure that it's a good approach.

Tobias Grosser via Phabricator

unread,
Aug 8, 2017, 1:00:48 PM8/8/17
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, doer...@cs.uni-saarland.de, siddu...@gmail.com, llvm-c...@lists.llvm.org, poll...@googlegroups.com
grosser reopened this revision.
grosser added a comment.
This revision is now accepted and ready to land.

I am slightly confused. Don't we add IR in all our test cases?

Roman via Phabricator

unread,
Oct 7, 2019, 8:41:11 AM10/7/19
to garee...@gmail.com, phabr...@grosser.es, ll...@meinersbur.de, jdoe...@anl.gov, siddu...@gmail.com, jav...@graphcore.ai, llvm-c...@lists.llvm.org, poll...@googlegroups.com, w...@google.com, czhe...@cn.ibm.com, felipe.de.aze...@intel.com
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1563f039f504: Use SCEV information for the second level aliasing (authored by gareevroman).
Herald added a subscriber: javed.absar.
Herald added a project: LLVM.

Changed prior to commit:
https://reviews.llvm.org/D35761?vs=110221&id=223531#toc

Repository:
rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D35761/new/

https://reviews.llvm.org/D35761

Files:
polly/include/polly/CodeGen/IRBuilder.h
polly/lib/CodeGen/IRBuilder.cpp
polly/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop
polly/test/ScheduleOptimizer/kernel_gemm___%for.body---%for.end24.jscop.transformed
polly/test/ScheduleOptimizer/pattern-matching-based-opts_14.ll

D35761.223531.patch

Michael Kruse via Phabricator

unread,
Sep 20, 2021, 11:20:46 PM9/20/21
to garee...@gmail.com, phabr...@grosser.es, jdoe...@anl.gov, siddu...@gmail.com, jav...@graphcore.ai, poll...@googlegroups.com, bhuvanend...@amd.com, yanli...@intel.com, doug...@gmail.com
Meinersbur added a reverting change: rGcad9f98a2ad9: [Polly] Don't generate inter-iteration noalias metadata..
Reply all
Reply to author
Forward
0 new messages