Polly usage issues

27 views

Skip to first unread message

Ejjeh, Adel

unread,

Aug 18, 2022, 7:01:04 PM8/18/22

to poll...@googlegroups.com

Hello

I recently picked up an old project that I worked on using polly (I was using polly on llvm 9.0 at the time) with the goal of expanding the functionality that I had. The general idea is that I to detect when I have two functions that have a producer / consumer relationship and pipeline them. This is part of a bigger project where I am building a compiler for FPGA code-generation under the HPVM project (hpvm.cs.illinois.edu). I use polly to determine when the access patterns of the producer/consumer do not match and transform the consumer to make the pipeline generation legal. My original implementation handled very basic cases with lots of limitations, and at the time the approach I took after getting feedback and advice from this mailing list was to combine the producer and consumer functions into a single function so that the Scops and Relations in the producer/consumer have the same ISL context allowing me to manipulate and compare the schedules and access relations. The issue I am facing now, as I try to expand the functionality to support more general loops (including nesting) and more complex access patterns, is that Polly is not always detecting the Scops, and in some cases it is detecting a single Scop spanning both loops.

I have the following example:

Producer:

for (size_t i = 0; i < row_size; ++i) {

out1_in2[i] = in[i] * 2;

}

Consumer:

for (size_t i = 0; i < row_size; ++i) {

size_t Left = i - 1;

size_t Middle = i;

size_t Right = i + 1;

float in_L = (i > 0) ? out1_in2[Left] : 0;

float in_M = out1_in2[Middle];

float in_R = (i < row_size) ? out1_in2[Right] : 0;

out2_in3[i] = (in_L + in_M + in_R) /

}

My pass first combines the two functions into a single merged function, then runs the ScopInfoAnalysis on the merged function as such:

auto &MergedSI = FAM.getResult<ScopInfoAnalysis>(*FMerged);

In the above example, I noticed that there is only one Scop generated for the merged function. I ran through gdb and noticed that in the function ScopDetection::FindScops(), first the two subregions are detected as valid regions, but then, it tries to expand the region in the following code:

for (Region *CurrentRegion : ToExpand) {

// Skip invalid regions. Regions may become invalid, if they are element of

// an already expanded region.

if (!ValidRegions.count(CurrentRegion))

continue;

// Skip regions that had errors.

bool HadErrors = lookupRejectionLog(CurrentRegion)->hasErrors();

if (HadErrors)

continue;

Region *ExpandedR = expandRegion(*CurrentRegion);

if (!ExpandedR)

continue;

R.addSubRegion(ExpandedR, true);

ValidRegions.insert(ExpandedR);

removeCachedResults(*CurrentRegion);

removeCachedResultsRecursively(*ExpandedR);

}

This leads to it merging the two regions of the two loops into a single region and generating a Scop for that region instead. I cannot seem to understand why this is happening, and I was wondering if there is a way to enforce that it will return two separate Scops for the separate loops? Would it be better if I manually built the Scops for the loops that I am interested in instead of using the ScopInfoAnalysis pass?

One thing that I have noticed through some debugging and testing I’ve done is that the merging of the regions is occurring when the loops do not have loop guard branches. In this example (and the one below) the loop guard branches for these loops are being explicitly removed by some of our passes. I’ve tested a different case where the loops have guard branches and I do get two separate scops for the producer and the consumer. I am wondering if I should instead be manually creating scops using ScopBuilder while specifying the regions that I need the Scop for to enforce having two separate Scops for the producer/consumer in these scenarios. Any advice here would be appreciated!

Another problem I am facing when using nested loops as such:

Producer:

for (size_t i = 0; i < row_size; ++i) {

for (size_t j = 0; j < col_size; ++j) {

out1_in2[i * col_size + j] = in[i * col_size + j] * 2;

}

Consumer:

for (size_t i = 0; i < row_size; ++i) {

for (size_t j = 0; j < col_size; ++j) {

size_t TopLeft = (i - 1) * col_size + j - 1;

size_t TopMiddle = (i - 1) * col_size + j;

size_t TopRight = (i - 1) * col_size + j + 1;

size_t CenterLeft = i * col_size + j - 1;

size_t CenterMiddle = i * col_size + j;

size_t CenterRight = i * col_size + j + 1;

size_t BottomLeft = (i + 1) * col_size + j - 1;

size_t BottomMiddle = (i + 1) * col_size + j;

size_t BottomRight = (i + 1) * col_size + j + 1;

float in_TL = (i > 0 && j > 0) ? out1_in2[TopLeft] : 0;

float in_TM = (i > 0) ? out1_in2[TopMiddle] : 0;

float in_TR = (i > 0 && j < col_size) ? out1_in2[TopRight] : 0;

float in_CL = (j > 0) ? out1_in2[CenterLeft] : 0;

float in_CM = out1_in2[CenterMiddle];

float in_CR = (j < col_size) ? out1_in2[CenterRight] : 0;

float in_BL = (i < row_size && j > 0) ? out1_in2[BottomLeft] : 0;

float in_BM = (i < row_size) ? out1_in2[BottomMiddle] : 0;

float in_BR = (i < row_size && j < col_size) ? out1_in2[BottomRight] : 0;

out2_in3[i * col_size + j] = (in_TL + in_TM + in_TR + in_CL + in_CM +

in_CR + in_BL + in_BM + in_BR) /

}

In this scenario the regions are also getting merged, but now no Scops are generated. Instead, ScopBuilder is generating a message: “SCoP detected but dismissed” which seems to be coming from the ScopBuilder() constructor. Any ideas about this as well will also be helpful.

Thanks

-Adel

Michael Kruse

unread,

Aug 22, 2022, 9:04:16 PM8/22/22

to Ejjeh, Adel, poll...@googlegroups.com

Hi Ejjeh,

in the usual use case you will want the SCoP to be as large as
possible. Larger SCoPs mean more opportunities for optimizations. For
instance, with two sequential loops in the same SCoP, they can be
fused into a single loop. If fusing is not profitable, they can still
be optimized independently. If you don't want it, you could add an
option that causes `ScopDetection::expandRegion` to exit immediately.
However, you use case sounds like you would rather like to have more
control over the SCoP region in general since ScopDetection might also
find a SCoP too small rather than too large. In this case you might
want to do what findScops does from your own code on the region you
want to be analyzed. isValidRegion will tell you whether it actually
is a proper SCoP or not.

Using ScopBuilder alone is not a solution since it depends on
DetectionContext and others created by ScopDetection.

"SCoP detected but dismissed" is a result of
hasFeasibleRuntimeContext() returning false. This may have many
reasons, one of them is that the polyhedral representation becomes too
complicated. This is likely the cause here since size_t requires
integer over/underflow checks.
Use -Rpass-analysis=polly-scops to emit which assumptions are required.

Michael

--
Tardyzentrismus verboten!

Reply all

Reply to author

Forward

0 new messages