[llvm-dev] Alignment analysis

Skip to first unread message

Thomas Raoux via llvm-dev

Jan 25, 2022, 1:27:15 AM1/25/22
to llvm...@lists.llvm.org

I have been looking at a way to get alignment information propagation
on load/store instructions. I see that instcombine does some alignment
propagation based on value tracking helper ComputeKnownBits, however
because of compile time heuristic it won't recurse much when the value
comes from a phi

Is there a way to do more aggressive alignment propagation? I'm
generating code going through the NVPTX target and because of the
missing alignment information the backend will have to break up vector
loads into scalar ones causing performance problems. (I also tried
AlignmentFromAssumptionsPass but it doesn't help).

I assume this may be a problem for other backends as well. How is it handled?

Here is a simple showing how limited the propagation is:

In the code below `p[offset]` should be aligned as the offset comes
from a phi(0, offset << 8) however the alignment analysis doesn't
detect it:

float *getAlignedPtr() __attribute__((assume_aligned (32)));
void anchor();

float loadaligned(int offset, bool c) {
float* p = getAlignedPtr();
int offset1 = offset << 8;
if(c) {
offset = offset1;
} else {
offset = 0;
// p[offset1] and p[0] are marked as aligned but not p[offset]
return p[offset] + p[offset1] + p[0];

LLVM Developers mailing list

Johannes Doerfert via llvm-dev

Jan 25, 2022, 10:00:40 AM1/25/22
to Thomas Raoux, llvm...@lists.llvm.org
Hi Thomas,

the `sext` of offset confuses the alignment computation.
If you make offset and offset1 `long int` it seems to work
fine: https://godbolt.org/z/zYdro6YPz

I haven't dug in deeper but this might help. Probably worth
to file a bug/issue.

~ Johannes

Thomas Raoux via llvm-dev

Jan 25, 2022, 11:34:06 AM1/25/22
to Johannes Doerfert, llvm...@lists.llvm.org
Hi Johannes,

It is not really the `sext` that is confusing the analysis, the
problem is that the computeKnownBits will only go one level up after
the phi, so if there is any instruction in between the `shl` and the
phi the analysis will fail.

Here is the same example with long int and an extra `add` in between:

This is done on purpose due to this limitation:
// Recurse, but cap the recursion to one level, because we don't
// want to waste time spinning around in loops.
computeKnownBits(IncValue, Known2, MaxAnalysisRecursionDepth - 1, RecQ);


Johannes Doerfert via llvm-dev

Jan 25, 2022, 12:16:22 PM1/25/22
to Thomas Raoux, llvm...@lists.llvm.org
Good point.

Cut-offs are always tricky. One could certainly make
this a command line option though.

~ Johannes

Reply all
Reply to author
0 new messages