Hi Matteo,
I think Roman (CC'ed) had a similar problem recently.
IIRC, the solution that was discussed was to enable memory promotion of the alloca into a vector.
Once that happen the existing instcombine logic might already create your select.
If the above turns out to be not sufficient for a more complex case, you might want to
consider building something on top of https://reviews.llvm.org/D80991, or maybe Shinji (CC'ed)
might even ;)
Cheers,
Johannes
, the following input can be used: define i64 @_Z3srcm(i64 %Flags) { entry: %Memory = alloca [2 x i64], align 16 %and = lshr i64 %Flags, 6 %shr = and i64 %and, 1 %0 = bitcast [2 x i64]* %Memory to i8* %arrayidx = getelementptr inbounds [2 x i64], [2 x i64]* %Memory, i64 0, i64 0 store i64 5369966919, i64* %arrayidx, align 16 %arrayidx1 = getelementptr inbounds [2 x i64], [2 x i64]* %Memory, i64 0, i64 1 store i64 5369966790, i64* %arrayidx1, align 8 %arrayidx2 = getelementptr inbounds [2 x i64], [2 x i64]* %Memory, i64 0, i64 %shr %1 = load i64, i64* %arrayidx2, align 8 ret i64 %1 } define i64 @_Z3tgtm(i64 %Flags) { entry: %0 = and i64 %Flags, 64 %trunc = icmp eq i64 %0, 0 %. = select i1 %trunc, i64 5369966919, i64 5369966790 ret i64 %. } At this point I decided to replicate the 'tgt' function logic and coded a quick LLVM pass that: 1. uses the known/unknown computed bits information to identify a non-volatile 'load' instruction that uses an index proved to have only two possible values; 2. check if there's at least one 'store' to the accessed buffer using one of the two indexes; 3. converts the single 'load' instruction into two 'load' instructions using the concrete indexes; 4. generates a 'select' instruction that returns one of the two loaded values, using as condition a check on the index. The pass seems to be working fine, but I'm left wondering if LLVM is purposefully avoiding such an optimization, and if so what is the reason to do so (e.g. hard to prove that the optimization is actually going to improve the quality of the code, the logic I'm using is completely off the rails). Assuming the logic it's correct and this could be seen as a new custom optimization pass, what would be the suggested way to implement it in a solid and generic fashion? Thanks for any insight, Matteo
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> IIRC, the solution that was discussed was to enable memory promotion of the alloca into a vector.
I'm working on that :)
But i'm not sure if we will be okay doing that unconditionally,
is spilling going to be able to undo all that?
> Once that happen the existing instcombine logic might already create your select.
>
> If the above turns out to be not sufficient for a more complex case, you might want to
> consider building something on top of https://reviews.llvm.org/D80991, or maybe Shinji (CC'ed)
> might even ;)
>
> Cheers,
> Johannes
Roman
Assuming the logic it's correct and this could be seen as a new custom optimization pass, what would be the suggested way to implement it in a solid and generic fashion?