However, I discovered in LoopVectorize.cpp
(http://llvm.org/docs/doxygen/html/LoopVectorize_8cpp_source.html) we have the method
InnerLoopVectorizer::getOrCreateTripCount() that seems to do a better job at computing the
trip count, even if the implementation differences are not big. The differences are subtle
- first of all the method getOrCreateTripCount() doesn't call
hasLoopInvariantBackedgeTakenCount().
Please don't hesitate to comment why InnerLoopVectorizer::getOrCreateTripCount()
works better. I will try to come back myself with more info.
Thank you,
Alex
PS: Could you please recommend me one important paper for Scalar evolutions?
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
I have not tried InnerLoopVectorizer::getOrCreateTripCount(), which is not there in LLVM 3.4.2.
I suppose I should first try using the latest LLVM release, but atm I'm having problems accessing llvm.org for download.
For the paper on SCEV, I have read
O. Bachmann, P. S. Wang, and E. V. Zima. Chains of recurrences–to expedite the
evaluation of closed-form functions. In ISSAC ’94, pages 242–249. ACM, July 1994.
Best,
Andrew
On Friday, 19 May 2017, 3:03, via llvm-dev <llvm...@lists.llvm.org> wrote:
Message: 3
Date: Thu, 18 May 2017 21:30:33 +0300
From: Alex Susu via llvm-dev <llvm...@lists.llvm.org>
To: llvm-dev <llvm...@lists.llvm.org>
Subject: [llvm-dev] Computing loop trip counts with Scalar evolution
Message-ID: <079d36ea-a81f-5281...@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Your function is weird because the compiler can't prove whether the
induction variables overflow. Here's the output of "clang -O2
-emit-llvm -S | opt -analyze -scalar-evolution":
Determining loop execution counts for: @MatMul
Loop %for.body13: Unpredictable backedge-taken count.
Loop %for.body13: Unpredictable max backedge-taken count.
Loop %for.body13: Predicated backedge-taken count is (-1 + %Acols)
Predicates:
{1,+,1}<%for.body13> Added Flags: <nssw>
Loop %for.body6: Unpredictable backedge-taken count.
Loop %for.body6: Unpredictable max backedge-taken count.
Loop %for.body6: Predicated backedge-taken count is (-1 + %Bcols)
Predicates:
{1,+,1}<%for.body6> Added Flags: <nssw>
Loop %for.cond2.preheader: Unpredictable backedge-taken count.
Loop %for.cond2.preheader: Unpredictable max backedge-taken count.
Loop %for.cond2.preheader: Predicated backedge-taken count is (-1 + %Arows)
Predicates:
{1,+,1}<%for.cond2.preheader> Added Flags: <nssw>
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
We have the predicated backedge taken count, so the loop should be vectorizable.
I think the fact that getOrCreateTripCount doesn't use the predicated backedge taken count is probably a bug (and might blow up in certain conditions).
-Silviu