Hello all,
In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes optimization remarks and some future plans for the project. I had a few follow-up questions:
1. As an example of future work to be done, the talk mentions expanding the set of optimization passes that emit remarks. However, the Clang User Manual mentions that "optimization remarks do not really make sense outside of the major transformations (e.g.: inlining, vectorization, loop optimizations)." [1] I am wondering: which passes exist today that are most in need of supporting optimization remarks? Should all passes emit optimization remarks, or are there indeed passes for which optimization remarks "do not make sense"?
2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML dashboard for the optimization remark YAML generated from a large C++ program. Unfortunately, the Python script does not finish, even after over an hour of processing. It appears performance has been brought up before by Bob Haarman (cc'ed), and some optimizations have been made since. [2] I wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if there's still some room to speed up the opt-viewer.py script? I tried the C++ implementation as well, but that never completed either. [3]
Overall I'm excited to make greater use of optimization remarks, and to contribute in any way I can. Please let me know if you have any thoughts on my questions above!
- Brian Gesiak
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
On Jun 20, 2017, at 1:13 AM, Brian Gesiak <modo...@gmail.com> wrote:Hello all,In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes optimization remarks and some future plans for the project. I had a few follow-up questions:1. As an example of future work to be done, the talk mentions expanding the set of optimization passes that emit remarks. However, the Clang User Manual mentions that "optimization remarks do not really make sense outside of the major transformations (e.g.: inlining, vectorization, loop optimizations)." [1] I am wondering: which passes exist today that are most in need of supporting optimization remarks? Should all passes emit optimization remarks, or are there indeed passes for which optimization remarks "do not make sense”?
2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML dashboard for the optimization remark YAML generated from a large C++ program. Unfortunately, the Python script does not finish, even after over an hour of processing. It appears performance has been brought up before by Bob Haarman (cc'ed), and some optimizations have been made since. [2] I wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if there's still some room to speed up the opt-viewer.py script? I tried the C++ implementation as well, but that never completed either. [3]
    
Adam, thanks for all the suggestions!
One nice aspect of the `-Rpass` family of options is that I can filter based on what I want. If I only want to see which inlines I missed, I could use `clang -Rpass-missed="inline"`, for example. On the other hand, optimization remark YAML always include remarks from all passes (as far as I can tell), which increases the amount of time it takes opt-viewer.py and other tools to parse. Would you be open to including options to, for example, only emit optimization remarks related to loop vectorization, or to not emit any analysis remarks? Or is it important that the YAML always include all remarks?
Let me know what you think! In the meantime, I'll try to add the progress bar you mention in llvm.org/PR33522. Thanks!
- Brian Gesiak
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ah, I see, that makes sense. Thanks!
> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>
> We desperately need a progress bar in opt-viewer.  Let me know if you want to add it otherwise I will.  I filed llvm.org/PR33522 for this.
>
> In terms of improving the performance, I am pretty sure the bottleneck is still YAML parsing so:
>
> - If PGO is used, we can have a threshold to not even emit remarks on cold code, this should dramatically improve performance, llvm.org/PR33523
> - I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...
I added progress indicators in https://reviews.llvm.org/D34735, and it
seems like it takes a while for the Python scripts to read some of the
larger YAML files produced for my program. I'll try to look into
binary YAML encoding later this week.
A threshold preventing remarks from being emitted on cold code sounds
good to me as well. Hal, do you agree, or is this also something that
tools at a higher level should be responsible for ignoring?
Sounds good.
>
> A threshold preventing remarks from being emitted on cold code sounds
> good to me as well. Hal, do you agree, or is this also something that
> tools at a higher level should be responsible for ignoring?
I agree that this makes sense. Tools can also threshold at a higher 
level, but this kind of generic filtering can be done without exposing 
users to any unnecessary implementation details.
-Hal
>
> - Brian Gesiak
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>
> I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...
I was under the impression that YAML had some sort of standard binary
encoding format, sort of like JSON and BSON [1], but this doesn't
appear to be the case. Did you have something specific in mind here,
or did you mean having optimization remarks optionally emit a
different, non-human-readable format? If so, MessagePack [2] or
protocol buffers [3] appear to be widely used.
[1] http://bsonspec.org
[2] http://msgpack.org/index.html
[3] https://developers.google.com/protocol-buffers/
- Brian Gesiak
You may want look at these files with opt-stats and see what type of remarks are on the top of the list. If they are missed remarks, we may need to work harder to remove false positives.
Adam
I didn’t really have anything specific in mind. If we want to completely change the file format for opt remarks from YAML, we probably want to carefully evaluate the options wrt the potential speed-up.
Adam
Hi,
I've been asked at $WORK to take a look at `-opt-remarks` , so here
are a couple of thoughts.
1) When LTO is on, the output isn't particularly easy to read. I guess
this can be mitigated with some filtering approach, I and Simon
discussed it offline.
2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.
3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])
4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.
5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776
That said, I think optimization remarks support is coming along nicely.
--
Davide
2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.
3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])
4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.
5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776
That said, I think optimization remarks support is coming along nicely.
The issue is twofold:
1) With LTO, the number of remarks generated skyrockets because whole
module visibility makes IPO more effective (i.e. you end up inlining
much more etc..). As a side effect, more aggressive inlining/IPCP
expose more intraprocedural optimizations which in turn generates more
remarks.
2) As pointed out earlier, DI is not always reliable.
>
>
> 2) Yes, indeed `opt-viewer` takes forever for large testcases to
> process. I think that it could lead to exploring a better
> representation than YAML which is, indeed, a little slow to parse. To
> be honest, I'm torn about this.
> YAML is definitely really convenient as we already use it somewhere in
> tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
> that nicely.
>
>
> Agreed.  We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.
>
At some point, I guess we might just slowly moving away from
>
>
> 3) There are lots of optimizations which are still missing from the
> output, in particular PGO remarks (including, e.g. branch info
> probabilities which still use the old API as far as I can tell
> [PGOInstrumentation.cpp])
>
>
> Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?
>
That sounds like a plan.
> Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator.  The pass emits a remark for each basic block.  Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line.  The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness.  E.g.:
>
>
Yes, feel free to post for review once you have it ready.
>
>
> 4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
> attached to instructions. Things get a little hairy at -O3 (or with
> -flto) because there are optimizations bugs so transformations don't
> preserve debuginfo. This is not entirely orthogonal but something can
> be worked on in parallel (bonus point, this would also help SamplePGO
> & debuginfo experience). With `-flto` the problem gets amplified more,
> as expected.
>
> 5) I found a couple of issue when trying the support, but I'm actively
> working on them.
> https://bugs.llvm.org/show_bug.cgi?id=33773
> https://bugs.llvm.org/show_bug.cgi?id=33776
>
> That said, I think optimization remarks support is coming along nicely.
>
>
> Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!
At some point, I guess we might just consider the HTML generated
report as a fallback and having the opt-remarks more integrated in the
developer's workflow.
I personally use Visual studio daily to compile clang and it would be
nice to have remarks there as a plugin. I can imagine something
similar happening for XCode/CLion/Emacs etc..
Thanks,
On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm...@lists.llvm.org> wrote:
2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.
3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])
4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.
5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776
That said, I think optimization remarks support is coming along nicely.
2) As pointed out earlier, DI is not always reliable.
2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.
Agreed. We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.
At some point, I guess we might just slowly moving away from
3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])
Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?
That sounds like a plan.Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator. The pass emits a remark for each basic block. Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line. The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness. E.g.:
Yes, feel free to post for review once you have it ready.
4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.
5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776
That said, I think optimization remarks support is coming along nicely.
Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!
At some point, I guess we might just consider the HTML generated
report as a fallback and having the opt-remarks more integrated in the
developer's workflow.
I personally use Visual studio daily to compile clang and it would be
nice to have remarks there as a plugin. I can imagine something
similar happening for XCode/CLion/Emacs etc..
Thanks,
--
Davide
On Jul 14, 2017, at 10:22 AM, Davide Italiano <dav...@freebsd.org> wrote:
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On Jul 18, 2017, at 11:49 AM, Sam Elliott <as...@cs.washington.edu> wrote:I may grab a few of these bugs in the next few days.Am I correct in thinking that only the following passes use the new OptimizationRemark system (or is searching for `INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)` not the correct way to find them?)- GVN- Loop Data Prefetch- Loop Distribution- Simplify Instructions- Loop Vectorize- SLP Vectorize- Loop Interchange