[llvm-dev] Next steps for optimization remarks?

Brian Gesiak via llvm-dev

unread,

Jun 19, 2017, 7:14:29 PM6/19/17

to llvm-dev, Xinliang David Li

Hello all,

In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes optimization remarks and some future plans for the project. I had a few follow-up questions:

1. As an example of future work to be done, the talk mentions expanding the set of optimization passes that emit remarks. However, the Clang User Manual mentions that "optimization remarks do not really make sense outside of the major transformations (e.g.: inlining, vectorization, loop optimizations)." [1] I am wondering: which passes exist today that are most in need of supporting optimization remarks? Should all passes emit optimization remarks, or are there indeed passes for which optimization remarks "do not make sense"?

2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML dashboard for the optimization remark YAML generated from a large C++ program. Unfortunately, the Python script does not finish, even after over an hour of processing. It appears performance has been brought up before by Bob Haarman (cc'ed), and some optimizations have been made since. [2] I wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if there's still some room to speed up the opt-viewer.py script? I tried the C++ implementation as well, but that never completed either. [3]

Overall I'm excited to make greater use of optimization remarks, and to contribute in any way I can. Please let me know if you have any thoughts on my questions above!

[1] https://clang.llvm.org/docs/UsersManual.html#options-to-emit-optimization-reports

[2] http://lists.llvm.org/pipermail/llvm-dev/2016-November/107039.html

[3] https://reviews.llvm.org/D26723

- Brian Gesiak

Hal Finkel via llvm-dev

unread,

Jun 19, 2017, 7:28:42 PM6/19/17

to Brian Gesiak, llvm-dev, Xinliang David Li

On 06/19/2017 06:13 PM, Brian Gesiak via llvm-dev wrote:

Hello all,

In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes optimization remarks and some future plans for the project. I had a few follow-up questions:

1. As an example of future work to be done, the talk mentions expanding the set of optimization passes that emit remarks. However, the Clang User Manual mentions that "optimization remarks do not really make sense outside of the major transformations (e.g.: inlining, vectorization, loop optimizations)." [1] I am wondering: which passes exist today that are most in need of supporting optimization remarks? Should all passes emit optimization remarks, or are there indeed passes for which optimization remarks "do not make sense"?

Obviously there is a continuous spectrum of transformation effects between "major" and "minor", and moreover, we have different consumers of the remarks. Remarks that would be too noisy if directly viewed by a human (because Clang prints them all, for example), might make perfect sense if interpreted by some tool. llvm-opt-report, for example, demonstrates how a tool can collect many remarks and aggregate them into a more succinct form.

If you're looking for an area to contribute, I'd recommend looking at how to better output (and display) the "why not" of transformations that didn't fire. Memory dependencies that block vectorization and loop-invariant code motion, for example, would be really useful if mapped back to source-level constructs for presentation to the user.

-Hal

2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML dashboard for the optimization remark YAML generated from a large C++ program. Unfortunately, the Python script does not finish, even after over an hour of processing. It appears performance has been brought up before by Bob Haarman (cc'ed), and some optimizations have been made since. [2] I wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if there's still some room to speed up the opt-viewer.py script? I tried the C++ implementation as well, but that never completed either. [3]

Overall I'm excited to make greater use of optimization remarks, and to contribute in any way I can. Please let me know if you have any thoughts on my questions above!

[1] https://clang.llvm.org/docs/UsersManual.html#options-to-emit-optimization-reports

[2] http://lists.llvm.org/pipermail/llvm-dev/2016-November/107039.html

[3] https://reviews.llvm.org/D26723

- Brian Gesiak

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Brian Gesiak via llvm-dev

unread,

Jun 19, 2017, 8:55:32 PM6/19/17

to Hal Finkel, llvm-dev, Xinliang David Li

Hal, thank you! I had forgotten about llvm-opt-report, even though it was mentioned during Adam's talk. I think a tool that parses the YAML, like llvm-opt-report, might be a way to sidestep my HTML generation performance problem as well. I'm most interested in, as you mentioned, transformations that were not applied, and surfacing those to users. Thanks for suggesting vectorization and LICM blockers, I'll look into those.

- Brian Gesiak

Adam Nemet via llvm-dev

unread,

Jun 20, 2017, 4:50:29 AM6/20/17

to Brian Gesiak, llvm-dev, Xinliang David Li

On Jun 20, 2017, at 1:13 AM, Brian Gesiak <modo...@gmail.com> wrote:

Hello all,

In https://www.youtube.com/watch?v=qq0q1hfzidg, Adam Nemet (cc'ed) describes optimization remarks and some future plans for the project. I had a few follow-up questions:

1. As an example of future work to be done, the talk mentions expanding the set of optimization passes that emit remarks. However, the Clang User Manual mentions that "optimization remarks do not really make sense outside of the major transformations (e.g.: inlining, vectorization, loop optimizations)." [1] I am wondering: which passes exist today that are most in need of supporting optimization remarks? Should all passes emit optimization remarks, or are there indeed passes for which optimization remarks "do not make sense”?

I think that we want to report most optimizations. Where I think we need to be a bit more careful is missed optimizations. For those, we should try to report cases where there is a good chance that the user may be able to take some action to enable the transformation (e.g. pragma, restrict, source modification or cost model overrides).

2. I tried running llvm/utils/opt-viewer/opt-viewer.py to produce an HTML dashboard for the optimization remark YAML generated from a large C++ program. Unfortunately, the Python script does not finish, even after over an hour of processing. It appears performance has been brought up before by Bob Haarman (cc'ed), and some optimizations have been made since. [2] I wonder if I'm passing in bad input (6,000+ YAML files -- too many?), or if there's still some room to speed up the opt-viewer.py script? I tried the C++ implementation as well, but that never completed either. [3]

Do you have libYAML installed to get the C parser for YAML; the Python parser is terribly slow? opt-viewer issues a warning if it needs to fall back on the Python parser.

We desperately need a progress bar in opt-viewer. Let me know if you want to add it otherwise I will. I filed llvm.org/PR33522 for this.

In terms of improving the performance, I am pretty sure the bottleneck is still YAML parsing so:

- If PGO is used, we can have a threshold to not even emit remarks on cold code, this should dramatically improve performance, llvm.org/PR33523

- I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...

- There is a simple tool called opt-stats.py next to the opt-viewer which provide stats on the different types of remarks. We can see which ones are overly noisy and try to reduce the false positive rate. For example, last time I checked the inlining remark that reports missing definition of the callee was the top missed remark. We should not report this for system headers where there is not much the user can do.

Adam

Brian Gesiak via llvm-dev

unread,

Jun 27, 2017, 2:48:52 PM6/27/17

to Adam Nemet, llvm-dev, Xinliang David Li

Adam, thanks for all the suggestions!

One nice aspect of the `-Rpass` family of options is that I can filter based on what I want. If I only want to see which inlines I missed, I could use `clang -Rpass-missed="inline"`, for example. On the other hand, optimization remark YAML always include remarks from all passes (as far as I can tell), which increases the amount of time it takes opt-viewer.py and other tools to parse. Would you be open to including options to, for example, only emit optimization remarks related to loop vectorization, or to not emit any analysis remarks? Or is it important that the YAML always include all remarks?

Let me know what you think! In the meantime, I'll try to add the progress bar you mention in llvm.org/PR33522. Thanks!

- Brian Gesiak

Hal Finkel via llvm-dev

unread,

Jun 28, 2017, 11:24:22 AM6/28/17

to Brian Gesiak, Adam Nemet, llvm-dev, Xinliang David Li

On 06/27/2017 01:48 PM, Brian Gesiak via llvm-dev wrote:

Adam, thanks for all the suggestions!

One nice aspect of the `-Rpass` family of options is that I can filter based on what I want. If I only want to see which inlines I missed, I could use `clang -Rpass-missed="inline"`, for example. On the other hand, optimization remark YAML always include remarks from all passes (as far as I can tell), which increases the amount of time it takes opt-viewer.py and other tools to parse. Would you be open to including options to, for example, only emit optimization remarks related to loop vectorization, or to not emit any analysis remarks? Or is it important that the YAML always include all remarks?

I don't object to adding some kind of filtering option, but in general it won't help. An important goal here is to provide analysis (and other) tools to users that present this information at a higher level. The users won't, and shouldn't, know exactly what kinds of messages the tools use. This is already somewhat true for llvm-opt-report, and will be even more true in the future.

-Hal

Let me know what you think! In the meantime, I'll try to add the progress bar you mention in llvm.org/PR33522. Thanks!

- Brian Gesiak

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Brian Gesiak via llvm-dev

unread,

Jun 28, 2017, 2:57:06 PM6/28/17

to Hal Finkel, llvm-dev, Xinliang David Li

> On Wed, Jun 28, 2017 at 8:13 AM, Hal Finkel <hfi...@anl.gov> wrote:
>
> I don't object to adding some kind of filtering option, but in general it won't help. An important goal here is to provide analysis (and other) tools to users that present this information at a higher level. The users won't, and shouldn't, know exactly what kinds of messages the tools use. This is already somewhat true for llvm-opt-report, and will be even more true in the future.

Ah, I see, that makes sense. Thanks!

> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>
> We desperately need a progress bar in opt-viewer. Let me know if you want to add it otherwise I will. I filed llvm.org/PR33522 for this.
>
> In terms of improving the performance, I am pretty sure the bottleneck is still YAML parsing so:
>
> - If PGO is used, we can have a threshold to not even emit remarks on cold code, this should dramatically improve performance, llvm.org/PR33523
> - I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...

I added progress indicators in https://reviews.llvm.org/D34735, and it
seems like it takes a while for the Python scripts to read some of the
larger YAML files produced for my program. I'll try to look into
binary YAML encoding later this week.

A threshold preventing remarks from being emitted on cold code sounds
good to me as well. Hal, do you agree, or is this also something that
tools at a higher level should be responsible for ignoring?

Hal Finkel via llvm-dev

unread,

Jun 28, 2017, 3:03:55 PM6/28/17

to Brian Gesiak, llvm-dev, Xinliang David Li

On 06/28/2017 01:56 PM, Brian Gesiak wrote:
>> On Wed, Jun 28, 2017 at 8:13 AM, Hal Finkel <hfi...@anl.gov> wrote:
>>
>> I don't object to adding some kind of filtering option, but in general it won't help. An important goal here is to provide analysis (and other) tools to users that present this information at a higher level. The users won't, and shouldn't, know exactly what kinds of messages the tools use. This is already somewhat true for llvm-opt-report, and will be even more true in the future.
> Ah, I see, that makes sense. Thanks!
>
>> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>>
>> We desperately need a progress bar in opt-viewer. Let me know if you want to add it otherwise I will. I filed llvm.org/PR33522 for this.
>>
>> In terms of improving the performance, I am pretty sure the bottleneck is still YAML parsing so:
>>
>> - If PGO is used, we can have a threshold to not even emit remarks on cold code, this should dramatically improve performance, llvm.org/PR33523
>> - I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...
> I added progress indicators in https://reviews.llvm.org/D34735, and it
> seems like it takes a while for the Python scripts to read some of the
> larger YAML files produced for my program. I'll try to look into
> binary YAML encoding later this week.

Sounds good.

>
> A threshold preventing remarks from being emitted on cold code sounds
> good to me as well. Hal, do you agree, or is this also something that
> tools at a higher level should be responsible for ignoring?

I agree that this makes sense. Tools can also threshold at a higher
level, but this kind of generic filtering can be done without exposing
users to any unnecessary implementation details.

-Hal

>
> - Brian Gesiak

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________

Brian Gesiak via llvm-dev

unread,

Jun 29, 2017, 1:50:50 AM6/29/17

to Hal Finkel, llvm-dev, Xinliang David Li

Please excuse all the questions! Circling back to this:

> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>
> I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...

I was under the impression that YAML had some sort of standard binary
encoding format, sort of like JSON and BSON [1], but this doesn't
appear to be the case. Did you have something specific in mind here,
or did you mean having optimization remarks optionally emit a
different, non-human-readable format? If so, MessagePack [2] or
protocol buffers [3] appear to be widely used.

[1] http://bsonspec.org
[2] http://msgpack.org/index.html
[3] https://developers.google.com/protocol-buffers/

- Brian Gesiak

Adam Nemet via llvm-dev

unread,

Jun 29, 2017, 1:15:30 PM6/29/17

to Brian Gesiak, llvm-dev, Xinliang David Li

> On Jun 28, 2017, at 8:56 PM, Brian Gesiak <modo...@gmail.com> wrote:
>
>> On Wed, Jun 28, 2017 at 8:13 AM, Hal Finkel <hfi...@anl.gov> wrote:
>>
>> I don't object to adding some kind of filtering option, but in general it won't help. An important goal here is to provide analysis (and other) tools to users that present this information at a higher level. The users won't, and shouldn't, know exactly what kinds of messages the tools use. This is already somewhat true for llvm-opt-report, and will be even more true in the future.
>
> Ah, I see, that makes sense. Thanks!
>
>> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>>
>> We desperately need a progress bar in opt-viewer. Let me know if you want to add it otherwise I will. I filed llvm.org/PR33522 for this.
>>
>> In terms of improving the performance, I am pretty sure the bottleneck is still YAML parsing so:
>>
>> - If PGO is used, we can have a threshold to not even emit remarks on cold code, this should dramatically improve performance, llvm.org/PR33523
>> - I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...
>
> I added progress indicators in https://reviews.llvm.org/D34735, and it
> seems like it takes a while for the Python scripts to read some of the
> larger YAML files produced for my program.

You may want look at these files with opt-stats and see what type of remarks are on the top of the list. If they are missed remarks, we may need to work harder to remove false positives.

Adam

Adam Nemet via llvm-dev

unread,

Jun 29, 2017, 1:33:54 PM6/29/17

to Brian Gesiak, llvm-dev, Xinliang David Li

> On Jun 29, 2017, at 7:50 AM, Brian Gesiak <modo...@gmail.com> wrote:
>
> Please excuse all the questions! Circling back to this:
>
>> On Tue, Jun 20, 2017 at 1:50 AM, Adam Nemet <ane...@apple.com> wrote:
>>
>> I expect that some sort of binary encoding of YAML would speed up parsing but I haven’t researched this topic yet...
>
> I was under the impression that YAML had some sort of standard binary
> encoding format, sort of like JSON and BSON [1], but this doesn't
> appear to be the case. Did you have something specific in mind here,
> or did you mean having optimization remarks optionally emit a
> different, non-human-readable format? If so, MessagePack [2] or
> protocol buffers [3] appear to be widely used.

I didn’t really have anything specific in mind. If we want to completely change the file format for opt remarks from YAML, we probably want to carefully evaluate the options wrt the potential speed-up.

Adam

Davide Italiano via llvm-dev

unread,

Jul 14, 2017, 11:22:40 AM7/14/17

to Brian Gesiak, llvm-dev, Xinliang David Li, Whittaker, Simon

Hi,
I've been asked at $WORK to take a look at `-opt-remarks` , so here
are a couple of thoughts.

1) When LTO is on, the output isn't particularly easy to read. I guess
this can be mitigated with some filtering approach, I and Simon
discussed it offline.

2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.

3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])

4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.

5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776

That said, I think optimization remarks support is coming along nicely.

--
Davide

Simon Whittaker via llvm-dev

unread,

Jul 14, 2017, 12:50:22 PM7/14/17

to Davide Italiano, llvm-dev, Xinliang David Li

>process. I think that it could lead to exploring a better representation than YAML which is, indeed, a little slow to parse

As a datapoint the codebase of a recent PlayStation4 game produces over 10GiB of YAML files, of course I'm tending to run opt-viewer on just a subset of these to get a reasonable workflow.

Just looking at one of the graphics libraries, which is a reasonable granularity to examine, we have ~10MiB of object file for x86-64, including debug info. This produces ~70MiB of YAML which takes 48s to parse (optrecord.gather_results) and 25s to produce a total of ~70MiB of HTML (generate_report) on a decent i7 with SSD. Not terrible but probably too slow for our end-users.

Brian, did you get time to try out some alternative representations? Although we've not done any finer-grained profiling of the above we also suspect a binary representation might improve things. If you've not looked at this yet we might be able to investigate over the next couple of weeks. If you already have then I'd be happy to test against the codebase above and see what the difference is like.

To echo Davide, we don't want to sound too negative - the remarks work is definitely a good direction to be going in and is already useful.

Thanks,

Simon

Adam Nemet via llvm-dev

unread,

Jul 14, 2017, 1:20:15 PM7/14/17

to Davide Italiano, llvm-dev, Xinliang David Li, Whittaker, Simon

[Resending with smaller image to stay within the size threshold of llvm-dev]

Can you please elaborate?

2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.

Agreed. We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.

3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])

Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?

Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator. The pass emits a remark for each basic block. Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line. The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness. E.g.:

4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.

5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776

That said, I think optimization remarks support is coming along nicely.

Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!

Adam

Davide Italiano via llvm-dev

unread,

Jul 14, 2017, 1:22:57 PM7/14/17

to Adam Nemet, llvm-dev, Xinliang David Li, Whittaker, Simon

On Fri, Jul 14, 2017 at 10:10 AM, Adam Nemet <ane...@apple.com> wrote:
>
>
> On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm...@lists.llvm.org> wrote:
>

> Can you please elaborate?
>

The issue is twofold:
1) With LTO, the number of remarks generated skyrockets because whole
module visibility makes IPO more effective (i.e. you end up inlining
much more etc..). As a side effect, more aggressive inlining/IPCP
expose more intraprocedural optimizations which in turn generates more
remarks.
2) As pointed out earlier, DI is not always reliable.

>
>
> 2) Yes, indeed `opt-viewer` takes forever for large testcases to
> process. I think that it could lead to exploring a better
> representation than YAML which is, indeed, a little slow to parse. To
> be honest, I'm torn about this.
> YAML is definitely really convenient as we already use it somewhere in
> tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
> that nicely.
>
>

> Agreed. We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.
>

At some point, I guess we might just slowly moving away from

>
>
> 3) There are lots of optimizations which are still missing from the
> output, in particular PGO remarks (including, e.g. branch info
> probabilities which still use the old API as far as I can tell
> [PGOInstrumentation.cpp])
>
>

> Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?
>

That sounds like a plan.

> Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator. The pass emits a remark for each basic block. Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line. The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness. E.g.:
>
>

Yes, feel free to post for review once you have it ready.

>
>
> 4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
> attached to instructions. Things get a little hairy at -O3 (or with
> -flto) because there are optimizations bugs so transformations don't
> preserve debuginfo. This is not entirely orthogonal but something can
> be worked on in parallel (bonus point, this would also help SamplePGO
> & debuginfo experience). With `-flto` the problem gets amplified more,
> as expected.
>
> 5) I found a couple of issue when trying the support, but I'm actively
> working on them.
> https://bugs.llvm.org/show_bug.cgi?id=33773
> https://bugs.llvm.org/show_bug.cgi?id=33776
>
> That said, I think optimization remarks support is coming along nicely.
>
>

> Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!

At some point, I guess we might just consider the HTML generated
report as a fallback and having the opt-remarks more integrated in the
developer's workflow.
I personally use Visual studio daily to compile clang and it would be
nice to have remarks there as a plugin. I can imagine something
similar happening for XCode/CLion/Emacs etc..

Thanks,

Adam Nemet via llvm-dev

unread,

Jul 14, 2017, 1:26:32 PM7/14/17

to Davide Italiano, llvm-dev, Xinliang David Li, Whittaker, Simon

On Jul 14, 2017, at 8:21 AM, Davide Italiano via llvm-dev <llvm...@lists.llvm.org> wrote:

Can you please elaborate?

2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.

Agreed. We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.

3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])

Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?

Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator. The pass emits a remark for each basic block. Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line. The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness. E.g.:

4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.

5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776

That said, I think optimization remarks support is coming along nicely.

Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!

Adam

Adam Nemet via llvm-dev

unread,

Jul 14, 2017, 1:32:35 PM7/14/17

to Davide Italiano, llvm-dev, Xinliang David Li, Whittaker, Simon

Ah ok, you meant increased quantity. Sure. On inlining there is actually a low-hanging fruit: https://bugs.llvm.org/show_bug.cgi?id=33786

2) As pointed out earlier, DI is not always reliable.

2) Yes, indeed `opt-viewer` takes forever for large testcases to
process. I think that it could lead to exploring a better
representation than YAML which is, indeed, a little slow to parse. To
be honest, I'm torn about this.
YAML is definitely really convenient as we already use it somewhere in
tree, and it has an easy textual repr. OTOH, it doesn't seem to scale
that nicely.

Agreed. We now have a mitigation strategy with -pass-remarks-hotness-threshold but this is something that we may have to solve in the long run.

At some point, I guess we might just slowly moving away from

3) There are lots of optimizations which are still missing from the
output, in particular PGO remarks (including, e.g. branch info
probabilities which still use the old API as far as I can tell
[PGOInstrumentation.cpp])

Yes, how about we file bugs for each pass that still uses the old API (I am looking at ICP today) and then we can split up the work and then finally remove the old API?

That sounds like a plan.

Also on exposing PGO info, I have a patch that adds a pass I call HotnessDecorator. The pass emits a remark for each basic block. Then opt-viewer is made aware of these and the remarks are special-cased to show hotness for a line unless there is already a remark on the line. The idea is that since we only show hotness as part of the remark if a block does not contain a remark we don’t see its hotness. E.g.:

Yes, feel free to post for review once you have it ready.

Will do.

4) `opt-remarks` heavily relies on the fidelity of the DebugLoc
attached to instructions. Things get a little hairy at -O3 (or with
-flto) because there are optimizations bugs so transformations don't
preserve debuginfo. This is not entirely orthogonal but something can
be worked on in parallel (bonus point, this would also help SamplePGO
& debuginfo experience). With `-flto` the problem gets amplified more,
as expected.

5) I found a couple of issue when trying the support, but I'm actively
working on them.
https://bugs.llvm.org/show_bug.cgi?id=33773
https://bugs.llvm.org/show_bug.cgi?id=33776

That said, I think optimization remarks support is coming along nicely.

Yes, I’ve been really happy with the progress. Thanks for all the help from everybody!

At some point, I guess we might just consider the HTML generated
report as a fallback and having the opt-remarks more integrated in the
developer's workflow.
I personally use Visual studio daily to compile clang and it would be
nice to have remarks there as a plugin. I can imagine something
similar happening for XCode/CLion/Emacs etc..

Exactly.

Adam

Thanks,

--
Davide

Adam Nemet via llvm-dev

unread,

Jul 14, 2017, 1:52:39 PM7/14/17

to Davide Italiano, llvm-dev, Xinliang David Li, Whittaker, Simon

On Jul 14, 2017, at 10:22 AM, Davide Italiano <dav...@freebsd.org> wrote:

Filed https://bugs.llvm.org/show_bug.cgi?id=33789 to remove the old API and blockers for the 7 passes that need to be migrated. Anybody wanting to help with this, please feel free to grab any of the bugs.

Thanks!

Adam

Sam Elliott via llvm-dev

unread,

Jul 18, 2017, 2:49:33 PM7/18/17

to Adam Nemet via llvm-dev, Adam Nemet

I may grab a few of these bugs in the next few days.

Am I correct in thinking that only the following passes use the new OptimizationRemark system (or is searching for `INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)` not the correct way to find them?)

- GVN

- Loop Data Prefetch

- Loop Distribution

- Simplify Instructions

- Loop Vectorize

- SLP Vectorize

- Loop Interchange

There seem to be more results if i search for `#include "llvm/Analysis/OptimizationDiagnosticInfo.h"`.

Sam

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--

Archibald Sam Elliott

as...@cs.washington.edu

PhD Student, PLSE Group

Adam Nemet via llvm-dev

unread,

Jul 18, 2017, 4:14:54 PM7/18/17

to Sam Elliott, Adam Nemet via llvm-dev

Hi Sam,

On Jul 18, 2017, at 11:49 AM, Sam Elliott <as...@cs.washington.edu> wrote:

I may grab a few of these bugs in the next few days.

Am I correct in thinking that only the following passes use the new OptimizationRemark system (or is searching for `INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)` not the correct way to find them?)

- GVN
- Loop Data Prefetch
- Loop Distribution
- Simplify Instructions
- Loop Vectorize
- SLP Vectorize
- Loop Interchange

Yes, you are correct. I came up with the list of passes for the bugs by searching for emitOptimizationRemark. That is the set of API we’d like to remove.