This RFC suggests an API to enable customizable instrumentation of pass
execution.
The intent is to have a common machinery to implement all the
pass-execution-debugging
features mentioned here.
Prime target of the interface is the new pass manager.
The overall approach and most of the implementation details should be
equially applicable
to the legacy one though.
Background
==========
There are quite a few important debugging facilities in LLVM that affect
pass execution sequence:
-print-after/-print-before[-all]
execute IR-print pass before or after a particularly
insteresting pass
(or all passes)
-verify-each
execute verifier pass after each
-opt-bisect-limit
execute passes only up to a selected "pass-counter"
-time-passes
track execution time for each pass
There are also quite a few similar ideas floating around, i.e:
-git-commit-after-all
-debugify-each
All these facilities essentially require instrumentation of pass
execution points
in the pass manager, each being implemented in a legacy pass manager
through their
own custom way, e.g:
* -time-passes has a bunch of dedicated code in each of the pass managers
* -print-before/after/verify-each insert additional passes before/after
the passes in the pipeline
And there is no implementation of any of these features for the new pass
manager,
which obviously is a problem for new pass manager transition.
Proposal
========
Main idea:
- introduce an API that allows to instrument points of pass execution
- access through LLVM Context (allows to control life-time and scope
in multi-context execution)
- wrap it into an analysis for easier access from pass managers
Details:
1. introduce llvm::PassInstrumentation
This is the main interface that handles the customization and
provides instrumentation calls
- resides in IR
- is accessible through LLVMContext::getPassInstrumentation()
(with context owning this object).
2. every single point of Pass execution in the (new) PassManager(s)
will query
this analysis and run instrumentation call specific to a
particular point.
Instrumentation points:
bool BeforePass (PassID, PassExecutionCounter);
void AfterPass (PassID, PassExecutionCounter);
Run before/after a particular pass execution
BeforePass instrumentation call returns true if this
execution is allowed to run.
'PassID'
certain unique identifier for a pass (pass name?).
'PassExecutionCounter'
a number that uniquely identifies this particular pass
execution
in current pipeline, as tracked by Pass Manager.
void StartPipeline()
void EndPipeline()
Run at the start/end of a pass pipeline execution.
(useful for initialization/finalization purposes)
3. custom callbacks are registered with
PassInstrumentation::register* interfaces
A sequence of registered callbacks is called at each
instrumentation point as appropriate.
4. introduce llvm::ExecutionCounter to track execution of passes
(akin to DebugCounter, yet enabled in Release mode as well?)
Note: it is somewhat nontrivial to uniquely track pass executions
with counters in new pass
manager as its pipeline schedule can be dynamic. Ideas are welcome
on how to efficiently
implement unique execution tracking that does not break in
presence of fixed-point iteration
passes like RepeatedPass/DevirtSCCRepeatedPass
Also, the intent is for execution counters to be able provide
thread-safety in multi-threaded
pipeline execution (though no work planned for it yet).
5. introduce a new analysis llvm::PassInstrumentationAnalysis
This is a convenience wrapper to provide an access to
PassInstrumentation via analysis framework.
If using analysis is not convenient (?legacy) then
PassInstrumentation can be queried
directly from LLVMContext.
Additional goals
================
- layering problem
Currently OptBisect/OptPassGate has layering issue - interface
dependencies on all the "IR units",
even those that are analyses - Loop, CallGraphSCC.
Generic PassInstrumentation facilitiy allows to inject arbitrary
call-backs in run-time,
removing any compile-time interface dependencies on internals of
those callbacks,
effectively solving this layering issue.
- life-time/scope control for multi-context execution
Currently there are issues with multi-context execution of, say,
-time-passes which store
their data in global maps.
With LLVMContext owning PassInstrumentation there should be no
problem with multi-context execution
(callbacks can be made owning the instrumentation data).
Open Questions
==============
- whats the best way to handle ownership of PassInstrumentation
Any problems with owning by LLVMContext?
Something similar to TargetLibraryInfo (owned by
TargetLibraryAnalysis/TargetLibraryInfoWrapperPass)?
- using PassInstrumentationAnalysis or directly querying LLVMContext
PassInstrumentationAnalysis appeared to be a nice idea, only until
I tried querying it
in new pass manager framework, and amount of hooplas to jump over
makes me shiver a bit...
Querying LLVMContext is plain and straightforward, but we do not
have a generic way to access LLVMContext
from a PassManager template (need to introduce generic
IRUnit::getContext?)
Implementation
==============
PassInstrumentationAnalysis proof-of-concept unfinished prototype
implementation:
(Heavily under construction, do not enter without wearing a hard hat...)
https://reviews.llvm.org/D47858
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
bool BeforePass (PassID, IRUnit&, PassExecutionCounter);
void AfterPass (PassID, IRUnit&, PassExecutionCounter);
regards,
Fedor.
On 06/07/2018 03:00 AM, Fedor Sergeev via llvm-dev wrote:
> 2. every single point of Pass execution in the (new) PassManager(s)
> will query
> this analysis and run instrumentation call specific to a
> particular point.
>
> Instrumentation points:
>
> bool BeforePass (PassID, PassExecutionCounter);
> void AfterPass (PassID, PassExecutionCounter);
>
> Run before/after a particular pass execution
> BeforePass instrumentation call returns true if this
> execution is allowed to run.
>
> 'PassID'
> certain unique identifier for a pass (pass name?).
>
> 'PassExecutionCounter'
> a number that uniquely identifies this particular
> pass execution
> in current pipeline, as tracked by Pass Manager.
>
> void StartPipeline()
> void EndPipeline()
>
> Run at the start/end of a pass pipeline execution.
> (useful for initialization/finalization purposes)
_______________________________________________
- access through LLVM Context (allows to control life-time and scope
in multi-context execution)
- wrap it into an analysis for easier access from pass managers
We had already talked about this, so unsurprisingly I'm generally in favor of the direction. Some comments below.
On Thu, Jun 7, 2018 at 2:00 AM Fedor Sergeev <fedor....@azul.com> wrote:- access through LLVM Context (allows to control life-time and scope
in multi-context execution)
- wrap it into an analysis for easier access from pass managers
Why not simply make it an analysis, and leave LLVM context out?
Because this is very pass specific, I think it would be substantially cleaner for it to be more specifically based in the pass infrastructure.
I also think that this can be more cleanly designed by focusing on the new PM. The legacy PM has reasonable solutions for these problems already, and I think the desgin can be made somewhat simpler if we don't have to support both in some way.
My hope would be that there are two basic "layers" to this. Along side a particular PassManager, we would have an analysis that instruments the running passes. This would just expose the basic API to track and control pass behavior and none of the "business logic".
Then I would hope that the Passes library can build an instance of this analysis with callbacks (or a type parameter that gets type erased internally) which handles all the business logic.
I consider LLVM context to be a good reference point for "compilation-local singleton stuff".
My task is to provide a way to handle callbacks per-compilation-context, and preferably have a single copy of those
(possibly stateful) callbacks per compilation.
In my implementation (linked at the end of RFC) I'm using PassInstrumentationImpl to have a single copy of object.
What entity should *own* PassInstrumentationImpl object to make it unique per-compilation?
Hi Fedor,
2018-06-07 17:48 GMT+02:00 Fedor Sergeev via llvm-dev <llvm...@lists.llvm.org>:
[...]I consider LLVM context to be a good reference point for "compilation-local singleton stuff".
My task is to provide a way to handle callbacks per-compilation-context, and preferably have a single copy of those
(possibly stateful) callbacks per compilation.
In my implementation (linked at the end of RFC) I'm using PassInstrumentationImpl to have a single copy of object.
What entity should *own* PassInstrumentationImpl object to make it unique per-compilation?
Both the PassBuilder and the AnalysisManager owning the analysis is unique per compilation for all intents and purposes. Just making it an analysis does not force you to extend the contract of IRUnitT to access the context. The PassBuilder is also exposed to pass plugins, so you get support for instrumentation plugins for free.
Thanks Craig, that's exactly what I mean, stopping at particular changes inside a pass.
We've had a -pass-max option here for some time and have hand-added
instrumentation to various passes to honor it. It's saved us man-years
of debug time. I was planning on sending it upstream but saw this
effort with pass execution instrumentation and thought it might fit
there.
Initially I think some very simple APIs in PassInstrumentationAnalysis
would be fine, something like:
// PIA - PassInstrumentationAnalysis
if (PIA->skipTransformation()) {
return;
}
// Do it.
PIA->didTransformation();
This kind of interface also encourages good pass design like doing all
the analysis for a transformation before actually doing the
transformation. Some passes mix analysis with transformation and those
are much harder to instrument to support -pass-max operation.
In our implementation we can set a -pass-max per pass, like
-pass-max=instcombine=524. A global index might be even more useful.
If it interacted with opt-bisect, even better. It seems like APIs that
cover both the opt-bisect pass-level operation and the finer-grained
operation could be quite powerful. As passes opt-in to the
finer-grained control, the opt-bisect limit would become more powerfuly
automatically.
I've always wanted a bugpoint that could point not just to a pass but to
a specific transformation within a pass.
-David
On 06/12/2018 12:04 AM, David A. Greene wrote:
> I was going to write something up about fine-grained opt-bisect but
> didn't get to it last week.
>
> We've had a -pass-max option here for some time and have hand-added
> instrumentation to various passes to honor it. It's saved us man-years
> of debug time. I was planning on sending it upstream but saw this
> effort with pass execution instrumentation and thought it might fit
> there.
>
> Initially I think some very simple APIs in PassInstrumentationAnalysis
> would be fine, something like:
>
> // PIA - PassInstrumentationAnalysis
> if (PIA->skipTransformation()) {
> return;
> }
> // Do it.
> PIA->didTransformation();
That should be easily doable (though the interface would be part of
PassInstrumentation
rather than PassInstrumentationAnalysis).
>
> This kind of interface also encourages good pass design like doing all
> the analysis for a transformation before actually doing the
> transformation. Some passes mix analysis with transformation and those
> are much harder to instrument to support -pass-max operation.
I'm not sure everybody would agree on this definition of a good pass
design :)
Ability to mix analysis with transformation might appear to be rather useful
when heavy analysis is only needed in a very special corner case of an
overall
transformation.
regards,
Fedor.
> On 06/12/2018 12:04 AM, David A. Greene wrote:
>> // PIA - PassInstrumentationAnalysis
>> if (PIA->skipTransformation()) {
>> return;
>> }
>> // Do it.
>> PIA->didTransformation();
> That should be easily doable (though the interface would be part of
> PassInstrumentation
> rather than PassInstrumentationAnalysis).
Ok. The way I envision this working from a user standpoint is
-opt-bisect-limit <n> would mean "n applications of code
transformation." where "code transformation" could mean an entire pass
run or individual transforms within a pass. Each pass would decide what
it supports.
>> This kind of interface also encourages good pass design like doing all
>> the analysis for a transformation before actually doing the
>> transformation. Some passes mix analysis with transformation and those
>> are much harder to instrument to support -pass-max operation.
> I'm not sure everybody would agree on this definition of a good pass
> design :)
> Ability to mix analysis with transformation might appear to be rather useful
> when heavy analysis is only needed in a very special corner case of an
> overall
> transformation.
Yes, I'm sure there are exceptions. I'm not referring to things like
instcombine that have individual rules that guard transformations and
the pass iterates applying transformations when rules are matched.
That's straightforward to instrument.
The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the
larger-goal transformation is not viable. If the pass then tries to
undo the first transformation, it's possible that -pass-max will result
in code that never would have been generated, because it could do the
first transformation but then not undo it because it hit the max number
of transforms. Sometimes it's difficult to find where things are undone
and update the transformation index (basically allow the undo and
decrement the index to reflect the undo).
In code:
if (not hit max)
do anlysis transform
++index
return
<some other function>
if (transform legal)
if (not hit max)
do big transform
++index
return
<some third function>
if (need to undo analysis transform)
if (not hit max)
undo it
++index
Sometimes it is not obvious that these three places are logically
connected. Ideally we wouldn't increment the index for the analysis
transform or we would allow the undo and decrement the index, but it's
not always clear from the code that that is what should happen.
-David
The harder cases are where the analysis phase itself does some
transformation (possily to facilitate analysis) and then decides the
larger-goal transformation is not viable. If the pass then tries to
undo the first transformation, it's possible that -pass-max will result
in code that never would have been generated, because it could do the
first transformation but then not undo it because it hit the max number
of transforms. Sometimes it's difficult to find where things are undone
and update the transformation index (basically allow the undo and
decrement the index to reflect the undo).
On 06/13/2018 07:46 PM, David A. Greene wrote:
> Fedor Sergeev <fedor....@azul.com> writes:
>
>> On 06/12/2018 12:04 AM, David A. Greene wrote:
>>> // PIA - PassInstrumentationAnalysis
>>> if (PIA->skipTransformation()) {
>>> return;
>>> }
>>> // Do it.
>>> PIA->didTransformation();
>> That should be easily doable (though the interface would be part of
>> PassInstrumentation
>> rather than PassInstrumentationAnalysis).
> Ok. The way I envision this working from a user standpoint is
> -opt-bisect-limit <n> would mean "n applications of code
> transformation." where "code transformation" could mean an entire pass
> run or individual transforms within a pass. Each pass would decide what
> it supports.
I would rather not merge pass-execution and in-pass-transformation
numbers into a single number.
It will only confuse users on what is being controlled.
Especially as in-pass control is going to be opt-in only.
>
>
>>> This kind of interface also encourages good pass design like doing all
>>> the analysis for a transformation before actually doing the
>>> transformation. Some passes mix analysis with transformation and those
>>> are much harder to instrument to support -pass-max operation.
>> I'm not sure everybody would agree on this definition of a good pass
>> design :)
>> Ability to mix analysis with transformation might appear to be rather useful
>> when heavy analysis is only needed in a very special corner case of an
>> overall
>> transformation.
> Yes, I'm sure there are exceptions. I'm not referring to things like
> instcombine that have individual rules that guard transformations and
> the pass iterates applying transformations when rules are matched.
> That's straightforward to instrument.
>
> The harder cases are where the analysis phase itself does some
> transformation (possily to facilitate analysis) and then decides the
As Philip has already pointed out, analyses by design are expected to be
non-mutating.
regards,
Fedor.
> The harder cases are where the analysis phase itself does some
> transformation (possily to facilitate analysis) and then decides
> the
> larger-goal transformation is not viable. If the pass then tries
> to
> undo the first transformation, it's possible that -pass-max will
> result
> in code that never would have been generated, because it could do
> the
> first transformation but then not undo it because it hit the max
> number
> of transforms. Sometimes it's difficult to find where things are
> undone
> and update the transformation index (basically allow the undo and
> decrement the index to reflect the undo).
> It should be pointed out that analyses don't transform the IR. At
> least not in the new PassManager, which I think we should focus on in
> this proposal.
I'm not talking about analysis passes as such. I'm talking about
transformations passes that check various conditions before doing
transformations. They have to check legality, profitability, etc. Most
of the time this is well-separated but sometimes things can get pretty
convoluted and it's not always clear where the "logical changes" are, as
opposed to component changes that make up a single logical change.
>> Ok. The way I envision this working from a user standpoint is
>> -opt-bisect-limit <n> would mean "n applications of code
>> transformation." where "code transformation" could mean an entire pass
>> run or individual transforms within a pass. Each pass would decide what
>> it supports.
> I would rather not merge pass-execution and in-pass-transformation
> numbers into a single number.
> It will only confuse users on what is being controlled.
> Especially as in-pass control is going to be opt-in only.
Oh, ok. I'm fine with that too. Do we want this finer-grained control
on a global basis, or a per-pass basis? For example, should something
like -transform-max=<n> apply over the whole compilation run, so that
every pass checks the limit, or should it work like
-transform-max=<pass>=<n>, where only pass <pass> checks the limit? If
the latter, then -opt-bisect-limit (or bugpoint) can identify the pass
and another run with -transform-max can identify the specific transform
within the pass.
The latter is how we have things set up here and it seems to work well,
but I can also see utility in a global limit because then you don't need
two separate runs to isolate the problem.
I'd like to start building this off the pass instrumentation stuff as
soon as it gets integrated. Could you copy me on Phabricator when they
land there? Thanks!
>> The harder cases are where the analysis phase itself does some
>> transformation (possily to facilitate analysis) and then decides the
> As Philip has already pointed out, analyses by design are expected to
> be non-mutating.
See my reply to Philip. I'm talking about various analyses that happen
within transformation passes.
> I'd like to start building this off the pass instrumentation stuff as
> soon as it gets integrated. Could you copy me on Phabricator when they
> land there? Thanks!
BTW, I am "greened" on Phabricator.
Fedor Sergeev <fedor....@azul.com> writes:
>> Ok. The way I envision this working from a user standpoint is
>> -opt-bisect-limit <n> would mean "n applications of code
>> transformation." where "code transformation" could mean an entire pass
>> run or individual transforms within a pass. Each pass would decide what
>> it supports.
> I would rather not merge pass-execution and in-pass-transformation
> numbers into a single number.
> It will only confuse users on what is being controlled.
> Especially as in-pass control is going to be opt-in only.
Oh, ok. I'm fine with that too. Do we want this finer-grained control
on a global basis, or a per-pass basis? For example, should something
like -transform-max=<n> apply over the whole compilation run, so that
every pass checks the limit, or should it work like
-transform-max=<pass>=<n>, where only pass <pass> checks the limit? If
the latter, then -opt-bisect-limit (or bugpoint) can identify the pass
and another run with -transform-max can identify the specific transform
within the pass.
The latter is how we have things set up here and it seems to work well,
but I can also see utility in a global limit because then you don't need
two separate runs to isolate the problem.
I'd like to start building this off the pass instrumentation stuff as
soon as it gets integrated. Could you copy me on Phabricator when they
land there? Thanks!
>> The harder cases are where the analysis phase itself does some
>> transformation (possily to facilitate analysis) and then decides the
> As Philip has already pointed out, analyses by design are expected to
> be non-mutating.
See my reply to Philip. I'm talking about various analyses that happen
within transformation passes.
> This seems to be pretty much orthogonal to the pass manager
> instrumentation. In fact, there is nothing keeping you from
> implementing this for your pass right now using debug counters. That's
> mostly their job, and they are independent of the pass manager
> implementation.
Yes, it sounds like that will work. When I did things on our end, those
didn't exist.
> See my reply to Philip. I'm talking about various analyses that
> happen
> within transformation passes.
>
> I see, then I just misunderstood what you meant by analysis. I believe
> what you were going here for can as well be implemented on top of
> debug counters.
I don't think anything is needed other than debug counters, if I'm
understanding how they work. If we wanted some kind of global limit
that would require more, but we haven't found a need for that.
Thanks for the pointer!
On Wed, Jun 13, 2018 at 8:03 PM David A. Greene via llvm-dev <llvm...@lists.llvm.org> wrote:
Fedor Sergeev <fedor....@azul.com> writes:
>> Ok. The way I envision this working from a user standpoint is
>> -opt-bisect-limit <n> would mean "n applications of code
>> transformation." where "code transformation" could mean an entire pass
>> run or individual transforms within a pass. Each pass would decide what
>> it supports.
> I would rather not merge pass-execution and in-pass-transformation
> numbers into a single number.
> It will only confuse users on what is being controlled.
> Especially as in-pass control is going to be opt-in only.
Oh, ok. I'm fine with that too. Do we want this finer-grained control
on a global basis, or a per-pass basis? For example, should something
like -transform-max=<n> apply over the whole compilation run, so that
every pass checks the limit, or should it work like
-transform-max=<pass>=<n>, where only pass <pass> checks the limit? If
the latter, then -opt-bisect-limit (or bugpoint) can identify the pass
and another run with -transform-max can identify the specific transform
within the pass.
This seems to be pretty much orthogonal to the pass manager instrumentation. In fact, there is nothing keeping you from implementing this for your pass right now using debug counters. That's mostly their job, and they are independent of the pass manager implementation.
> My problem with debug counters is that they are ... well ...
> debug-only :)
> I was planning to use debug counters for the purpose of pass execution
> counting but then
> realized that it will not work in Release mode at all.
>
> But other than that debug counters seems to be a exactly a machinery
> designed for opt-in control of internal pass activity.
Why were debug counters made debug-only in the first place? We
certainly use our -pass-max stuff in release builds. Most of the time a
debug build is fine but for some codes a debug build is way too slow to
allow bisecting in a reasonable amount of time.
> It seems reasonable to me to require that assertions are on when you're
> trying to debug the compiler. Not so much to require that the compiler
> itself has been built with `-O0` :)
Thanks for explaining things. It's a little weird, name-wise, to use
ENABLE_ASSERTIONS to get debug counters but I can see how it would make
sense.