Re: [mlir] (RFC) Generalizing the MLIR Pass Manager

463 views
Skip to first unread message

Chris Lattner

unread,
Aug 8, 2019, 12:45:58 AM8/8/19
to River Riddle, MLIR
On Aug 7, 2019, at 5:49 PM, 'River Riddle' via MLIR <ml...@tensorflow.org> wrote:
Hi all,
I'd like to propose an RFC for generalizing the pass manager in MLIR given that the previous proposal to rethink the representations of Functions/Modules as Operations was accepted by the community. Now that this proposal has been implemented, the fundamental design of the pass manager will also require a rethink. Currently the pass manager infrastructure supports two different types of passes: Module Passes and Function Passes; with each operating on the respectively named IR entity. With Modules and Functions now represented with operations, this seems overly limiting. For example, dialects may want to define a custom function operation, like the LLVM dialect does, and write transformation passes on that abstraction. The builtin func operation may also appear in other regions than the top-level module.

Agreed, thank you for tackling this!

Given the above, I propose that we abstract the pass manager infrastructure to work on arbitrary operations at arbitrary levels of nesting. To accomplish this several pieces of the infrastructure need to be generalized:

Pass Manager Structure


In the current pass manager, there are two levels of nesting in the form of ModulePassManagers(MPM) and FunctionPassManagers(FPM); where FPMs may be nested within MPMs to form pass pipelines. This closely models the legacy relationship between Function and Module before they became operations. Now that Function and Module are modelled as operations, the level at which an operation may be nested is arbitrary.

Yes.  Ops with regions that are known IsolatedFromAbove are the natural unit of parallelization in a hierarchical representation like MLIR.

This means that the system needs to be expanded to support multiple levels of nesting. To support this, we introduce the concept of an OpPassManager(OPM).

What happens to function/module pass manager?

An OPM runs passes that operate on operations of a specific type, e.g. FuncOp/ModuleOp/etc. As alluded to above, OPMs support arbitrary levels of nesting. Nesting a new pass manager is as simple as invoking the nest method on any OPM instance. This pipeline nesting *must* be explicit now that operations do not have a set nesting level, as Functions did previously.
// Building a function pipeline.
OpPassManager &parent = …;
auto &pm = parent.nest<FuncOp>();
pm.addPass(createCSEPass());
pm.addPass(createCanonicalizationPass());

  

The types of operations that are supported by an OPM are those marked as IsolatedFromAbove. This restriction is necessary as Passes must not modify state at or above the operation being operated on in order to preserve the ability for MLIR to be multi-threaded at every level of the pass manager. The rationale behind this can be found here: https://github.com/tensorflow/mlir/blob/master/g3doc/Rationale.md#multithreading-the-compiler

I don’t really understand what you are getting here.  I’m sorry if this is obvious, but I always assumed we would do something like:

typedef OpPassManager<FuncOp> FunctionPassManager;
typedef OpPassManager<ModuleOp> ModulePassManager;

Which implies that the default behavior of OpPassManager is to do a postorder traversal of the region tree of the program, visiting ops that match the template argument (which must be IsolatedFromAbove).

While we want to make it conceptually possible for people to write passes on “their own kind of function” (like LLVM IR functions), the main purpose of these foreign function representations is for interop with external systems like LLVM, it isn’t to be able to write LLVM IR transformations in MLIR.

As such, I think that pushing for more-or-less-standardization on FunctionPassManager (which is a specialization of a generic thing!) is a good thing, and optimizing for simplicity in practice is also useful.

Command Line Specification

Along with the C++ API, the interface for building a pipeline from the command line (for tools like mlir-opt) must also change. The structure of the pipeline must become explicit as it can no longer be implicitly inferred from the type of pass being added. The pipeline specification format will work similarly to LLVM’s new pass manager, i.e. by providing a pipeline string that encodes the structure and passes to run. The syntax for this specification is as follows:
pipeline: op-name `(` pipeline-element (`,` pipeline-element)* `)`
pipeline-element: pipeline | pass-name

Example:
// The current specification:
$ mlir-opt foo.mlir -cse -canonicalize -lower-to-llvm

// The new specification:
$ mlir-opt foo.mlir -pass-pipeline="func(cse, canonicalize), lower-to-llvm”

:-(. This punishes the vastly most common case.  My limited muscle memory won’t know how to use this.  Can we do better?

-Chris



River Riddle

unread,
Aug 8, 2019, 1:32:48 AM8/8/19
to MLIR


On Wednesday, August 7, 2019 at 9:45:58 PM UTC-7, Chris Lattner wrote:
On Aug 7, 2019, at 5:49 PM, 'River Riddle' via MLIR <ml...@tensorflow.org> wrote:
Hi all,
I'd like to propose an RFC for generalizing the pass manager in MLIR given that the previous proposal to rethink the representations of Functions/Modules as Operations was accepted by the community. Now that this proposal has been implemented, the fundamental design of the pass manager will also require a rethink. Currently the pass manager infrastructure supports two different types of passes: Module Passes and Function Passes; with each operating on the respectively named IR entity. With Modules and Functions now represented with operations, this seems overly limiting. For example, dialects may want to define a custom function operation, like the LLVM dialect does, and write transformation passes on that abstraction. The builtin func operation may also appear in other regions than the top-level module.

Agreed, thank you for tackling this!

Given the above, I propose that we abstract the pass manager infrastructure to work on arbitrary operations at arbitrary levels of nesting. To accomplish this several pieces of the infrastructure need to be generalized:

Pass Manager Structure


In the current pass manager, there are two levels of nesting in the form of ModulePassManagers(MPM) and FunctionPassManagers(FPM); where FPMs may be nested within MPMs to form pass pipelines. This closely models the legacy relationship between Function and Module before they became operations. Now that Function and Module are modelled as operations, the level at which an operation may be nested is arbitrary.

Yes.  Ops with regions that are known IsolatedFromAbove are the natural unit of parallelization in a hierarchical representation like MLIR.

This means that the system needs to be expanded to support multiple levels of nesting. To support this, we introduce the concept of an OpPassManager(OPM).

What happens to function/module pass manager?
The idea is that the current 'PassManager' will still operate on the top-level module. Any pass managers nested under that are just OpPassManager, which can be any isolated from above op(e.g. FuncOp, ModuleOp, etc.). So essentially, the function pass manager is being generalized into a generic operation pass manager. The module pass manager will remain as is so that we can retain a common interaction point, as well as keep all of the other pass manager options in one place. 
 
Also important to note that we don't actually have a user visible function pass manager ATM. 


An OPM runs passes that operate on operations of a specific type, e.g. FuncOp/ModuleOp/etc. As alluded to above, OPMs support arbitrary levels of nesting. Nesting a new pass manager is as simple as invoking the nest method on any OPM instance. This pipeline nesting *must* be explicit now that operations do not have a set nesting level, as Functions did previously.
// Building a function pipeline.
OpPassManager &parent = …;
auto &pm = parent.nest<FuncOp>();
pm.addPass(createCSEPass());
pm.addPass(createCanonicalizationPass());

  

The types of operations that are supported by an OPM are those marked as IsolatedFromAbove. This restriction is necessary as Passes must not modify state at or above the operation being operated on in order to preserve the ability for MLIR to be multi-threaded at every level of the pass manager. The rationale behind this can be found here: https://github.com/tensorflow/mlir/blob/master/g3doc/Rationale.md#multithreading-the-compiler

I don’t really understand what you are getting here.  I’m sorry if this is obvious, but I always assumed we would do something like:

typedef OpPassManager<FuncOp> FunctionPassManager;
typedef OpPassManager<ModuleOp> ModulePassManager;
The OpPassManager is not templated to allow for opaquely building a pipeline. As mentioned above, the PassManager is still the main interaction point, so an OpPassManager can only be constructed from another pass manager and not directly by the user. For example, if my dialect has a function pass pipeline we don't have to assume the operation that we are nesting within:

void buildPipeline(OpPassManager &pm) { // This 'pm' could correspond to a module, or lto.translation_unit, or some other unknown op.
  auto &funcPm = pm.nest<FuncOp>();
  ...
}
 

Which implies that the default behavior of OpPassManager is to do a postorder traversal of the region tree of the program, visiting ops that match the template argument (which must be IsolatedFromAbove).
Yes, it operates in the same way that the pass manager does today.
 

While we want to make it conceptually possible for people to write passes on “their own kind of function” (like LLVM IR functions), the main purpose of these foreign function representations is for interop with external systems like LLVM, it isn’t to be able to write LLVM IR transformations in MLIR.
I'd rather not attach too much to function or the LLVM dialect, there are going to be many different kinds of top level operations. For example, today it is impossible to run a pass on a SPIRV module operation. It is also impossible to run passes on nested modules.

module {
  module {

  }
  spv.module {

  }
}
 

As such, I think that pushing for more-or-less-standardization on FunctionPassManager (which is a specialization of a generic thing!) is a good thing, and optimizing for simplicity in practice is also useful.
What do you mean by standardization on FunctionPassManager? Are you suggesting we keep the same restrictions we have today? i.e. passes only run on FuncOp? Can you elaborate a bit more on what you mean here?
 

Command Line Specification

Along with the C++ API, the interface for building a pipeline from the command line (for tools like mlir-opt) must also change. The structure of the pipeline must become explicit as it can no longer be implicitly inferred from the type of pass being added. The pipeline specification format will work similarly to LLVM’s new pass manager, i.e. by providing a pipeline string that encodes the structure and passes to run. The syntax for this specification is as follows:
pipeline: op-name `(` pipeline-element (`,` pipeline-element)* `)`
pipeline-element: pipeline | pass-name

Example:
// The current specification:
$ mlir-opt foo.mlir -cse -canonicalize -lower-to-llvm

// The new specification:
$ mlir-opt foo.mlir -pass-pipeline="func(cse, canonicalize), lower-to-llvm”

:-(. This punishes the vastly most common case.  My limited muscle memory won’t know how to use this.  Can we do better?
I welcome any and all suggestions on better ways to represent this, I don't really like it. The problem though is that operations don't have a set nesting level, so the structure needs to be expressed explicitly somehow.

-- River
 

-Chris



River Riddle

unread,
Aug 8, 2019, 2:01:47 AM8/8/19
to Madhur Amilkanthwar, MLIR


On Wed, Aug 7, 2019 at 10:57 PM Madhur Amilkanthwar <madhu...@gmail.com> wrote:
Wouldn't it be nice to have something like below (red lines below)? The idea is to talk to PassManager via two key methods 1. addPass() and 2. addPassManager. 

PassManager pm;

pm.addPass(new MyModulePass());


// Add a few function passes.

OpPassManager &fpm = getFuncOpPassManager();

pm.addPassManager(fpm);

// OpPassManager &fpm = pm.nest<FuncOp>();

 
What would 'getFuncOpPassManager' do here? Each pass manager needs to explicitly own each of the children pass managers. This is what the 'nest' call gets you, I'm not particularly attached to the name of the method if you have a better idea. 


-- River
 

fpm.addPass(new MyFunctionPass());

fpm.addPass(new MyFunctionPass2());


// Run the pass manager on a module.

Module m = ...;

if (failed(pm.run(m)))

    ... // One of the passes signaled a failure


--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/1bbe8dd7-eee8-41c3-a533-0d476c354641%40tensorflow.org.


--
Disclaimer: Views, concerns, thoughts, questions, ideas expressed in this mail are of my own and my employer has no take in it. 
Thank You.
Madhur D. Amilkanthwar

Mehdi AMINI

unread,
Aug 8, 2019, 11:52:30 AM8/8/19
to River Riddle, MLIR
Hi,


On Wed, Aug 7, 2019 at 5:55 PM 'River Riddle' via MLIR <ml...@tensorflow.org> wrote:
Hi all,
I'd like to propose an RFC for generalizing the pass manager in MLIR given that the previous proposal to rethink the representations of Functions/Modules as Operations was accepted by the community. Now that this proposal has been implemented, the fundamental design of the pass manager will also require a rethink. Currently the pass manager infrastructure supports two different types of passes: Module Passes and Function Passes; with each operating on the respectively named IR entity. With Modules and Functions now represented with operations, this seems overly limiting. For example, dialects may want to define a custom function operation, like the LLVM dialect does, and write transformation passes on that abstraction. The builtin func operation may also appear in other regions than the top-level module.

Given the above, I propose that we abstract the pass manager infrastructure to work on arbitrary operations at arbitrary levels of nesting. To accomplish this several pieces of the infrastructure need to be generalized:

Pass Manager Structure


In the current pass manager, there are two levels of nesting in the form of ModulePassManagers(MPM) and FunctionPassManagers(FPM); where FPMs may be nested within MPMs to form pass pipelines. This closely models the legacy relationship between Function and Module before they became operations. Now that Function and Module are modelled as operations, the level at which an operation may be nested is arbitrary. This means that the system needs to be expanded to support multiple levels of nesting. To support this, we introduce the concept of an OpPassManager(OPM).



An OPM runs passes that operate on operations of a specific type, e.g. FuncOp/ModuleOp/etc. As alluded to above, OPMs support arbitrary levels of nesting. Nesting a new pass manager is as simple as invoking the nest method on any OPM instance. This pipeline nesting *must* be explicit now that operations do not have a set nesting level, as Functions did previously.

// Building a function pipeline.

OpPassManager &parent = …;

auto &pm = parent.nest<FuncOp>();

pm.addPass(createCSEPass());

pm.addPass(createCanonicalizationPass());

  

The types of operations that are supported by an OPM are those marked as IsolatedFromAbove. This restriction is necessary as Passes must not modify state at or above the operation being operated on in order to preserve the ability for MLIR to be multi-threaded at every level of the pass manager. The rationale behind this can be found here: https://github.com/tensorflow/mlir/blob/master/g3doc/Rationale.md#multithreading-the-compiler

PassStructure

Now that the structure of the pass manager has been detailed, we can discuss the passes they are going to be operating on. Just as the PassManager has been generalized, so too will the Passes. The main pass in the new infrastructure will be the OperationPass.

OperationPass

An OperationPass is a transformation pass that opaquely runs on an operation of the current pass manager. As such, OperationPasses can be placed within any OpPassManager instance. This pass allows for performing transformations on operations that don’t necessarily need to know about the derived op class. This encapsulates a large margin of the types of transformations that are written, e.g. of the transformation passes that we have today very few rely on specific invariants of a FuncOp or ModuleOp. Passes like Canonicalization and CSE may operate at any level of nesting, but running at the FuncOp level allows for realizing the benefits of mulit-threading. Having passes run on different operation types allows for the pass to use other mechanisms for selective execution, such as traits placed on the operation or some configuration passed in on pass construction.


The definition of an OperationPass is very similar to that of a FunctionPass or ModulePass today:

struct MyPass : public OperationPass<MyPass> {

  /// Operation passes must override the 'runOnOperation' method.

  void runOnOperation() override {

    // 'getOperation' provides access to the operation currently being operated on.

    Operation *op = getOperation();

    ...   

  }

};


OpPass
An OpPass is a transformation pass that runs on an instance of a specific operation type. Unlike OperationPass, an OpPass may only be placed within an OpPassManager that operates on operations of the same kind. OpPasses are defined very similarly to OperationsPass:

/// When defining an OpPass, the operation type must also be specified.

struct MyPass : public OpPass<FuncOp, MyPass> {

  void runOnOperation() override {

    // 'getOperation' now returns the proper derived operation type.

    FuncOp op = getOperation();

    ...   

  }

};



Analysis Management

In terms of analysis management, nested pass managers will have the same relationship that exists today between ModulePassManager and FunctionPassManager. Passes can query analyses on parent/child operations, at any level of nesting, with getCachedParentAnalysis and getChildAnalysis/getCachedChildAnalysis respectively. This is simply a generalization of the existing getCacheModuleAnalysis/getFunctionAnalysis methods.

Pass Pipeline Building

Mentioned above is the fact that pipeline building in the pass manager is now explicit vs implicit as before. This essentially means that the following pipeline:

PassManager pm;

pm.addPass(new MyModulePass());


// Add a few function passes.

pm.addPass(new MyFunctionPass());

pm.addPass(new MyFunctionPass2());


// Run the pass manager on a module.

Module m = ...;

if (failed(pm.run(m)))

    ... // One of the passes signaled a failure.


Would now be constructed like:

PassManager pm;

pm.addPass(new MyModulePass());


// Add a few function passes.

OpPassManager &fpm = pm.nest<FuncOp>();

fpm.addPass(new MyFunctionPass());

fpm.addPass(new MyFunctionPass2());


// Run the pass manager on a module.

Module m = ...;

if (failed(pm.run(m)))

    ... // One of the passes signaled a failure.



I'm happy to see the weird nesting "magic" going away :)
This is better aligned with the new pass-manager in LLVM as well I believe.

 

Command Line Specification

Along with the C++ API, the interface for building a pipeline from the command line (for tools like mlir-opt) must also change. The structure of the pipeline must become explicit as it can no longer be implicitly inferred from the type of pass being added. The pipeline specification format will work similarly to LLVM’s new pass manager, i.e. by providing a pipeline string that encodes the structure and passes to run. The syntax for this specification is as follows:

pipeline: op-name `(` pipeline-element (`,` pipeline-element)* `)`

pipeline-element: pipeline | pass-name


Example:

// The current specification:

$ mlir-opt foo.mlir -cse -canonicalize -lower-to-llvm


// The new specification:

$ mlir-opt foo.mlir -pass-pipeline="func(cse, canonicalize), lower-to-llvm"


This will reflect better the actual structure of the pass manager, I don't have a better suggestion for the syntax.

If we are looking into the command line, it would have been nice to re-think the way passes registers and expose their options so that we could pass options individually to each instance of a pass (and get rid entirely of cl::opt). That's beyond the scope of the pass-manager redesign itself somehow though.

-- 
Mehdi


 



Thoughts?

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

Madhur Amilkanthwar

unread,
Aug 9, 2019, 1:38:16 PM8/9/19
to River Riddle, MLIR
Wouldn't it be nice to have something like below (red lines below)? The idea is to talk to PassManager via two key methods 1. addPass() and 2. addPassManager. 

PassManager pm;

pm.addPass(new MyModulePass());


// Add a few function passes.

OpPassManager &fpm = getFuncOpPassManager();

pm.addPassManager(fpm);

// OpPassManager &fpm = pm.nest<FuncOp>();

fpm.addPass(new MyFunctionPass());

fpm.addPass(new MyFunctionPass2());


// Run the pass manager on a module.

Module m = ...;

if (failed(pm.run(m)))

    ... // One of the passes signaled a failure


--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

Madhur Amilkanthwar

unread,
Aug 9, 2019, 1:38:16 PM8/9/19
to River Riddle, MLIR
Got it. I think the name 'nest' is confusing. I would vote for something like 'addPassManager' to be consistent with 'addPass'. 
Reply all
Reply to author
Forward
0 new messages