Interpreted PatternMatch Execution

Jeff Niu

unread,

Sep 25, 2019, 12:06:23 AM9/25/19

to ml...@tensorflow.org

Hey MLIR community,

I’m an intern this fall in the TensorFlow MLIR team working with River and Mehdi. I am working on an interpreted pattern-matcher and rewrite framework for MLIR. The goal is to make MLIR framework, which currently relies on TableGen and/or pure C++, easily extensible at runtime.

Users would be able to define patterns in some higher-level DSL/eDSL (in Swift, Python, etc.) or possibly in static a configuration file (e.g. JSON, YAML). These higher-level descriptions of the rewrites would be converted to a bytecode (or similar) suitable to be consumed by a light interpreter embedded in the MLIR compiler. This is akin to a similar mechanism in LLVM with the DAG selection framework: a state machine generated by TableGen is interpreted to perform instruction selection. A difference with the LLVM implementation will be to make it possible to load more patterns at runtime, without having to recompile the entire compiler or be bound by the C++ ABI like with a C++ plugin approach.

A direction we would like to explore in the implementation, is to create a dialect in MLIR itself to represent this “bytecode” that drives the matching and rewrite. By representing the patterns in MLIR, optimizing the matching can be addressed as a compiler problem within existing MLIR infrastructure. Patterns with similar structure and constraints can be optimized to speed up execution. For instance, repeated comparisons (like looking up the same attribute) across multiple patterns can be reused by removing redundant operations in the MLIR representation.

Please feel free to share any questions or suggestions, including pointers to similar/related work in other frameworks.

-- Jeff

Renato Golin

unread,

Sep 25, 2019, 10:19:07 AM9/25/19

to Jeff Niu, MLIR

Hi Jeff,

This sounds like a very interesting project, and one of the main
reasons why I got interested in MLIR. It'd be great if you could copy
me in the reviews, for my own education. :)

Some comments inline.

On Wed, 25 Sep 2019 at 01:06, 'Jeff Niu' via MLIR <ml...@tensorflow.org> wrote:
> The goal is to make MLIR framework, which currently relies on TableGen and/or pure C++, easily extensible at runtime.

While this is a commendable goal, it's also a Pandora's box. TableGen
is far from perfect, but the patterns we generate are extensively
tested in the way they interact with others. This would be impossible
to do with dynamic patterns / rewrites.

The main problem is that the problem space of all combinations of all
TableGen patterns is very large, but by being statically matched, we
only need to test a very small subset of that space. Adding a dynamic
component to that would increase complexity by orders of magnitude
without solving a comparable number of cases.

I'm not advocating against such a move, by all means, I think it's a
worthwhile goal, but we need to make sure there are constraints,
artificial ones if need be, to curb the combinatorial nature of
dynamically changing the compiler's behaviour.

You may have to spend considerable time protecting the compiler from
crashing. It may sound drastic, but this was very common in LLVM when
changing things as simple as the order of passes.

> Users would be able to define patterns in some higher-level DSL/eDSL (in Swift, Python, etc.) or possibly in static a configuration file (e.g. JSON, YAML). These higher-level descriptions of the rewrites would be converted to a bytecode (or similar) suitable to be consumed by a light interpreter embedded in the MLIR compiler.

A common re-write language for MLIR sounds interesting, no matter
where they come from. The main issue is where does the validation
comes from?

Say I want to flip two instructions, when do I guarantee they can be
done, before applying (and paying the cost up-front) or after N
transformations have been done, and only paying the cost for those
series of transformations that are deemed to have a positive effect?
Depending on which way you go, the algorithms will end up with very
different cost models.

> This is akin to a similar mechanism in LLVM with the DAG selection framework: a state machine generated by TableGen is interpreted to perform instruction selection. A difference with the LLVM implementation will be to make it possible to load more patterns at runtime, without having to recompile the entire compiler or be bound by the C++ ABI like with a C++ plugin approach.

Remember, being bound is not a bad thing, as it simplifies the
validation process. I wouldn't get rid of all bounds in an attempt to
make it "really flexible".

> A direction we would like to explore in the implementation, is to create a dialect in MLIR itself to represent this “bytecode” that drives the matching and rewrite. By representing the patterns in MLIR, optimizing the matching can be addressed as a compiler problem within existing MLIR infrastructure. Patterns with similar structure and constraints can be optimized to speed up execution. For instance, repeated comparisons (like looking up the same attribute) across multiple patterns can be reused by removing redundant operations in the MLIR representation.

This sounds to me like temporary metadata, that gets lost after the
pass if done. Usually, that kind of information gets stored in global
variables on the compiler (result of analysis passes, that need to be
invalidated and so on), but having them in the IR itself, with some
guaranteed semantics, is interesting.

One could export the MLIR in this dialect, feed into another program,
export again and feed back to the MLIR pass. As long as the dialect is
stable and simple, the internal rules can end up being just like
LLVM's metadata: "if I understand, I use it, if not I may have to drop
it", and let the tools negotiate on their level (above the compiler).

Overall, I think this is a pretty nice project and I'd love to hear
more about it. :)

cheers,
--renato

Jeff Niu

unread,

Sep 25, 2019, 9:06:12 PM9/25/19

to MLIR

Hi Renato,

Thanks for the comments!

The goal of this project is to provide users with an option other than TableGen for defining patterns.

The TableGen C++ backend will remain the go-to for more complex use-cases. At some point,

there will be a TableGen backend for dynamic PatternMatch as well, which will hopefully share

similar infrastructure with the current one.

The dynamic PatternMatch will provide a "standard library" of sorts (similar to OpBase.td), which

users could extend, E.g. a TF PatternMatch lib. The basic library should provide enough functionality

so that most patterns could be expressed dynamically. The flexibility comes from not being bound

to TableGen -> C++, at the cost of some features and the ability to embed native code.

Presently, the dynamic patterns are run as a single FunctionPass, but they could be split into multiple

that are run at different times. But the key part of the project is how the patterns are optimized/compressed.

I'll be sure to keep you updated on the progress.

-- Jeff

Renato Golin

unread,

Sep 26, 2019, 9:56:17 AM9/26/19

to Jeff Niu, MLIR

On Wed, 25 Sep 2019 at 22:06, 'Jeff Niu' via MLIR <ml...@tensorflow.org> wrote:
> The dynamic PatternMatch will provide a "standard library" of sorts (similar to OpBase.td), which
> users could extend, E.g. a TF PatternMatch lib. The basic library should provide enough functionality
> so that most patterns could be expressed dynamically. The flexibility comes from not being bound
> to TableGen -> C++, at the cost of some features and the ability to embed native code.

Oh, so the TableGen dynamic PatternMatch will implement the "library
functions" and the external matchers will use these functions to
encode their own patterns?

That'd be restrictive enough to help with combinatorial validation,
but generic enough (if it evolves with users needs) to allow pretty
generic high-level changes.

> Presently, the dynamic patterns are run as a single FunctionPass, but they could be split into multiple
> that are run at different times. But the key part of the project is how the patterns are optimized/compressed.

Would be interesting if it was split in at least two parts: first, the
optimisation of the patterns themselves, then the application of the
optimised transformations into the MLIR graph.

I think opening the pass manager to dynamic reorder of passes would be
a can of worms, though. A more conservative approach would be, if
needed, to allow early/late pass bindings and perhaps a list of
analysis dependencies.

If these two passes could also be plugged in as shared libraries, we
could have a very dynamic way to transform MLIR, yet restricted by the
common library and optimisation rules.

I never liked how LLVM IR uses ethereal metadata for keeping
information around across different stages and I saw MLIR's dialects
as part of solving that problem.

This work is another part, making MLIR a powerful optimisation engine,
with arbitrary level of complexity, but explicitly so, where users
know the cost they're paying.

cheers,
--renato

Reply all

Reply to author

Forward