TF to mid-level MLIR forms doc/contributions + roadmap

Uday Bondhugula

unread,

May 12, 2020, 9:44:33 AM5/12/20

to MLIR

Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early? And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

Thanks,

Uday

Mehdi AMINI

unread,

May 12, 2020, 5:59:51 PM5/12/20

to Uday Bondhugula, MLIR

Hey Uday,

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

Note that `MLIR TF Control` is deprecated and mostly unused (I think TFLite still has it in their pipeline but this is just a missing cleanup).

There is also some subtleties around the various HLO dialects, but on a high level picture this is correct.

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

These are really good question, and we're really overdue for an update on this: I actually was planning to address this last month but for various reasons it didn't happen (you may notice that I have a slot in the weekly meeting agenda for this).

We've been also pivoting a lot since the beginning of 2020 to realign ourselves with respect to TFRT which is our primary runtime target at the moment. So we have actually to document and reorganize many different pipeline depending on the use-case.

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early?

Again this is a multi-dimensional problem: for example it may depends on the horizon (next two quarters vs next year) or in terms of product integration (for example LHLO exists for the sole purpose of integrating better with XLA and having a path for incremental migration from XLA to use MLIR).

The work on TCP is bootstrapping (IREE folks are mostly leading this right now in https://github.com/google/mlir-npcomp with trying to get to a quick end-to-end flow that can instruct the design.

We aren't woking on TF->TCP/Linalg, on the other hand there are other changes in flights around the design of TF/HLO that may impact how we structure this in the future, but this isn't clear enough yet.

And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

(It is TF executor -> TF, without going through control)

This is likely the closest we have to be a production pipeline is our TPU XLA bridge, where we run this ahead of time / before execution: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/transforms/bridge.cc#L74

And then at runtime in a "JIT" mode we continue with this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/utils/compile_mlir_util.cc#L261

Note that the JIT part of the pipeline is using XLA for the backend right now and so relies on the ability to inject shapes at runtime and statically inferred the shapes for the computation.

Best,

--

Mehdi

Stella Laurenzo

unread,

May 12, 2020, 6:32:51 PM5/12/20

to Uday Bondhugula, Sean Silva, MLIR

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

Hi Uday!

There is so much history in the TF/XLA ecosystem that it is hard to parse through. In addition, we have two compilation flows currently:

TF->xla_hlo->IREE Flow->LinAlg->... (in use by IREE)
TF->xla_hlo->xla_lhlo->... (in use by more traditional TF backends)

We ultimately believe that much of this can be rebased on a high level TCP abstraction, and we've found it valuable to start working on that as part of the npcomp scale model, focusing on extracting the patterns to bridge the various layers that have grown in the IREE and traditional TF side into an independent place that we can look at and evaluate how to evolve/upstream/etc. We'd like to see more convergence and upstreaming of the frontend layers, which is why we (IREE-team) are building out the npcomp prototype to aggregate the necessary pieces in preparation for concrete design discussions and upstreaming. +Sean Silva who is working on this in the mlir-npcomp repo.

In addition to this, we are continuing to push on the TF->XLA path, and it is mature enough to be handling some fairly non trivial models. Here is IREE's HLO op coverage for various backends (all exercising the HLO->LinAlg path): https://google.github.io/iree/HLOOpCoverage. We don't have a corresponding list of supported TF ops published yet, but we do have our model level coverage annotated in the IREE's TensorFlow tests build file. Most models we care about are in the frustrating space of needing "one or two more op variants", but notably ResNet50 and MobileNet are compiling/running via both LLVM/CPU and Vulkan-SPIRV codegen paths. There are also a handful of models we track privately (mainly sequence and various audio models). Notably on the IREE side right now, these are (almost) exclusively forward-pass (inference) only, and we need to expand our inventory to include loss functions. We largely consider the TF->XLA path to be "design stable" for static shapes. Most of the work we are putting into it is either related to extending op coverage or generalizing things to have some support for dynamic shapes.

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org.

Uday Bondhugula

unread,

May 13, 2020, 1:27:41 AM5/13/20

to MLIR

Hi Mehdi,

On Wednesday, May 13, 2020 at 3:29:51 AM UTC+5:30, Mehdi AMINI wrote:

Hey Uday,

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

Note that `MLIR TF Control` is deprecated and mostly unused (I think TFLite still has it in their pipeline but this is just a missing cleanup).

I see - but I can't immediately tell (even from the links below on the TPU XLA bridge) which pass actually finally gets you out of the executor dialect and gets you to the MLIR std TF dialect (while not going through TF control), i.e., the substitute for -tf-executor-to-control-conversion -tf-raise-control-flow.

There is also some subtleties around the various HLO dialects, but on a high level picture this is correct.

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

These are really good question, and we're really overdue for an update on this: I actually was planning to address this last month but for various reasons it didn't happen (you may notice that I have a slot in the weekly meeting agenda for this).
We've been also pivoting a lot since the beginning of 2020 to realign ourselves with respect to TFRT which is our primary runtime target at the moment. So we have actually to document and reorganize many different pipeline depending on the use-case.

Thanks - this would really help!

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early?

Again this is a multi-dimensional problem: for example it may depends on the horizon (next two quarters vs next year) or in terms of product integration (for example LHLO exists for the sole purpose of integrating better with XLA and having a path for incremental migration from XLA to use MLIR).
The work on TCP is bootstrapping (IREE folks are mostly leading this right now in https://github.com/google/mlir-npcomp with trying to get to a quick end-to-end flow that can instruct the design.
We aren't woking on TF->TCP/Linalg, on the other hand there are other changes in flights around the design of TF/HLO that may impact how we structure this in the future, but this isn't clear enough yet.

And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

(It is TF executor -> TF, without going through control)
This is likely the closest we have to be a production pipeline is our TPU XLA bridge, where we run this ahead of time / before execution: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/transforms/bridge.cc#L74
And then at runtime in a "JIT" mode we continue with this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/utils/compile_mlir_util.cc#L261

Thanks - these are very useful references. For the TF to TPU XLA bridge, I couldn't still immediately tell from the code what the substitute for the TF executor -> TF control -> TF is, i.e., where exactly you were getting out of the TF executor dialect and into the TF dialect. (Minor: the doc comment on CreateTPUBridgePipeline is missing its second line.) And CompileSerializedMlirToXlaHlo's precondition is that its input is in the TF dialect already (although potentially with TF's control flow model). So, I'm missing a reference for the middle step.

Note that the JIT part of the pipeline is using XLA for the backend right now and so relies on the ability to inject shapes at runtime and statically inferred the shapes for the computation.

Thanks.

- Uday

Best,

--
Mehdi

Mehdi AMINI

unread,

May 13, 2020, 1:42:42 AM5/13/20

to Uday Bondhugula, MLIR

On Tue, May 12, 2020 at 10:27 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

Hi Mehdi,

On Wednesday, May 13, 2020 at 3:29:51 AM UTC+5:30, Mehdi AMINI wrote:
Hey Uday,

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

Note that `MLIR TF Control` is deprecated and mostly unused (I think TFLite still has it in their pipeline but this is just a missing cleanup).

I see - but I can't immediately tell (even from the links below on the TPU XLA bridge) which pass actually finally gets you out of the executor dialect and gets you to the MLIR std TF dialect (while not going through TF control), i.e., the substitute for -tf-executor-to-control-conversion -tf-raise-control-flow.

This is the combination of `-tf-executor-island-coarsening -canonicalize`

There is also some subtleties around the various HLO dialects, but on a high level picture this is correct.

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

These are really good question, and we're really overdue for an update on this: I actually was planning to address this last month but for various reasons it didn't happen (you may notice that I have a slot in the weekly meeting agenda for this).
We've been also pivoting a lot since the beginning of 2020 to realign ourselves with respect to TFRT which is our primary runtime target at the moment. So we have actually to document and reorganize many different pipeline depending on the use-case.

Thanks - this would really help!

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early?

Again this is a multi-dimensional problem: for example it may depends on the horizon (next two quarters vs next year) or in terms of product integration (for example LHLO exists for the sole purpose of integrating better with XLA and having a path for incremental migration from XLA to use MLIR).
The work on TCP is bootstrapping (IREE folks are mostly leading this right now in https://github.com/google/mlir-npcomp with trying to get to a quick end-to-end flow that can instruct the design.
We aren't woking on TF->TCP/Linalg, on the other hand there are other changes in flights around the design of TF/HLO that may impact how we structure this in the future, but this isn't clear enough yet.

And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

(It is TF executor -> TF, without going through control)
This is likely the closest we have to be a production pipeline is our TPU XLA bridge, where we run this ahead of time / before execution: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/transforms/bridge.cc#L74
And then at runtime in a "JIT" mode we continue with this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/utils/compile_mlir_util.cc#L261

Thanks - these are very useful references. For the TF to TPU XLA bridge, I couldn't still immediately tell from the code what the substitute for the TF executor -> TF control -> TF is, i.e., where exactly you were getting out of the TF executor dialect and into the TF dialect. (Minor: the doc comment on CreateTPUBridgePipeline is missing its second line.) And CompileSerializedMlirToXlaHlo's precondition is that its input is in the TF dialect already (although potentially with TF's control flow model). So, I'm missing a reference for the middle step.

Note that the JIT part of the pipeline is using XLA for the backend right now and so relies on the ability to inject shapes at runtime and statically inferred the shapes for the computation.

Thanks.

- Uday

Best,

--
Mehdi

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/53dd7620-75ac-41b5-a017-df9525d1918e%40tensorflow.org.

Uday Bondhugula

unread,

May 13, 2020, 1:51:43 AM5/13/20

to MLIR

Hi Stella,

Thank you responding in detail. Some questions below.

On Wednesday, May 13, 2020 at 4:02:51 AM UTC+5:30, Stella Laurenzo wrote:

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early? And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

Hi Uday!

There is so much history in the TF/XLA ecosystem that it is hard to parse through. In addition, we have two compilation flows currently:
TF->xla_hlo->IREE Flow->LinAlg->... (in use by IREE)

The nodes in my chain above were intended to be dialects. The above would look like <IREE flow> is replacing the conversion through lhlo with things that will hopefully get into the tensor compute dialect and merged with LinAlg? By "IREE flow", did you mean conversions or new op abstractions? I should check the code, but you perhaps meant a single conversion pass (but no new ops / dialect there).

TF->xla_hlo->xla_lhlo->... (in use by more traditional TF backends)

We ultimately believe that much of this can be rebased on a high level TCP abstraction, and we've found it valuable to start working on that as part of the npcomp scale model, focusing on extracting the patterns to bridge the various layers that have grown in the IREE and traditional TF side into an independent place that we can look at and evaluate how to evolve/upstream/etc. We'd like to see more convergence and upstreaming of the frontend layers, which is why we (IREE-team) are building out the npcomp prototype to aggregate the necessary pieces in preparation for concrete design discussions and upstreaming. +Sean Silva who is working on this in the mlir-npcomp repo.

This would be great and npcomp could also serve as an excellent reference for an end-to-end implementation of a user facing programming model - perhaps the first as well.

In addition to this, we are continuing to push on the TF->XLA path, and it is mature enough to be handling some fairly non trivial models. Here is IREE's HLO op coverage for various backends (all exercising the HLO->LinAlg path): https://google.github.io/iree/HLOOpCoverage. We don't have a corresponding list of supported TF ops published yet, but we do have our model level

Thanks, this is useful. Perhaps this is what contributors should look for in the short term (my original question on what to use to get to mid-level forms of MLIR) - if you have great coverage for HLO to LinAlg already and this will perhaps evolve into the conversion into tensor compute, it probably makes sense for contributors to invest here. From LinAlg, there is already a path into Affine/SCF to go further without any uncertainty.

coverage annotated in the IREE's TensorFlow tests build file. Most models we care about are in the frustrating space of needing "one or two more op variants", but notably ResNet50 and MobileNet are compiling/running via both LLVM/CPU and Vulkan-SPIRV codegen paths. There are also a handful of models we track privately (mainly sequence and various audio models). Notably on the IREE side right now, these are (almost) exclusively forward-pass (inference) only, and we need to expand our inventory to include loss functions. We largely consider the TF->XLA path to be "design stable" for static shapes. Most of the work we are putting into it is either related to extending op coverage or generalizing things to have some support for dynamic shapes.

Thanks very much - this is exactly what I was looking for! Very useful to know and really nice that you have those models working end-to-end. Has there been any thought of having such end-to-end tests in the MLIR-TensorFlow repo itself? Everything that's needed for LLVM/CPU path is already being linked in (tf-opt has everything through the MLIR LLVM dialect).

- Uday

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

Stella Laurenzo

unread,

May 13, 2020, 2:50:35 AM5/13/20

to Uday Bondhugula, MLIR

On Tue, May 12, 2020 at 10:51 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

Hi Stella,

Thank you responding in detail. Some questions below.

On Wednesday, May 13, 2020 at 4:02:51 AM UTC+5:30, Stella Laurenzo wrote:

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early? And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

Hi Uday!

There is so much history in the TF/XLA ecosystem that it is hard to parse through. In addition, we have two compilation flows currently:
TF->xla_hlo->IREE Flow->LinAlg->... (in use by IREE)
The nodes in my chain above were intended to be dialects. The above would look like <IREE flow> is replacing the conversion through lhlo with things that will hopefully get into the tensor compute dialect and merged with LinAlg? By "IREE flow", did you mean conversions or new op abstractions? I should check the code, but you perhaps meant a single conversion pass (but no new ops / dialect there).

My apologies for being somewhat imprecise in my description. I should take some time and draw this out, as it has been a point of confusion (and has been a fast moving target). The diagram in our README is the best we have right now but also lacks precision at the level I suspect you want.

IREE thinks about this in three layers:

Frontend pipeline
IREE input dialects
Backend dialects

Currently, the only frontend pipeline that we support is TensorFlow, and the specific passes that we use to legalize to the IREE Input dialects are here. At this level, we often think of this as a conversion from the "tf" dialect to "xha_hlo", but there is actually more going on here: the input to this pipeline is actually a MLIR representation of the on-disk TensorFlow SavedModel, which logically contains an MLIR representation of a GraphDef and additional metadata encoded in the SavedModel (i.e. public function signatures, variables, etc). This includes the dialects tf, tf_executor, and tf_savedmodel. The result of running the TensorFlow legalization pipeline is an MLIR module containing functions of mostly xla_hlo ops, but it will also contain IREE-specific ops and types from the IREE flow dialect (minimally including "flow" ops for things not representable in xla_hlo, such as variables, assignment). Most of the flow ops are optional and can be used to enhance or access additional features if needed (i.e. by inputting a program that has already been partitioned into flow.dispatch_region ops instead of letting IREE's partitioner handle it). One important difference from the default way that TensorFlow thinks about this is that we legalize early to CFG, whereas traditional TensorFlow/XLA representations use each dialect's native control flow primitives instead.

So in effect, the entry to "#2. IREE input dialects" above is a mixture of xla_hlo, [iree] flow, [mlir] std represented with CFG control flow constructs. Stateless models can be completely expressed in xla_hlo, and in fact, this is how the bulk of our individual tests are represented. In the future, we will minimally extend these input dialects to include TCP, and as things evolve, I suspect we will end up inverting the relationship, treating xla_hlo in a similar way to how we treat the TensorFlow dialects today (i.e. having a Frontend Pipeline that legalizes from xla_hlo -> tcp + [iree] flow + [mlir] std).

Currently, backends are invoked on each partitioned flow.dispatch_region once it has been outlined and buffer assignment has been performed. The result is a small xla_hlo program with additional IR for ABI interfaces, shape/buffer informations, etc. We invoke backend specific pipelines to lower these dispatch regions to executables. The backends are currently vmla (reference interpreter), llvm (cpu), and Vulkan/SPIR-V. Just as we eventually plan to migrate to TCP as an input dialect, we would make a similar move on the backends, but there will probably be a significant period of time that both exist.

Both of our codegen backends (llvm cpu and Vulkan/SPIR-V) use common conversions from xla_hlo -> LinAlg and then proceed to do more target specific lowering from there. I'm hand waving over a lot of important details here, but those are the primary layers and dialects.

The key point is that we think about TCP as existing at, perhaps, a half a level of abstraction above xla_hlo (the current prototypes add some concepts like islands that are used for clustering and carrying program-level constraints that must be satisfied and come up quickly when considering what it takes to interface with more dynamic frontends than XLA has historically services).

TF->xla_hlo->xla_lhlo->... (in use by more traditional TF backends)

We ultimately believe that much of this can be rebased on a high level TCP abstraction, and we've found it valuable to start working on that as part of the npcomp scale model, focusing on extracting the patterns to bridge the various layers that have grown in the IREE and traditional TF side into an independent place that we can look at and evaluate how to evolve/upstream/etc. We'd like to see more convergence and upstreaming of the frontend layers, which is why we (IREE-team) are building out the npcomp prototype to aggregate the necessary pieces in preparation for concrete design discussions and upstreaming. +Sean Silva who is working on this in the mlir-npcomp repo.

This would be great and npcomp could also serve as an excellent reference for an end-to-end implementation of a user facing programming model - perhaps the first as well.

I really hope so and that is my interest in it. I think we're overdue for a new reference point at this level and building it out end to end is helping us answer more from a fresh perspective what it takes to model the level of dynamism that people expect -- and focus on getting the layers right to legalize down from there. I suspect that getting this layer right will be both useful as a user-facing programming model and as a target for similar things to lower into. We'll be talking about it more at upcoming ODM sessions and on the LLVM discourse, but you're welcome to join our semi-frequent discussions on our #npcomp channel of IREE's discord server in the meantime. Again, we see this as something new which will ultimately align with MLIR-upstream, but we're taking our time to prototype on a small scale before engaging in the more detailed design discussions that will come next.

In addition to this, we are continuing to push on the TF->XLA path, and it is mature enough to be handling some fairly non trivial models. Here is IREE's HLO op coverage for various backends (all exercising the HLO->LinAlg path): https://google.github.io/iree/HLOOpCoverage. We don't have a corresponding list of supported TF ops published yet, but we do have our model level

Thanks, this is useful. Perhaps this is what contributors should look for in the short term (my original question on what to use to get to mid-level forms of MLIR) - if you have great coverage for HLO to LinAlg already and this will perhaps evolve into the conversion into tensor compute, it probably makes sense for contributors to invest here. From LinAlg, there is already a path into Affine/SCF to go further without any uncertainty.

That is our view of the situation, and since our target is whole-program compilation to devices that were never served well by TensorFlow's existing compiler stacks, we had more latitude to explore more of the "pure" lowering paths built from upstream MLIR components (i.e. build out xla_hlo dialects, lower directly to LinAlg, bypassing lhlo and other interop focused conversions) versus needing to consider the levels of interop with existing XLA-based technologies that the TensorFlow side is trying to balance more directly. I suspect that it all comes together in the end, but we've been exploring the more green-field approaches, whereas the TensorFlow-based compiler efforts have been focusing on bridging to technologies that are already in use and serving a lot of needs. Two sides of the same coin in the long run...

coverage annotated in the IREE's TensorFlow tests build file. Most models we care about are in the frustrating space of needing "one or two more op variants", but notably ResNet50 and MobileNet are compiling/running via both LLVM/CPU and Vulkan-SPIRV codegen paths. There are also a handful of models we track privately (mainly sequence and various audio models). Notably on the IREE side right now, these are (almost) exclusively forward-pass (inference) only, and we need to expand our inventory to include loss functions. We largely consider the TF->XLA path to be "design stable" for static shapes. Most of the work we are putting into it is either related to extending op coverage or generalizing things to have some support for dynamic shapes.

Thanks very much - this is exactly what I was looking for! Very useful to know and really nice that you have those models working end-to-end. Has there been any thought of having such end-to-end tests in the MLIR-TensorFlow repo itself? Everything that's needed for LLVM/CPU path is already being linked in (tf-opt has everything through the MLIR LLVM dialect).

So far, we've not discussed that specifically, and we have some interest in keeping the repos separate for now, if for no other reason than that it enforces some layering (but we all are literally working together on the shared problem across these codebases). We've been trying to push more of the infra coreward towards LLVM/MLIR vs the TensorFlow repo. IREE is also still quite young, and while the coverage is improving, we have a decent road ahead of us to get to credible performance -- and I suspect as we get closer to that point will be the right time to re-approach these repo/dependency decisions.

Right now, I would interpret our work here to be a pretty decent proxy to indicate that quite a bit is working through the tf2xla bridge and fairly standard MLIR-based tooling. The next points of high level evolution of the representations will likely come from TCP and our npcomp experiments, and that will lead to some reshuffling of the component boundaries. And of course, all of this is dependent on us collectively achieving the performance metrics that get us to a point of viability.

- Uday

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org.

Uday Bondhugula

unread,

May 13, 2020, 11:47:47 PM5/13/20

to MLIR

On Wednesday, May 13, 2020 at 11:12:42 AM UTC+5:30, Mehdi AMINI wrote:

On Tue, May 12, 2020 at 10:27 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hi Mehdi,

On Wednesday, May 13, 2020 at 3:29:51 AM UTC+5:30, Mehdi AMINI wrote:
Hey Uday,

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:
Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

Note that `MLIR TF Control` is deprecated and mostly unused (I think TFLite still has it in their pipeline but this is just a missing cleanup).

I see - but I can't immediately tell (even from the links below on the TPU XLA bridge) which pass actually finally gets you out of the executor dialect and gets you to the MLIR std TF dialect (while not going through TF control), i.e., the substitute for -tf-executor-to-control-conversion -tf-raise-control-flow.

This is the combination of `-tf-executor-island-coarsening -canonicalize`

Thanks - this works. This wasn't clear from any of the documentation I found - CreateTPUBridgePipeline also has TPU specific rewrites interspersed.

Uday Bondhugula

unread,

May 16, 2020, 8:48:37 PM5/16/20

to MLIR

Thanks very much for all this information!

- Uday

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org.

Uday Bondhugula

unread,

May 20, 2020, 3:11:01 PM5/20/20

to MLIR

Hi Stella,

Overall, it looks like there is a lot going on with really three things/repos here: MLIR, TensorFlow MLIR dialects in TF, and IREE which is sandwiched between the two for the xla_hlo to linalg parts. I still can't see why the conversion from xla_hlo into linalg has to live outside the tensorflow tree. While xla_hlo is in the tensorflow tree and linalg is in mlir proper, it is odd that the bridge between these two is being developed in a third thing - although it looks like there are other reasons you do want such an end-to-end toolchain. But having three things I feel would really hurt contributions from outsiders - with three discussion forums, three repositories, and the corresponding dependences on upstream versions, as is, there are probably very few people who I feel know what the layers are and where they are headed. From your description, it looks like a part of the IREE bridge will end up in tensorflow (the one that converts to TCP), and a larger part would go into MLIR proper. This basically means that if the xla_hlo -> tcp/linalg conversion development happens in tensorflow, the right parts could incrementally be moved into MLIR. Until then, I assume there won't be other programming models (other than TF) needing those parts of IREE (as you mention TF being the only pipeline IREE supports). So, unless there is a plan to soon move the stuff converting into the evolving tcp dialect into the tensorflow tree and the mlir tree, it looks like that process would get protracted, and until then, there wouldn't be a reasonable choice and duplicated effort. The fact that you have ResNet and Mobilenet working is already great. Conversions available out of LHLO in contrast are minimal, but easy to quickly add (esp. to affine). Linalg OTOH provides more conversion with less boilerplate and with auto-generated stuff going forward for the named ops I recall. So, the IREE flow path appears to be the only one active with a plan to target what will become tcp? (I'm haven't kept track of the MLIR/ONNYX efforts. And LHLO is a buffer based abstraction and it wouldn't appear ideal to make it target some form of TCP - but that's possible as well.) Perhaps IREE flow should just be moved into TF and MLIR right away?

- Uday

Right now, I would interpret our work here to be a pretty decent proxy to indicate that quite a bit is working through the tf2xla bridge and fairly standard MLIR-based tooling. The next points of high level evolution of the representations will likely come from TCP and our npcomp experiments, and that will lead to some reshuffling of the component boundaries. And of course, all of this is dependent on us collectively achieving the performance metrics that get us to a point of viability.

- Uday

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org.

Stella Laurenzo

unread,

May 20, 2020, 4:07:53 PM5/20/20

to Uday Bondhugula, MLIR

You are, of course, correct in your estimation of the awkwardness here. Leaving aside LHLO, which IREE does not have a stake in, we definitely would like to see a repository where the HLO dialect and outbound conversions could live in one place. Right now, the HLO->LinAlg(on buffers) conversions directly target some IREE HAL and shape abstractions that have no correspondence in either upstream repo, which is the anchor that is keeping them in IREE at the moment. In addition to some design work needed to get these items abstracted properly for life outside of IREE, we would be open to patches that build up HLO->LinAlg conversions in the Tensorflow repo. That is an eventual goal of ours, but has been somewhat difficult to get done.

Personally, I would like to see an independent repo for HLO, its outbound conversions and a common place for infra related to the aspects that IREE is doing (buffer assignment, shape materialization, etc). While some of that may ultimately belong in upstream MLIR, we currently lack a viable repo like this, organized like peer LLVM projects, and suitable for taking a dependency on. That is something that is actively being discussed but has not been resolved. For a number of reasons having to do with repository layout and build system constraints, IREE has more exacting requirements at present than what can be served by literal existence in the TensorFlow repo, so we would like to see these repositories laid out in more of a dependency-friendly fashion that TensorFlow can also depend on versus moving more pieces into the TensorFlow mono-repo. There is a fair bit of effort involved in pulling that off, however. If there are short-term things that will help you make progress if they existed in one place or another then we can see about addressing those first.

- Uday

Right now, I would interpret our work here to be a pretty decent proxy to indicate that quite a bit is working through the tf2xla bridge and fairly standard MLIR-based tooling. The next points of high level evolution of the representations will likely come from TCP and our npcomp experiments, and that will lead to some reshuffling of the component boundaries. And of course, all of this is dependent on us collectively achieving the performance metrics that get us to a point of viability.

- Uday

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/dff4db69-8698-419e-b1d6-19f54a5d21d5%40tensorflow.org.

Stella Laurenzo

unread,

May 21, 2020, 12:36:01 AM5/21/20

to Uday Bondhugula, MLIR

Leaving aside the above, I believe this is what you are looking for: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

(I had forgotten that we had established these common patterns in the TensorFlow repo for the things that don't require further tie-ins to IREE to lower)

Uday R Bondhugula

unread,

May 21, 2020, 1:07:10 AM5/21/20

to Stella Laurenzo, MLIR

On 21/05/2020 10:05, Stella Laurenzo wrote:
> Leaving aside the above, I believe this is what you are looking for:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

Right - this is clearly a good place to seek contributions in the
interim for those not wanting to depend on three things. The link you
provided earlier https://google.github.io/iree/HLOOpCoverage shows great
HLO op coverage via IREE, but this coverage for HLO/LHLO to Linalg or
even LHLO to Affine is really small in comparison (pretty much pointwise).
https://github.com/tensorflow/tensorflow/blob/203c1de5a4e54079304f154eee1745e6ee3eb3b2/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc#L645

On the other hand, with the IREE part covering almost everything and is
converting from xla_hlo into linalg:
https://github.com/google/iree/blob/bb44cf05aaf761cf46c1556ce05cd79bc8a88eb0/iree/compiler/Conversion/HLOToLinalg/HLOToLinalgOnBuffers.cpp#L336

I'm now confused why this code really doesn't live in TensorFlow? It has
nothing to do with IREE but only with xla_hlo and linalg. The only thing
is that it's using the MLIR/LLVM code style as opposed to Google/TF style.

- Uday

>
> (I had forgotten that we had established these common patterns in the
> TensorFlow repo for the things that don't require further tie-ins to
> IREE to lower)
>
> On Wed, May 20, 2020 at 1:07 PM Stella Laurenzo <laur...@google.com
> <mailto:laur...@google.com>> wrote:
>
>
>
> On Wed, May 20, 2020 at 12:11 PM 'Uday Bondhugula' via MLIR
> <ml...@tensorflow.org <mailto:ml...@tensorflow.org>> wrote:
>
> Hi Stella,
>
> On Wednesday, May 13, 2020 at 12:20:35 PM UTC+5:30, Stella
> Laurenzo wrote:
>
>
>
> On Tue, May 12, 2020 at 10:51 PM 'Uday Bondhugula' via MLIR
> <ml...@tensorflow.org> wrote:
>
> Hi Stella,
>
> Thank you responding in detail. Some questions below.
>
> On Wednesday, May 13, 2020 at 4:02:51 AM UTC+5:30,
> Stella Laurenzo wrote:
>
>
>
> On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula'
> via MLIR <ml...@tensorflow.org> wrote:
>
> Hello,
>
> There are broadly two topics that I'm asking
> about below.
>
> 1) I was looking at TensorFlow MLIR overview

> <https://www.tensorflow.org/mlir/overview> and I

> couldn't find information documenting the
> typical layers (dialects and conversions) to go
> from TF GraphDef into mid-level MLIR forms like
> the linalg, Affine or SCF dialects. For TF to TF
> Lite itself, this README.md

> <https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/lite/README.md> provides

> a good high-level summary (although it's
> difficult to find and not linked from
> https://www.tensorflow.org/mlir/overview). But
> for TF through MLIR layers and code generation,
> there isn't such an overview AFAICS. I've learnt
> that this is the typical conversion pipeline:
>
> GraphDef -> MLIR TF executor -> MLIR TF Control
> -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR
> xla_lhlo -> linalg / affine / loops
>
> And each of these conversions has a
> corresponding pass in tf-mlir-translate or
> tf-opt (all in tf-opt except the first which is
> an IR translation) with a description for the
> cmd line option among numerous other options. Is
> this all and accurate? I think it would be
> useful to have a README.md similar to the Lite
> one at
> https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing
> a summary, especially to all those who are
> interested in contributing to the pipeline. Or
> else there could be a description of these
> intermediate dialects and conversion passes like

> there is at mlir.llvm.org <http://mlir.llvm.org>
> (Dialects
> <https://mlir.llvm.org/docs/Dialects/> and
> Passes <https://mlir.llvm.org/docs/Passes/>) - I

> guess the ones for TF MLIR aren't auto-generated
> and so it wouldn't be practical perhaps to
> maintain separately.
>
> 2) On a related note, the tensor compute dialect
> work/proposal in progress

> <https://llvm.discourse.group/t/development-of-high-level-tensor-compute-primitives-dialect-s-and-transformations/388> on

> * TF->xla_hlo->IREE Flow->LinAlg->... (in use by

> IREE)
>
> The nodes in my chain above were intended to be
> dialects. The above would look like <IREE flow> is
> replacing the conversion through lhlo with things that
> will hopefully get into the tensor compute dialect and
> merged with LinAlg? By "IREE flow", did you mean
> conversions or new op abstractions? I should check the
> code, but you perhaps meant a single conversion pass
> (but no new ops / dialect there).
>
>
> My apologies for being somewhat imprecise in my description.
> I should take some time and draw this out, as it has been a
> point of confusion (and has been a fast moving target). The

> diagram in our README <https://github.com/google/iree> is

> the best we have right now but also lacks precision at the
> level I suspect you want.
>
> IREE thinks about this in three layers:
>

> 1. Frontend pipeline
> 2. IREE input dialects
> 3. Backend dialects

>
> Currently, the only frontend pipeline that we support is
> TensorFlow, and the specific passes that we use tolegalize
> to the IREE Input dialects are here

> <https://github.com/google/iree/blob/8b50d98aa1f0b084f2a2715d10e977c4e4be902c/integrations/tensorflow/bindings/python/pyiree/tf/compiler/__init__.py#L48>.

> At this level, we often think of this as a conversion from
> the "tf" dialect to "xha_hlo", but there is actually more
> going on here: the input to this pipeline is actually a MLIR
> representation of the on-disk TensorFlow SavedModel, which
> logically contains an MLIR representation of a GraphDef and
> additional metadata encoded in the SavedModel (i.e. public
> function signatures, variables, etc). This includes the
> dialects tf

> <https://source.corp.google.com/piper///depot/google3/third_party/tensorflow/compiler/mlir/tensorflow/ir/tf_ops.td>,
> tf_executor
> <https://source.corp.google.com/piper///depot/google3/third_party/tensorflow/compiler/mlir/tensorflow/ir/tf_executor_ops.td>,
> and tf_savedmodel
> <https://source.corp.google.com/piper///depot/google3/third_party/tensorflow/compiler/mlir/tensorflow/ir/tf_saved_model_ops.td>.

> The result of running the TensorFlow legalization pipeline
> is an MLIR module containing functions of mostly xla_hlo ops

> <https://source.corp.google.com/piper///depot/google3/third_party/tensorflow/compiler/mlir/xla/ir/hlo_ops.td>,

> but it will also contain IREE-specific ops and types from
> the IREE flow

> <https://github.com/google/iree/blob/master/iree/compiler/Dialect/Flow/IR/FlowOps.td>

> dialect (minimally including "flow" ops for things not
> representable in xla_hlo, such as variables, assignment).
> Most of the flow ops are optional and can be used to enhance
> or access additional features if needed (i.e. by inputting a
> program that has already been partitioned into
> flow.dispatch_region ops instead of letting IREE's
> partitioner handle it). One important difference from the
> default way that TensorFlow thinks about this is that we
> legalize early to CFG, whereas traditional TensorFlow/XLA
> representations use each dialect's native control flow
> primitives instead.
>
> So in effect, the entry to "#2. IREE input dialects" above
> is a mixture of xla_hlo, [iree] flow, [mlir] std represented
> with CFG control flow constructs. Stateless models can be
> completely expressed in xla_hlo, and in fact, this is how
> the bulk of our individual tests are represented

> <https://github.com/google/iree/tree/master/iree/test/e2e>.

> * TF->xla_hlo->xla_lhlo->... (in use by more

> traditional TF backends)
>
> We ultimately believe that much of this can be
> rebased on a high level TCP abstraction, and we've
> found it valuable to start working on that as part
> of the npcomp scale model, focusing on extracting
> the patterns to bridge the various layers that have
> grown in the IREE and traditional TF side into an
> independent place that we can look at and evaluate
> how to evolve/upstream/etc. We'd like to see more
> convergence and upstreaming of the frontend layers,
> which is why we (IREE-team) are building out the
> npcomp prototype to aggregate the necessary pieces
> in preparation for concrete design discussions and
> upstreaming. +Sean Silva who is working on this in
> the mlir-npcomp repo

> <https://github.com/google/mlir-npcomp>.

>
>
> This would be great and npcomp could also serve as an
> excellent reference for an end-to-end implementation of
> a user facing programming model - perhaps the first as well.
>
>
> I really hope so and that is my interest in it. I think
> we're overdue for a new reference point at this level and
> building it out end to end is helping us answer more from a
> fresh perspective what it takes to model the level of
> dynamism that people expect -- and focus on getting the
> layers right to legalize down from there. I suspect that
> getting this layer right will be both useful as a
> user-facing programming model and as a target for similar
> things to lower into. We'll be talking about it more at
> upcoming ODM sessions and on the LLVM discourse, but you're
> welcome to join our semi-frequent discussions on our #npcomp
> channel of IREE's discord server

> <https://discord.gg/26P4xW4> in the meantime. Again, we see

> <https://github.com/google/iree/blob/master/integrations/tensorflow/e2e/BUILD>.

> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org?utm_medium=email&utm_source=footer>.

>
> --
> You received this message because you are subscribed to
> the Google Groups "MLIR" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to ml...@tensorflow.org.
> To view this discussion on the web visit

> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org?utm_medium=email&utm_source=footer>.

>
> --
> You received this message because you are subscribed to the
> Google Groups "MLIR" group.
> To unsubscribe from this group and stop receiving emails from

> it, send an email to mlir+uns...@tensorflow.org
> <mailto:mlir+uns...@tensorflow.org>.

> To view this discussion on the web visit

> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/dff4db69-8698-419e-b1d6-19f54a5d21d5%40tensorflow.org
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/dff4db69-8698-419e-b1d6-19f54a5d21d5%40tensorflow.org?utm_medium=email&utm_source=footer>.
>

Mahesh Ravishankar

unread,

May 21, 2020, 1:33:00 AM5/21/20

to Uday R Bondhugula, Stella Laurenzo, MLIR

On Wed, May 20, 2020 at 10:07 PM 'Uday R Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

On 21/05/2020 10:05, Stella Laurenzo wrote:
> Leaving aside the above, I believe this is what you are looking for:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

Right - this is clearly a good place to seek contributions in the
interim for those not wanting to depend on three things. The link you
provided earlier https://google.github.io/iree/HLOOpCoverage shows great
HLO op coverage via IREE, but this coverage for HLO/LHLO to Linalg or
even LHLO to Affine is really small in comparison (pretty much pointwise).
https://github.com/tensorflow/tensorflow/blob/203c1de5a4e54079304f154eee1745e6ee3eb3b2/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc#L645

On the other hand, with the IREE part covering almost everything and is
converting from xla_hlo into linalg:
https://github.com/google/iree/blob/bb44cf05aaf761cf46c1556ce05cd79bc8a88eb0/iree/compiler/Conversion/HLOToLinalg/HLOToLinalgOnBuffers.cpp#L336

This is mostly an intermediate solution till we can have a more robust solution in core. As an example,

- xla_hlo.reduce is lowered to a linalg.indexed_generic op to handle the initialization value.

- it might be better to convert xla_hlo.pad operation to a linalg.pad operation in the long-term.

- The lowering of xla_hlo.reduce_window to linalg.pool* operations is an example of the experimentation. We were planning to lower these to linalg.indexed_generic as well, but later decided to add pooling operations since it seemed like the right approach for linalg. But here too, we havent really worked out how to handle padding (and if it even makes sense to handle it in the op itself or to have a separate pad operation as mentioned above).

Once we have a better idea of how these ops can be lowered and fused effectively, we plan to move them out of IREE into either Tensorflow HLO -> Linalg conversion, or MLIR core.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/64e98284-2447-c23f-5ca4-3f41133da343%40iisc.ac.in.

--

Mahesh

Uday R Bondhugula

unread,

May 21, 2020, 1:44:35 AM5/21/20

to Mahesh Ravishankar, Stella Laurenzo, MLIR

Hi Mahesh,

All of the above makes sense except for why these conversions have to be
developed out of the tensorflow tree?! They have xla_hlo on their left
and linalg on their right both of which are available in the TF tree,
and with some of the LHLO -> LinAlg conversions already living there.
Having it in TF would make it possible for more users to use it and
contribute to it - or else they are really looking at a very incomplete
picture or duplicating a lot of stuff (I think most don't even know that
IREE has them!). Having to depend on IREE just for those critical
bridges may not be an option for many because IREE itself has another
backends and goals to experiment/exercise end-to-end flows.

Is there a concern on how frequently the MLIR dependency is updated on
TensorFlow vis-a-vis IREE (with the latter using something much closer
to the MLIR tip)? But this looks clearly outweighed by the other
considerations.

Thanks,
- Uday

>
> I'm now confused why this code really doesn't live in TensorFlow? It
> has
> nothing to do with IREE but only with xla_hlo and linalg. The only
> thing
> is that it's using the MLIR/LLVM code style as opposed to Google/TF
> style.
>
> - Uday
>
>
> >
> > (I had forgotten that we had established these common patterns in
> the
> > TensorFlow repo for the things that don't require further tie-ins to
> > IREE to lower)
> >
> > On Wed, May 20, 2020 at 1:07 PM Stella Laurenzo
> <laur...@google.com <mailto:laur...@google.com>

> > <mailto:laur...@google.com <mailto:laur...@google.com>>> wrote:
> >
> >
> >
> > On Wed, May 20, 2020 at 12:11 PM 'Uday Bondhugula' via MLIR
> > <ml...@tensorflow.org <mailto:ml...@tensorflow.org>

> <mailto:ml...@tensorflow.org <mailto:ml...@tensorflow.org>>> wrote:
> >
> > Hi Stella,
> >
> > On Wednesday, May 13, 2020 at 12:20:35 PM UTC+5:30, Stella
> > Laurenzo wrote:
> >
> >
> >
> > On Tue, May 12, 2020 at 10:51 PM 'Uday Bondhugula'
> via MLIR
> > <ml...@tensorflow.org <mailto:ml...@tensorflow.org>>
> wrote:
> >
> > Hi Stella,
> >

> > Thank you responding in detail. Some questions below.
> >
> > On Wednesday, May 13, 2020 at 4:02:51 AM UTC+5:30,
> > Stella Laurenzo wrote:
> >
> >
> >
> > On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula'
> > via MLIR <ml...@tensorflow.org

> <http://mlir.llvm.org> <http://mlir.llvm.org>

> > ml...@tensorflow.org <mailto:ml...@tensorflow.org>.

> > To view this discussion on the web visit
> >

> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org
> >
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org?utm_medium=email&utm_source=footer>.
> >
> > --
> > You received this message because you are
> subscribed to
> > the Google Groups "MLIR" group.
> > To unsubscribe from this group and stop receiving
> emails
> > from it, send an email to ml...@tensorflow.org

> <mailto:ml...@tensorflow.org>.

> > To view this discussion on the web visit
> >

> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org
> >
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/c4963917-8e41-4aef-9523-602c787762be%40tensorflow.org?utm_medium=email&utm_source=footer>.
> >
> > --
> > You received this message because you are subscribed to the
> > Google Groups "MLIR" group.
> > To unsubscribe from this group and stop receiving emails from
> > it, send an email to mlir+uns...@tensorflow.org

> <mailto:mlir%2Bunsu...@tensorflow.org>
> > <mailto:mlir+uns...@tensorflow.org
> <mailto:mlir%2Bunsu...@tensorflow.org>>.

> > To view this discussion on the web visit
> >
> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/dff4db69-8698-419e-b1d6-19f54a5d21d5%40tensorflow.org
> >
> <https://groups.google.com/a/tensorflow.org/d/msgid/mlir/dff4db69-8698-419e-b1d6-19f54a5d21d5%40tensorflow.org?utm_medium=email&utm_source=footer>.
> >
>
> --
> You received this message because you are subscribed to the Google
> Groups "MLIR" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to mlir+uns...@tensorflow.org

> <mailto:mlir%2Bunsu...@tensorflow.org>.

> To view this discussion on the web visit

> https://groups.google.com/a/tensorflow.org/d/msgid/mlir/64e98284-2447-c23f-5ca4-3f41133da343%40iisc.ac.in.
>
>
>
> --
> Mahesh

Mahesh Ravishankar

unread,

May 21, 2020, 2:24:29 AM5/21/20

to Uday R Bondhugula, Stella Laurenzo, MLIR

The goal of everything in IREE's codegen is to move as much as possible into TF tree or MLIR core. I think its a distinction between what we think is a workable intermediate solution v.s. something which is along the path of a more long term solution. This is subjective, but the former is in IREE and latter in TF or MLIR. We have moved things in the past out of IREE codegen into both TF and MLIR. So happy to move (some of) these into TF if these make sense to be there.

What you pointed to earlier is going from HLO to Linalg on buffers. That is OK to do in IREE because of its architecture, but it really needs to be LHLO to Linalg on buffers (rather I dont think HLO to Linalg on buffers conversion makes sense in TF tree). Even in IREE we want to keep this layer as thin as possible. For each of the ops there, our current thinking is that these are intermediate solutions that will be replaced with a more long term solution in Linalg, but allows us to make forward progress on other goals of IREE.

Is there a concern on how frequently the MLIR dependency is updated on
TensorFlow vis-a-vis IREE (with the latter using something much closer
to the MLIR tip)? But this looks clearly outweighed by the other
considerations.

This is not the reason. IREE depends on both tensorflow and MLIR, and MLIR gets updated within IREE at the same time the Tensorflow MLIR gets updated.

--

Mahesh

Mehdi AMINI

unread,

May 21, 2020, 11:50:00 AM5/21/20

to Mahesh Ravishankar, MLIR, Stella Laurenzo, Uday R Bondhugula

On Wed, May 20, 2020 at 10:32 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:

On Wed, May 20, 2020 at 10:07 PM 'Uday R Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

On 21/05/2020 10:05, Stella Laurenzo wrote:
> Leaving aside the above, I believe this is what you are looking for:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

Right - this is clearly a good place to seek contributions in the
interim for those not wanting to depend on three things. The link you
provided earlier https://google.github.io/iree/HLOOpCoverage shows great
HLO op coverage via IREE, but this coverage for HLO/LHLO to Linalg or
even LHLO to Affine is really small in comparison (pretty much pointwise).
https://github.com/tensorflow/tensorflow/blob/203c1de5a4e54079304f154eee1745e6ee3eb3b2/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc#L645

On the other hand, with the IREE part covering almost everything and is
converting from xla_hlo into linalg:
https://github.com/google/iree/blob/bb44cf05aaf761cf46c1556ce05cd79bc8a88eb0/iree/compiler/Conversion/HLOToLinalg/HLOToLinalgOnBuffers.cpp#L336

This is mostly an intermediate solution till we can have a more robust solution in core. As an example,
- xla_hlo.reduce is lowered to a linalg.indexed_generic op to handle the initialization value.
- it might be better to convert xla_hlo.pad operation to a linalg.pad operation in the long-term.
- The lowering of xla_hlo.reduce_window to linalg.pool* operations is an example of the experimentation. We were planning to lower these to linalg.indexed_generic as well, but later decided to add pooling operations since it seemed like the right approach for linalg. But here too, we havent really worked out how to handle padding (and if it even makes sense to handle it in the op itself or to have a separate pad operation as mentioned above).

Once we have a better idea of how these ops can be lowered and fused effectively, we plan to move them out of IREE into either Tensorflow HLO -> Linalg conversion, or MLIR core.

HLO conversion can’t be in Core as there is a dependency on TF.

I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

—

Mehdi

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/CAArwm2Y2W8%3DfUJmMFXXnTY11YzSBbAvDu3W5z-ZimKjJcg8C-Q%40mail.gmail.com.

Stella Laurenzo

unread,

May 21, 2020, 1:15:01 PM5/21/20

to Mehdi AMINI, Mahesh Ravishankar, MLIR, Uday R Bondhugula

On Thu, May 21, 2020 at 8:49 AM Mehdi AMINI <joke...@gmail.com> wrote:

On Wed, May 20, 2020 at 10:32 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:

On Wed, May 20, 2020 at 10:07 PM 'Uday R Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

On 21/05/2020 10:05, Stella Laurenzo wrote:
> Leaving aside the above, I believe this is what you are looking for:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

Right - this is clearly a good place to seek contributions in the
interim for those not wanting to depend on three things. The link you
provided earlier https://google.github.io/iree/HLOOpCoverage shows great
HLO op coverage via IREE, but this coverage for HLO/LHLO to Linalg or
even LHLO to Affine is really small in comparison (pretty much pointwise).
https://github.com/tensorflow/tensorflow/blob/203c1de5a4e54079304f154eee1745e6ee3eb3b2/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc#L645

On the other hand, with the IREE part covering almost everything and is
converting from xla_hlo into linalg:
https://github.com/google/iree/blob/bb44cf05aaf761cf46c1556ce05cd79bc8a88eb0/iree/compiler/Conversion/HLOToLinalg/HLOToLinalgOnBuffers.cpp#L336

This is mostly an intermediate solution till we can have a more robust solution in core. As an example,
- xla_hlo.reduce is lowered to a linalg.indexed_generic op to handle the initialization value.
- it might be better to convert xla_hlo.pad operation to a linalg.pad operation in the long-term.
- The lowering of xla_hlo.reduce_window to linalg.pool* operations is an example of the experimentation. We were planning to lower these to linalg.indexed_generic as well, but later decided to add pooling operations since it seemed like the right approach for linalg. But here too, we havent really worked out how to handle padding (and if it even makes sense to handle it in the op itself or to have a separate pad operation as mentioned above).

Once we have a better idea of how these ops can be lowered and fused effectively, we plan to move them out of IREE into either Tensorflow HLO -> Linalg conversion, or MLIR core.

HLO conversion can’t be in Core as there is a dependency on TF.
I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

Most of it has. What remains in IREE either has a) an IREE dependency that can not yet be represented in either upstream repo, or b) was more intertwined in the recent past and just hasn't been moved yet. If there are specific things left in IREE's HLOToLinAlg tree which don't depend on IREE dialects and are useful in the TensorFlow tree, then they can be moved.

Mahesh Ravishankar

unread,

May 21, 2020, 4:58:27 PM5/21/20

to Stella Laurenzo, Mehdi AMINI, MLIR, Uday R Bondhugula

On Thu, May 21, 2020 at 10:15 AM Stella Laurenzo <laur...@google.com> wrote:

On Thu, May 21, 2020 at 8:49 AM Mehdi AMINI <joke...@gmail.com> wrote:

On Wed, May 20, 2020 at 10:32 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:

On Wed, May 20, 2020 at 10:07 PM 'Uday R Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

On 21/05/2020 10:05, Stella Laurenzo wrote:
> Leaving aside the above, I believe this is what you are looking for:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc

Right - this is clearly a good place to seek contributions in the
interim for those not wanting to depend on three things. The link you
provided earlier https://google.github.io/iree/HLOOpCoverage shows great
HLO op coverage via IREE, but this coverage for HLO/LHLO to Linalg or
even LHLO to Affine is really small in comparison (pretty much pointwise).
https://github.com/tensorflow/tensorflow/blob/203c1de5a4e54079304f154eee1745e6ee3eb3b2/tensorflow/compiler/mlir/xla/transforms/xla_legalize_to_linalg.cc#L645

On the other hand, with the IREE part covering almost everything and is
converting from xla_hlo into linalg:
https://github.com/google/iree/blob/bb44cf05aaf761cf46c1556ce05cd79bc8a88eb0/iree/compiler/Conversion/HLOToLinalg/HLOToLinalgOnBuffers.cpp#L336

This is mostly an intermediate solution till we can have a more robust solution in core. As an example,
- xla_hlo.reduce is lowered to a linalg.indexed_generic op to handle the initialization value.
- it might be better to convert xla_hlo.pad operation to a linalg.pad operation in the long-term.
- The lowering of xla_hlo.reduce_window to linalg.pool* operations is an example of the experimentation. We were planning to lower these to linalg.indexed_generic as well, but later decided to add pooling operations since it seemed like the right approach for linalg. But here too, we havent really worked out how to handle padding (and if it even makes sense to handle it in the op itself or to have a separate pad operation as mentioned above).

Once we have a better idea of how these ops can be lowered and fused effectively, we plan to move them out of IREE into either Tensorflow HLO -> Linalg conversion, or MLIR core.

HLO conversion can’t be in Core as there is a dependency on TF.
I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

These are lowered from HLO to Linalg on buffers, which is valid to do within IREE due to IREE's architecture. There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

--

Mahesh

Uday Bondhugula

unread,

May 21, 2020, 5:36:06 PM5/21/20

to MLIR

I still don't see the reasoning that this has to stay outside because it is evolving or is taking a shortcut as far as the conversion goes. The key point is that the TF tree itself doesn't have an alternative for this missing stuff: both of your ends are in TF/MLIR proper, and the bridge between the two is outside TF! (note the vast difference in op coverage via IREE vs in TF proper). I'm afraid this would only delay or hurt both adoption and contributions because one'd have to look at three repositories here. There are several dialects that are already evolving and improving their lowering paths while being in MLIR in-tree - IMO that needn't be the reason to keep the conversions out because AFAIK there is no alternate competing path in tree nor any proposal to build such (I'm referring to HLO to LinAlg and thus from TF to mid-level forms). But I saw at least a couple of requests on discourse asking for an end-to-end reference path - so there are definitely folks waiting to use and contribute. And for the conversions part, external folks contributing to them while they are in IREE is often a stretch and doesn't look practical - it's a big level of indirection when the original thing being looked at is TF and MLIR.

- Uday

Stella Laurenzo

unread,

May 21, 2020, 6:37:48 PM5/21/20

to Uday Bondhugula, MLIR

Uday, I don't disagree with the criticism, and I suspect that there is not an answer that will satisfy you in the short term. *I* am also frustrated by the lack of a complete story in TensorFlow itself, but being aware of some of the directions being pursued am possibly willing to extend more credit towards progress being made to getting out of this current transient state.

"Simply" moving IREE into TensorFlow is not in the cards for a lot reasons, both technical and non-technical. Were we to do so, it may help with some narrow, near term friction to clone and make progress on certain parts, but it comes at a pretty massive cost in terms of needing to conform to the build and dependency limitations of the TensorFlow monorepo, neither of which is conducive to either cross platform development nor external parties taking a dependency on it. Whether we agree or disagree on those points, they are important enough to IREE that we are looking to more structural changes to the TensorFlow monorepo versus dropping an entirely separate and independently buildable project into it. With that said, when the design and layering allows, we extend components directly in the TensorFlow monorepo (versus forking, etc).

Consider that repository management is an evolving topic, and that, as illustrated by both the TFRT, IREE (and others), it is considered valuable to progress towards the (painful) path of decoupling components of the monorepo. The bug here is that the current state is not coherent -- not that meaningful components are being developed outside of the monorepo. I'm not going to deny, either, that the birth and history of this stuff was chaotic and unplanned: the IREE team historically was completely disjoint from TensorFlow, interacting primarily as a contributor with the value of being a first-class community member *using* both LLVM and TensorFlow technology to target resource constrained devices. Along the way, we built and contributed the HLO dialect/tools itself, the TensorFlow MLIR-SavedModel infrastructure, much of LinAlg, much of MLIR-SPIRV, a significant amount of the HLO support for dynamic shapes and a significant amount of the MLIR-tf2xla bridge. At each step, when faced with the decision to upstream something that was a clear win and where consensus on the design was feasible, we did so and continue to do so. However, conversely, when we were able to converge on a design point locally that did not have a solution upstream, we built locally to prove the concept and kept moving forward, often continuing the underlying design discussions along the way.

The result, in the present day, is a mismatch on some specific maturity levels of components. It is no coincidence that the mismatches are specifically around design points that tend to differ a lot between large-scale and resource-constrained systems (i.e. buffer/memory management, approaches to fusion, concurrency modeling, etc). I can assure you that work is happening on both sides to converge to similar capability levels at these different scales.

So again, if you see specific things in the codebases that can be isolated and moved, patches are welcome: we do prioritize upstreaming such things but are sometimes lazy about it for one reason or another. Regarding collaboration on specific plans to have upstream TensorFlow emerge as a good e2e vehicle for the kind of development you are looking to do, I will need to defer to my TensorFlow colleagues.

- Uday

Most of it has. What remains in IREE either has a) an IREE dependency that can not yet be represented in either upstream repo, or b) was more intertwined in the recent past and just hasn't been moved yet. If there are specific things left in IREE's HLOToLinAlg tree which don't depend on IREE dialects and are useful in the TensorFlow tree, then they can be moved.

—
Mehdi

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/d3973111-397c-45ba-b3bd-4b67b9bb79b5%40tensorflow.org.

Uday Bondhugula

unread,

May 22, 2020, 12:37:26 AM5/22/20

to MLIR

HLO conversion can’t be in Core as there is a dependency on TF.
I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

These are lowered from HLO to Linalg on buffers, which is valid to do within IREE due to IREE's architecture. There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

I still don't see the reasoning that this has to stay outside because it is evolving or is taking a shortcut as far as the conversion goes. The key point is that the TF tree itself doesn't have an alternative for this missing stuff: both of your ends are in TF/MLIR proper, and the bridge between the two is outside TF! (note the vast difference in op coverage via IREE vs in TF proper). I'm afraid this would only delay or hurt both adoption and contributions because one'd have to look at three repositories here. There are several dialects that are already evolving and improving their lowering paths while being in MLIR in-tree - IMO that needn't be the reason to keep the conversions out because AFAIK there is no alternate competing path in tree nor any proposal to build such (I'm referring to HLO to LinAlg and thus from TF to mid-level forms). But I saw at least a couple of requests on discourse asking for an end-to-end reference path - so there are definitely folks waiting to use and contribute. And for the conversions part, external folks contributing to them while they are in IREE is often a stretch and doesn't look practical - it's a big level of indirection when the original thing being looked at is TF and MLIR.

Uday, I don't disagree with the criticism, and I suspect that there is not an answer that will satisfy you in the short term. *I* am also frustrated by the lack of a complete story in TensorFlow itself, but being aware of some of the directions being pursued am possibly willing to extend more credit towards progress being made to getting out of this current transient state.

"Simply" moving IREE into TensorFlow is not in the cards for a lot reasons, both technical and non-technical. Were we to do so, it may help with some narrow, near term friction to clone and make progress on certain parts, but it comes at a pretty massive cost in terms of needing to conform to the build and dependency limitations of the TensorFlow monorepo, neither of which is conducive to either cross platform development nor external parties taking a dependency on it. Whether we agree or disagree on those points, they are important enough to IREE that we are looking to more structural changes to the TensorFlow monorepo versus dropping an entirely separate and independently buildable project into it. With that said, when the design and layering allows, we extend components directly in the TensorFlow monorepo (versus forking, etc).

Consider that repository management is an evolving topic, and that, as illustrated by both the TFRT, IREE (and others), it is considered valuable to progress towards the (painful) path of decoupling components of the monorepo. The bug here is that the current state is not coherent -- not that meaningful components are being developed outside of the monorepo. I'm not going to deny, either, that the birth and history of this stuff was chaotic and unplanned: the IREE team historically was completely disjoint from TensorFlow, interacting primarily as a contributor with the value of being a first-class community member *using* both LLVM and TensorFlow technology to target resource constrained devices. Along the way, we built and contributed the HLO dialect/tools itself, the TensorFlow MLIR-SavedModel infrastructure, much of LinAlg, much of MLIR-SPIRV, a significant amount of the HLO support for dynamic shapes and a significant amount of the MLIR-tf2xla bridge. At each step, when faced with the decision to upstream something that was a clear win and where consensus on the design was feasible, we did so and continue to do so. However, conversely, when we were able to converge on a design point locally that did not have a solution upstream, we built locally to prove the concept and kept moving forward, often continuing the underlying design discussions along the way.

The result, in the present day, is a mismatch on some specific maturity levels of components. It is no coincidence that the mismatches are specifically around design points that tend to differ a lot between large-scale and resource-constrained systems (i.e. buffer/memory management, approaches to fusion, concurrency modeling, etc). I can assure you that work is happening on both sides to converge to similar capability levels at these different scales.

Stella, thank you again for explaining at length. I think I understand the constraints and costs involved better now.

So again, if you see specific things in the codebases that can be isolated and moved, patches are welcome: we do prioritize upstreaming such things but are sometimes lazy about it for one reason or another. Regarding collaboration on specific plans to have upstream TensorFlow emerge as a good e2e vehicle for the kind of development you are looking to do, I will need to defer to my TensorFlow colleagues.

I think this comment leaves it as a cliffhanger. :) Irrespective of the direction TF takes here, we obviously agree that leaving the path for TF compilation through MLIR disconnected in TF proper can't be an option. When compared to that, perhaps the more radical alternative of doing the reverse: moving everything MLIR out of TF and into IREE, and IREE becoming the "TF with MLIR" besides other things it is given that its e2e flow plans to support other frontends as shown in this nice figure:

https://google.github.io/iree/docs/IREE-Architecture.svg

- Uday

Stella Laurenzo

unread,

May 22, 2020, 1:33:35 AM5/22/20

to Uday Bondhugula, MLIR

Lol - a cliffhanger was not my intent :) A full plan, though, in a form suitable to align on both internal to Google and externally is certainly beyond what I can enunciate right now -- and it will involve more than the two of us speculating. There are a lot of important developments happening across the Google-sponsored MLIR projects -- and we do need to balance where those parts will be soon versus just the current snapshot. I will point out that while IREE is "ahead" right now in terms of getting the e2e story in place, we are all working towards some real proof points of viability -- covering the integrations, codegen strategy and true performance comparisons. We do have the marble dropping all the way through, but it is still quite a way from being a generally usable, high performance implementation. In a real sense, it matters less which facet of the projects get to the top of the mountain first than it does that we start flipping those bits and proving the technology direction for what we think it is capable of.

In addition to that, we really need to get the repository situation aligned so that external people can experiment, contribute to and depend on the tooling that is being built in a more effective way. Your feedback is definitely taken on that front -- it is just going to take some time to action.

- Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/2c3682e8-eb62-4d4f-a9f0-95a366d10635%40tensorflow.org.

Mehdi AMINI

unread,

May 22, 2020, 1:38:01 AM5/22/20

to Mahesh Ravishankar, Stella Laurenzo, MLIR, Uday R Bondhugula

You mean "Linalg on tensors" here I believe.

, which is valid to do within IREE due to IREE's architecture.

As long as you can have a lowering path from LinAlg on tensors to something down, what is specific to IREE?

There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

You seem to assume that because we have LHLO, we can't or won't use HLO-> "Linalg on Tensor" in any of the TF-ecosystem codegen-related project? This does not seem obvious to me though.

If we have, for example, a good CPU codegen path upstream which starts with Linalg-on-tensors, then we'd be very happy to migrate something like tf_compile to use it.

The only reason for LHLO to exist is to give us an anchor point with respect to XLA: on the short-term we can preserve the very robust XLA machinery for layout/buffer assignment (and other optimizations) at the HLO level and plug back into MLIR CodeGen.

So, any codegen related handling of *HLO (and conversion from *HLO to Linalg/Affine and similar dialects) is welcome under tensorflow/compiler/mlir/xla/...

Cheers,

--

Mehdi

Stella Laurenzo

unread,

May 22, 2020, 1:51:17 AM5/22/20

to Mehdi AMINI, Mahesh Ravishankar, MLIR, Uday R Bondhugula

And this is the main point for me: LinAlg-on-tensors is still undergoing quite a bit of evolution to be generally usable, and something filling that level of the stack needs to be mature for all of this to work together. We've been pushing on it on the IREE side, but it is not ready for primetime yet. This stuff is all still fairly experimental and being worked out in-situ.

Mahesh Ravishankar

unread,

May 22, 2020, 4:42:43 PM5/22/20

to Stella Laurenzo, Mehdi AMINI, MLIR, Uday R Bondhugula

No I mean HLO to Linalg on buffers. There are some operations (like matmul, conv, anything with reduction iterator types etc.) that cannot be lowered to Linalg on tensors right now. To be able to handle these HLO ops, we convert them to Linalg on buffers. We can do this within IREE cause we put them in their own dispatch regions. Since IREE allocates buffers for the inputs and outputs of the dispatch region, this conversion is possible. The ops that are converted from HLO to Linalg on buffers directly this way are the only ones in IREE. The conversion of XLA-HLO to Linalg on tensors is already in TF tree (and used in IREE).

P.S. : We are aware of the phase ordering issue in IREE. Ideally we want to convert from XLA-HLO to Linalg on tensors, perform fusion and then create dispatch regions with each linalg op/remaining xla op in its own dispatch region. This patch is one step to start evaluating this, along with Linalg fusion on tensors that is also in core and these patches (here, here) should help this effort. This approach should work for TF as well, but I have to really see how that plays out. We havent really pushed on this in IREE yet cause we are currently focussing on things downstream in compilation first.

, which is valid to do within IREE due to IREE's architecture.

As long as you can have a lowering path from LinAlg on tensors to something down, what is specific to IREE?

There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

You seem to assume that because we have LHLO, we can't or won't use HLO-> "Linalg on Tensor" in any of the TF-ecosystem codegen-related project? This does not seem obvious to me though.

I dont think I was saying that. As I mentioned above HLO -> Linalg on tensors would work for TF as well. I would hope you can use it. All that code lives in TF already (and the fusion code in MLIR core). The logic for converting XLA-HLO to Linalg on buffers that lives in IREE would be re-usable for LHLO to Linalg as well. So then it's a matter of someone picking that up and pushing that forward.

--

Mahesh

Mehdi AMINI

unread,

May 22, 2020, 4:46:35 PM5/22/20

to Mahesh Ravishankar, Stella Laurenzo, MLIR, Uday R Bondhugula

Ah! I'm glad I asked, that makes a lot of sense indeed, thanks for clarifying!

Uday Bondhugula

unread,

May 23, 2020, 1:14:36 AM5/23/20

to MLIR

I'm actually interested in this part and in contributing to it. But again there appear to be two options here. (1) The HLO to Linalg buffers stuff in IREE could be reused to do LHLO to Linalg in TF proper (removing any IREE connected stuff and integrating it in TF) and HLO to LHLO for those missing ops could also be done in TF. Does this make sense and to submit this to TF? (2) The other option: since the LHLO path would be subsumed by HLO -> Linalg on tensors -> Linalg on buffers, it may instead make sense to contribute the conversions from HLO to Linalg tensors that involve reductions. When you said that the conversion to 'linalg on tensors' for operators that involve reductions doesn't work, is it due to missing design in linalg or just the conversions missing? I assume the conversion from "linalg on tensors" to "linalg on buffers" exists in TF and MLIR proper and works for all ops (I'm just referring to naive conversion and not any buffer reuse opts)?

Mahesh Ravishankar

unread,

May 23, 2020, 5:40:02 PM5/23/20

to Uday Bondhugula, MLIR

I would really prefer (2). The issue with reductions so far is just how to handle the initialization value. Nicolas has suggested we take an extra operand in such cases.

For example, current matmul in Linalg on buffers is

linalg.matmul(%a, %b, %c) : memref<..>, memref<..>, memref<..>

In tensors it would be

%d = linalg.matmul(%a, %b, %c) : tensor<..>, tensor<..>, tensor<..> -> tensor<..>

Where %c and %d have the same shape and %c is the initial value of the result tensor %d. We could potentially add an attribute to explicitly state this relationship so that the buffer allocation schemes can use to get a smaller footprint. This WIP and Nicolas can provide more details on it.

Btw, I dont know if there is a standard way of going from Linalg on tensors to Linalg on buffers. MLIR core recently gained an allocator, but I am not tracking that as closely to know if it has been used to do buffer allocation for Linalg on tensor to Linalg on buffer conversion. Nevertheless, a naive conversion should be possible. Linalg ops allow mixing tensor and buffer operands. So you can allocate in an incremental fashion.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/5b9f4f19-a236-404f-87cf-27129d98ec0e%40tensorflow.org.

--

Mahesh

Uday Bondhugula

unread,

May 23, 2020, 10:07:23 PM5/23/20

to MLIR

HLO conversion can’t be in Core as there is a dependency on TF.
I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

These are lowered from HLO to Linalg on buffers

You mean "Linalg on tensors" here I believe.

No I mean HLO to Linalg on buffers. There are some operations (like matmul, conv, anything with reduction iterator types etc.) that cannot be lowered to Linalg on tensors right now. To be able to handle these HLO ops, we convert them to Linalg on buffers. We can do this within IREE cause we put them in their own dispatch regions. Since IREE allocates buffers for the inputs and outputs of the dispatch region, this conversion is possible. The ops that are converted from HLO to Linalg on buffers directly this way are the only ones in IREE. The conversion of XLA-HLO to Linalg on tensors is already in TF tree (and used in IREE).

P.S. : We are aware of the phase ordering issue in IREE. Ideally we want to convert from XLA-HLO to Linalg on tensors, perform fusion and then create dispatch regions with each linalg op/remaining xla op in its own dispatch region. This patch is one step to start evaluating this, along with Linalg fusion on tensors that is also in core and these patches (here, here) should help this effort. This approach should work for TF as well, but I have to really see how that plays out. We havent really pushed on this in IREE yet cause we are currently focussing on things downstream in compilation first.

, which is valid to do within IREE due to IREE's architecture.

As long as you can have a lowering path from LinAlg on tensors to something down, what is specific to IREE?

There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

You seem to assume that because we have LHLO, we can't or won't use HLO-> "Linalg on Tensor" in any of the TF-ecosystem codegen-related project? This does not seem obvious to me though.

I dont think I was saying that. As I mentioned above HLO -> Linalg on tensors would work for TF as well. I would hope you can use it. All that code lives in TF already (and the fusion code in MLIR core). The logic for converting XLA-HLO to Linalg on buffers that lives in IREE would be re-usable for LHLO to Linalg as well. So then it's a matter of someone picking that up and pushing that forward.

I'm actually interested in this part and in contributing to it. But again there appear to be two options here. (1) The HLO to Linalg buffers stuff in IREE could be reused to do LHLO to Linalg in TF proper (removing any IREE connected stuff and integrating it in TF) and HLO to LHLO for those missing ops could also be done in TF. Does this make sense and to submit this to TF? (2) The other option: since the LHLO path would be subsumed by HLO -> Linalg on tensors -> Linalg on buffers, it may instead make sense to contribute the conversions from HLO to Linalg tensors that involve reductions. When you said that the conversion to 'linalg on tensors' for operators that involve reductions doesn't work, is it due to missing design in linalg or just the conversions missing? I assume the conversion from "linalg on tensors" to "linalg on buffers" exists in TF and MLIR proper and works for all ops (I'm just referring to naive conversion and not any buffer reuse opts)?

I would really prefer (2). The issue with reductions so far is just how to handle the initialization value. Nicolas has suggested we take an extra operand in such cases.
For example, current matmul in Linalg on buffers is

linalg.matmul(%a, %b, %c) : memref<..>, memref<..>, memref<..>

In tensors it would be

%d = linalg.matmul(%a, %b, %c) : tensor<..>, tensor<..>, tensor<..> -> tensor<..>

Where %c and %d have the same shape and %c is the initial value of the result tensor %d. We could potentially add an attribute to explicitly state this relationship so that the buffer allocation schemes can use to get a smaller footprint. This WIP and Nicolas can provide more details on it.

Btw, I dont know if there is a standard way of going from Linalg on tensors to Linalg on buffers. MLIR core recently gained an allocator, but I am not tracking that as closely to know if it has

This part would also be important to know about before jumping in. Moreover, I ran into other issues - looks like the IREE passes aren't at all what I thought; can you see here:

https://github.com/google/iree/issues/2011

- Uday

been used to do buffer allocation for Linalg on tensor to Linalg on buffer conversion. Nevertheless, a naive conversion should be possible. Linalg ops allow mixing tensor and buffer operands. So you can allocate in an incremental fashion.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/5b9f4f19-a236-404f-87cf-27129d98ec0e%40tensorflow.org.

--
Mahesh

Mehdi AMINI

unread,

May 23, 2020, 11:50:08 PM5/23/20

to Mahesh Ravishankar, Stella Laurenzo, MLIR, Uday R Bondhugula

Actually Uday's question prompted me to look at some of the code, and it isn't clear why most of the code in HLOToLinalgOnBuffers.cpp isn't located with the HLO dialect?

Mahesh Ravishankar

unread,

May 24, 2020, 12:19:00 PM5/24/20

to Uday Bondhugula, MLIR

On Sat, May 23, 2020 at 7:07 PM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

HLO conversion can’t be in Core as there is a dependency on TF.
I am with Uday here, I don’t understand why these haven’t been developed in TF in the first place?

These are lowered from HLO to Linalg on buffers

You mean "Linalg on tensors" here I believe.

No I mean HLO to Linalg on buffers. There are some operations (like matmul, conv, anything with reduction iterator types etc.) that cannot be lowered to Linalg on tensors right now. To be able to handle these HLO ops, we convert them to Linalg on buffers. We can do this within IREE cause we put them in their own dispatch regions. Since IREE allocates buffers for the inputs and outputs of the dispatch region, this conversion is possible. The ops that are converted from HLO to Linalg on buffers directly this way are the only ones in IREE. The conversion of XLA-HLO to Linalg on tensors is already in TF tree (and used in IREE).

P.S. : We are aware of the phase ordering issue in IREE. Ideally we want to convert from XLA-HLO to Linalg on tensors, perform fusion and then create dispatch regions with each linalg op/remaining xla op in its own dispatch region. This patch is one step to start evaluating this, along with Linalg fusion on tensors that is also in core and these patches (here, here) should help this effort. This approach should work for TF as well, but I have to really see how that plays out. We havent really pushed on this in IREE yet cause we are currently focussing on things downstream in compilation first.

, which is valid to do within IREE due to IREE's architecture.

As long as you can have a lowering path from LinAlg on tensors to something down, what is specific to IREE?

There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

You seem to assume that because we have LHLO, we can't or won't use HLO-> "Linalg on Tensor" in any of the TF-ecosystem codegen-related project? This does not seem obvious to me though.

I dont think I was saying that. As I mentioned above HLO -> Linalg on tensors would work for TF as well. I would hope you can use it. All that code lives in TF already (and the fusion code in MLIR core). The logic for converting XLA-HLO to Linalg on buffers that lives in IREE would be re-usable for LHLO to Linalg as well. So then it's a matter of someone picking that up and pushing that forward.

I'm actually interested in this part and in contributing to it. But again there appear to be two options here. (1) The HLO to Linalg buffers stuff in IREE could be reused to do LHLO to Linalg in TF proper (removing any IREE connected stuff and integrating it in TF) and HLO to LHLO for those missing ops could also be done in TF. Does this make sense and to submit this to TF? (2) The other option: since the LHLO path would be subsumed by HLO -> Linalg on tensors -> Linalg on buffers, it may instead make sense to contribute the conversions from HLO to Linalg tensors that involve reductions. When you said that the conversion to 'linalg on tensors' for operators that involve reductions doesn't work, is it due to missing design in linalg or just the conversions missing? I assume the conversion from "linalg on tensors" to "linalg on buffers" exists in TF and MLIR proper and works for all ops (I'm just referring to naive conversion and not any buffer reuse opts)?

I would really prefer (2). The issue with reductions so far is just how to handle the initialization value. Nicolas has suggested we take an extra operand in such cases.
For example, current matmul in Linalg on buffers is

linalg.matmul(%a, %b, %c) : memref<..>, memref<..>, memref<..>

In tensors it would be

%d = linalg.matmul(%a, %b, %c) : tensor<..>, tensor<..>, tensor<..> -> tensor<..>

Where %c and %d have the same shape and %c is the initial value of the result tensor %d. We could potentially add an attribute to explicitly state this relationship so that the buffer allocation schemes can use to get a smaller footprint. This WIP and Nicolas can provide more details on it.

Btw, I dont know if there is a standard way of going from Linalg on tensors to Linalg on buffers. MLIR core recently gained an allocator, but I am not tracking that as closely to know if it has

This part would also be important to know about before jumping in.

+1

Moreover, I ran into other issues - looks like the IREE passes aren't at all what I thought; can you see here:
https://github.com/google/iree/issues/2011

It depends on what you expect :) . IREE codegen passes are structured to work with the input it gets from upstream IREE passes (and the way IREE specifies its buffers to enable more efficient descriptor set usage). Also its WIP, so it doesnt work with arbitrary sequence of HLO ops, rather what the codegen is expected to handle based on what upstream passes do. A good starting point is to look at tests in the folder where the pass lives that shows what the expected input looks like. These are IREE specific issues that might not be relevant to this forum. It would be great if we could move that conversation to IREE mailing lists/bugs, etc. where other people might weigh in and we can go into more details there.

- Uday

been used to do buffer allocation for Linalg on tensor to Linalg on buffer conversion. Nevertheless, a naive conversion should be possible. Linalg ops allow mixing tensor and buffer operands. So you can allocate in an incremental fashion.

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/5b9f4f19-a236-404f-87cf-27129d98ec0e%40tensorflow.org.

--
Mahesh

--
You received this message because you are subscribed to the Google Groups "MLIR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mlir+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/0ef8f688-5dba-4013-a56c-6c4a4dbe87b3%40tensorflow.org.

--

Mahesh

Mahesh Ravishankar

unread,

May 24, 2020, 12:24:00 PM5/24/20

to Mehdi AMINI, Stella Laurenzo, MLIR, Uday R Bondhugula

I think this was discussed earlier in the thread. IREEs does buffer allocation before codegen passes are invoked. What is happening in the pass you mentioned is using the buffers of the inputs/outputs for the dispatch region and using them. IREE also represents buffers to reflect (and therefore optimize) descriptor set usage. There is also work handling dynamic shapes which is very preliminary. There is some separation of concerns there to keep the actual conversion logic silo-ed off from all the other things need to be handled within IREE at this stage of the codegen. So some of that could be shared between TF and IREE. Maybe we can chat offline about how to share the code while still making it work for both IREE and TF. This again is going into details of IREE. I see that you are looking at the bug that Uday mentioned above. Lets continue discussing there.

--

Mahesh

Stephan Herhut

unread,

May 26, 2020, 11:20:14 AM5/26/20

to Mahesh Ravishankar, Stella Laurenzo, Mehdi AMINI, MLIR, Uday R Bondhugula

On Thu, May 21, 2020 at 10:58 PM 'Mahesh Ravishankar' via MLIR <ml...@tensorflow.org> wrote:

These are lowered from HLO to Linalg on buffers, which is valid to do within IREE due to IREE's architecture. There is a question of buffer allocation that has been solved in IREE before codegen. IIUC in TF the plan is to use XLA-HLO to XLA-LHLO to Linalg on buffers (using XLA's machinery).We dont plan to use LHLO in IREE. Maybe some, if not all of this code can be made to work for both IREE and XLA (just like pointwise ops exist today). We can move things into TF (or some repo that has HLO dialect).

There are different approaches in TF, depending on the project. Work that is using the XLA buffer assignment and only uses MLIR for final code generation starts with LHLO and then uses different lowering strategies via LinAlg, or directly to loops, depending on the operation. We essentially use LinAlg there whenever we are interested in fusion and otherwise might use different approaches. Recently, we also started experimenting with performing the fusion from HLO via LinAlg on tensors and then going to LinAlg on buffers. MLIR core has a generic buffer allocation that enables this.

Cheers

Stephan

--

Stephan

Uday Bondhugula

unread,

Jun 5, 2020, 4:35:22 AM6/5/20

to MLIR

[Sorry for top posting, but this isn't a direct continuation although on the same topic. ]

I just tried out the TF -> XLA path via MLIR (starting from GraphDef) on a VGG sized model and it's really great to see nearly all conversions working through the xla_hlo dialect with some intervention needed which I'm listing below. I'm posting this to check if someone is already working on this or if contributions are being sought to complete the missing things, or if there are any non-trivial issues preventing those.

Here's the MLIR pipeline I ran the TFv2 graphdef through:

$ mlir-tf-translate -graphdef-to-mlir -tf-enable-shape-inference-on-import=true model.pbtxt | tf-opt -tf-executor-island-coarsening -canonicalize -tf-decompose-resource-ops -tf-promote-resources-to-args -canonicalize -xla-legalize-tf

The only things that got in the way are:

1) tf.VarIsInitializedOp

I can't find this op in the tf/mlir tree but only in test cases; its bool result is unused and it appears to be a non-mutating check. It precedes every tf.ReadVariableOp. Looks like it could just be erased via a simple pattern (perhaps it just needs to be added to ops and marked side-effect free)?

%outputs_102, %control_103 = tf_executor.island wraps "tf.VarIsInitializedOp"(%outputs_98) {device = ""} : (tensor<!tf.resource<tensor<64xf32>>>) -> tensor<i1>

On a minor note, it's not clear why there are two result values but the type signature says 0-d i1 tensor.

2) At the end of island-coarsening and canonicalize, there are still enclosing tf_executor ops:

func @main() {

tf_executor.graph {

%outputs_0, %control_1 = tf_executor.island wraps "tf.Const"() {value = dense<1> : tensor<2xi32>} : () -> tensor<2xi32>

%outputs_2, %control_3 = tf_executor.island wraps "tf.Const"() {value = dense<1> : tensor<2xi32>} : () -> tensor<2xi32>

...

tf_executor.island {

...

tf_executor.yield

}

tf_executor.fetch

}

return

}

Is there something preventing the -canonicalize from dropping the enclosing tf.executor ops? It does coarsen it to this form with a single island.

Thanks,

Uday

On Wednesday, May 13, 2020 at 4:02:51 AM UTC+5:30, Stella Laurenzo wrote:

On Tue, May 12, 2020 at 6:44 AM 'Uday Bondhugula' via MLIR <ml...@tensorflow.org> wrote:

Hello,

There are broadly two topics that I'm asking about below.

1) I was looking at TensorFlow MLIR overview and I couldn't find information documenting the typical layers (dialects and conversions) to go from TF GraphDef into mid-level MLIR forms like the linalg, Affine or SCF dialects. For TF to TF Lite itself, this README.md provides a good high-level summary (although it's difficult to find and not linked from https://www.tensorflow.org/mlir/overview). But for TF through MLIR layers and code generation, there isn't such an overview AFAICS. I've learnt that this is the typical conversion pipeline:

GraphDef -> MLIR TF executor -> MLIR TF Control -> MLIR TF (Standard CF) -> MLIR xla_hlo -> MLIR xla_lhlo -> linalg / affine / loops

And each of these conversions has a corresponding pass in tf-mlir-translate or tf-opt (all in tf-opt except the first which is an IR translation) with a description for the cmd line option among numerous other options. Is this all and accurate? I think it would be useful to have a README.md similar to the Lite one at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/tensorflow providing a summary, especially to all those who are interested in contributing to the pipeline. Or else there could be a description of these intermediate dialects and conversion passes like there is at mlir.llvm.org (Dialects and Passes) - I guess the ones for TF MLIR aren't auto-generated and so it wouldn't be practical perhaps to maintain separately.

2) On a related note, the tensor compute dialect work/proposal in progress on MLIR's list is on developing something subsuming {xla_hlo, xla_lhlo, linalg}, and linalg is presumably expanding to the left. Given this, a few concrete questions on the roadmap reg. contributing to the larger end-to-end pipeline: would it be useful to invest more in TF to xla_hlo legalization to complete the missing pieces to get say a full model working or are most people just waiting for the "tensor compute dialect" so that the legalization from MLIR TF dialect could be done straight to that dialect when it's close to ready? Currently, linalg and lhlo are in overlapping spaces, and with known / named ops on tensor types, linalg is covering the hlo space as well already. Is there any early work on cloning the tf -> xla_hlo conversion to tf -> tcd/linalg or is that too early? And finally, are the MLIR TF layers above the tensor compute dialect (TF executor -> TF control -> TF) mostly "design stable" with some parts yet to be implemented? Is there a particular roadmap for those?

Hi Uday!

There is so much history in the TF/XLA ecosystem that it is hard to parse through. In addition, we have two compilation flows currently:

TF->xla_hlo->IREE Flow->LinAlg->... (in use by IREE)

TF->xla_hlo->xla_lhlo->... (in use by more traditional TF backends)

We ultimately believe that much of this can be rebased on a high level TCP abstraction, and we've found it valuable to start working on that as part of the npcomp scale model, focusing on extracting the patterns to bridge the various layers that have grown in the IREE and traditional TF side into an independent place that we can look at and evaluate how to evolve/upstream/etc. We'd like to see more convergence and upstreaming of the frontend layers, which is why we (IREE-team) are building out the npcomp prototype to aggregate the necessary pieces in preparation for concrete design discussions and upstreaming. +Sean Silva who is working on this in the mlir-npcomp repo.

In addition to this, we are continuing to push on the TF->XLA path, and it is mature enough to be handling some fairly non trivial models. Here is IREE's HLO op coverage for various backends (all exercising the HLO->LinAlg path): https://google.github.io/iree/HLOOpCoverage. We don't have a corresponding list of supported TF ops published yet, but we do have our model level coverage annotated in the IREE's TensorFlow tests build file. Most models we care about are in the frustrating space of needing "one or two more op variants", but notably ResNet50 and MobileNet are compiling/running via both LLVM/CPU and Vulkan-SPIRV codegen paths. There are also a handful of models we track privately (mainly sequence and various audio models). Notably on the IREE side right now, these are (almost) exclusively forward-pass (inference) only, and we need to expand our inventory to include loss functions. We largely consider the TF->XLA path to be "design stable" for static shapes. Most of the work we are putting into it is either related to extending op coverage or generalizing things to have some support for dynamic shapes.

Thanks,
Uday

--
You received this message because you are subscribed to the Google Groups "MLIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ml...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/mlir/e2fc1ffc-46d0-4722-972e-3c3078536c65%40tensorflow.org.

Uday Bondhugula

unread,

Jun 25, 2020, 12:57:00 AM6/25/20

to MLIR

On Friday, June 5, 2020 at 2:05:22 PM UTC+5:30, Uday Bondhugula wrote:

[Sorry for top posting, but this isn't a direct continuation although on the same topic. ]

I just tried out the TF -> XLA path via MLIR (starting from GraphDef) on a VGG sized model and it's really great to see nearly all conversions working through the xla_hlo dialect with some intervention needed which I'm listing below. I'm posting this to check if someone is already working on this or if contributions are being sought to complete the missing things, or if there are any non-trivial issues preventing those.

Here's the MLIR pipeline I ran the TFv2 graphdef through:

$ mlir-tf-translate -graphdef-to-mlir -tf-enable-shape-inference-on-import=true model.pbtxt | tf-opt -tf-executor-island-coarsening -canonicalize -tf-decompose-resource-ops -tf-promote-resources-to-args -canonicalize -xla-legalize-tf

The only things that got in the way are:

1) tf.VarIsInitializedOp
I can't find this op in the tf/mlir tree but only in test cases; its bool result is unused and it appears to be a non-mutating check. It precedes every tf.ReadVariableOp. Looks like it could just be erased via a simple pattern (perhaps it just needs to be added to ops and marked side-effect free)?

%outputs_102, %control_103 = tf_executor.island wraps "tf.VarIsInitializedOp"(%outputs_98) {device = ""} : (tensor<!tf.resource<tensor<64xf32>>>) -> tensor<i1>

This issue is now addressed with this PR: https://github.com/tensorflow/tensorflow/pull/40744

tf.VarIsInitializedOp will no longer get in the way of a conversion into the TF dialect and further on to MLIR xla_hlo.

- Uday

Reply all

Reply to author

Forward