OpenXLA overall architecture & components

Mehdi AMINI

unread,

May 15, 2023, 8:13:24 PM5/15/23

to OpenXLA Discuss

Hi all,

I was looking for some public info on the overall architecture of the OpenXLA project, trying to define “what are we trying to build?” and how to start articulating all this. I couldn’t find much unfortunately, and it may be time to get some alignment here to avoid a chaotic iterative and ad-hoc merge of the components.

When we started OpenXLA (I wrote the first doc circa 11/2021!), the goal was to create a community-driven project around the XLA compiler, which was under-going already a transition to incorporate gradually more of MLIR internally. Beyond that, we worked on modularizing the project: finding within XLA the abstract components and the extension point. While the XLA codebase may come across as a bit monolithic, the overall architecture is actually pretty decoupled: you can theoretically slot your own compiler at various points of the stack! That led to the goal of OpenXLA being “an ecosystem of reusable ML infrastructure components”: while XLA presents a consistent assembly of such components, they should be designed and built to be reused separately. One of the first examples was how StableHLO was extracted and built into a separate repository decoupled from XLA.

The overlap between this evolution of XLA in the context of OpenXLA, and IREE led to a merge of IREE into OpenXLA, and an acknowledgement to adopt the IREE execution environment into OpenXLA as the replacement of the current execution environment for XLA (other than the runtime specificities, both project were adopting the same MLIR-based codegen already).

There hasn’t been much high-level public discussions on the future of OpenXLA since then, and I find it incredibly hard to discuss RFCs without a top-level view: I see some changes into individual components or between the interface between components but I can’t evaluate them without positioning them in a big picture.

"IREE" is mentioned multiple times in isolation of anything else, and I’ve been trying to find some reference on “What is IREE?” and more importantly “What is IREE within the context of OpenXLA?” because I would expect that joining the OpenXLA project, the definition of "IREE" evolves to mesh within OpenXLA.

Looking back, it was announced in December that IREE is joining the OpenXLA project without more details, and the last slide from the 1/24 community meeting just mentioned “more details to come”. Stella mentioned in the OpenXLA summit a couple of weeks ago that we should see a transition over the next 18 months towards the “OpenXLA compiler” which is “combining both IREE and XLA”. I'm interested in looking into what this combination means.

I wrote many times slides about the OpenXLA stack over the last year, the most recent time was a couple of months ago, when Jacques and myself presented OpenXLA privately to some company. In our presentation, IREE was introduced that way in the broader OpenXLA stack on the following slide (where the arrow is):

To elaborate a bit more on how I see the components in this picture, it should be roughly the following:

This does not give a detailed and complete picture of IREE, but as far as I understand, this integrates the specificity of IREE into the general stack provided by OpenXLA: IREE brings to OpenXLA mostly a modern execution engine, with new low-level abstraction capabilities and a set of compiler abstractions and transformations to map a sequence of operations to an efficient dynamic schedule.

If we zoom in and really want to see what IREE provides, it seems like the following will make it more explicit and accurate:

I’m trying to build some shared vocabulary and understanding here, and get to an overall architecture diagram, a description of the modular components in OpenXLA and their role and interactions. I defined some on the diagram above, here is a first attempt to describe these:

OpenXLA: the whole project, I would describe it as “an ecosystem of ML Infrastructure modular components that can be assembled to form an e2e stacks targeting CPU/GPU with extension points enabling xPU targets”
StableHLO: “a portability layer between ML frameworks and ML compilers, is an operation set for high-level operations (HLO) that supports dynamism, quantization, and sparsity. Furthermore, it can be serialized into MLIR bytecode to provide compatibility guarantees.” It is a stable format that is intentionally decoupled for MHLO, which in turn is positioned as a compiler IR (no stability and different ergonomic goals).
OpenXLA compiler: it is the component that takes StableHLO as input and generates an output for an execution environment. The out-of-the-box OpenXLA compiler execution environment is IREE and the IREE compiler uses extension points from the OpenXLA compiler to plug in as the execution environment. Other execution environments should be possible to plug-in for platforms which haven’t adopted IREE.
High-level optimization and device mapping: it is the component in the OpenXLA that operates at the full-graph level and performs transformations accordingly. It also considers the topology of the target (for multi-devices environments) and performs all sharding/partitioning necessary, and optimizes the cross-device communication scheduling to overlap computations. It is parameterized and customized for a given execution environment (platform/runtime/…).
Device Optimizer: it is a point where the partitioning is complete and the code has a “single device” view of the program and optimizes accordingly. This is a level where some linalg fusions may happen for example (In cases where linalg is being used). This is composed of generic and reusable transformations, but this is likely invoked and customized by a particular “execution environment compiler” (like the IREE compiler) since there is a dance that starts to take place with lowering towards a particular environment (and possibly a particular platform).
Pluggable HW specific codegen: this is a point where a single “fusion” or “dispatch” is handed over to the vendor plugin to generate the executable for a given fusion (e.g. to generate ptx/cubin), we can plug the Triton compiler as a codegen for specific kind of “fusions” here.
Execution Environment Compiler: in OpenXLA this is the “IREE Compiler”, it takes as input the output of the “High-level optimization and device mapping” and sets up the “Device Optimizer” according to its need (to lower as needed and transform it accordingly). Other environments are possible (a “no-runtime” embedded CPU environment compiler for example), which could reuse the “device Optimizer” but not map to the same kind of runtime abstractions as IREE.
Runtime: isn’t directly part of the compiler, it is the sets of components that are available on the target platform to allow execution of the resulting program. It is extensible in similar ways to the compiler, with different goals and constraints. There is a strong coupling between the execution environment compiler and the runtime.
MLIR: this is an infrastructure providing tools for building compilers, with reusable dialects and codegen components. OpenXLA is built using MLIR, reusing the codegen components provided as much as possible, and contributing back any improvements or new components (including tools) developed for the need of OpenXLA.

And for simplicity, the diagrams above side-step PJRT, which abstracts from the frameworks the instantiation of the platform and the compiler.

There are likely other ways to slice and dice the overall architecture, and I’d be interested to discuss this further and come up with some diagrams, terminology, and components descriptions we could put online as documentation of OpenXLA.

--

Mehdi

Stella Laurenzo

unread,

May 15, 2023, 10:27:36 PM5/15/23

to Mehdi AMINI, OpenXLA Discuss

Thanks for taking a stab at the definitions. Personally, I'm not very good at diagrams and would love the help you are offering. I've set myself a reminder to come back and jot down some notes and thoughts.

Also, in case if it wasn't obvious, some of this stuff has been very heavily debated within Google with respect to what is in and outside of OpenXLA, and in an attempt to not confuse people with intermediate states, I think we've overcorrected and shared too little of the final state. And what we have shared has had a certain difficulty to the naming that had got to be hard to piece apart for anyone not in the debates.

I think it is overdue that we set that right, and thank you for your help and calling it out.

--
Mehdi

--
You received this message because you are subscribed to the Google Groups "OpenXLA Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openxla-discu...@openxla.org.
To view this discussion on the web visit https://groups.google.com/a/openxla.org/d/msgid/openxla-discuss/CANF-O%3DbopQtt%2B5uks87_nViSqKk4ZpV3u64yFJ1%3DZDVq9eJRWw%40mail.gmail.com.
For more options, visit https://groups.google.com/a/openxla.org/d/optout.

Vinod Grover US

unread,

May 16, 2023, 12:24:06 PM5/16/23

to OpenXLA Discuss, Stella Laurenzo, Mehdi AMINI

My understanding was that both HLO and device optimizer should, both, accept StableHLO. in Mehdi's new picture device optimizer only recognizes linalg.

Julian Jones

unread,

May 16, 2023, 1:12:18 PM5/16/23

to Mehdi AMINI, OpenXLA Discuss

This is helpful, thank you Mehdi!

Eugene Zhulenev

unread,

May 16, 2023, 1:31:24 PM5/16/23

to Mehdi AMINI, OpenXLA Discuss

On Mon, May 15, 2023 at 5:13 PM Mehdi AMINI <joke...@gmail.com> wrote:

Hi all,

I was looking for some public info on the overall architecture of the OpenXLA project, trying to define “what are we trying to build?” and how to start articulating all this. I couldn’t find much unfortunately, and it may be time to get some alignment here to avoid a chaotic iterative and ad-hoc merge of the components.

When we started OpenXLA (I wrote the first doc circa 11/2021!), the goal was to create a community-driven project around the XLA compiler, which was under-going already a transition to incorporate gradually more of MLIR internally. Beyond that, we worked on modularizing the project: finding within XLA the abstract components and the extension point. While the XLA codebase may come across as a bit monolithic, the overall architecture is actually pretty decoupled: you can theoretically slot your own compiler at various points of the stack! That led to the goal of OpenXLA being “an ecosystem of reusable ML infrastructure components”: while XLA presents a consistent assembly of such components, they should be designed and built to be reused separately. One of the first examples was how StableHLO was extracted and built into a separate repository decoupled from XLA.
The overlap between this evolution of XLA in the context of OpenXLA, and IREE led to a merge of IREE into OpenXLA, and an acknowledgement to adopt the IREE execution environment into OpenXLA as the replacement of the current execution environment for XLA (other than the runtime specificities, both project were adopting the same MLIR-based codegen already).

There hasn’t been much high-level public discussions on the future of OpenXLA since then, and I find it incredibly hard to discuss RFCs without a top-level view: I see some changes into individual components or between the interface between components but I can’t evaluate them without positioning them in a big picture.

"IREE" is mentioned multiple times in isolation of anything else, and I’ve been trying to find some reference on “What is IREE?” and more importantly “What is IREE within the context of OpenXLA?” because I would expect that joining the OpenXLA project, the definition of "IREE" evolves to mesh within OpenXLA.
Looking back, it was announced in December that IREE is joining the OpenXLA project without more details, and the last slide from the 1/24 community meeting just mentioned “more details to come”. Stella mentioned in the OpenXLA summit a couple of weeks ago that we should see a transition over the next 18 months towards the “OpenXLA compiler” which is “combining both IREE and XLA”. I'm interested in looking into what this combination means.

I wrote many times slides about the OpenXLA stack over the last year, the most recent time was a couple of months ago, when Jacques and myself presented OpenXLA privately to some company. In our presentation, IREE was introduced that way in the broader OpenXLA stack on the following slide (where the arrow is):

To elaborate a bit more on how I see the components in this picture, it should be roughly the following:

This does not give a detailed and complete picture of IREE, but as far as I understand, this integrates the specificity of IREE into the general stack provided by OpenXLA: IREE brings to OpenXLA mostly a modern execution engine, with new low-level abstraction capabilities and a set of compiler abstractions and transformations to map a sequence of operations to an efficient dynamic schedule.

If we zoom in and really want to see what IREE provides, it seems like the following will make it more explicit and accurate:

I’m trying to build some shared vocabulary and understanding here, and get to an overall architecture diagram, a description of the modular components in OpenXLA and their role and interactions. I defined some on the diagram above, here is a first attempt to describe these:

This is very close to how I think of OpenXLA.

OpenXLA: the whole project, I would describe it as “an ecosystem of ML Infrastructure modular components that can be assembled to form an e2e stacks targeting CPU/GPU with extension points enabling xPU targets”
StableHLO: “a portability layer between ML frameworks and ML compilers, is an operation set for high-level operations (HLO) that supports dynamism, quantization, and sparsity. Furthermore, it can be serialized into MLIR bytecode to provide compatibility guarantees.” It is a stable format that is intentionally decoupled for MHLO, which in turn is positioned as a compiler IR (no stability and different ergonomic goals).
OpenXLA compiler: it is the component that takes StableHLO as input and generates an output for an execution environment. The out-of-the-box OpenXLA compiler execution environment is IREE and the IREE compiler uses extension points from the OpenXLA compiler to plug in as the execution environment. Other execution environments should be possible to plug-in for platforms which haven’t adopted IREE.

I think that PjRt will be another integration point for users that don't want to take any dependencies on IREE compiler, and want to build everything from scratch (although IREE VM / non-hal-runtime still can be reused for running host executables).

High-level optimization and device mapping: it is the component in the OpenXLA that operates at the full-graph level and performs transformations accordingly. It also considers the topology of the target (for multi-devices environments) and performs all sharding/partitioning necessary, and optimizes the cross-device communication scheduling to overlap computations. It is parameterized and customized for a given execution environment (platform/runtime/…).

With mhlo going away I'm not sure at what level this will happen. For example in openxla-nvgpu we had to write custom stablehlo transpose folding (https://github.com/openxla/openxla-nvgpu/commit/52e88c15f7c660badf7119210420f8920ddd048b), because previously it was in mhlo and we relied on it.

Device Optimizer: it is a point where the partitioning is complete and the code has a “single device” view of the program and optimizes accordingly. This is a level where some linalg fusions may happen for example (In cases where linalg is being used). This is composed of generic and reusable transformations, but this is likely invoked and customized by a particular “execution environment compiler” (like the IREE compiler) since there is a dance that starts to take place with lowering towards a particular environment (and possibly a particular platform).
Pluggable HW specific codegen: this is a point where a single “fusion” or “dispatch” is handed over to the vendor plugin to generate the executable for a given fusion (e.g. to generate ptx/cubin), we can plug the Triton compiler as a codegen for specific kind of “fusions” here.

Current plan for library integrations (cuDNN and Triton) is to intercept input IR at stablehlo level, and map subgraphs into cuDNN/Triton calls (before "fusions"/"dispatches" formed by builtin IREE pipeline). Longer term it will require a cost model, especially when we'll be able to target multiple libraries doing similar things.

However "custom flow.region / flow.dispatch codegen" is also a viable option. I think we'll end up with multiple extension points in the IREE compilation pipeline that allow plugins to intercept IR after it gets lowered from one level of abstraction to the next one (1. flow 2. stream 3. hal 4. vm?)

Execution Environment Compiler: in OpenXLA this is the “IREE Compiler”, it takes as input the output of the “High-level optimization and device mapping” and sets up the “Device Optimizer” according to its need (to lower as needed and transform it accordingly). Other environments are possible (a “no-runtime” embedded CPU environment compiler for example), which could reuse the “device Optimizer” but not map to the same kind of runtime abstractions as IREE.
Runtime: isn’t directly part of the compiler, it is the sets of components that are available on the target platform to allow execution of the resulting program. It is extensible in similar ways to the compiler, with different goals and constraints. There is a strong coupling between the execution environment compiler and the runtime.
MLIR: this is an infrastructure providing tools for building compilers, with reusable dialects and codegen components. OpenXLA is built using MLIR, reusing the codegen components provided as much as possible, and contributing back any improvements or new components (including tools) developed for the need of OpenXLA.

And for simplicity, the diagrams above side-step PJRT, which abstracts from the frameworks the instantiation of the platform and the compiler.

There are likely other ways to slice and dice the overall architecture, and I’d be interested to discuss this further and come up with some diagrams, terminology, and components descriptions we could put online as documentation of OpenXLA.

--
Mehdi

Oscar Hernandez

unread,

May 16, 2023, 2:20:46 PM5/16/23

to Vinod Grover US, OpenXLA Discuss, Stella Laurenzo, Mehdi AMINI

Another term to define is "Classic" OpenXLA. I heard of it today for the first time ;-). Oscar

You received this message because you are subscribed to a topic in the Google Groups "OpenXLA Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/a/openxla.org/d/topic/openxla-discuss/DnPUmpyk4y0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openxla-discu...@openxla.org.
To view this discussion on the web visit https://groups.google.com/a/openxla.org/d/msgid/openxla-discuss/b1915ec8-18ce-4269-80c7-78da10fe48e0n%40openxla.org.

Geoffrey Martin-Noble

unread,

May 22, 2023, 5:23:53 PM5/22/23

to Oscar Hernandez, Vinod Grover US, OpenXLA Discuss, Stella Laurenzo, Mehdi AMINI

On Tue, May 16, 2023 at 11:20 AM Oscar Hernandez <keyle...@gmail.com> wrote:

Another term to define is "Classic" OpenXLA. I heard of it today for the first time ;-). Oscar

"Classic XLA" (note not "Classic OpenXLA") refers to https://github.com/openxla/xla, which was pulled out of https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/xla. I think it specifically refers to the compiler based on passes on the HLO protobuf that predates the creation of MLIR. I think people also use it to refer to the compiler as a whole, including the incorporation of MLIR passes and codegen when contrasting with the next generation of the OpenXLA compiler, which will be developed outside that repo (with some bits extracted from it). In that sense, it's the current ("classic") production compiler, as opposed to the next generation that we're building, which some people refer to as "The OpenXLA Compiler", but which I think we should give some fun name so that we can refer to it as an entity separate from the overall project and avoid the confusion of XLA and IREE both being compilers in the OpenXLA project, but not being "The OpenXLA Compiler".

For maximal clarity, I propose XhLA with the Xh being the click consonant from Xhosa :-P

To view this discussion on the web visit https://groups.google.com/a/openxla.org/d/msgid/openxla-discuss/CAO8OWu7ENJUS%3DK8zuRnSbWJLNLr%2BiNb9FA9HGEZYj4aqJRyLHQ%40mail.gmail.com.

Stella Laurenzo

unread,

May 25, 2023, 9:25:40 PM5/25/23

to OpenXLA Discuss, Mehdi AMINI

Hi Mehdi - you beat me to the comment, but as I was reflecting on your thread, I realized that my brain works entirely differently and needs to have a model of the physical organization of the project in order to project APIs and logical designs. I realize that this is not how a lot of folks brains work, but based on a bit of feedback from folks who believe I have the physical organization in my head and they would love to see it, I wrote the RFC for what it looks like to release this thing. I expect that this is a complementary artifact to the logical description you are pushing, and maybe it is helpful in finding the center lane and getting some terminology set (at least from the "what's on the filesystem" perspective): https://github.com/openxla/community/pull/82

Will respond separately with some comments inline.

Stella Laurenzo

unread,

May 25, 2023, 9:29:41 PM5/25/23

to OpenXLA Discuss, Vinod Grover US, Stella Laurenzo, Mehdi AMINI

On Tuesday, May 16, 2023 at 9:24:06 AM UTC-7 Vinod Grover US wrote:

My understanding was that both HLO and device optimizer should, both, accept StableHLO. in Mehdi's new picture device optimizer only recognizes linalg.

Yes, I think that is the state we want to aim at for 2023: we have substantial assets in OpenXLA at the HLO level of abstraction, and investing there to make them accessible and connected to other top-level components via stablehlo gives us a path to re-use and de-risk. That also provides us an incremental path to port more to a native MLIR implementation, and it does not preclude future investments to enable/move the optimizations at different levels of the stack.

Stella Laurenzo

unread,

May 25, 2023, 9:46:00 PM5/25/23

to OpenXLA Discuss, Eugene Zhulenev, OpenXLA Discuss, Mehdi AMINI

On Tuesday, May 16, 2023 at 10:31:24 AM UTC-7 Eugene Zhulenev wrote:

On Mon, May 15, 2023 at 5:13 PM Mehdi AMINI <joke...@gmail.com> wrote:
Hi all,

I was looking for some public info on the overall architecture of the OpenXLA project, trying to define “what are we trying to build?” and how to start articulating all this. I couldn’t find much unfortunately, and it may be time to get some alignment here to avoid a chaotic iterative and ad-hoc merge of the components.

When we started OpenXLA (I wrote the first doc circa 11/2021!), the goal was to create a community-driven project around the XLA compiler, which was under-going already a transition to incorporate gradually more of MLIR internally. Beyond that, we worked on modularizing the project: finding within XLA the abstract components and the extension point. While the XLA codebase may come across as a bit monolithic, the overall architecture is actually pretty decoupled: you can theoretically slot your own compiler at various points of the stack! That led to the goal of OpenXLA being “an ecosystem of reusable ML infrastructure components”: while XLA presents a consistent assembly of such components, they should be designed and built to be reused separately. One of the first examples was how StableHLO was extracted and built into a separate repository decoupled from XLA.
The overlap between this evolution of XLA in the context of OpenXLA, and IREE led to a merge of IREE into OpenXLA, and an acknowledgement to adopt the IREE execution environment into OpenXLA as the replacement of the current execution environment for XLA (other than the runtime specificities, both project were adopting the same MLIR-based codegen already).

There hasn’t been much high-level public discussions on the future of OpenXLA since then, and I find it incredibly hard to discuss RFCs without a top-level view: I see some changes into individual components or between the interface between components but I can’t evaluate them without positioning them in a big picture.

"IREE" is mentioned multiple times in isolation of anything else, and I’ve been trying to find some reference on “What is IREE?” and more importantly “What is IREE within the context of OpenXLA?” because I would expect that joining the OpenXLA project, the definition of "IREE" evolves to mesh within OpenXLA.
Looking back, it was announced in December that IREE is joining the OpenXLA project without more details, and the last slide from the 1/24 community meeting just mentioned “more details to come”. Stella mentioned in the OpenXLA summit a couple of weeks ago that we should see a transition over the next 18 months towards the “OpenXLA compiler” which is “combining both IREE and XLA”. I'm interested in looking into what this combination means.

I wrote many times slides about the OpenXLA stack over the last year, the most recent time was a couple of months ago, when Jacques and myself presented OpenXLA privately to some company. In our presentation, IREE was introduced that way in the broader OpenXLA stack on the following slide (where the arrow is):

To elaborate a bit more on how I see the components in this picture, it should be roughly the following:

This does not give a detailed and complete picture of IREE, but as far as I understand, this integrates the specificity of IREE into the general stack provided by OpenXLA: IREE brings to OpenXLA mostly a modern execution engine, with new low-level abstraction capabilities and a set of compiler abstractions and transformations to map a sequence of operations to an efficient dynamic schedule.

If we zoom in and really want to see what IREE provides, it seems like the following will make it more explicit and accurate:

I’m trying to build some shared vocabulary and understanding here, and get to an overall architecture diagram, a description of the modular components in OpenXLA and their role and interactions. I defined some on the diagram above, here is a first attempt to describe these:

This is very close to how I think of OpenXLA.

OpenXLA: the whole project, I would describe it as “an ecosystem of ML Infrastructure modular components that can be assembled to form an e2e stacks targeting CPU/GPU with extension points enabling xPU targets”
StableHLO: “a portability layer between ML frameworks and ML compilers, is an operation set for high-level operations (HLO) that supports dynamism, quantization, and sparsity. Furthermore, it can be serialized into MLIR bytecode to provide compatibility guarantees.” It is a stable format that is intentionally decoupled for MHLO, which in turn is positioned as a compiler IR (no stability and different ergonomic goals).

This doesn't actually match reality. The StableHLO *project* carries those goals, but the `stablehlo` *dialect* is actually defined in terms of an evolution process that is much closer to, say, LLVM IR than it is to a serialization format (i.e. the `vhlo` dialect and corresponding passes/utilities for serialization are what arbitrate the wire-compatibility guarantees). Recognizing that distinction causes a different evaluation of the placement of MHLO. I'm not claiming that the various descriptions or names are entirely clear, but I believe that the function as I just described.

OpenXLA compiler: it is the component that takes StableHLO as input and generates an output for an execution environment. The out-of-the-box OpenXLA compiler execution environment is IREE and the IREE compiler uses extension points from the OpenXLA compiler to plug in as the execution environment. Other execution environments should be possible to plug-in for platforms which haven’t adopted IREE.

I think that PjRt will be another integration point for users that don't want to take any dependencies on IREE compiler, and want to build everything from scratch (although IREE VM / non-hal-runtime still can be reused for running host executables).

High-level optimization and device mapping: it is the component in the OpenXLA that operates at the full-graph level and performs transformations accordingly. It also considers the topology of the target (for multi-devices environments) and performs all sharding/partitioning necessary, and optimizes the cross-device communication scheduling to overlap computations. It is parameterized and customized for a given execution environment (platform/runtime/…).

With mhlo going away I'm not sure at what level this will happen. For example in openxla-nvgpu we had to write custom stablehlo transpose folding (https://github.com/openxla/openxla-nvgpu/commit/52e88c15f7c660badf7119210420f8920ddd048b), because previously it was in mhlo and we relied on it.

My personal opinion is that mhlo has served as far as it can on this as a "mirror dialect". It has a number of practical flaws in how it was evolved (it is important to remember that if memory serves, even before MLIR was released publicly, this was *the very first dialect* and it suffered through a lot of our early years of learning how to use the toolkit). Because it has been added to incrementally over the years, we've never had a proper distinction between canonicalizations and lowerings, and there are a lot of things that it would be good to roll back history on there (i.e. I've reviewed many patches that were effectively "canonicalizing" into a given form as part of driving a specific, unpublished lowering strategy for a device). On the IREE side, we've found that while it is work to recover these simplifiers and port them to the stablehlo *dialect* level (see my comments above about the distinction), the result lets us reset the history and know what we are getting. Without taking a firm opinion on how much built-in folding and canonicalization such an opset should have enabled implicitly by default, I much prefer the mode where we have these optimizations held separate with a quite high bar in place to making things implicit/automatic again.

I further believe that we don't actually need/want an effective mirror dialect at the HLO level to model orthogonal concepts, and we would be better served (as we have done in IREE and elsewhere) to create dialects that represent the different levels of abstraction as needed and make them compose with the canonical "hlo-level" dialect (as opposed to just building them into mhlo or the stablehlo dialect).

I anticipate that my thoughts on the role of the stablehlo *project*, the stablehlo *dialect*, the mhlo *project* and the *mhlo* dialect probably has a lot of room for a more detailed discussion than I know how to have on a mailing list thread like this. I'd be happy to fork that to some other forum if there are counter thoughts or discussion that needs to happen.

Vinod Grover US

unread,

May 30, 2023, 7:29:09 PM5/30/23

to OpenXLA Discuss, Stella Laurenzo, Vinod Grover US, Stella Laurenzo, Mehdi AMINI

Not just for 2023. One of the goals we had at initial OpenXLA meetings was to allow a pluggable backend to OpenXLA's front-end passes. The current front-end pass is based on XLA's HLO optimizer. Eventuelly it will be replaced. However the pluggable backend needs an IR. I suggest that we fix that to be based on StableHLO for the foreseeable future. Otherwise OpenXLA is not an attractive platform for 3rd party compiler backends.

Reply all

Reply to author

Forward