[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

415 views
Skip to first unread message

Trifunovic, Konrad via llvm-dev

unread,
Mar 2, 2021, 4:36:46 AM3/2/21
to llvm...@lists.llvm.org, Paszkowski, Michal, Bezzubikov, Aleksandr, Tretyakov, Andrey1
Hi all,

We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:

Abstract
========

We at Intel are interested in contributing a proper LLVM backend that targets Khronos SPIR-V portable IR format [7]. It would be based on a proper backend architecture (GlobalISel) and targeting compute flavour of SPIR-V with further possibility to extend it to 3D shader flavour (Vulkan). What we are asking for is LLVM community blessing for the proposal and help in addressing open questions (Many of you are already familiar with the topic, so you might want to skip immediately to 'Open questions' and 'Objective' without going through all paragraphs).

We would be extremely grateful for all comments, questions and guidance on further direction.

Intro
====

There have been several attempts to properly integrate SPIR-V generators into LLVM , but, to the best of our knowledge, none of them made significant progress to eventually land into LLVM.org trunk.

One of the reasons for such a state is the lack of consensus onto the fundamental design: whether it should be a translator library (Khronos LLVM - SPIR-V translator) wrapped within a target, or, it should be a 'proper' LLVM target using SelectionDAG/GlobalISel or 'just' a binary emission layer (just naming few ideas discussed over previous mailing list threads). [1][2][3]

We at Intel do want to give it another try by implementing a 'true' backend approach. Most importantly, we do want to land the prototype code into LLVM trunk as a SPIR-V target and continue the development there as a prototype LLVM target. Starting point for project is code base at Khronos github[4].

note: In the meantime it is not meant to be a replacement for bidirectional SPIRV-LLVM translator developed by Khronos members [5] (including Intel). This proposal does not address the question of SPIR-V to LLVM IR translation (what could be considered a SPIR-V front-end for LLVM).

Design
======

Without starting a new debate on implementation choices, we took into the account the following important design points from previous discussions:

* The overall goal of this effort is to implement a proper LLVM backend for SPIR-V. That said, it registers itself as a proper target, implements Target* interfaces (similarly to NVPTX or AMDGPU backends). The backend uses GlobalISel infrastructure starting with Khronos prototype [4] (big thanks goes to ARM for contributing this code) and we are committed to keep it that way (i.e. no fallback to SelectionDAG is planned). This addresses some concerns in the first proposal [1].

* Support OpenCL (compute) flavour of SPIR-V. Infrastructure is flexible, so adding Vulkan specific opcodes/capabilities should not be a big effort. (but not planned in the near term)

* For non-clang based frontends it is desirable to expose intrinsics through a target specific .td file (currently not done, still relying on well-known names and mangling). Need discussion on direction.

Implementation
=============

* Since SPIR-V is a virtual ISA, many of the regular backend passes are disabled, such as register allocation or scheduling. This is quite similar to what NVPTX BE is doing. Still most of the logic is concentrated in canonical GlobalISel passes: IRTranslator, CallLowering, Legalization, InstructionSelection. RegBankSelect is of no need in our backend.

* One of the major differences between SPIR-V ISA and LLVM IR is the way type information is stored. In order to link gMIR instructions to the SPIR-V type they are producing we use some pseudo instructions which were quite easy to fold with the actual instruction on the selection stage while still providing all the necessary info at the previous passes.

* In the meantime some of the SPIR-V instructions (e.g. OpAccessChain) are being generated right at the IRTranslation stage. This goes back to the original prototype, we are not sure yet if we should get rid of this - some advice could be helpful. Moreover, calls to OpenCL builtins are lowered into the actual SPIR-V code at CallLowering stage - i.e. not properly integrated into selection yet.

* Due to the aforementioned difference in how LLVM IR and SPIR-V describe values and their types, backend legalizer is making some custom transformations on top of the existing ones to ensure types compliance with the selector expectations without disabling preISel legality checks.

* Instruction selection patterns are distributed between Tablegen and plain C++ - thanks to GlobalISel for allowing that. For example, most of the binary operators are described in .td while casts are selected with C++ code.

- note: Code generation is achieved with no (or minimal) changes to general GlobalISel infrastructure. Some modifications to the existing GlobalISel implementation may happen, but at the moment we're trying to avoid them unless absolutely necessary or we're sure the changes may be beneficial for the whole LLVM project.

* There is a couple of custom passes in the backend, e.g. for generating required capabilities, decorations and extensions. There also exists a pass to ensure SPIR-V BBs layout requirements.

Current state & open problems
=========================

Current code is based on LLVM 12 and is now published at Khronos github [4]. This includes the original code contributed by ARM and some additions developed at Intel (both being active Khronos members).
We are working on overall refactoring, implementation of the missing features and improving the pass-rate (see 'Testing' below), but there are a bunch of problems which are on our TODO list:

* Remove selection logic from IR translation stage, this problem's inherited from original prototype
* Proper handling of extensions (planned to be similar to the translator's approach which is to enable them explicitly via an option)
* Binary file versioning - there is much output version numbers (and header structure in general) hardcoding in the current codebase
* Implement some of the currently missing OpenCL builtins
* .td descriptions for Capabilities/Decorations/etc. - already work in progress

Testing
======

A dozen of LIT tests have been contributed to facilitate offline testing. Nevertheless, there is (still) a lack of 'runtime testing', where a produced SPIR-V binary is actually executed on a target platform (being it a CPU/GPU/FPGA). Intel plans to provide testing on a reference GPU platform and other OpenCL platform providers are encouraged to do the same.

Current test-suite mostly consists of LITs taken from LLVM-SPIRV translator. We have not achieved 100% pass-rate on it yet and the testsuite itself is not yet complete.

Open questions
=============

There are also a number of problems we have not come to a final solution as of yet, so any input from the community would be greatly appreciated. Here we list some:

* Exposing compute intrinsics: mangling or Intrinsics.td? It seems that non-clang front-ends would prefer having a library of SPIR-V (GPU-centric) intrinsics exposed by a target. Current clang approach for OpenCL is using well-known names for OpenCL builtin functions and name mangling (which is also the way supported by LLVM-SPIRV translator). SPIRV-LLVM bidirectional translator also supports a 'SPIR-V friendly' LLVM-IR convention [6].

* Development model - in-trunk or out-of-trunk?

1) we could land the code as it is to llvm.org trunk (residing in lib/Target/SPIRV) and continue development from there, keeping it as a prototype target. That would be preferable for us, since we think that contributing code to trunk will give better community visibility and help us with a continuous guidance of LLVM community.

2) development will continue on external Github (based on most recent LLVM codebase) until some agreed-upon milestone is reached. We are open to this option, though it is less preferable by us since we will remain out-of-sync with main llvm development and will not have an opportunity to contribute back generic improvements to codegen infrastructure.

* Selection dilemma: .td vs C++ selection patterns - maybe there is already a BKM for that? One of the problem with moving everything to Tablegen is an increased number of variants for the same opcode (due to the generality built in SPIR-V design, e.g. OpSelect supports integers, float, vectors of both, etc.). That in turn worsens the code in some places, e.g. some checks regarding those opcodes.

* Promotion criteria: whichever development model is chosen, the backend will be in an experimental state. There is a need to set up quality criteria for promoting it into a regular backend. We propose that we should track the quality of current Khronos LLVM-SPIRV translator [5] and to switch to a production quality SPIR-V backend once that quality/functionality is on par. Any other suggestions would be appreciated.

* Testing and maintenance: currently testing is performed through LIT tests, but that only facilitates 'offline' testing. Ultimately the SPIR-V code needs to be executed on at least one OpenCL conformant platform that does execute SPIR-V kernels. This is work in progress and currently will proceed outside of LLVM buildbot infrastructure (i.e. will be performed at in-house Intel infrastructure). We want to discuss how this flow could be up-streamed to LLVM community. Of course, other vendors are encouraged to support this effort by providing their reference platforms.

This is not closed list of open questions, please feel free to add Your opinions and points for discussion.

Objective
========

Our ultimate objective is to upstream the backend to the trunk LLVM repository. Since our changes are too significant for a general code review on Phabricator/Mailing List, we would like to encourage you to comment on the backend's original repository on GitHub [4]. Eventually (in the next couple of months), we plan to commit the experimental backend to the LLVM repository and ask for post-commit review. The backend could land either in the main branch as an experimental backend or possibly on a new branch allowing for easier review and further work. Right now we would like to ask for general discussion, comments and we are happy to answer any questions you might have as well.

Numbered references
===================

[1] https://lists.llvm.org/pipermail/llvm-dev/2015-June/086848.html

[2] https://lists.llvm.org/pipermail/llvm-dev/2017-May/112538.html

[3] https://lists.llvm.org/pipermail/llvm-dev/2018-September/125948.html

[4] https://github.com/KhronosGroup/LLVM-SPIRV-Backend

[5] https://github.com/KhronosGroup/SPIRV-LLVM-Translator

[6] https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/master/docs/SPIRVRepresentationInLLVM.rst

[7] https://github.com/KhronosGroup/SPIRV-Guide

regards,
konrad
--------------------------------------------------------------------------------------------------------------------------------------------
Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Renato Golin via llvm-dev

unread,
Mar 2, 2021, 5:11:48 AM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org, Paszkowski, Michal, Bezzubikov, Aleksandr, Tretyakov, Andrey1
On Tue, 2 Mar 2021 at 09:36, Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi all,

We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:

Hi,

Perhaps a parallel question: how does that integrate with MLIR's SPIRV back-end?

If this proposal goes through and we have a production-quality SPIRV back-end in LLVM, do we remove MLIR's own version and lower to LLVM, then to SPIRV? Or do we still need the MLIR version?

In a perfect world, translating to LLVM IR then to SPIRV shouldn't make a difference, but there could be some impedance mismatch between MLIR->LLVM lowering that isn't compatible with SPIRV?

But as a final goal, if SPIRV becomes an official LLVM target, it would be better if we could iron out the impedance problems and keep only one SPIRV backend.

cheers,
--renato

Trifunovic, Konrad via llvm-dev

unread,
Mar 2, 2021, 6:07:47 AM3/2/21
to Renato Golin, llvm...@lists.llvm.org
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR -> LLVM-IR with SPIR-V intrinsics (let's call it a LLVM IR SPIR-V dialect) -> SPIR-V binary (generated by a backend). So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.

regards,
konrad

> From: Renato Golin <reng...@gmail.com>
> Sent: Tuesday, March 2, 2021 11:12 AM
> To: Trifunovic, Konrad <konrad.t...@intel.com>
> Cc: llvm...@lists.llvm.org; Paszkowski, Michal <michal.p...@intel.com>; Bezzubikov, Aleksandr <aleksandr....@intel.com>; Tretyakov, Andrey1 <andrey1....@intel.com>
> Subject: Re: [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

Aleksandr Bezzubikov via llvm-dev

unread,
Mar 2, 2021, 7:16:27 AM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
Hi,

Perhaps some obvious addition to Konrad's answer - a proper LLVM backend for SPIR-V can make it much easier for people who're already using LLVM for codegen purposes (targeting e.g. AArch64 or x86 CPUs) to simply retarget their flow with one (ideally) option changed in cmdline.


Thanks,
Alex

вт, 2 мар. 2021 г. в 14:07, Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org>:


--
Aleksandr Bezzubikov

Renato Golin via llvm-dev

unread,
Mar 2, 2021, 7:58:05 AM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
On Tue, 2 Mar 2021 at 11:07, Trifunovic, Konrad <konrad.t...@intel.com> wrote:
I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

Excellent.

The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

I see. It is unfortunate that we have a shader focused backend in one side and a compute focused in another. As you say, it means we can only move the SPV MLIR lowering to LLVM once the LLVM side also supports it.

I'm guessing the support is not a trivial addition to compute and that it will probably take place after the current proposal is mostly done.

It is unfortunate, but not altogether bad. I think it would be fine for them to co-exist until the time we unify.
 
So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.

Indeed, that's what I meant.

Thanks!
--renato

Johannes Doerfert via llvm-dev

unread,
Mar 2, 2021, 10:56:44 AM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org, Paszkowski, Michal, Bezzubikov, Aleksandr, Tretyakov, Andrey1
Little expertise to help but looking forward to it happening.

~ Johannes

Lei Zhang via llvm-dev

unread,
Mar 2, 2021, 2:02:28 PM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
Chiming in mostly from the perspective of MLIR SPIR-V support. More comments inlined, but first some general comments. :)

As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it's 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It's unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet. My understanding over LLVM is it's mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn't much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on. This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too.. Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality. 

That's why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR's open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :)

On Tue, Mar 2, 2021 at 6:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions, e.g. in Vulkan we have logical adressing mode).

By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

I think we have an assumption here: LLVM itself should support all mechanisms and use cases SPIR-V can support, if to make LLVM a layer before SPIR-V. I think there is a huge gap here.
 

The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute.

To be specific, Vulkan compute is the most well supported use case right now. But there are interests from the community to push on Vulkan graphics and OpenCL.
 
Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).

I actually believe the opposite, because of the reasons I listed at the very beginning. To me SPIR-V also stays at a higher level than LLVM. (But again, depending on what subset we are talking about.) 

My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR -> LLVM-IR with SPIR-V intrinsics (let's call it a LLVM IR SPIR-V dialect) -> SPIR-V binary (generated by a backend). So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.
 
Not sure this is the prefered way, given that we can define SPIR-V ops easily in MLIR in its own dialect with native support for various aspects. 

Renato Golin via llvm-dev

unread,
Mar 2, 2021, 3:22:29 PM3/2/21
to Lei Zhang, llvm...@lists.llvm.org, Trifunovic, Konrad
On Tue, 2 Mar 2021 at 19:02, Lei Zhang <antia...@google.com> wrote:
With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :)

Thank you for such a detailed response!

Honestly, I don't know much about SPIRV, so my comments were without context. If there are reasons to keep the back-end on both sides, I'm not against it.

I just proposed unifying things in case they're duplicated. If we can make that case, then it should definitely be part of the plan.

cheers,
--renato

Mehdi AMINI via llvm-dev

unread,
Mar 2, 2021, 4:40:02 PM3/2/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

James Y Knight via llvm-dev

unread,
Mar 2, 2021, 6:19:01 PM3/2/21
to Mehdi AMINI, llvm...@lists.llvm.org, Trifunovic, Konrad
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be. I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what's the purpose? It wouldn't really help move towards the longer term goal, I don't think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Aleksandr Bezzubikov via llvm-dev

unread,
Mar 2, 2021, 6:45:35 PM3/2/21
to James Y Knight, aleksandr....@intel.com, llvm...@lists.llvm.org, Trifunovic, Konrad
Please some some comments inlined

On Wed, Mar 3, 2021 at 2:19 AM James Y Knight via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.

What do you mean by lower? I'm not that familiar with the way MLIR deals with SPIR-V binaries, but isn't it still necessary to convert SPIR-V dialect to LLVM and then use some hardware-tied codegen to be able to _run_ a SPIR-V binary? 
 
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

Oh, this sounds interesting actually. Would be nice if someone has any materials or code to share on the topic.
 

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be. I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.


There's definitely some consensus or even roadmap/timeline on this transition missing IMO :)  And pls forgive me my possibly stupid question, but is there any way now to _conveniently_ incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have 'llvm' dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn't public yet, is it?


Thanks,
Alex

Mehdi AMINI via llvm-dev

unread,
Mar 2, 2021, 6:46:23 PM3/2/21
to James Y Knight, llvm...@lists.llvm.org, Trifunovic, Konrad
On Tue, Mar 2, 2021 at 3:18 PM James Y Knight <jykn...@google.com> wrote:
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be.

I don't understand what you mean here with "layered underneath LLVM"? Can you elaborate on this?

 
I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.

I'm not sure if we're talking about the same thing here: there is nothing that I suggest that would operate at the level of LLVM IR. And nothing that requires a "long timescale", it seems quite easily in scope to me here.
 

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what's the purpose? It wouldn't really help move towards the longer term goal, I don't think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Do we want to maintain, in the LLVM monorepo, *two* different implementations of a SPIRV IR and associated serialization (and potential deserialization)? All the tools associated to manipulate it? I assume the backend may even want to implement optimization passes, are we gonna duplicate these as well?
(note that this isn't at the LLVM IR level, but post-instruction selection, so very ad-hoc to the backend anyway).0

-- 
Mehdi

Mehdi AMINI via llvm-dev

unread,
Mar 2, 2021, 6:49:20 PM3/2/21
to Aleksandr Bezzubikov, llvm...@lists.llvm.org, Trifunovic, Konrad, aleksandr....@intel.com
On Tue, Mar 2, 2021 at 3:45 PM Aleksandr Bezzubikov <zuba...@gmail.com> wrote:
Please some some comments inlined

On Wed, Mar 3, 2021 at 2:19 AM James Y Knight via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.

What do you mean by lower? I'm not that familiar with the way MLIR deals with SPIR-V binaries, but isn't it still necessary to convert SPIR-V dialect to LLVM and then use some hardware-tied codegen to be able to _run_ a SPIR-V binary? 

What you're describing seems a bit orthogonal to the SPIRV backend: you're asking "how would someone run a SPIRV binary". That up to the SPIRV runtime implementation (it may or may not use LLVM to JIT the SPIRV to the native platform).
From what I understand, the proposal about a backend here is exclusively about a "LLVM -> SPIRV" flow, i.e. SPIRV is the abstract ISA (like NVPTX) and the final target of the workflow.

 
 
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

Oh, this sounds interesting actually. Would be nice if someone has any materials or code to share on the topic.
 

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be. I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.


There's definitely some consensus or even roadmap/timeline on this transition missing IMO :)  And pls forgive me my possibly stupid question, but is there any way now to _conveniently_ incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have 'llvm' dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn't public yet, is it?

Not that I am aware of, but I haven't followed the most recent development either! We're very interested into looking into this though :)

Ronan KERYELL via llvm-dev

unread,
Mar 2, 2021, 8:07:25 PM3/2/21
to Trifunovic, Konrad via llvm-dev, Trifunovic, Konrad, Bezzubikov, Aleksandr, Tretyakov, Andrey1, Paszkowski, Michal
>>>>> On Tue, 2 Mar 2021 09:36:35 +0000, "Trifunovic, Konrad via llvm-dev" <llvm...@lists.llvm.org> said:

Konrad> Hi all, We would like to propose this RFC for upstreaming a
Konrad> proper SPIR-V backend to LLVM:

+1

It would be nice to have this real after 6+ years of various projects
flying around and diluting the efforts among various SPIR-V consumers
and producers...
--
Ronan KERYELL

Trifunovic, Konrad via llvm-dev

unread,
Mar 3, 2021, 5:11:27 AM3/3/21
to Lei Zhang, llvm...@lists.llvm.org
> As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it's 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It's unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet.

Agreed. Indeed, 'production quality SPIR-V backend' is vaguely defined here and we proposed one discussion point on this. For this proposal needs, we should focus on one subset of SPIR-V and one use-case (OpenCL). By production quality I mean that we can correctly produce the code for that subset of SPIR-V. I totally agree that having the 'Full SPIR-V' coverage is something very broad and probably not achievable at all - but we are not aiming at that. I do take the perspective of classical 'CPU backend' here: we do have to generate the ISA code for the input LLVM-IR code. Now, besides instructions, our backend needs to deduce proper capabilities and extensions, based on what subset of instructions is selected. As You pointed out later, plain LLVM-IR is not capable of describing the full SPIR-V. Some of decorations/extensions/capabilities might be deduced by the backend, while some need to be declared using various LLVM-IR concepts, such as metadata, attributes, intrinsics - and that needs a clear definition.


> My understanding over LLVM is it's mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn't much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on.

Also agree. I do believe though, that LLVM-IR is still worth the effort, and we can take an incremental approach into adopting some GPU concepts into LLVM-IR. (e.g. 'convergent' attribute has been added mainly for GPU kind of targets). The first step is defining the metadata and intrinsics that are target specific for SPIR-V, but most of them could be generalized as GPU concepts and even introduced into core LLVM-IR spec. I agree this is great effort - and not really the main focus point of this proposal - yet, we should give LLVM-IR it's own right into the world of GPUs.

> This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too.. Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality.

I would focus on just a subset and clearly define what input LLVM-IR GPU dialect would look like for that subset.

>
> That's why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR's open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

Right. For the 'General SPIR-V' support, MLIR is the right abstraction level to use. And I would keep it that way. For the 'specific'/legacy uses, backend is the way to fill that gap.

> With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :)

A very good discussion! I seem to be overly optimistic at the first place at unifying those two approaches. Now I believe that we actually should have two paths, for the reasons You have just explained and for the reasons of supporting 'legacy' paths/compilers that rely on a classical, years old approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that legacy path, a plain old 'backend' approach is still (in my view) the way to go. On the other hand, when MLIR evolves and gets wider adoption, it will be the way to go. From the semantic point of view, MLIR is much better suited for representing structured and extensible nature of SPIR-V. But for MLIR approach to be adopted, new languages/front-ends need to be aware of that structure, so to take most of the advantage of it. If Clang C/C++ start to use MLIR as its native generation format - that would be a big case for MLIR approach, but until that happens, we need to have some intermediate solution.

The most important use-case for the backend is the systems/languages that have been targeting x86, ARM or NVPTX, AMDGPU. You want to replace that particular backend with some other GPU backend (e.g. Intel GPU :) ). So the solution is to use SPIR-V as the backend target, and then consume SPIR-V with a proprietary/open-source GPU finalizer that eventually produces GPU assembly. (We did not want to come up with yet another GPU intermediate language like PTX, HSAIL etc. and want to use Khronos standard SPIR-V for that purpose). In this use case, the stress is on points 1) and 2) and less on point 3) as You stated earlier.

So my proposal is to keep two paths. They are complementary to each other. I know of the maintenance cost concerns - but while there are use-cases for both, it is still worth it. When the worlds stops using LLVM-IR, it will die silently, so will die the SPIR-V backend - but that is a natural software lifecycle ;)

As it comes to unifying, there seems little implementation could be unified (unless we make GlobalISel produce MLIR SPV dialect😊). Nevertheless, we might collaborate on a conceptual level, especially on defining the subset of SPIR-V that we want to support, what use cases (OpenCL, Vulkan compute, OGL?) are relevant for LLVM community and to have a common vision there.

BTW: Intel is also interested in MLIR path and there is a group actively contributing in that direction too.
...

>> My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).

> I actually believe the opposite, because of the reasons I listed at the very beginning. To me SPIR-V also stays at a higher level than LLVM. (But again, depending on what subset we are talking about.) 

Here by 'lower level in the compiler stack' I did not mean the higher/lower semantics level, but the place in the compiler pipeline (front-end -> optimizer -> back-end), where I assumed MLIR is at front-end level, optimizer is LLVM-IR, and back-end comes last.
I agree that semantically, SPIR-V is higher level than LLVM-IR, especially when it comes to other meta-data that LLVM-IR does not support natively (extensions/extended instruction sets/capabilities/execution model etc.)

>> My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

>> PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR -> LLVM-IR with SPIR-V intrinsics (let's call it a LLVM IR SPIR-V dialect) -> SPIR-V binary (generated by a backend). So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.
 
> Not sure this is the prefered way, given that we can define SPIR-V ops easily in MLIR in its own dialect with native support for various aspects. 

Agree. Having in mind that we should actually keep both paths, I believe this path of going MLIR -> LLVM-IR -> SPIR-V then does not make sense as it might lose some information.

regards,
konrad

> From: Renato Golin <mailto:reng...@gmail.com>
> Sent: Tuesday, March 2, 2021 11:12 AM
> To: Trifunovic, Konrad <mailto:konrad.t...@intel.com>
> Cc: mailto:llvm...@lists.llvm.org; Paszkowski, Michal <mailto:michal.p...@intel.com>; Bezzubikov, Aleksandr <mailto:aleksandr....@intel.com>; Tretyakov, Andrey1 <mailto:andrey1....@intel.com>
> Subject: Re: [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
>
> On Tue, 2 Mar 2021 at 09:36, Trifunovic, Konrad via llvm-dev <mailto:mailto:llvm...@lists.llvm.org> wrote:
> Hi all,
>
> We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:
>
> Hi,
>
> Perhaps a parallel question: how does that integrate with MLIR's SPIRV back-end?
>
> If this proposal goes through and we have a production-quality SPIRV back-end in LLVM, do we remove MLIR's own version and lower to LLVM, then to SPIRV? Or do we still need the MLIR version?
>
> In a perfect world, translating to LLVM IR then to SPIRV shouldn't make a difference, but there could be some impedance mismatch between MLIR->LLVM lowering that isn't compatible with SPIRV?
>
> But as a final goal, if SPIRV becomes an official LLVM target, it would be better if we could iron out the impedance problems and keep only one SPIRV backend.
>
> cheers,
> --renato
>
_______________________________________________
LLVM Developers mailing list
mailto:llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Renato Golin via llvm-dev

unread,
Mar 3, 2021, 6:25:03 AM3/3/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
On Wed, 3 Mar 2021 at 10:11, Trifunovic, Konrad <konrad.t...@intel.com> wrote:
> With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :)

A very good discussion! I seem to be overly optimistic at the first place at unifying those two approaches. Now I believe that we actually should have two paths, for the reasons You have just explained and for the reasons of supporting 'legacy' paths/compilers that rely on a classical, years old approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that legacy path, a plain old 'backend' approach is still (in my view) the way to go. On the other hand, when MLIR evolves and gets wider adoption, it will be the way to go. From the semantic point of view, MLIR is much better suited for representing structured and extensible nature of SPIR-V. But for MLIR approach to be adopted, new languages/front-ends need to be aware of that structure, so to take most of the advantage of it. If Clang C/C++ start to use MLIR as its native generation format - that would be a big case for MLIR approach, but until that happens, we need to have some intermediate solution.

I think there are two points here:

1. How many SPIRV end-points we have

This is mostly about software engineering concerns of duplication, maintenance, etc. But it's also about IR support, with MLIR having an upper hand here because of the existing implementation and its inherent flexibility with dialects.

It's perfectly fine to have two back-ends for a while, but since we moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not a side project. 

LLVM IR has some "flexibility" through intrinsics, which we could use to translate MLIR concepts that can't be represented in LLVM IR for the purpose of lowering only. Optimisations on these intrinsics would bring the usual problems.

2. Where do the optimisations happen in code lowering to SPIRV

I think Ronan's points are a good basis for keeping that in MLIR, at least for the current code. Now, if that precludes optimising in LLVM IR, than this could be a conflict with this proposal.

If the code passes through MLIR or not will be a decision of the toolchain, that will pick the best path for each workload. This allows us to have concurrent approaches in tree, but also makes it hard to test and creates corner cases that are hard to test.

So, while I appreciate this is a large proposal, that will likely take a year or more to get into shape, I think the ultimate goal (after the current proposal) should be that we end up with one back-end.

I'm a big fan of MLIR, and I think we should keep developing the SPIRV dialect and possibly this could be the entry point of all SPIRV toolchains. 

While Clang will take a long time (if ever) to generate MLIR for C/C++, it could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is then optimised, compiled into LLVM IR and linked to the main module (or not, for multi-targets) after high-level optimisations.

This would answer both questions above and create a pipeline that is consistent, easier to test and with lower overall maintenance costs.

cheers,
--renato

Trifunovic, Konrad via llvm-dev

unread,
Mar 3, 2021, 8:25:19 AM3/3/21
to Mehdi AMINI, llvm...@lists.llvm.org
>> So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?
>>
>> I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
>> By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
>> The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
>> So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.
>>
>> My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
>> My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.
>>
> Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
> I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

By 'lower' I was referring to the place of backend in a typical compiler flow that I could imagine: MLIR -> LLVM-IR (opt) -> Bakcend (llc).
And yes, I agree, if we treat MLIR SPV dialect as a final result of what this backend would produce, then MLIR SPV could be the lowest-level representation (before streaming into SPIR-V binary).

> It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
> I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

We should investigate that. I believe though that GlobalISel is not really that flexible to produce MLIR (or dialects) - but that is something we might want to change 😊 That path would open us a door to have a great deal of unification:
We can support two 'entry points' :
1) Directly through MLIR. It gets translated to SPV dialect, and then streamed to SPIR-V binary. (without even going into LLVM-IR)
2) Start with LLVM-IR (with some augmented metadata and intrinsics). Feed that into proposed SPIR-V backend. Backend will produce MLIR SPV dialect and make use of whatever legalization/binary emission/etc. it provides.
This way, SPIR-V LLVM backend will be (a probably tiny) wrapper around MLIR SPV. Then, the majority of work would focus on MLIR SPV (e.g. adding support for OpenCL environment in addition to existing Vulkan compute).

From the implementation point of view, that would bring us huge re-use. Still, from the design point of view, we need to maintain two 'GPU centric' representations': LLVM-IR with SPIR-V intrinsics/metadata/attributes + MLIR SPV dialect.
Still that would be a much better situation from the community point of view.

--
konrad

James Y Knight via llvm-dev

unread,
Mar 3, 2021, 10:05:11 AM3/3/21
to Mehdi AMINI, llvm...@lists.llvm.org, Trifunovic, Konrad
On Tue, Mar 2, 2021 at 6:46 PM Mehdi AMINI <joke...@gmail.com> wrote:


On Tue, Mar 2, 2021 at 3:18 PM James Y Knight <jykn...@google.com> wrote:
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be.

I don't understand what you mean here with "layered underneath LLVM"? Can you elaborate on this?

That ultimately the goal should be for LLVM IR to be a dialect of MLIR, and for much of the optimization and codegen processes in LLVM to be implemented as MLIR dialect lowering. Then, MLIR is foundational -- "layered" underneath LLVM's core -- LLVM would have a hard dependency on MLIR.

At that point, SPIR-V as an MLIR dialect, and the SPIR-V backend doing MLIR dialect lowering would be effectively no different from how every target works -- just with a different output dialect.

I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.

I'm not sure if we're talking about the same thing here: there is nothing that I suggest that would operate at the level of LLVM IR. And nothing that requires a "long timescale", it seems quite easily in scope to me here.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what's the purpose? It wouldn't really help move towards the longer term goal, I don't think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Do we want to maintain, in the LLVM monorepo, *two* different implementations of a SPIRV IR and associated serialization (and potential deserialization)? All the tools associated to manipulate it? I assume the backend may even want to implement optimization passes, are we gonna duplicate these as well?
(note that this isn't at the LLVM IR level, but post-instruction selection, so very ad-hoc to the backend anyway).0

Quite possibly yes. It's unfortunate to have duplication, but given the current state of things, I think it should not be ruled out.

My inclination is that the following factors are likely to be true:
- The amount of code for SPIRV binary format serialization is not particularly large or tricky.
- The work to emit SPIR-V MLIR dialect from the LLVM SPIR-V backend will not be simpler than serializing to SPIR-V directly.
- Writing this custom code to emit SPIR-V MLIR dialect from the SPIR-V backend will not noticably further the longer-term goals of having LLVM core be implemented as MLIR dialect lowering.

It seems to me that the choice here is either writing new code in LLVM to emit the SPIR-V MLIR dialect in the GlobalISel SPIR-V backend, or new code in LLVM to emit SPIR-V directly. And while I find the long-term prospects of MLIR integration into LLVM extremely promising, using MLIR just as step-stone to MLIR SPIR-V serialization does not seem particularly interesting.

So, to me the interesting question is whether we'd expect to be doing something interesting after converting to the SPIR-V MLIR dialect form besides simply serializing to SPIR-V binary format. Something that would make the added complexity of serializing through MLIR seem more worthwhile. I guess I'm not immediately seeing this as likely to be the case, but it seems well worth further discussion.

A possibility you've mentioned is post-instruction-selection optimizations. Do you have something in particular in mind there?

Stella Laurenzo via llvm-dev

unread,
Mar 3, 2021, 10:29:57 AM3/3/21
to Trifunovic, Konrad, llvm-dev


On Wed, Mar 3, 2021, 2:11 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
> As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it's 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It's unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet.

Agreed. Indeed, 'production quality SPIR-V backend' is vaguely defined here and we proposed one discussion point on this. For this proposal needs, we should focus on one subset of SPIR-V and one use-case (OpenCL). By production quality I mean that we can correctly produce the code for that subset of SPIR-V. I totally agree that having the 'Full SPIR-V' coverage is something very broad and probably not achievable at all - but we are not aiming at that. I do take the perspective of classical 'CPU backend' here: we do have to generate the ISA code for the input LLVM-IR code. Now, besides instructions, our backend needs to deduce proper capabilities and extensions, based on what subset of instructions is selected. As You pointed out later, plain LLVM-IR is not capable of describing the full SPIR-V. Some of decorations/extensions/capabilities might be deduced by the backend, while some need to be declared using various LLVM-IR concepts, such as metadata, attributes, intrinsics - and that needs a clear definition.


>  My understanding over LLVM is it's mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn't much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on.

Also agree. I do believe though, that LLVM-IR is still worth the effort, and we can take an incremental approach into adopting some GPU concepts into LLVM-IR. (e.g. 'convergent' attribute has been added mainly for GPU kind of targets). The first step is defining the metadata and intrinsics that are target specific for SPIR-V, but most of them could be generalized as GPU concepts and even introduced into core LLVM-IR spec. I agree this is great effort - and not really the main focus point of this proposal - yet, we should give LLVM-IR it's own right into the world of GPUs.

> This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too.. Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality.

I would focus on just a subset and clearly define what input LLVM-IR GPU dialect would look like for that subset.

>
> That's why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR's open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

Right. For the 'General SPIR-V' support, MLIR is the right abstraction level to use. And I would keep it that way. For the 'specific'/legacy uses, backend is the way to fill that gap.

> With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :)

A very good discussion! I seem to be overly optimistic at the first place at unifying those two approaches. Now I believe that we actually should have two paths, for the reasons You have just explained and for the reasons of supporting 'legacy' paths/compilers that rely on a classical, years old approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that legacy path, a plain old 'backend' approach is still (in my view) the way to go. On the other hand, when MLIR evolves and gets wider adoption, it will be the way to go. From the semantic point of view, MLIR is much better suited for representing structured and extensible nature of SPIR-V. But for MLIR approach to be adopted, new languages/front-ends need to be aware of that structure, so to take most of the advantage of it. If Clang C/C++ start to use MLIR as its native generation format - that would be a big case for MLIR approach, but until that happens, we need to have some intermediate solution.

The most important use-case for the backend is the systems/languages that have been targeting x86, ARM or NVPTX, AMDGPU. You want to replace that particular backend with some other GPU backend (e.g. Intel GPU :) ). So the solution is to use SPIR-V as the backend target, and then consume SPIR-V with a proprietary/open-source GPU finalizer that eventually produces GPU assembly. (We did not want to come up with yet another GPU intermediate language like PTX, HSAIL etc. and want to use Khronos standard SPIR-V for that purpose). In this use case, the stress is on points 1) and 2) and less on point 3) as You stated earlier.

So my proposal is to keep two paths. They are complementary to each other. I know of the maintenance cost concerns - but while there are use-cases for both, it is still worth it. When the worlds stops using LLVM-IR, it will die silently, so will die the SPIR-V backend - but that is a natural software lifecycle ;)

I agree with this viewpoint. Having a backend like this (I hesitate to just call it a "Spir-v backend" because it is a more specific thing than that) would bring a level of utility and parity between GPU targets that seems like it has a lot of value. There may be a future convergence between the LLVM-IR and MLIR approaches, but I think we need to see and do more to get there. So long as there is enough community support to build and maintain this backend, it seems like a good thing that we want in repo, and having it developed there will help us see more options for unification over time.

I'm also personally supportive of "sweating it out" on this RFC/subsequent discussions and exploring technical options that we may have missed for better unification from an earlier point because these two parts of the community have been fairly isolated. Those are good conversations that we should have and continue. Unless if an obvious simplification emerges as part of that, by default though, I would support moving forward with this proposal once the details are converged. 

It'd be quite a good outcome, imo, to exit these discussions with a sketch of a plan for how these things could converge in the future, but I think we will find that future to be a ways out and require more practical first steps.

Anastasia Stulova via llvm-dev

unread,
Mar 3, 2021, 10:37:09 AM3/3/21
to Mehdi AMINI, Simone Atzeni via llvm-dev, Trifunovic, Konrad, aleksandr....@intel.com
There's definitely some consensus or even roadmap/timeline on this transition missing IMO :)  And pls forgive me my possibly stupid question, but is there any way now to _conveniently_ incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have 'llvm' dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn't public yet, is it?

Not that I am aware of, but I haven't followed the most recent development either! We're very interested into looking into this though :)

FYI there was this thread about CIL last month
It didn’t get a lot of traction yet but I think
this topic could become more interesting to the community in the future. I am not
sure the IR generation in Clang can be completely replaced but CIL could certainly
start as an experimental alternative format that Clang could emit. I do believe it will
take a while until it will catch up with the quality and functionality that IR generation
provides.

Speaking on behalf of the OpenCL community that will greatly benefit from the
SPIR-V generation for OpenCL kernel languages in the LLVM project, having an
LLVM backend would improve the user experience and facilitate many other
improvements in Clang and LLVM for the OpenCL devices that are now blocked on
the unavailability of a common target that vendors without in-tree LLVM backend can
contribute to. The backend will also be the easiest way to integrate with the Clang
frontend because it is a conventional route.

I do acknowledge that MLIR can provide many benefits to the OpenCL community,
however, on the Clang side a different approach than what is available right now
would probably make more sense. It would be better to go via CIL i.e. Clang
would emit CIL flavor of MLIR that would then get converted to the SPIR-V flavor
of MLIR bypassing the LLVM IR. This would be a preferable route but in order to
support CIL for OpenCL we would need to have the support for C/C++ first as it
is just a thin layer on top of the core languages that Clang supports. So this
is not something we can target at the moment because we will likely depend on
the will and the time investment from the C/C++ community for that.


Cheers,
Anastasia

From: llvm-dev <llvm-dev...@lists.llvm.org> on behalf of Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org>
Sent: 02 March 2021 23:48
To: Aleksandr Bezzubikov <zuba...@gmail.com>
Cc: llvm...@lists.llvm.org <llvm...@lists.llvm.org>; Trifunovic, Konrad <konrad.t...@intel.com>; aleksandr....@intel.com <aleksandr....@intel.com>

Johannes Doerfert via llvm-dev

unread,
Mar 3, 2021, 11:09:18 AM3/3/21
to Mehdi AMINI, Trifunovic, Konrad, llvm...@lists.llvm.org
I would prefer the non-MLIR route first, as proposed by the original
RFC. MLIR is great and all but it certainly opens up the possibility
for unrelated problems to slow down and derail the effort. My 2c.

~ Johannes

Mehdi AMINI via llvm-dev

unread,
Mar 3, 2021, 12:18:54 PM3/3/21
to James Y Knight, llvm...@lists.llvm.org, Trifunovic, Konrad
On Wed, Mar 3, 2021 at 7:05 AM James Y Knight <jykn...@google.com> wrote:


On Tue, Mar 2, 2021 at 6:46 PM Mehdi AMINI <joke...@gmail.com> wrote:


On Tue, Mar 2, 2021 at 3:18 PM James Y Knight <jykn...@google.com> wrote:
On Tue, Mar 2, 2021 at 4:40 PM Mehdi AMINI via llvm-dev <llvm...@lists.llvm.org> wrote:
On Tue, Mar 2, 2021 at 3:07 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

A very good question. I was actually expecting it 😊

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an "LLVM IR 2.0 -- Generic Edition", but not yet actually layered underneath LLVM where it really wants to be.

I don't understand what you mean here with "layered underneath LLVM"? Can you elaborate on this?

That ultimately the goal should be for LLVM IR to be a dialect of MLIR, and for much of the optimization and codegen processes in LLVM to be implemented as MLIR dialect lowering. Then, MLIR is foundational -- "layered" underneath LLVM's core -- LLVM would have a hard dependency on MLIR.

OK I see what you mean now, I didn't connect to this because I think it is an open question whether we see this happen in this decade ;)

So my assumption coming here is that:
1) LLVM IR as it is now is "granted" (at least in the context of this thread).
2) A SPIRV backend that takes LLVMIR and use GlobalISel is desirable.

Considering this, my angle is mainly one of library, software engineering, and reuse / avoiding duplication.
So adding intrinsics to LLVM IR and improving the GPU support in LLVM IR is something I see as "obviously good" and necessary for this project. The only opportunity for sharing and avoiding duplication appears to me right after GlobalISel for the rest of the pipeline.

At that point, SPIR-V as an MLIR dialect, and the SPIR-V backend doing MLIR dialect lowering would be effectively no different from how every target works -- just with a different output dialect.

I think it doesn't really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The "proper" solution probably won't be possible any time soon.

I'm not sure if we're talking about the same thing here: there is nothing that I suggest that would operate at the level of LLVM IR. And nothing that requires a "long timescale", it seems quite easily in scope to me here.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what's the purpose? It wouldn't really help move towards the longer term goal, I don't think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Do we want to maintain, in the LLVM monorepo, *two* different implementations of a SPIRV IR and associated serialization (and potential deserialization)? All the tools associated to manipulate it? I assume the backend may even want to implement optimization passes, are we gonna duplicate these as well?
(note that this isn't at the LLVM IR level, but post-instruction selection, so very ad-hoc to the backend anyway).0

Quite possibly yes. It's unfortunate to have duplication, but given the current state of things, I think it should not be ruled out.

My inclination is that the following factors are likely to be true:
- The amount of code for SPIRV binary format serialization is not particularly large or tricky.
- The work to emit SPIR-V MLIR dialect from the LLVM SPIR-V backend will not be simpler than serializing to SPIR-V directly.
- Writing this custom code to emit SPIR-V MLIR dialect from the SPIR-V backend will not noticably further the longer-term goals of having LLVM core be implemented as MLIR dialect lowering.

These are great considerations, I subscribe entirely :)
 

It seems to me that the choice here is either writing new code in LLVM to emit the SPIR-V MLIR dialect in the GlobalISel SPIR-V backend, or new code in LLVM to emit SPIR-V directly. And while I find the long-term prospects of MLIR integration into LLVM extremely promising, using MLIR just as step-stone to MLIR SPIR-V serialization does not seem particularly interesting.

So, to me the interesting question is whether we'd expect to be doing something interesting after converting to the SPIR-V MLIR dialect form besides simply serializing to SPIR-V binary format. Something that would make the added complexity of serializing through MLIR seem more worthwhile. I guess I'm not immediately seeing this as likely to be the case, but it seems well worth further discussion.

A possibility you've mentioned is post-instruction-selection optimizations. Do you have something in particular in mind there?

I suspect that post-Global ISel there is a bit more "than taking the MIR as-is and emit the SPIRV serialization in a single traversal". So converting MIR to MLIR means that everything that we want to happen at this point will be shared.
Note that this is different from other backends, because I don't expect SPIRV to share passes (RA, Scheduling, ...) on MIR or to take advantage of the MC layer in the same way (if I'm wrong here then my point is less strong though).

Cheers,

-- 
Mehdi

Trifunovic, Konrad via llvm-dev

unread,
Mar 3, 2021, 12:48:32 PM3/3/21
to Renato Golin, llvm...@lists.llvm.org
Answering to Renato's points:

> I think there are two points here:
>
> 1. How many SPIRV end-points we have

I would rather call this 'two entry points', as to having two entry points for accessing SPIR-V: either through LLVM-IR with augmentation (metadata/intrinsics), or, a proper MLIR 'SPV' dialect.

> This is mostly about software engineering concerns of duplication, maintenance, etc. But it's also about IR support, with MLIR having an upper hand here because of the existing implementation and its inherent flexibility with dialects.
>
> It's perfectly fine to have two back-ends for a while, but since we moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not a side project.

Agreed. We are not treating MLIR as a side project 😊

>
> LLVM IR has some "flexibility" through intrinsics, which we could use to translate MLIR concepts that can't be represented in LLVM IR for the purpose of lowering only. Optimisations on these intrinsics would bring the usual problems.
>
> 2. Where do the optimisations happen in code lowering to SPIRV
>
> I think Ronan's points are a good basis for keeping that in MLIR, at least for the current code. Now, if that precludes optimising in LLVM IR, than this could be a conflict with this proposal.

I think You are referring to points made by Lei, not by Ronan 😉
I believe that the idea of MLIR path is to completely skip LLVM-IR optimization passes, so having just the MLIR 'entry point' would preclude that possibility. (though, it would be possible to do FE -> LLVM-IR (optimize here) -> LLVM-IR to MLIR -> MLIR to 'spv' MLIR dialect -> SPIR-V, but that seems like an overkill...)

> If the code passes through MLIR or not will be a decision of the toolchain, that will pick the best path for each workload. This allows us to have concurrent approaches in tree, but also makes it hard to test and creates corner cases that are hard to test.

If we provide two 'entry points' for accessing SPIR-V, it is up to the toolchain to decide the most convenient way. I'm not sure whether this would be a runtime decision though. I believe that all future front-ends would like to target MLIR directly (and skip LLVM-IR altogether).

> So, while I appreciate this is a large proposal, that will likely take a year or more to get into shape, I think the ultimate goal (after the current proposal) should be that we end up with one back-end.

Agree. Though I would say : one back-end, but two 'entry points'. As I wrote in a reply to Mehdi, it seems that an option of having 'LLVM IR backend' produce 'spv' MLIR dialect would be a good way to ultimately unify the implementation. Though, that seems like a longer distance and needs some research, since, at the moment I'm not sure how to tackle this (to have GlobalISel produce MLIR as an output).

> I'm a big fan of MLIR, and I think we should keep developing the SPIRV dialect and possibly this could be the entry point of all SPIRV toolchains.

MLIR should be an entry point for all future SPIRV toolchains - that is the future. There will be still toolchains that are legacy and cannot be rewritten to use MLIR.


> While Clang will take a long time (if ever) to generate MLIR for C/C++, it could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is then optimised, compiled into LLVM IR and linked to the main module (or not, for multi-targets) after high-level optimisations.

I'm not sure about Clang OpenCL support. I believe that OpenCL C/C++ cannot produce MLIR directly. For OpenMP, I know that flang (Fortran) does have a MLIR based 'codegen'. Not sure about SYCL either. Someone from Intel should know? Does clang based SYCL compiler have a path to produce MLIR?

> This would answer both questions above and create a pipeline that is consistent, easier to test and with lower overall maintenance costs.

We should definitely aim at SPIR-V support to be less fragmented if possible (at the moment, we also have SPIR-V <-> LLVM bidirectional translator which is an external project).

> cheers,
> --renato
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Lei Zhang via llvm-dev

unread,
Mar 3, 2021, 1:25:57 PM3/3/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
On Wed, Mar 3, 2021 at 8:25 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
>> So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?
>>
>> I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
>> By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
>> The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
>> So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.
>>
>> My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
>> My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.
>>
> Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
> I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

By 'lower' I was referring to the place of backend in a typical compiler flow that I could imagine: MLIR -> LLVM-IR (opt) -> Bakcend (llc).
And yes, I agree, if we treat MLIR SPV dialect as a final result of what this backend would produce, then MLIR SPV could be the lowest-level representation (before streaming into SPIR-V binary).

> It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
> I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

We should investigate that. I believe though that GlobalISel is not really that flexible to produce MLIR (or dialects) - but that is something we might want to change 😊 That path would open us a door to have a great deal of unification:

+1. This sounds quite interesting and worth exploration. Starting with some initial feasibility/blockers investigation would be awesome. It's also not scoped to SPIR-V per se; done right it actually can have the ability to connect any MLIR dialect I guess? (So later we can design other dialects for LLVM CodeGen probably.) Connecting LLVM and MLIR is a huge effort. And it needs to start somewhere. This might be a nice first attempt? (Please pardon me if this is obviously not; I'm not super familiar with GlobalISel infrastructure.)

Lei Zhang via llvm-dev

unread,
Mar 3, 2021, 1:41:50 PM3/3/21
to James Y Knight, llvm...@lists.llvm.org, Trifunovic, Konrad
Serialization itself is not a large amount of work; But I think there are also other considerations worth calling out. In  the development of the SPIR-V MLIR dialect, we have been following the spec verbatim and thoroughly to add validation rules when introducing new operations. Going through SPIR-V MLIR dialect we can have all the validation on the operations and against the target environment to guarantee the generated binary is valid. I'd assume you'll need that validation too before serialization in the LLVM SPIR-V backend?

 
- Writing this custom code to emit SPIR-V MLIR dialect from the SPIR-V backend will not noticably further the longer-term goals of having LLVM core be implemented as MLIR dialect lowering.

It seems to me that the choice here is either writing new code in LLVM to emit the SPIR-V MLIR dialect in the GlobalISel SPIR-V backend, or new code in LLVM to emit SPIR-V directly. And while I find the long-term prospects of MLIR integration into LLVM extremely promising, using MLIR just as step-stone to MLIR SPIR-V serialization does not seem particularly interesting.

So, to me the interesting question is whether we'd expect to be doing something interesting after converting to the SPIR-V MLIR dialect form besides simply serializing to SPIR-V binary format. Something that would make the added complexity of serializing through MLIR seem more worthwhile. I guess I'm not immediately seeing this as likely to be the case, but it seems well worth further discussion.

A possibility you've mentioned is post-instruction-selection optimizations. Do you have something in particular in mind there? 
_______________________________________________

Ronan KERYELL via llvm-dev

unread,
Mar 3, 2021, 1:42:07 PM3/3/21
to Renato Golin via llvm-dev, Trifunovic, Konrad
>>>>> On Wed, 3 Mar 2021 11:24:45 +0000, Renato Golin via llvm-dev <llvm...@lists.llvm.org> said:

Renato> While Clang will take a long time (if ever) to generate MLIR
Renato> for C/C++, it could very well generate MLIR for non-C++
Renato> (OpenCL, OpenMP, SYCL, etc) which is then optimised,
Renato> compiled into LLVM IR and linked to the main module (or not,
Renato> for multi-targets) after high-level optimisations.

Actually SYCL is pure C++, just with a few special C++ classes similar
to some other special things like std::thread or setjump()/longjump().

OpenMP, when associated to C++, is also pure C++.

In your list OpenCL is a language based on C/C++ to program accelerators
while SYCL & OpenMP are single-source frameworks to program full
applications using a host and some accelerators, with both part in the
same source program in a seamless and type-safe way.

So the MLIR approach is quite compelling with its "Multi-Level"
representation of both the host and the device code to enable
multi-level inter-procedural or inter-module optimizations which cannot
be done today when compiling single-source OpenMP/SYCL/CUDA/HIP/OpenACC
because most implementations use early outlining of the device code,
thus it is super hard to do inter-module optimization later without a
multi-level view.

As you and most other people said, it looks we are stuck with plain LLVM
for a while.

But perhaps you were considering in your sentence the case where with
OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR
just for the hardware accelerator parts?

While it would obviously allow to recycle more easily the MLIR SPIR-V
generator, it would still require somehow to generate MLIR from the C++
accelerator parts. At least the C++ code allowed in accelerator parts is
often restricted, so it is easier to do than with full-fledged host part
C++ and actually there are a few hacks trying to do this (for example
leveraging Polly, PPCG...). But it seems we are far from a
production-quality infrastructure yet.

So it looks like, while we do not have a robust C++ to MLIR path, we need
an LLVM IR to SPIR-V path somehow.

At least, as others like Mehdi said, let's do good software
engineering and factorize out as much as we can between the LLVM IR and
MLIR paths.
--
Ronan KERYELL

Aleksandr Bezzubikov via llvm-dev

unread,
Mar 3, 2021, 1:45:22 PM3/3/21
to Lei Zhang, llvm...@lists.llvm.org, Trifunovic, Konrad
On Wed, Mar 3, 2021 at 9:25 PM Lei Zhang via llvm-dev <llvm...@lists.llvm.org> wrote:


On Wed, Mar 3, 2021 at 8:25 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
>> So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?
>>
>> I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
>> By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
>> The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
>> So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.
>>
>> My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
>> My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.
>>
> Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
> I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

By 'lower' I was referring to the place of backend in a typical compiler flow that I could imagine: MLIR -> LLVM-IR (opt) -> Bakcend (llc).
And yes, I agree, if we treat MLIR SPV dialect as a final result of what this backend would produce, then MLIR SPV could be the lowest-level representation (before streaming into SPIR-V binary).

> It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
> I haven't seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

We should investigate that. I believe though that GlobalISel is not really that flexible to produce MLIR (or dialects) - but that is something we might want to change 😊 That path would open us a door to have a great deal of unification:

+1. This sounds quite interesting and worth exploration. Starting with some initial feasibility/blockers investigation would be awesome. It's also not scoped to SPIR-V per se; done right it actually can have the ability to connect any MLIR dialect I guess? (So later we can design other dialects for LLVM CodeGen probably.) Connecting LLVM and MLIR is a huge effort. And it needs to start somewhere. This might be a nice first attempt? (Please pardon me if this is obviously not; I'm not super familiar with GlobalISel infrastructure.)

It would be great to hear backend experts' opinions on that, but it seems that it's more of a problem of producing MLIR with standard LLVM codegen infra, no matter whether it's SelectionDAG or GlobalISel.
 

 
We can support two 'entry points' :
1) Directly through MLIR. It gets translated to SPV dialect, and then streamed to SPIR-V binary. (without even going into LLVM-IR)
2) Start with LLVM-IR (with some augmented metadata and intrinsics). Feed that into proposed SPIR-V backend. Backend will produce MLIR SPV dialect and make use of whatever legalization/binary emission/etc. it provides.
This way, SPIR-V LLVM backend will be (a probably tiny) wrapper around MLIR SPV. Then, the majority of work would focus on MLIR SPV (e.g. adding support for OpenCL environment in addition to existing Vulkan compute).


I'm actually wondering what's the point of having a backend for 'spv' dialect of MLIR rather than some generic 'mir' dialect? The latter may help in retargeting other backends eventually, which may be perceived as a huge plus. 
 
From the implementation point of view, that would bring us huge re-use. Still, from the design point of view, we need to maintain two 'GPU centric' representations': LLVM-IR with SPIR-V intrinsics/metadata/attributes + MLIR SPV dialect.
Still that would be a much better situation from the community point of view.

--
konrad





_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


Thanks,
Alex

Renato Golin via llvm-dev

unread,
Mar 3, 2021, 1:57:52 PM3/3/21
to Ronan KERYELL, Renato Golin via llvm-dev, Trifunovic, Konrad
On Wed, 3 Mar 2021 at 18:41, Ronan KERYELL <ronan...@keryell.fr> wrote:
But perhaps you were considering in your sentence the case where with
OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR
just for the hardware accelerator parts?

Just thinking out loud if clang couldn't be a hybrid front-end, emitting LLVM IR and MLIR for different parts of the program (for example, accelerators), and either use SPIRV (for supported accelerators) or lower to LLVM IR (for the rest). This would allow us to use MLIR directly in hybrid programming models (like OpenMP, OpenCL) and make real use of the high-level optimisations in MLIR. Perhaps SYCL wouldn't fit here.

I'm just going back to the overall goal of MLIR and trying to see on our current approach, what do we lower too soon to LLVM IR, so it could lower to MLIR instead, piece wise, then lower to LLVM IR (or not) later.

Johannes Doerfert via llvm-dev

unread,
Mar 3, 2021, 2:02:36 PM3/3/21
to Ronan KERYELL, Renato Golin via llvm-dev, Trifunovic, Konrad

On 3/3/21 12:41 PM, Ronan KERYELL via llvm-dev wrote:
>>>>>> On Wed, 3 Mar 2021 11:24:45 +0000, Renato Golin via llvm-dev <llvm...@lists.llvm.org> said:
> Renato> While Clang will take a long time (if ever) to generate MLIR
> Renato> for C/C++, it could very well generate MLIR for non-C++
> Renato> (OpenCL, OpenMP, SYCL, etc) which is then optimised,
> Renato> compiled into LLVM IR and linked to the main module (or not,
> Renato> for multi-targets) after high-level optimisations.
>
> Actually SYCL is pure C++, just with a few special C++ classes similar
> to some other special things like std::thread or setjump()/longjump().
>
> OpenMP, when associated to C++, is also pure C++.
>
> In your list OpenCL is a language based on C/C++ to program accelerators
> while SYCL & OpenMP are single-source frameworks to program full
> applications using a host and some accelerators, with both part in the
> same source program in a seamless and type-safe way.
>
> So the MLIR approach is quite compelling with its "Multi-Level"
> representation of both the host and the device code to enable
> multi-level inter-procedural or inter-module optimizations which cannot
> be done today when compiling single-source OpenMP/SYCL/CUDA/HIP/OpenACC
> because most implementations use early outlining of the device code,
> thus it is super hard to do inter-module optimization later without a
> multi-level view.

FWIW, the indirection "caused by" early outlining is already covered
with callbacks in LLVM-IR. The separation of host and device code
will be covered with heterogeneous LLVM-IR modules.

"Super hard" is definitively not how I would describe it given our
current state, we mainly need to put the things together.

~ Johannes


>
> As you and most other people said, it looks we are stuck with plain LLVM
> for a while.
>
> But perhaps you were considering in your sentence the case where with
> OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR
> just for the hardware accelerator parts?
>
> While it would obviously allow to recycle more easily the MLIR SPIR-V
> generator, it would still require somehow to generate MLIR from the C++
> accelerator parts. At least the C++ code allowed in accelerator parts is
> often restricted, so it is easier to do than with full-fledged host part
> C++ and actually there are a few hacks trying to do this (for example
> leveraging Polly, PPCG...). But it seems we are far from a
> production-quality infrastructure yet.
>
> So it looks like, while we do not have a robust C++ to MLIR path, we need
> an LLVM IR to SPIR-V path somehow.
>
> At least, as others like Mehdi said, let's do good software
> engineering and factorize out as much as we can between the LLVM IR and
> MLIR paths.

Johannes Doerfert via llvm-dev

unread,
Mar 3, 2021, 2:06:22 PM3/3/21
to Renato Golin, Ronan KERYELL, Renato Golin via llvm-dev, Trifunovic, Konrad

On 3/3/21 12:57 PM, Renato Golin via llvm-dev wrote:
> On Wed, 3 Mar 2021 at 18:41, Ronan KERYELL <ronan...@keryell.fr> wrote:
>
>> But perhaps you were considering in your sentence the case where with
>> OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR
>> just for the hardware accelerator parts?
>
> Just thinking out loud if clang couldn't be a hybrid front-end, emitting
> LLVM IR and MLIR for different parts of the program (for example,
> accelerators), and either use SPIRV (for supported accelerators) or lower
> to LLVM IR (for the rest). This would allow us to use MLIR directly in
> hybrid programming models (like OpenMP, OpenCL) and make real use of the
> high-level optimisations in MLIR. Perhaps SYCL wouldn't fit here.

As Ronan said, OpenMP (and others) is C/C++/Fortran + some additions.
Without having a full fledged C/C++ MLIR route we won't be able to build
a toolchain I would trust, not to mention the lack of evidence we need
to go to MLIR in the first place ;)

~ Johannes


>
> I'm just going back to the overall goal of MLIR and trying to see on our
> current approach, what do we lower too soon to LLVM IR, so it could lower
> to MLIR instead, piece wise, then lower to LLVM IR (or not) later.
>
>

Ronan KERYELL via llvm-dev

unread,
Mar 3, 2021, 2:55:47 PM3/3/21
to Renato Golin, Renato Golin via llvm-dev, Trifunovic, Konrad
>>>>> On Wed, 3 Mar 2021 18:57:33 +0000, Renato Golin <reng...@gmail.com> said:

Renato> Just thinking out loud if clang couldn't be a hybrid
Renato> front-end, emitting LLVM IR and MLIR for different parts of
Renato> the program (for example, accelerators), and either use
Renato> SPIRV (for supported accelerators) or lower to LLVM IR (for
Renato> the rest).

At some point everything is possible.

Renato> This would allow us to use MLIR directly in hybrid
Renato> programming models (like OpenMP, OpenCL) and make real use
Renato> of the high-level optimisations in MLIR. Perhaps SYCL
Renato> wouldn't fit here.

I guess you have exchanged the words OpenCL and SYCL in this sentence.
OpenCL is like graphics shader languages: a foreign language to the
host. From the host point-of-view it is just a host foreign API managing
memory allocation on the device and controlling kernel execution on some
devices (think RPC). The advantage is that you can use OpenCL from a
COBOL host program if you want but this is another story... :-)

Trifunovic, Konrad via llvm-dev

unread,
Mar 12, 2021, 5:30:14 AM3/12/21
to llvm...@lists.llvm.org
Hi,

Thank you all for a very in-depth discussion so far.
Besides the major topic of MLIR and LLVM-IR coexistence, are there any other comments, especially regarding 'Open questions' section that we proposed?

My recap so far is:
* There is a good reception from the community that is interested in LLVM-IR path (a classical FE/opt/code-generation path, clang community)
* There is a concern on maintenance cost if we have two solutions in parallel: MLIR based, and LLVM-IR based. We will look for the ways to address this, one investigation point would be generating MLIR 'spv' dialect from the target backend infrastructure (GlobalISel)
* We also need to iron out the details of the semantics and capabilities of SPIR-V that we would like to expose: 1) which exact subset of LLVM-IR is acceptable by the backend, 2) how do we expose the extensions and core builtins, 3) how do we map a memory model to LLVM-IR (especially if we think about adding Vulkan compute memory model) etc.

Based on the feedback so far, that would be roughly my plan:
* Go ahead with a SPIR-V backend in LLVM-IR, as planned. Look for clang integration.
* Midterm: investigate MLIR 'spv' dialect generation from GlobalISel (or other means) as an unifying solution
* Long-term: come up with a single, MLIR based backend (which is going to take care of serialization, deserialization, automatic specification updates and all the rest of the infrastructure work). This still means we will have two entry points: directly through MLIR 'spv' dialect or through LLVM-IR with intrinsics/metadata. Will support both Vulkan compute and OpenCL compute models.
* When/if clang eventually switches to MLIR code generation, that would be eventually the end of LLVM-IR path :)

For the first point, we need a close collaboration with GlobalISel and clang communities. For the last three points, we will work closely with MLIR community on engineering and IR specification.

What do You think? Does this sound like a reasonable plan?

thanks,
konrad

> -----Original Message-----
> From: Ronan KERYELL <ro...@keryell.fr>
> Sent: Wednesday, March 3, 2021 8:55 PM
> To: Renato Golin <reng...@gmail.com>
> Cc: Renato Golin via llvm-dev <llvm...@lists.llvm.org>; Trifunovic, Konrad
> <konrad.t...@intel.com>
> Subject: Re: [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
>

Johannes Doerfert via llvm-dev

unread,
Mar 12, 2021, 1:58:07 PM3/12/21
to Trifunovic, Konrad, llvm...@lists.llvm.org

On 3/12/21 4:30 AM, Trifunovic, Konrad via llvm-dev wrote:
> Hi,
>
> Thank you all for a very in-depth discussion so far.
> Besides the major topic of MLIR and LLVM-IR coexistence, are there any other comments, especially regarding 'Open questions' section that we proposed?
>
> My recap so far is:
> * There is a good reception from the community that is interested in LLVM-IR path (a classical FE/opt/code-generation path, clang community)
> * There is a concern on maintenance cost if we have two solutions in parallel: MLIR based, and LLVM-IR based. We will look for the ways to address this, one investigation point would be generating MLIR 'spv' dialect from the target backend infrastructure (GlobalISel)
> * We also need to iron out the details of the semantics and capabilities of SPIR-V that we would like to expose: 1) which exact subset of LLVM-IR is acceptable by the backend, 2) how do we expose the extensions and core builtins, 3) how do we map a memory model to LLVM-IR (especially if we think about adding Vulkan compute memory model) etc.
>
> Based on the feedback so far, that would be roughly my plan:
> * Go ahead with a SPIR-V backend in LLVM-IR, as planned. Look for clang integration.
> * Midterm: investigate MLIR 'spv' dialect generation from GlobalISel (or other means) as an unifying solution
> * Long-term: come up with a single, MLIR based backend (which is going to take care of serialization, deserialization, automatic specification updates and all the rest of the infrastructure work). This still means we will have two entry points: directly through MLIR 'spv' dialect or through LLVM-IR with intrinsics/metadata. Will support both Vulkan compute and OpenCL compute models.
> * When/if clang eventually switches to MLIR code generation, that would be eventually the end of LLVM-IR path :)
>
> For the first point, we need a close collaboration with GlobalISel and clang communities. For the last three points, we will work closely with MLIR community on engineering and IR specification.
>
> What do You think? Does this sound like a reasonable plan?

The first point sounds good, let's start there and revisit the rest as
it becomes
interesting, no need to finalize anything as the context is arguably fluid.

----

FWIW, the other three points sounds very much like "MLIR is obviously
the solution,
let's switch everything over so our problems are somehow solved". If I
were you I
would not want to be the first to do the LLVM-IR -> MLIR dialect code
path. If there
is one, adding SPIR-V support is something one can look at, I'd expect
pros, cons,
and a lot of unforeseen complications.

My point two, if any, would be to lower some parts of the SPIR-V dialect
to LLVM-IR
(+ the extensions you are adding anyway for point one). That seems like
a low-cost
way to determine if a single SPIR-V backend (in LLVM-Core) would make
sense or not.

~ Johannes

Mehdi AMINI via llvm-dev

unread,
Mar 13, 2021, 10:32:24 PM3/13/21
to Trifunovic, Konrad, llvm...@lists.llvm.org
On Fri, Mar 12, 2021 at 2:30 AM Trifunovic, Konrad via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi,

Thank you all for a very in-depth discussion so far.
Besides the major topic of MLIR and LLVM-IR coexistence, are there any other comments, especially regarding 'Open questions' section that we proposed?

My recap so far is:
        * There is a good reception from the community that is interested in LLVM-IR path (a classical FE/opt/code-generation path, clang community)
        * There is a concern on maintenance cost if we have two solutions in parallel: MLIR based, and LLVM-IR based. We will look for the ways to address this, one investigation point would be generating MLIR 'spv' dialect from the target backend infrastructure (GlobalISel)
        * We also need to iron out the details of the semantics and capabilities of SPIR-V that we would like to expose: 1) which exact subset of LLVM-IR is acceptable by the backend, 2) how do we expose the extensions and core builtins, 3) how do we map a memory model to LLVM-IR (especially if we think about adding Vulkan compute memory model) etc.

Based on the feedback so far, that would be roughly my plan:
        * Go ahead with a SPIR-V backend in LLVM-IR, as planned. Look for clang integration.
        * Midterm: investigate MLIR 'spv' dialect generation from GlobalISel (or other means) as an unifying solution

To me this is something to look at in the very short term: I'm wary of investment in duplicated infrastructure for SPIRV binary serialization/deserialization in-tree. I.e. this covers the path from MIR (post-GlobalISel)->SPIRV (and potentially any MIR-level transformations).
To be clear: this does not block landing parts of the backend (all the GlobalISel specification for instance), but I suspect you'll want to get something end-to-end.

The two points below are indeed more long-term / hypothetical in my opinion: good to keep in mind but clearly off the table right now.
Reply all
Reply to author
Forward
0 new messages