Polly for GPU

Xin Tong

unread,

Mar 5, 2012, 12:08:51 PM3/5/12

to poll...@googlegroups.com

Hello

I am investigating the possibilities of a project which makes uses
of Polly to parallelize loops and code-gen into code that makes use of
GPU. One of the options is to generate CUDA/OpenCL code - LLVM
currently does not have a OpenCL/CUDA code generator. Another option
is to generate PTX code - LLVM currently does have PTX backend.

Tobias, have you thought/evaluated both approaches ? Any suggestions ?

Thanks

Xin

Tobias Grosser

unread,

Mar 5, 2012, 2:02:44 PM3/5/12

to poll...@googlegroups.com, xerox.t...@gmail.com

Hi Xin,

I have thought about it. ;-)

If you want to translate from LLVM-IR to GPU code there are, as you
pointed out, two possible approaches:

1. Generating C like CUDA/OpenCL code

To generate CUDA/OpenCL a backend like the LLVM C backend
needs to be written or the LLVM C backend needs to be extended. For
OpenCL this was already done in a bachelors thesis [1]. This approach
has both advantages and disadvantages:

Advantages:
- The result can be compiled by any OpenCL compiler

(Though this only works, if we solve the hard problem of
generating correct OpenCL code from LLVM-IR)

Disadvantages:
- Unreliable and Buggy.

The C backend is known to be buggy (and a rewrite is needed to
fix it). An OpenCL backend based on the C backend, will also
be unreliable if the root problems are not fixed.

- Overhead
Going back from LLVM-IR to OpenCL is actually just overhead.
All OpenCL compilers I know of (AMD, Intel, NVidia)
lower OpenCL back to LLVM-IR.

2. Use compiler backends for the GPU low level IRs

Here we directly go from LLVM-IR to PTX/AMD-IL/..
Again some positive and negative points:

Disadvantages:
- You need specific compiler backends
- Need to generate different meta-data for different GPUs

At the moment, the different back ends expect different
meta-data. This means your stuff would for the moment be
backend specific

Advantages:
- Reliable

The existence of proprietary GPU backends shows that this
approach can yield production quality software.

- Existing open source backends

LLVM includes a PTX backend. AMD also open sourced their AMD IL
backend [2]

- Upcoming standard ?!

I heard rumors that people are thinking to standardize LLVM-IR
as an alternative OpenCL input format.

Personally, I would recommend to go with approach two. I think time is
better invested in working on advanced transformations that enable GPU
code generation, than on fixing the c back end. If you plan to work on
optimizations, you may want to have a look at Muthu's work[3]. In my
group we currently work improving on this. The relevant code is
available in ppcg. For the moment, this is still a source to source
tool, but adapting/using these algorithms for Polly would be great.

Keep me up to date, what is/has happening here.

Cheers
Tobi

P.S.: The Google summer of code project is starting soon. If you know
any student interested in this, I believe this would be a great project.

[1] http://www.cdl.uni-saarland.de/publications/theses/moll_bsc.pdf
[2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-December/046136.html
[3] Automatic C-to-CUDA Code Generation for Affine Programs
Muthu Manikandan Baskaran, J. Ramanujam and P. Sadayappan
CC 2010
[4] http://repo.or.cz/w/ppcg.git

Xin Tong

unread,

Mar 5, 2012, 4:42:14 PM3/5/12

to Tobias Grosser, poll...@googlegroups.com

comments below.

Thanks

I have not looked into the backends yet. but as far as i can see,
there is only one PTX back-end that generates code for GPU. Is not the
whole idea of IR to abstract away the heterogeneity in the back-ends ?

Tobias Grosser

unread,

Mar 6, 2012, 2:48:36 AM3/6/12

to Xin Tong, poll...@googlegroups.com

Sure. It mostly works. However, e.g. the AMD-IL back end stores
additional information within meta data, which is not standardized. I
don't think it's a big deal, but you should be aware that you might need
to generate slightly different annotations/meta-data to mark certain
constructs. Though, I think the differences are rather small.

Cheers
Tobi

Tobias Grosser

unread,

Mar 6, 2012, 12:19:12 PM3/6/12

to poll...@googlegroups.com, xerox.t...@gmail.com

On 03/05/2012 08:02 PM, Tobias Grosser wrote:
> On 03/05/2012 06:08 PM, Xin Tong wrote:
>> Hello
>>
>> I am investigating the possibilities of a project which makes uses of
>> Polly to parallelize loops and code-gen into code that makes use of
>> GPU. One of the options is to generate CUDA/OpenCL code - LLVM
>> currently does not have a OpenCL/CUDA code generator. Another
>> option is to generate PTX code - LLVM currently does have PTX
>> backend.
>>
>> Tobias, have you thought/evaluated both approaches ? Any suggestions
>> ?
>>
>> Thanks
>
> Hi Xin,
>
> I have thought about it. ;-)
>
> If you want to translate from LLVM-IR to GPU code there are, as you
> pointed out, two possible approaches:
>
> 1. Generating C like CUDA/OpenCL code
>
> To generate CUDA/OpenCL a backend like the LLVM C backend
> needs to be written or the LLVM C backend needs to be extended. For
> OpenCL this was already done in a bachelors thesis [1]

The relevant tool was just posted on the LLVM mailing list:

https://bitbucket.org/gnarf/axtor/

You may want to give it a look.

Tobi

Reply all

Reply to author

Forward