[llvm-dev] AMDGPU mimics JIT?

6 views

Skip to first unread message

Frank Winter via llvm-dev

unread,

Feb 25, 2020, 12:17:48 PM2/25/20

to LLVM Dev

Hi!

I'm looking into ways how to port our application to machines using AMD
GPUs.

The way the application is currently set up for accelerated computing is
the following:

1) LLVM IR modules are being built in memory during runtime

2) In case of NVIDIA GPUs each module is compiled with the NVPTX backend
and the assembly file (PTX) extracted

3) The PTX is loaded with the NVIDIA kernel driver which JIT-compiles it
to the actual GPU installed, and the kernel is launched on the GPU for
execution.

Now, AMD doesn't seem to use an intermediate IR level comparable to PTX.
As far as I understand the AMDGPU backend generates binary code (AMGCN)
for the GPU kernels directly. This makes me wonder if there is any way
to execute (launch) such a kernel after it has been compiled by the
AMDGPU backend.

It will certainly not work in the typical HIP way using the ROCM
utilities where a kernel is specified with the __global__ attribute like

__global__ void kernel(const T* in, T* out) {}

From the AMDGPU kernel's compilation at most a raw pointer is returned
(and that is only in the case if the backend supports JIT, which it
probably doesn't!?) otherwise it produces a library. Is it possible to
'dlload' such a (kernel) library into the address space and launch it?

Anyone has any idea how to launch such a kernel within the same program
context/execution?

Thanks,

Frank

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Dmitry Mikushin via llvm-dev

unread,

Feb 25, 2020, 12:31:14 PM2/25/20

to Frank Winter, LLVM Dev

The closest more or less existing design to your definition is PGI OpenACC. Radeon target no longer exists in it for obvious reasons. But as long as it deploys an LLVM-based GPU pipeline, you might have non-zero luck to intercept LLVM IR before lowering to PTX and redirect it to AMDGPU. This of course shall involve writing quite a bit of glueing code to translate nvvm intrinsics to red equivalents. As a result, you will get OpenACC support, which might not that generic thing as you want. But if I were you, I'd take this approach as the one potentially most successful in reasonable time.

Kind regards,

- Dmitry.

вт, 25 февр. 2020 г. в 18:17, Frank Winter via llvm-dev <llvm...@lists.llvm.org>:

Reply all

Reply to author

Forward

0 new messages