Initialising NVPTX backend within Polly

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 6, 2017, 8:58:42 AM4/6/17

to Polly Development

Hello,

Polly-ACC uses the NVPTX back-end to enable GPU offload, which is to be to be initialised before being used.

Currently, the onus is on the front-end to initialise the NVPTX passes for Polly. This would create challenges when Polly's being extended to front-ends that don't initialise GPU backends. e.g Julia.

I'm looking at bringing this action into Polly and not depend on the front-end and have the following questions,

How to tell if backend's already initialised ?

Unconditional LLVMInitializeNVPTX*() calls are redundant if the back-end is being reinitialised and detrimental if the front-end is compile-time sensitive e.g. JIT capable languages like Julia.
TargetRegistry::lookupTarget seems to be a possible way.

Which LLVMInitiliazeNVPTXTarget*() functions are necessary for Polly's usage of the backend ?
How can Polly access these functions ?

These functions are defined and declared at <llvm_src>/lib/Target/NVPTX and hence cannot be accessed by #including header files.
A dirty workaround would be to #include "llvm/../../lib/Target/NVPTX/file.h"

Please share your thoughts.

Thank You,

Sanjay

Tobias Grosser

unread,

Apr 6, 2017, 9:41:46 AM4/6/17

to poll...@googlegroups.com

On Thu, Apr 6, 2017, at 02:58 PM, 'SANJAY SRIVALLABH SINGAPURAM' via
Polly Development wrote:
> Hello,
>
> Polly-ACC uses the NVPTX back-end to enable GPU offload, which is to be
> to
> be initialised before being used.
>
> Currently, the onus is on the front-end to initialise the NVPTX passes
> for
> Polly. This would create challenges when Polly's being extended to
> front-ends that don't initialise GPU backends. e.g Julia

> <https://github.com/JuliaLang/julia/pull/21142>.

>
> I'm looking at bringing this action into Polly and not depend on the
> front-end and have the following questions,
>

> 1. How to tell if backend's already initialised ?
> - Unconditional LLVMInitializeNVPTX*() calls are redundant if the

> back-end is being reinitialised and detrimental if the front-end is
> compile-time sensitive e.g. JIT capable languages like Julia.

> - TargetRegistry::lookupTarget seems to be a possible way.

Did you measure the overhead. I would assume it is close to zero.

> 2. Which LLVMInitiliazeNVPTXTarget*() functions are necessary for

> Polly's usage of the backend ?

Not sure.

> 3. How can Polly access these functions ?
> - These functions are defined and declared at

> <llvm_src>/lib/Target/NVPTX and hence cannot be accessed by
> #including
> header files.

> - A *dirty* workaround would be to #include
> "llvm/../../lib/Target/NVPTX/file.h"

What I am mostly concerned about is what happens if the NVPTX backend is
not linked into LLVM. Does LLVM have a function 'initialize all
targets'?

Best,
Tobias

>
> Please share your thoughts.
>
> Thank You,
> Sanjay
>

> --
> You received this message because you are subscribed to the Google Groups
> "Polly Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to polly-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

llvmres...@iith.ac.in

unread,

Apr 7, 2017, 2:51:19 AM4/7/17

to Polly Development, Tobias Grosser

Hello Tobias,

On Thursday, April 6, 2017 at 7:11:46 PM UTC+5:30, Tobias wrote:

On Thu, Apr 6, 2017, at 02:58 PM, 'SANJAY SRIVALLABH SINGAPURAM' via
Polly Development wrote:
> Hello,
>
> Polly-ACC uses the NVPTX back-end to enable GPU offload, which is to be
> to
> be initialised before being used.
>
> Currently, the onus is on the front-end to initialise the NVPTX passes
> for
> Polly. This would create challenges when Polly's being extended to
> front-ends that don't initialise GPU backends. e.g Julia
> <https://github.com/JuliaLang/julia/pull/21142>.
>
> I'm looking at bringing this action into Polly and not depend on the
> front-end and have the following questions,
>
> 1. How to tell if backend's already initialised ?
> - Unconditional LLVMInitializeNVPTX*() calls are redundant if the
> back-end is being reinitialised and detrimental if the front-end is
> compile-time sensitive e.g. JIT capable languages like Julia.
> - TargetRegistry::lookupTarget seems to be a possible way.

Did you measure the overhead.

I have not.

I would assume it is close to zero.

I too thought so. I had proposed changes to Julia's calls to initialise passes here, to InitializeAllTargets() instead of just InitializeNativeTarget(). According to a committer, unconditionally initialising backends that aren't needed is detrimental since LLVM isn't the state of the art in fast JIT compilers.

> 2. Which LLVMInitiliazeNVPTXTarget*() functions are necessary for
> Polly's usage of the backend ?

Not sure.

> 3. How can Polly access these functions ?
> - These functions are defined and declared at
> <llvm_src>/lib/Target/NVPTX and hence cannot be accessed by
> #including
> header files.
> - A *dirty* workaround would be to #include
> "llvm/../../lib/Target/NVPTX/file.h"

What I am mostly concerned about is what happens if the NVPTX backend is
not linked into LLVM.

libLLVM.so always had definitions of NVPTX backend routines, even when they weren't used by Julia. Are there scenarios when the backend won't included within LLVM itself ?

Does LLVM have a function 'initialize all
targets'?

Yes, InitializeAllTargets().

Tobias Grosser

unread,

Apr 7, 2017, 3:01:30 AM4/7/17

to llvmres...@iith.ac.in, Polly Development

On Fri, Apr 7, 2017, at 08:51 AM, llvmresch_int01 via Polly Development
wrote:

> passes here <https://github.com/JuliaLang/julia/pull/21142>, to

> InitializeAllTargets() instead of just InitializeNativeTarget().
> According
> to a committer

> <https://github.com/JuliaLang/julia/pull/21142/commits/3371d635da12444f81034ca02a668fe9cefd7388#r107757559>,

> unconditionally initialising backends that aren't needed is detrimental
> since LLVM isn't the state of the art in fast JIT compilers.

If that works, it makes sense to initialize the LLVM backend from Polly.
Can you add the necessary initialization calls to:

#ifdef GPU_CODEGEN
initializePPCGCodeGenerationPass(Registry);
#endif

in lib/Support/RegisterPasses.cpp, test that it works without the NVPTX
backend, and then propose a merge request for Polly.

Best,
Tobias

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 7, 2017, 3:21:11 AM4/7/17

to Tobias Grosser, Polly Development

Sure.

I just need to know how I could be sure that I'm not reinitialising the backend. Would TargetRegistry::lookupTarget be enough ?

Best,
Tobias

--
You received this message because you are subscribed to a topic in the Google Groups "Polly Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/polly-dev/tfsCNDgqkeM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to polly-dev+...@googlegroups.com.

Tobias Grosser

unread,

Apr 7, 2017, 3:21:50 AM4/7/17

to SANJAY SRIVALLABH SINGAPURAM, Polly Development

Is there a problem with reinitializing?

Best,
Tobias

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 8, 2017, 2:53:38 AM4/8/17

to Tobias Grosser, Polly Development

I

It it detrimental for front-ends which are time sensitive and have already initialised the backend.
I'm not sure if we're allowed to call the LLVMInitialize* functions more than once.

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 8, 2017, 3:16:35 AM4/8/17

to Tobias Grosser, Polly Development

I made the following changes to RegisterPasses.cpp

[...]

#include "llvm/Transforms/Vectorize.h"

+#include "llvm/../../lib/Target/NVPTX/NVPTXTargetMachine.cpp"

[...]

#ifdef GPU_CODEGEN

initializePPCGCodeGenerationPass(Registry);

+ LLVMInitializeNVPTXTarget();

#endif

The build stopped with the file "NVPTXGenRegisterInfo.inc" not being found.

In file included from /home/sanjay/Software/polly_julia/llvm_src/include/llvm/../../lib/Target/NVPTX/NVPTXTargetMachine.cpp:14:0,

from /home/sanjay/Software/polly_julia/llvm_src/tools/polly/lib/Support/RegisterPasses.cpp:42:

/home/sanjay/Software/polly_julia/llvm_src/include/llvm/../../lib/Target/NVPTX/NVPTX.h:171:36: fatal error: NVPTXGenRegisterInfo.inc: No such file or directory

#include "NVPTXGenRegisterInfo.inc"

Tobias Grosser

unread,

Apr 8, 2017, 3:21:54 AM4/8/17

to SANJAY SRIVALLABH SINGAPURAM, Polly Development

Perfect. Sounds like a great problem to fix and debug. Can you give
this a shot yourself. You know most of the infrastructure, so it is time
to go through some of these problems yourself. What would be a patch
that you could propose to polly, which would resolve your problem?

> > 1. It it detrimental for front-ends which are time sensitive and have
> > already initialised the backend.
> > 2. I'm not sure if we're allowed to call the LLVMInitialize* functions

Tobias Grosser

unread,

Apr 8, 2017, 3:27:02 AM4/8/17

to SANJAY SRIVALLABH SINGAPURAM, Polly Development

> 1. It it detrimental for front-ends which are time sensitive and have
> already initialised the backend.

How do you know this if you did not measure the overhead? My guess is
that you would not even be able to measure it.

> 2. I'm not sure if we're allowed to call the LLVMInitialize* functions
> more than once.

I believe we are. Would be good to verify this, though.

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 8, 2017, 7:32:38 AM4/8/17

to Tobias Grosser, Polly Development

I made the following changes to check for both the extra time taken and possible issues with reintialization.

+ clock_t start, stop;

+ start = clock();

+ for( int i=10 ; i ; i-- ) {

/*

InitializeNativeTarget();

InitializeNativeTargetAsmPrinter();

InitializeNativeTargetAsmParser();

InitializeNativeTargetDisassembler();

*/

InitializeAllTargets();

InitializeAllTargetMCs();

InitializeAllAsmPrinters();

InitializeAllAsmParsers();

InitializeAllDisassemblers(); }

+ stop = clock();

+ printf("%llf\n",((long double)((long long)stop - start))/CLOCKS_PER_SEC);

There was a .01 to .008 ms increase when AllTargets were initialized over just when the NativeTarget was initialised ( for a single iteration of the loop ). Also there were no problems due to reinitialization because Julia was able to run PTX kernels.

Tobias Grosser

unread,

Apr 8, 2017, 10:52:52 AM4/8/17

to SANJAY SRIVALLABH SINGAPURAM, Polly Development

Sounds perfect. Can you send a patch that enables these functions under:

if (Target == TARGET_GPU)

Even though I don't expect a lot of performance overhead, I prefer to
get some more testing before we do this in the standard compilation
flow.

Best,
Tobias

SANJAY SRIVALLABH SINGAPURAM

unread,

Apr 9, 2017, 1:49:54 AM4/9/17

to Tobias Grosser, Polly Development

Hello Tobias,

I've made changes to CMakeLists.txt and RegisterPasses.cpp,

@@ -90,6 +90,14 @@ if (BUILD_SHARED_LIBS)

LLVMTarget

LLVMVectorize

)

+ if (GPU_CODEGEN)

+ target_link_libraries(Polly

+ LLVMNVPTXCodeGen

+ LLVMNVPTXInfo

+ LLVMNVPTXDesc

+ LLVMNVPTXAsmPrinter

+ )

+ endif (GPU_CODEGEN)

link_directories(

${LLVM_LIBRARY_DIR}

)

-----------------------------------------

@@ -39,6 +39,7 @@

#include "llvm/Transforms/Vectorize.h"

+#include "llvm/Support/TargetSelect.h"

@@ -204,7 +205,13 @@ void initializePollyPasses(PassRegistry &Registry) {

initializeCodeGenerationPass(Registry);

#ifdef GPU_CODEGEN

- initializePPCGCodeGenerationPass(Registry);

+ if( Target == TARGET_GPU ) {

+ initializePPCGCodeGenerationPass(Registry);

+ LLVMInitializeNVPTXTarget();

+ LLVMInitializeNVPTXTargetInfo();

+ LLVMInitializeNVPTXTargetMC();

+ LLVMInitializeNVPTXAsmPrinter();

+ }

#endif

Using InitilializeAllTarget*() would've required linking all libraries built, which isn't possible to specify statically in CMakeLists.txt. The build succeeds when Polly is built as libPolly.so since the NVPTX backend is linked to libPolly.so. It fails when Polly is built as libPolly.a, since there isn't any linking when the archive is created.

[Creating libPolly.a]

/home/sanjay/Software/cmake-3.6.2/bin/cmake -E remove lib/libPolly.a && /usr/bin/ar qc lib/libPolly.a tools/polly/lib/CMakeFiles/Polly.dir/Analysis/DependenceInfo.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/PolyhedralInfo.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopDetection.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopDetectionDiagnostic.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopInfo.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopBuilder.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopGraphPrinter.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/ScopPass.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Analysis/PruneUnprofitable.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/BlockGenerators.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/IslAst.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/IslExprBuilder.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/IslNodeBuilder.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/CodeGeneration.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/LoopGenerators.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/IRBuilder.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/Utils.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/RuntimeDebugBuilder.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/CodegenCleanup.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/PerfMonitor.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/CodeGen/PPCGCodeGeneration.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Exchange/JSONExporter.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/GICHelper.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/SCEVAffinator.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/SCEVValidator.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/RegisterPasses.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/ScopHelper.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/ScopLocation.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/ISLTools.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Support/DumpModulePass.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/External/JSON/json_reader.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/External/JSON/json_value.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/External/JSON/json_writer.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/Canonicalization.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/CodePreparation.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/DeadCodeElimination.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/ScheduleOptimizer.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/FlattenSchedule.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/FlattenAlgo.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/DeLICM.cpp.o tools/polly/lib/CMakeFiles/Polly.dir/Transform/Simplify.cpp.o && /usr/bin/ranlib lib/libPolly.a

[Linking CXX executable bin/bugpoint]

FAILED: : && /usr/bin/c++ -fPIC -fvisibility-inlines-hidden -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-allow-shlib-undefined -Wl,--export-dynamic -Wl,-O3 -Wl,--gc-sections tools/bugpoint/CMakeFiles/bugpoint.dir/BugDriver.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/CrashDebugger.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/ExecutionDriver.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/ExtractFunction.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/FindBugs.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/Miscompilation.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/OptimizerDriver.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/ToolRunner.cpp.o tools/bugpoint/CMakeFiles/bugpoint.dir/bugpoint.cpp.o -o bin/bugpoint lib/libLLVMAnalysis.a lib/libLLVMBitWriter.a lib/libLLVMCodeGen.a lib/libLLVMCore.a lib/libLLVMipo.a lib/libLLVMIRReader.a lib/libLLVMInstCombine.a lib/libLLVMInstrumentation.a lib/libLLVMLinker.a lib/libLLVMObjCARCOpts.a lib/libLLVMScalarOpts.a lib/libLLVMSupport.a lib/libLLVMTarget.a lib/libLLVMTransformUtils.a lib/libLLVMVectorize.a -lpthread lib/libPolly.a lib/libLLVMTarget.a lib/libLLVMBitWriter.a lib/libLLVMAsmParser.a lib/libLLVMInstCombine.a lib/libLLVMTransformUtils.a lib/libLLVMAnalysis.a lib/libLLVMObject.a lib/libLLVMBitReader.a lib/libLLVMMCParser.a lib/libLLVMMC.a lib/libLLVMProfileData.a lib/libLLVMCore.a lib/libLLVMSupport.a -lrt -ldl -ltinfo -lpthread -lz -lm lib/libLLVMDemangle.a lib/libPollyPPCG.a lib/libPollyISL.a -Wl,-rpath,"\$ORIGIN/../lib" && :

lib/libPolly.a(RegisterPasses.cpp.o):RegisterPasses.cpp:function polly::initializePollyPasses(llvm::PassRegistry&): error: undefined reference to 'LLVMInitializeNVPTXTarget'

lib/libPolly.a(RegisterPasses.cpp.o):RegisterPasses.cpp:function polly::initializePollyPasses(llvm::PassRegistry&): error: undefined reference to 'LLVMInitializeNVPTXTargetInfo'

lib/libPolly.a(RegisterPasses.cpp.o):RegisterPasses.cpp:function polly::initializePollyPasses(llvm::PassRegistry&): error: undefined reference to 'LLVMInitializeNVPTXTargetMC'

lib/libPolly.a(RegisterPasses.cpp.o):RegisterPasses.cpp:function polly::initializePollyPasses(llvm::PassRegistry&): error: undefined reference to 'LLVMInitializeNVPTXAsmPrinter'

collect2: error: ld returned 1 exit status

We need a way to directly link the NVPTX backend to RegisterPassess.cpp.o, since this will ensure that the backend is available to initializePollyPasses any build scenario. I thought of using add_executable(RegisterPassess.cpp.o ...) and then target_link_libraries(RegisterPassess.cpp.o LLVMNVPTX... ) but that would,