You need to load the Polly.so object file. After the file is loaded, all
Polly passes are automatically available. To load them you have two options:
1) Add them to the pass list
This is a rather long list of additional passes. The passes we add can
be seen in lib/RegisterPasses.cpp (you also need the preparing
transformations)
2) You use the pass manager builder
Look at llvm/Transforms/IPO/PassManagerBuilder.h with
PassManagerBuilder::populateFunctionPassManager(). At -O3 and with
enabling the -polly command line option (no idea how that would work),
the Polly passes are part of the normal -O3 passes.
>> Is there a way I can dump the content of the entire LLVM-IR module
>> generated in the demo/matrixmul/matrixmul.py example?
> You can do so by printing the default_module:
>
> print default_module
Perfect, that's what I was looking for.
> You may want to do so before optimizing with "default_module.optimize()"
> to see what my codegen is doing.
>
> I will be adding more documentation to the project wiki.
Great.
I just looked at the generated code. Polly can not directly optimize it,
but I don't see any fundamental problems. In fact the code looks really
nice. The main issues I have seen here are:
1. The array references could alias
The arguments of the function matrixmul_naive can alias, which not only
blocks Polly from working right now, but will also make other LLVM
transformations less effective. If you can guarantee that the arguments
do not alias, the best would be to add the parameter attribute [1]
'noalias' to those parameters.
2. No target data set
The LLVM-IR module you are generating does not have any target data
string set. When trying my optimizations I set manually something like:
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"
This again is something that will both help generic optimizations, as
well as Polly.
3. Variable size arrays
You are using variable size arrays in the code you generate. This is
perfectly fine, but currently not supported by Polly. As a workaround,
setting n = 1024 at the beginning of the code is enough to make Polly
work. The right solution is obviously to add variable length array
support to Polly.
4. Pass ordering issue
The pass order 'opt -O3 -polly' uses is not good enough to detect your
code. Using 'opt -O3 | opt -O3 -polly' works. This means, we probably
need to schedule one or two additional canonicalization passes. One
reason for this may be, that you 'alloc' data elements in the body of a
function. Many LLVM passes put the alloc instructions always in the very
first basic block. You may consider doing the same, when doing code
generation.
Again, thanks for this very nice cool. I am looking forward to play more
with it.
Cheers
Tobi
[1]
http://llvm.org/docs/LangRef.html#paramattrs