Polly-Parallel with OpenMP for Rust

Josh Ward

unread,

Mar 8, 2021, 4:48:01 PM3/8/21

to Polly Development

I am attempting to compile Rust IR for Polly OpenMP but am having issues getting parallelization. passing the appropriate flags to llvm doesnt cause multithreading so I have resorted to calling llvm on the bytecode as per the examples which is throwing malformed JSCOP errors after analyses.

Is there any information on Polly interoperability with Rust? Rust has added the Polly feature but given my investigation this must only be for the vectorizer.

Thanks.

Michael Kruse

unread,

Mar 9, 2021, 7:55:19 AM3/9/21

to Josh Ward, Polly Development

I am not involved with Polly support for Rust, I suggest to ask the
Rust community.

In principle, for optimizing Rust-code with Polly nothing more than
adding Polly to the pass manager pipeline is needed. JSCOP
import/export is intended for developers and should not be encountered
by end-users. Usually it can only be triggered (at least in opt/clang)
by using internal+hidden command line options.

Michael

> --
> You received this message because you are subscribed to the Google Groups "Polly Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to polly-dev+...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/polly-dev/f3aa3034-7a5e-424a-be0e-7ef0a9499a39n%40googlegroups.com.

--
Tardyzentrismus verboten!

Josh Ward

unread,

Mar 9, 2021, 3:40:44 PM3/9/21

to Polly Development

I have only ended up in JSCOP calls because 1. I am seriously interested in developing LLVM and polly and 2. nothing works.

I can call all passes with opt and clang except for the passes that import JSCOP (in opt) to perform interchange. In clang I can call all the passes except for a few such as set-position early (admittedly deprecated) I get an archaic error that "error in backend: Option {polly-show, position-early etc.} not supported with NPM" but when I look that up in my LLVM build I get hit with tons of abstraction with dialog (DIAG) entries which are a dead end. Am I on a wrong build? I built LLVM with ENABLE_PROJECTS="clang;clang-tools-extra;libunwind;lldb;compiler-rt;lld;openmp;polly;debuginfo-tests;parallel-libs"

Rust with polly flags as well as clang with polly-parallel-force seems to quietly fail resulting in no optimizations but running code when I call opt passes manually (again as per website example documentation) I can display graphviz graph that confirms there are green SCOPs ready for optimization. I am using LLVM 13 and the associated Clang.

also the example code on the website is non existent it just shows resulting SCOPs of a mystery.c file which is initially confusing.

Id like to develop code at some point and I can at least contribute from this with documentation. Ive also requested membership to submit bugs.

Josh Ward

Michael Kruse

unread,

Mar 10, 2021, 1:51:30 AM3/10/21

to Josh Ward, Polly Development

Am Di., 9. März 2021 um 14:40 Uhr schrieb Josh Ward <joshawa...@gmail.com>:
> I have only ended up in JSCOP calls because 1. I am seriously interested in developing LLVM and polly and 2. nothing works.
>
> I can call all passes with opt and clang except for the passes that import JSCOP (in opt) to perform interchange. In clang I can call all the passes except for a few such as set-position early (admittedly deprecated) I get an archaic error that "error in backend: Option {polly-show, position-early etc.} not supported with NPM" but when I look that up in my LLVM build I get hit with tons of abstraction with dialog (DIAG) entries which are a dead end. Am I on a wrong build? I built LLVM with ENABLE_PROJECTS="clang;clang-tools-extra;libunwind;lldb;compiler-rt;lld;openmp;polly;debuginfo-tests;parallel-libs"

LLVM recently switched to the new pass manager (NPM) and some
developer's options have not been ported yet. You can build LLVM with
`cmake -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF`. Even without the
flag, you can use `opt -enable-new-pm=0` for `clang
-flegacy-pass-manger`. I don't know a corresponding option for rustc.
I have a not-yet-committed patch to support the -polly-position=early
(and -polly-dump-before) using the NPM. I can port more options for
the NPM, but I work primarily with the legacy pass manager and for
some things the NPM does not provide support for passes not in its
core libraries (specifically `clang -fdebug-pass-manager`).

If you are interested in contributing improved support of Polly and
the NPM, I can point to the relevant directions.

> Rust with polly flags as well as clang with polly-parallel-force seems to quietly fail resulting in no optimizations but running code when I call opt passes manually (again as per website example documentation) I can display graphviz graph that confirms there are green SCOPs ready for optimization. I am using LLVM 13 and the associated Clang.

Whether SCoPs are recognized may depend on other passes in the pass
pipeline, such as loop normalization and alias analysis.

> also the example code on the website is non existent it just shows resulting SCOPs of a mystery.c file which is initially confusing.

The documentation is unfortunately quite out-of-date. Improving it
would certainly be appreciated. There are more options giving more
interesting output, such as -debug-only=polly-scops,
-debug-only=polly-opt-isl and -debu-only=polly-ast. Interpreting most
of the output by Polly usually requires some knowledge about its
internal modeling.

> Id like to develop code at some point and I can at least contribute from this with documentation. Ive also requested membership to submit bugs.

Looking forward to your contributions. Please don't hesitate having
more conversations on this mailing list.

I've created a bugs.llvm.org account for you. I send the password in a
separate email.

Michael

--
Tardyzentrismus verboten!

Josh Ward

unread,

Mar 12, 2021, 3:05:01 AM3/12/21

to Polly Development

Polly is fully functioning in Rust. I don't get the performance I would hope for using Iter (presumably because of the memory-functional implementation of .next()) even when using Vec<Vec<>> but that is a separate matter. NPM still doesn't support the aforementioned passes and Rust passes view-scops to llvm pass configuration but doesn't render .dot scops in /tmp/ like Clang.

The example code can be found in llvm-project/polly/docs/ for anyone who lands here from the webpage (furiously googling as I).

Thank you I hope to submit a PR one day

Michael Kruse

unread,

Mar 12, 2021, 11:53:18 AM3/12/21

to Josh Ward, Polly Development

Am Fr., 12. März 2021 um 02:05 Uhr schrieb Josh Ward
<joshawa...@gmail.com>:

> Polly is fully functioning in Rust. I don't get the performance I would hope for using Iter (presumably because of the memory-functional implementation of .next()) even when using Vec<Vec<>> but that is a separate matter. NPM still doesn't support the aforementioned passes and Rust passes view-scops to llvm pass configuration but doesn't render .dot scops in /tmp/ like Clang.

For C++ iterator loops, Polly depends on them being inlined to
understand what they are doing and that any memory access they make
can be hoisted before the loop. It might indeed be the culprit.

Michael

--
Tardyzentrismus verboten!

Josh Ward

unread,

Mar 22, 2021, 12:16:05 AM3/22/21

to Polly Development

Where in the code can I find where polly builds the constraints for the optimization problem? such as registers, L1 cache, memory heirarchy profile and number of cores? I would like to understand how llvm can communicate from the SCOP/AST formats to the backend target architecture. Any file references in the polly/ folder would be greatly appreciated. I notice things like using 64 bit integers as the addressable size regardless of the target architecture, does LLVM use constant constraints for all target architectures?

I am tracing the openmp pragma labeling for C code and trying to find the equivalent in Rust.

Michael Kruse

unread,

Mar 24, 2021, 10:49:51 PM3/24/21

to Josh Ward, Polly Development

Am So., 21. März 2021 um 23:16 Uhr schrieb Josh Ward
<joshawa...@gmail.com>:
>

> Where in the code can I find where polly builds the constraints for the optimization problem? such as registers, L1 cache, memory heirarchy profile and number of cores?

Constraints are set here:
https://github.com/llvm/llvm-project/blob/97d8972c9cd1295fe838b0d0d1be4cefe2dd0b1c/polly/lib/Transform/ScheduleOptimizer.cpp#L1957

Caches hierarchy properties are set by command line parameter:
https://github.com/llvm/llvm-project/blob/97d8972c9cd1295fe838b0d0d1be4cefe2dd0b1c/polly/lib/Transform/ScheduleOptimizer.cpp#L156
or from TargetTransformInfo if available. Note that this is only
really used by the Generalized Matrix-Matrix Multiplication detection.

Number of threads can also be set by command line parameter:
https://github.com/llvm/llvm-project/blob/97d8972c9cd1295fe838b0d0d1be4cefe2dd0b1c/polly/lib/CodeGen/LoopGenerators.cpp#L31
However, this is typically managed by the OpenMP runtime giving it a
schedule to use.

> I would like to understand how llvm can communicate from the SCOP/AST formats to the backend target architecture. Any file references in the polly/ folder would be greatly appreciated.

It's done by the chain of passes.

ScopInfo creates the polly::Scop Scop structure from LLVM-IR.
polly::Scop also contains an isl_schedule_tree derived from the
sequential execution order.

IslScheduleOptimizer transforms the isl_schedule_tree. As mentioned,
it may use hardware details from TargetTransformInfo.

IslAstInfo converts and isl_schedule_tree into an isl_ast.

Finally, the CodeGeneration pass emits the isl_ast back into LLVM-IR.

> I notice things like using 64 bit integers as the addressable size regardless of the target architecture, does LLVM use constant constraints for all target architectures?

ISL's integer representation is arbitrary-sizes (e.g. using GNU MP),
i.e. don't have a native bit-width. 64 bits was used at the beginning
and was fine so far. We worked on determining the maximum value of an
ISL expression so we know what integer precision to use, but we never
got to a result.

> I am tracing the openmp pragma labeling for C code and trying to find the equivalent in Rust.

Clang lower OpenMP pragmas in its code-generation phase. That is, the
LLVM-IR contains function calls to the OpenMP runtime. E.g. #pragma
omp parallel is call call to the __kmpc_fork_call with a pointer to a
function containing the code in the parallel region. LLVM's OpenMPOpt
that is currently being worked on has to recognize these runtime
calls. Unfortunately, Polly does not have such a detection that would
make it possible to optimize OpenMP-annotated regions.

Hope this helps.

Michael

--
Tardyzentrismus verboten!

Josh Ward

unread,

Mar 25, 2021, 9:26:08 PM3/25/21

to Polly Development

This helps greatly; I will continue my code reading.

I meant where PPCG labels loops with OpenMP
here:
https://github.com/llvm/llvm-project/blob/ed8d76ec60745f88b1dfd28876dd2d1143c04279/polly/lib/External/ppcg/cpu.c#L302

and here:
https://github.com/llvm/llvm-project/blob/ed8d76ec60745f88b1dfd28876dd2d1143c04279/polly/lib/External/ppcg/cpu.c#L353

I am very early in my understanding of the codebase. Thanks this will help exponentially.

Michael Kruse

unread,

Mar 30, 2021, 1:10:13 AM3/30/21

to Josh Ward, Polly Development

The code you are referencing has been copied from PPCG
(https://repo.or.cz/ppcg.git). PPCG supports the switches `--target=c
--openmp` (https://github.com/Meinersbur/ppcg/blob/a8387001bc35046176766f842bf37921a8ea104d/ppcg_options.c#L108),
but Polly uses this source only for GPGPU code generation
("Polly-ACC"). Using the ppcg OpenMP mode would require setting the
option to true (https://github.com/llvm/llvm-project/blob/ed8d76ec60745f88b1dfd28876dd2d1143c04279/polly/lib/CodeGen/PPCGCodeGeneration.cpp#L2616)

Polly-ACC replaces the IslScheduleOptimizer, IslAstInfo and
CodeGeneration with a single PPCGCodeGeneration pass. Use the
non-GPGPU code generation path for generating OpenMP programs.

Michael

Am Do., 25. März 2021 um 20:26 Uhr schrieb Josh Ward

> --
> You received this message because you are subscribed to the Google Groups "Polly Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to polly-dev+...@googlegroups.com.

> To view this discussion on the web, visit https://groups.google.com/d/msgid/polly-dev/8c9d7301-6cbc-438a-a94e-c16c6f2c0d9bn%40googlegroups.com.

--
Tardyzentrismus verboten!

Josh Ward

unread,

Apr 5, 2021, 7:39:06 PM4/5/21

to Polly Development

Can Polly emit OpenCL for GPGPU targeting like PPCG but from IR and without the C pragma annotations?
This is just as important to me as properly hoisting and identifying SCOPs from Rustc IR.

Josh Ward

unread,

Apr 5, 2021, 8:50:21 PM4/5/21

to Polly Development

Where can I find a thorough location of the arguments I can pass to Polly. I am trying to call the pollyacc gpgpu commands and am guessing from the gpgpu tests and configuration.

This would expedite my learning.

Michael Kruse

unread,

Apr 5, 2021, 9:41:05 PM4/5/21

to Josh Ward, Polly Development

Am Mo., 5. Apr. 2021 um 18:39 Uhr schrieb Josh Ward <joshawa...@gmail.com>:
> Can Polly emit OpenCL for GPGPU targeting like PPCG but from IR and without the C pragma annotations?

PollyACC emits code that either links to libcudart or libopencl
without requiring pragma annotations (neither does PPCG with its
autodetect option). In contrast to PPCG, it does not emit source code
(CUDA or OpenCL C) but a binary representation (PTX or SPIR; _not_
SPIR-V).

Michael

--
Tardyzentrismus verboten!

Reply all

Reply to author

Forward