Handle external function call during loop analysis

14 views
Skip to first unread message

Dmitry N. Mikushin

unread,
Sep 3, 2012, 6:24:43 PM9/3/12
to polly-dev
Dear colleagues,

The version of Polly we work with always treats loop as non-parallel,
when its body contains external function call defined in the same
module, unless it is defined with some nice attribute, like readnone.
In order to workaround this effect, we performed aggressive inlining.
But with complex realistic apps this is unsuitable.

Is it possible to recursively walk through the called functions and
analyze them as well?
I can't find it now, but I remember something related is already being
discussed/developed, is it?

Thanks,
- Dima on behalf of kernelgen.org

Tobias Grosser

unread,
Sep 4, 2012, 4:48:51 AM9/4/12
to Dmitry N. Mikushin, polly-dev
Hi Dimitry,

you are right with spotting that the two current options to handle
function calls is either to prove they are readnone or to inline them.
We have been discussing other options, e.g. to add metadata to a
function that allows to specify its overall effect. Something like

int foo(int n) WRITES (A[i] : 0 <= i < n) {

// some crazy not-analyzable function

}

This would allow users to hide difficult, non-static code in some kind
of black box. Instead of only allowing read-none function calls, we
could now allow all function calls where we know the side effects.

In your use case, you want functions that are statically analyzable and
that can theoretically be inlined, to not be inlined. I am currently not
sure, why you want to do this. However, the way to do it would probably
be to automatically create the meta-data that specifies the behavior of
the function and to extend Polly such that it
accepts and understands functions with such meta-data.

Cheers
Tobi

Dmitry N. Mikushin

unread,
Sep 5, 2012, 10:58:14 AM9/5/12
to Tobias Grosser, polly-dev
Dear Tobi,

Thanks for reply,

Our typical use case looks like a very deep cascade of nested loops
distributed across call graph. Every subgroup of loops is considered
for parallelism upon execution (in our case - in runtime). Before
using Polly, we need to inline entire part of graph, that could be
very large. Since model codes normally have a global time step loop on
top of everything, one of subgraphs will contain almost the whole
program, and a lot others will be significantly big. In such
circumstances inlining is impossible.

At our end we are more thinking about the possibility of fair and
explicit handling of external functions, rather than hiding anything.
Instead of inlining it would be perfect to have called function bodies
to be processed on-the-fly during analysis and SCEV. If a function not
satisfying valid scope conditions is met, then analysis can be shut
down very early. Is it possible, how do you think?

- D.

2012/9/4 Tobias Grosser <tob...@grosser.es>:

Hongbin Zheng

unread,
Sep 5, 2012, 11:53:17 AM9/5/12
to Dmitry N. Mikushin, Tobias Grosser, polly-dev
On Wed, Sep 5, 2012 at 10:58 PM, Dmitry N. Mikushin <maem...@gmail.com> wrote:
> Dear Tobi,
>
> Thanks for reply,
>
> Our typical use case looks like a very deep cascade of nested loops
> distributed across call graph. Every subgroup of loops is considered
> for parallelism upon execution (in our case - in runtime). Before
> using Polly, we need to inline entire part of graph, that could be
> very large. Since model codes normally have a global time step loop on
> top of everything, one of subgraphs will contain almost the whole
> program, and a lot others will be significantly big. In such
> circumstances inlining is impossible.
>
> At our end we are more thinking about the possibility of fair and
> explicit handling of external functions, rather than hiding anything.
> Instead of inlining it would be perfect to have called function bodies
> to be processed on-the-fly during analysis and SCEV. If a function not
> satisfying valid scope conditions is met, then analysis can be shut
> down very early. Is it possible, how do you think?
>
> - D.
>

Hi Dimitry,

I am also interested in the "Interprocedural" approach, like what you
said, we can implement an Interprocedural Pre-SCoP analysis (like
nowaday's TempSCoP pass) as an CallGraphSCC pass to provides the
information about the called functions. And then build the SCoPs
according to the Interprocedural Pre-SCoP analysis.

best regards
ether

Dmitry N. Mikushin

unread,
Sep 5, 2012, 5:56:05 PM9/5/12
to Hongbin Zheng, Tobias Grosser, polly-dev, kernelg...@lists.hpcforge.org
Dear Ether,

Thank you for supportive comment, I can see now this problem is of
major interest! Do you already have a detailed plan of this
functionality? I'm abridging our small devel group, let's discuss it
all together. We will be able to work more actively in 2 weeks (next
week there are two project talks scheduled, need to prepare).

The version of polly we use is slightly customized r157485. TempSCoP
there is only mentioned in one comment. I'm guessing it was introduced
in newer versions?

Best,
- D.

2012/9/5 Hongbin Zheng <ethe...@gmail.com>:

Hongbin Zheng

unread,
Sep 6, 2012, 12:40:54 PM9/6/12
to Dmitry N. Mikushin, Tobias Grosser, polly-dev, kernelg...@lists.hpcforge.org
Hi Dmitry,

On Thu, Sep 6, 2012 at 5:56 AM, Dmitry N. Mikushin <maem...@gmail.com> wrote:
> Dear Ether,
>
> Thank you for supportive comment, I can see now this problem is of
> major interest! Do you already have a detailed plan of this
> functionality? I'm abridging our small devel group, let's discuss it
> all together. We will be able to work more actively in 2 weeks (next
> week there are two project talks scheduled, need to prepare).
I afraid I do not have a detailed plan for this. I am going to pick up
my last codegen related patch first after this week I finished my
personal issue.

In order to allow subfunction in a SCoP, we can treat the subfunction
as a single SCoP statement at the first step, which means Polly is not
going to change the code in the subfunction, otherwise the SCoP
becomes "Interprocedural".
Handling an interprocedural SCoP may out of Polly's capability right
now, because Polly is heavily depends on the ScalarEvoluation pass,
which is not interprocedural.

Another possible solution is, inline all subfunction before Polly run,
and then extract the inlined code back to a Function after Polly.

>
> The version of polly we use is slightly customized r157485. TempSCoP
> there is only mentioned in one comment. I'm guessing it was introduced
Sorry for not making myself clean. I means the TempScopInfo pass,
which is located in the lib/Analysis/TempScopInfo.cpp.
This pass perform some early analysis for building a SCoP.
> in newer versions?
>
> Best,
> - D.
>

best regards
ether
Reply all
Reply to author
Forward
0 new messages