New codegen framework for Polly

4 views
Skip to first unread message

ether zhhb

unread,
Apr 30, 2011, 10:52:21 AM4/30/11
to poll...@googlegroups.com, polly-commits
hi,

I just implement a new (experimental) codegen framework for polly,
base on the code in CodeGeneration.cpp.

A brief description of the new codegen framework is: The new codegen
framework is more modular and allow us add new codegen pass to polly
more easier, but the framework is more complicated than the original
light-weight codegen framework. And the sequential codegen pass passes
all regression test that the original codegen pass passed.

The codegen framework is designed to provide a modular approach to
implement code generation support for various software/hardware
platforms in Polly. To achieve this, the codegen passes is implemented
as a member of the "Codegen" analysis group[1], and should inherited
from the "Codegen" class and some llvm pass class(hopefully ScopPass)
at the same time.

In the new codegen framework, all enabled codegen passes will be
chained together, and the codegen passes do not generate any code in
their runOnXXXX function, instead they generate code when they handle
the codegen request sent from the codegen driver pass. This means you
can implement a codegen pass as an ImmutablePass, and in fact, the
default codegen implementation "SequentialCodegen", which always
generate sequential code, is a ImmutablePass. The codegen request send
from the codegen driver pass to the last codegen pass in the chain
(the first codegen pass in the chain will always be
SequentialCodegen). And the codegen request will pass through the
chain until one codegen pass decide to handle the request and stop
passing it to the previous one.

Because the codegen passes are chained together, a codegen
implementation can only generate code for some specific clast
statement or expression, for example, a vectorization codegen pass can
only generate code for the inner most parallel loop. and the rest
clast stuffs that not handled by the current codegen pass will be
forward to the next codegen pass, for example, the default codegen
implementation "SequentialCodegen". This means we can separate the
codegen support code for difference platforms in difference codegen
passes, this allow us to add much more codegen support targeting
difference platforms to Polly. And the chained codegen passes can
also generate hybrid code that exploits parallelism at several level
at the same time, as the original light-weight codegen approach.

To support codegen backends implemented within the framework, two new
data structures and some corresponding concepts are introduced:

1. Codegen location (the Loc struct), a codegen location provide some
information including:
*Where should the codegen passes place the new generated code, and
*What kind of clast statement should the codegen passes to generate code for.

To ensure the codegen request can always be sent to the last codegen
pass in the codegen chain, when the codegen pass finish generate code
for the current codegen location, they should return the next codegen
location and codegen driver, and the codegen driver will sent the next
codegen location to the last pass in the chain, and you should not
call function recursively, otherwise the codegen request will not been
sent to the codegen passes after you codegen pass in the chain.

2. Codegen region (the Reg struct), a codegen region carries
information about a region in the new generated code, including:
*The clast statement correspond to this codegen region,
*The entry BasicBlock and the exiting BasicBlock of this codegen
region, this is useful when we generate code for some clast statements
will lead to nonlinear CFG such as clast_for, clast_guard.
*A symbol table that mapping clast names to LLVM Value(CharMap in
CodeGeneration.cpp), the symbol table is shared by all codegen
regions.
*A vector of "ValueMap"s that mapping LLVM value in the old scop to
the new generated code, we need more than one map because we need to
support generate code for unrolled loop(The same as code in
CodeGeneration.cpp), and we say each ValueMap correspond to a "symbol
space", and you look up the same value in the old scop in difference
"symbol space" will get difference result.

As the clast have some tree structure, we will have nested codegen
regions, and because we will delete the codegen region after we leave
it, so the nested codegen regions have a stack structure. And during
the codegen process, we will have only one codegen region stack, and
the stack is shared by all codegen passes in the chain. Codegen region
pushing and poping are managed by the Codegen class internally, and
codegen pass do not need to worry about the codegen region management,
all their need to do about codegen region are do some preparation when
entering a codegen region, and do finalization when exiting a codegen
region:

For example, in "enterLoop" function of the SequentialCodegen pass,
the codegen region for a clast_for statement and the bounds of the
loop are passed in, and the what the function does are:
1. Generate CFG for the loop, excluding the back edge.
2. Generate the induction variable for the loop, and insert it and its
name into the symbol table of the codegen region.
3. Set the entry block and exiting block of the codegen region to the
right BasicBlock.

and in the "exitLoop" function of the SequentialCodegen pass, the
previous codegen region passed to the "enterLoop" function is passed
in again, and the last BasicBlock of the loop body is passed in, too.
All the function does is:
1. Generate the back edge, branching from the last BasicBlock of the
loop to the entry of the codegen region(also the loop header).
2. Add the incoming value to the induction variable.
3. Erase the the induction variable in the symbol table.
4. Return the codegen location for the clast statement next to the
"for" statement of the codegen region, the BasicBlock of the returned
location is the exit block of the loop.


These are all i can write down at the moment, i will explain you
question in detail if you have any confuse, and you can find more
example for codegen region entering/exiting in the source code.


I not propose to replace the original codegen stuffs by the new
codegen right now because i do not move the code in
CodeGenerateion.cpp to the new codegen framework completely, but i
suggest we replace it later. After we move necessary code in the
"CodeGeneration.cpp" to the new codegen framework, and we are sure
that the new codegen framework works correctly.

Yes, the new codegen framework will have some(or lots) of overhead to
the existing approach, but we can always improve it together, so,
comment and suggestion is appreciate!(It seems all you guys are very
busy at the moment, so i will wait:) )

--
best regards
ether


[1]http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup

0001-Codegen-Add-the-generic-code-generation-interface-an.patch
0002-Codegen-Add-the-default-implementation-of-the-codege.patch
0003-Codegen-Test-the-SequentialCodegen-in-regression-tes.patch
0004-Codegen-Add-the-initial-implementation-of-the-vector.patch

Tobias Grosser

unread,
May 6, 2011, 11:43:17 AM5/6/11
to ether zhhb, poll...@googlegroups.com, polly-commits
On 04/30/2011 04:52 PM, ether zhhb wrote:
> hi,
>
> I just implement a new (experimental) codegen framework for polly,
> base on the code in CodeGeneration.cpp.

I just took a look and I think this is definitely an interesting
approach. I am strongly for a more modular approach and your code looks
very nice. After the first read, I did not get all details, but this
will hopefully change over time.

As this is a completely rewrite I believe we should make sure, that we
do not introduce any regressions. Not only in make polly-test, but also
in the llvm test-suite. So your suggestion to not switch everything at
once sounds great. Still, we should make sure to evaluate your changes
fast to not get them outdated.

I propose to integrate the this patches piecewise and would like to
progress steadily towards more modularity - always keeping the code as
well (or even better) tested as before. I also would like to ensure that
the new codegen passes are as simple as possible.

I propose to start by simplifying the existing codegen pass, such that
the test cases are simpler and we the new codegen pass does need to do
less work. One of the first points that comes to my mind is that the old
pass was supposed to work on non-simple regions. I would like to remove
the non-simple region stuff (e.g. createSeSeEdges()) from it and adapt
all test cases to only provide simple regions. If this is tested and
works, we do not worry at all about non-simple region support in the new
codegen framework. What do you think?

> A brief description of the new codegen framework is: The new codegen
> framework is more modular and allow us add new codegen pass to polly
> more easier, but the framework is more complicated than the original
> light-weight codegen framework. And the sequential codegen pass passes
> all regression test that the original codegen pass passed.

Nice. This is a good first step.

> The codegen framework is designed to provide a modular approach to
> implement code generation support for various software/hardware
> platforms in Polly. To achieve this, the codegen passes is implemented
> as a member of the "Codegen" analysis group[1], and should inherited
> from the "Codegen" class and some llvm pass class(hopefully ScopPass)
> at the same time.

This is interesting. Are you sure we can use an analysis group to
perform something that is not an analysis? In case analysis groups do
not need to be analysis, we may want to contribute a patch to LLVM to
rename them to 'pass groups' or something similar. We should definitely
make sure the LLVM developers know that analysis groups are used in
passes that change the code.

> In the new codegen framework, all enabled codegen passes will be
> chained together, and the codegen passes do not generate any code in
> their runOnXXXX function, instead they generate code when they handle
> the codegen request sent from the codegen driver pass. This means you
> can implement a codegen pass as an ImmutablePass, and in fact, the
> default codegen implementation "SequentialCodegen", which always
> generate sequential code, is a ImmutablePass. The codegen request send
> from the codegen driver pass to the last codegen pass in the chain
> (the first codegen pass in the chain will always be
> SequentialCodegen). And the codegen request will pass through the
> chain until one codegen pass decide to handle the request and stop
> passing it to the previous one.

Sounds good.

> Because the codegen passes are chained together, a codegen
> implementation can only generate code for some specific clast
> statement or expression, for example, a vectorization codegen pass can
> only generate code for the inner most parallel loop. and the rest
> clast stuffs that not handled by the current codegen pass will be
> forward to the next codegen pass, for example, the default codegen
> implementation "SequentialCodegen". This means we can separate the
> codegen support code for difference platforms in difference codegen
> passes, this allow us to add much more codegen support targeting
> difference platforms to Polly. And the chained codegen passes can
> also generate hybrid code that exploits parallelism at several level
> at the same time, as the original light-weight codegen approach.

Nice.

> To support codegen backends implemented within the framework, two new
> data structures and some corresponding concepts are introduced:
>
> 1. Codegen location (the Loc struct), a codegen location provide some
> information including:
> *Where should the codegen passes place the new generated code, and
> *What kind of clast statement should the codegen passes to generate code for.
>
> To ensure the codegen request can always be sent to the last codegen
> pass in the codegen chain, when the codegen pass finish generate code
> for the current codegen location, they should return the next codegen
> location and codegen driver, and the codegen driver will sent the next
> codegen location to the last pass in the chain, and you should not
> call function recursively, otherwise the codegen request will not been
> sent to the codegen passes after you codegen pass in the chain.

Sounds OK. I need to look a little bit more into this.

> 2. Codegen region (the Reg struct), a codegen region carries
> information about a region in the new generated code, including:
> *The clast statement correspond to this codegen region,
> *The entry BasicBlock and the exiting BasicBlock of this codegen
> region, this is useful when we generate code for some clast statements
> will lead to nonlinear CFG such as clast_for, clast_guard.
> *A symbol table that mapping clast names to LLVM Value(CharMap in
> CodeGeneration.cpp), the symbol table is shared by all codegen
> regions.
> *A vector of "ValueMap"s that mapping LLVM value in the old scop to
> the new generated code, we need more than one map because we need to
> support generate code for unrolled loop(The same as code in
> CodeGeneration.cpp), and we say each ValueMap correspond to a "symbol
> space", and you look up the same value in the old scop in difference
> "symbol space" will get difference result.

General idea looks also good. I will dig into this a little bit deeper
later on.

Nice design.

> These are all i can write down at the moment, i will explain you
> question in detail if you have any confuse, and you can find more
> example for codegen region entering/exiting in the source code.

> I not propose to replace the original codegen stuffs by the new
> codegen right now because i do not move the code in
> CodeGenerateion.cpp to the new codegen framework completely, but i
> suggest we replace it later. After we move necessary code in the
> "CodeGeneration.cpp" to the new codegen framework, and we are sure
> that the new codegen framework works correctly.

Sounds fine. I have not fully understood it, but I do not see a big show
stopper. Though I have not yet fully reviewed the patches.

> Yes, the new codegen framework will have some(or lots) of overhead to
> the existing approach, but we can always improve it together, so,
> comment and suggestion is appreciate!(It seems all you guys are very
> busy at the moment, so i will wait:) )

I am not concerned about compile time performance. We can optimize this
later. As long as there are no conceptual problems, I am fine with this
approach.another context.

Another thing I would like to make sure is that our automatic testers
work, before we start to integrate bigger changes.

Tobi

ether zhhb

unread,
May 6, 2011, 11:10:39 PM5/6/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
hi tobi,

On Fri, May 6, 2011 at 11:43 PM, Tobias Grosser <tob...@grosser.es> wrote:
> On 04/30/2011 04:52 PM, ether zhhb wrote:
>>
>> hi,
>>
>> I just implement a new (experimental) codegen framework for polly,
>> base on the code in CodeGeneration.cpp.
>
> I just took a look and I think this is definitely an interesting approach. I
> am strongly for a more modular approach and your code looks very nice. After
> the first read, I did not get all details, but this will hopefully change
> over time.
>
> As this is a completely rewrite I believe we should make sure, that we do

In fact i copy a lots of code from CodeGenerateion.cpp.

> not introduce any regressions.
it do not introduce any regressions at the moment


>Not only in make polly-test, but also in the llvm test-suite.

yes, i agree.


> So your suggestion to not switch everything at once sounds great.
> Still, we should make sure to evaluate your changes fast to not get them outdated.
>
> I propose to integrate the this patches piecewise and would like to progress
> steadily towards more modularity - always keeping the code as well (or even
> better) tested as before. I also would like to ensure that the new codegen
> passes are as simple as possible.
>
> I propose to start by simplifying the existing codegen pass, such that the

Before introduce the new codegen passes, i had tried to do this, but
after remove createSeSeEdges, i got a segfault (or something, i
forgot).


> test cases are simpler and we the new codegen pass does need to do less
> work. One of the first points that comes to my mind is that the old pass was
> supposed to work on non-simple regions. I would like to remove the

And it seems also supposed to work on non-independent blocks?


> non-simple region stuff (e.g. createSeSeEdges()) from it and adapt all test
> cases to only provide simple regions. If this is tested and works, we do not
> worry at all about non-simple region support in the new codegen framework.
> What do you think?

New codegen framework asserts every incoming scop is in a simple
region, and the CFG update and Dominator tree update is simpler :)


>
>
>> The codegen framework is designed to provide a modular approach to
>> implement code generation support for various software/hardware
>> platforms in Polly. To achieve this, the codegen passes is implemented
>> as a member of the "Codegen" analysis group[1], and should inherited
>> from the "Codegen" class and some llvm pass class(hopefully ScopPass)
>> at the same time.
>
> This is interesting. Are you sure we can use an analysis group to perform
> something that is not an analysis? In case analysis groups do not need to be
> analysis, we may want to contribute a patch to LLVM to rename them to 'pass
> groups' or something similar. We should definitely make sure the LLVM
> developers know that analysis groups are used in passes that change the
> code.

The Codegen passes do not generate code on their "runOnXXX" function,
their only use the llvm analysis group framework to build the codegen
chain.
And code is generated when the codegen driver pass run, codegen driver
sends the codegen request to the codegen chain, and some pass in the
chain will respond to the request and generates some code.

And to be clear, this approach is first proposed by Andreas in our
older discussion in polly-dev list.

>

>> Yes, the new codegen framework will have some(or lots) of overhead to
>> the existing approach, but we can always improve it together, so,
>> comment and suggestion is appreciate!(It seems all you guys are very
>> busy at the moment, so i will wait:) )
>
> I am not concerned about compile time performance. We can optimize this
> later. As long as there are no conceptual problems, I am fine with this
> approach.another context.
>
> Another thing I would like to make sure is that our automatic testers work,
> before we start to integrate bigger changes.

ok, and thanks for the review.
>
> Tobi
>

best regards
ether

Tobias Grosser

unread,
May 7, 2011, 7:54:59 AM5/7/11
to ether zhhb, poll...@googlegroups.com, polly-commits
On 05/07/2011 05:10 AM, ether zhhb wrote:
> hi tobi,
>
> On Fri, May 6, 2011 at 11:43 PM, Tobias Grosser<tob...@grosser.es> wrote:
>> On 04/30/2011 04:52 PM, ether zhhb wrote:
>>>
>>> hi,
>>>
>>> I just implement a new (experimental) codegen framework for polly,
>>> base on the code in CodeGeneration.cpp.
>>
>> I just took a look and I think this is definitely an interesting approach. I
>> am strongly for a more modular approach and your code looks very nice. After
>> the first read, I did not get all details, but this will hopefully change
>> over time.
>>
>> As this is a completely rewrite I believe we should make sure, that we do
> In fact i copy a lots of code from CodeGenerateion.cpp.
Sure. In any case there is a lot of new coe.

>> not introduce any regressions.
> it do not introduce any regressions at the moment

Did you test it on more than 'make polly-test'? Did you run it on
polybench? The llvm test-suite? I really would appreciate to have wider
testing.

>> Not only in make polly-test, but also in the llvm test-suite.
> yes, i agree.
>> So your suggestion to not switch everything at once sounds great.
>> Still, we should make sure to evaluate your changes fast to not get them outdated.
>>
>> I propose to integrate the this patches piecewise and would like to progress
>> steadily towards more modularity - always keeping the code as well (or even
>> better) tested as before. I also would like to ensure that the new codegen
>> passes are as simple as possible.
>>
>> I propose to start by simplifying the existing codegen pass, such that the
> Before introduce the new codegen passes, i had tried to do this, but
> after remove createSeSeEdges, i got a segfault (or something, i
> forgot).

I will try it today.

>> test cases are simpler and we the new codegen pass does need to do less
>> work. One of the first points that comes to my mind is that the old pass was
>> supposed to work on non-simple regions. I would like to remove the
> And it seems also supposed to work on non-independent blocks?
>> non-simple region stuff (e.g. createSeSeEdges()) from it and adapt all test
>> cases to only provide simple regions. If this is tested and works, we do not
>> worry at all about non-simple region support in the new codegen framework.
>> What do you think?
> New codegen framework asserts every incoming scop is in a simple
> region, and the CFG update and Dominator tree update is simpler :)

Nice.

>>> The codegen framework is designed to provide a modular approach to
>>> implement code generation support for various software/hardware
>>> platforms in Polly. To achieve this, the codegen passes is implemented
>>> as a member of the "Codegen" analysis group[1], and should inherited
>>> from the "Codegen" class and some llvm pass class(hopefully ScopPass)
>>> at the same time.
>>
>> This is interesting. Are you sure we can use an analysis group to perform
>> something that is not an analysis? In case analysis groups do not need to be
>> analysis, we may want to contribute a patch to LLVM to rename them to 'pass
>> groups' or something similar. We should definitely make sure the LLVM
>> developers know that analysis groups are used in passes that change the
>> code.
> The Codegen passes do not generate code on their "runOnXXX" function,
> their only use the llvm analysis group framework to build the codegen
> chain.
> And code is generated when the codegen driver pass run, codegen driver
> sends the codegen request to the codegen chain, and some pass in the
> chain will respond to the request and generates some code.
>
> And to be clear, this approach is first proposed by Andreas in our
> older discussion in polly-dev list.

OK. I definitely need to take a deeper look.

>>> Yes, the new codegen framework will have some(or lots) of overhead to
>>> the existing approach, but we can always improve it together, so,
>>> comment and suggestion is appreciate!(It seems all you guys are very
>>> busy at the moment, so i will wait:) )
>>
>> I am not concerned about compile time performance. We can optimize this
>> later. As long as there are no conceptual problems, I am fine with this
>> approach.another context.
>>
>> Another thing I would like to make sure is that our automatic testers work,
>> before we start to integrate bigger changes.
> ok, and thanks for the review.

Another possibility to increase the speed of integrating your new
codegen framework is to add a new directory + a PollyCodegen library,
that is available in parallel to the existing framework. Like this we
have both available and can easily compare them, work on bugs ...
After the automatic tester are available, and some more tests were run
on polybench and the the llvm test-suite we may switch over.

What do you think?

Tobi

ether zhhb

unread,
May 7, 2011, 8:40:57 AM5/7/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
hi tobi,

On Sat, May 7, 2011 at 7:54 PM, Tobias Grosser <tob...@grosser.es> wrote:

>>
>> The Codegen passes do not generate code on their "runOnXXX" function,
>> their only use the llvm analysis group framework to build the codegen
>> chain.
>> And code is generated when the codegen driver pass run, codegen driver
>> sends the codegen request to the codegen chain, and some pass in the
>> chain will respond to the request and generates some code.
>>
>> And to be clear, this approach is first proposed by Andreas in our
>> older discussion in polly-dev list.
>
> OK. I definitely need to take a deeper look.

I am writing the html document, and i am able to send you a picture right now.
>
> Tobi
>

best regards
ether

codegen-arch.png

Tobias Grosser

unread,
May 7, 2011, 8:42:12 AM5/7/11
to ether zhhb, poll...@googlegroups.com, polly-commits
Nice. Thanks a lot. Let me know when the html document is ready.

Tobi

ether zhhb

unread,
May 7, 2011, 8:43:22 AM5/7/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
sure
>
> Tobi
>

ether zhhb

unread,
May 7, 2011, 8:56:15 AM5/7/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
hi tobi,

if you want to read the code, you can read the up to date version here:
http://repo.or.cz/w/polly.git/shortlog/refs/heads/flexiblecodegen

best regards
ether

ether zhhb

unread,
May 9, 2011, 10:25:37 AM5/9/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
hi,

On Sat, May 7, 2011 at 8:42 PM, Tobias Grosser <tob...@grosser.es> wrote:
> On 05/07/2011 02:40 PM, ether zhhb wrote:
>>
>> hi tobi,

>


> Nice. Thanks a lot. Let me know when the html document is ready.

Initial version done, patch attached.
>
> Tobi
>

best regards
ether

0003-Codegen-Add-initial-version-of-codegenframework.html.patch

Tobias Grosser

unread,
May 9, 2011, 10:26:19 AM5/9/11
to ether zhhb, poll...@googlegroups.com, polly-commits

Thanks. I will have a look.

Tobi

ether zhhb

unread,
May 9, 2011, 10:32:23 AM5/9/11
to Tobias Grosser, poll...@googlegroups.com, polly-commits
hi tobi,

>> it do not introduce any regressions at the moment
>
> Did you test it on more than 'make polly-test'? Did you run it on polybench?
> The llvm test-suite? I really would appreciate to have wider testing.

not yet, but i will do this later.
>

> Another possibility to increase the speed of integrating your new codegen
> framework is to add a new directory + a PollyCodegen library, that is
> available in parallel to the existing framework. Like this we have both

Yes, i am also going to do this, do you want i add a separated patch
to move them or edit the previous patches?

> available and can easily compare them, work on bugs ...

Yes, we can use the code generated by the light weight codegen pass as
somekind of reference answer.
And the user do not need to switch to the new untested codegen
framework immediately, they can use the old well tested codegen pass.

> After the automatic tester are available, and some more tests were run on
> polybench and the the llvm test-suite we may switch over.

Sounds great, so how is the status of the automatic tester?


>
> What do you think?
>
> Tobi
>

best regards
ether

Tobias Grosser

unread,
May 9, 2011, 3:10:47 PM5/9/11
to polly-...@googlegroups.com, ether zhhb, poll...@googlegroups.com
On 05/09/2011 04:25 PM, ether zhhb wrote:
> From 4402851816945d50d4b0a6503344ce2997d90f35 Mon Sep 17 00:00:00 2001
> From: Hongbin Zheng<ethe...@gmail.com>
> Date: Mon, 9 May 2011 22:20:07 +0800
> Subject: [PATCH 3/3] Codegen: Add initial version of codegenframework.html

Hi ether,

I just went trough this patch. It is also a nice description of the
framework which will be helpful to review the patches.

I highlighted some sentences and grammatical structures I would write
differently. Maybe you could have a look at my suggestions and see if
they are helpful

Some points I have seen pretty often and which I thought may be helpful
to highlight are:

-----------------
1. The third person 's'

The framework implementS a modular approach.
^
As 'framework' is a single item that we talk about and it is not the
first (I implement) or second (you implement) but the third person
(something implements), you need to append an 's'.

2. Unneeded words

To let you do something -> To do something

Keep the sentences as short as possible. No decoration.

3. Unfinished sentences.

"A symbol table maps clast names to LLVM Values, the symbol table is

shared by all codegen regions."

These are two sentences. Use a point and start again.

"A symbol table maps clast names to LLVM Values. It is shared by all
codegen regions."

As stated earlier, shorter sentences are easier to understand for readers.

4. Future

We are going to start ... -> We start

Always use the present tense (the time for now/at the moment)
-----------------

Except of these remarks, I think the document is already nice. Though
still not completely finished.

Cheers
Tobi

> ---
> www/codegenframework.html | 459 +++++++++++++++++++++++++++++++++++++++++++
> www/images/codegen-arch.png | Bin 0 -> 11052 bytes
> 2 files changed, 459 insertions(+), 0 deletions(-)
> create mode 100755 www/codegenframework.html
> create mode 100755 www/images/codegen-arch.png
>
> diff --git a/www/codegenframework.html b/www/codegenframework.html
> new file mode 100755
> index 0000000..991513e
> --- /dev/null
> +++ b/www/codegenframework.html
> @@ -0,0 +1,459 @@
> +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
> +"http://www.w3.org/TR/html4/strict.dtd">
> +<!-- Material used from: HTML 4.01 specs:http://www.w3.org/TR/html401/ -->
> +<html>
> +<head>
> +<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
> +<title>The Code Generation Framework of Polly</title>
> +<link type="text/css" rel="stylesheet" href="menu.css">
> +<link type="text/css" rel="stylesheet" href="content.css">
> +</head>
> +<body>
> +<!--#include virtual="menu.html.incl"-->
> +<div id="content">
> +<!--*********************************************************************-->
> +<h1>The Code Generation Framework of Polly</h1>
> +<!--*********************************************************************-->
> +<!--=====================================================================-->
> +<h2>Introduction</h2>
> +<!--=====================================================================-->
> +
> +<p>The Code Generation Framework is a part of Polly that translate
in Polly translates
> +<a href="http://www.bastoul.net/cloog/manual.php#CLooG-Output">CLAst</a> back
the CLooG AST (Clast)

> + to<a href="http://llvm.org/docs/LangRef.html">LLVM IR</a>, and it provides a
LLVM-IR . It
provides a platform to implement code generation for different platforms
in a modular way.

> + modular approach to adding code generation support for various
> + platforms.
> +</p>
> +
> +<p>The framework can be divided in to two parts: The code generation chain and
consists of two parts
> + the code generation driver:
> +</p>
> +<!--=====================================================================-->
> +<img src='images/codegen-arch.png'/>
> +
> +<!--=====================================================================-->
> +<h3>Code Generation Driver</h3>
> +<!--=====================================================================-->
> +<p>The Code Generation Driver is a ScopPass, it will get the CLAst from the
. It gets the ...

> + CloogInfo pass and call the main code generation function of the Code
calls
> + Generation Chain. The main code generation function will iterate over the
iterates
> + CLAst, and generate code for the clast statement/expression.
generates ^^ I like 'clast' actually better
than CLAst.

> + Note that only non-trivial CLAst statements/expressions(e.g. reduction
> + expression, for statement and user statement) will be sent to the chain and
expressions, for statements and user statements) are sent to
> + handled by the passes in the chain, while trivial CLAst statements/expressions
are handled by
> + will be handled immediately without passed to the chain.
are handled without being passed to
> +</p>
> +<!--=====================================================================-->
> +<h3>Code Generation Chain</h3>
> +<!--=====================================================================-->
> +<p>The Code Generation Chain consist of a set of Passes that also inherited
consists passes
> + from the "Codegen" class, their are chained together by register them as a
. They are registering
> + member of the "Codegen"
> +<a href="http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup">
> + Analysis Group</a>, and when we say "Polly Code Generation Pass", it means
. The expression "Polly [...] Pass" means a pass
in the code generation chain of Polly.
> + "A Pass in the Code Generation Chain of Polly".
> +</p>
> +<p>The code generation request will pass through chain until some pass
passes through the chain
> + in the chain respond to the request, generate code and stop passing the
responds , generates stops
> + request to the next pass in the chain. If the code generation request reach

request reaches

> + the end of the chain - the "SequentialCodegen" pass, sequential code will be
, the "SequentialCodegen" pass,
> + generated for the request. This means passes apart from SequentialCodegen pass
> + in the chain do not need to respond to ALL code generation request, instead,
requests.
Instead,

> + their only need to respond to the code generation request that their are
they only requests that they

> + interested in. For example, the vector code generation pass only generates
> + code for inner most loop and the ScopStmt inside the loop, the rest code generation
loops ScopStmts this loops. The
remaining code generation requests are ignored and passed to ...
> + request will be ignored and passed to the next pass in the chain.
> +</p>
> +
> +<!--=====================================================================-->
> +<h2>Writing a Code Generation Pass for Polly</h2>
> +<!--=====================================================================-->
> +<p>We will disussed how to write a code generation pass in this section.
discuss how
> + We are going to start by listing the basic steps to write a code generation
start by listing
> + pass, and then disussed important data sturctures and function of the Codegen
. Then we discuss structures and functions ...
> + Class, at last we will show you how to use those data structures and how to
. Finally, we show how to use these
> + reimplement those functions by taking the existing code in the framework for
reimplement these ?? I do not get the end of this sentence ?


> + Example.
> +</p>
> +<!--=====================================================================-->
> +<h3>Basic Steps</h3>
> +<!--=====================================================================-->
> + To write a code generation pass for Polly, first you need to have basic
> + knowledge about
> +<a href="http://llvm.org/docs/WritingAnLLVMPass.html">how to write a LLVM
> + Pass</a> and
> +<a href="http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup">Analysis
> + Groups</a>, then you can do it by follow these steps:
following
> +<ol>
> +<li>Add a c++ source file with the name of your code generation pass under
C++ pass
in the lib/CodeGen
> + lib/CodeGen/ directory, and add the file to the CMakelists.txt's source list.
directory and add the
> +</li>
> +<li>Include "Codegen.h" and other necessary header files in you c++ file.
your C++
file.
> +</li>
> +<li>Define a class that inherited from the "Codegen" class and the "ScopPass"
inherits
> + class, and register you pass as an member of the "Codegen" Analysis Group. For
class and register your pass as a member
> + example, the VectorizedCodegen pass is regestered like this:
example the Vect.. is registered as follows:

> +<pre class="code">
> +static RegisterPass<VectorizedCodegen>
> + D("polly-vectorized-codegen",
> + "Polly - Vectorized Codegen implementation that generate vectorized code");
generates

> +
> +static RegisterAnalysisGroup<Codegen> E(D);</pre>
> +</li>
> +<li>Reimplement the "runOnScop" function of ScopPass, and call the "initializeChain
"
ScopPass and call
> + function there to set up the Code Generation Chain correctly like this:
function to set up the Code Generation Chain:


> +<pre class="code">
> + bool runOnScop(Scop&S) {
> + initializeChain(this);</pre>
> +</li>
> +<li> Implement the "getAdjustedAnalysisPointer" so your pass can work with the
> + LLVM Analysis Group framework. You can just copy the following code:
copy the following code:
> +<pre class="code">
> + /// getAdjustedAnalysisPointer - This method is used when a pass implements
> + /// an analysis interface through multiple inheritance. If needed, it
> + /// should override this to adjust the this pointer as needed for the
> + /// specified pass info.
> + void *getAdjustedAnalysisPointer(const void *ID) {
> + if (ID ==&Codegen::ID)
> + return static_cast&lt;Codegen*&gt;(this);
> + return this;
> + }</pre>
> +</li>
> +<li>Reimplement the virtual functions of "Codegen" class to generate code for
of the "Codegen"

> + the CLAst statements or/and expressions that you are interested in.
and expressions you are interested in.
> +</li>
> +</ol>
> +<p>
> +</p>
> +<!--=====================================================================-->
> +<h3>Important Data Structures in the Code Generation Framework</h3>
> +<!--=====================================================================-->
> +<p>Passes in the code generation chain are communicate with each others via
chain communicate with
> + two kinds of data structure:<tt>Code Generation Location</tt> and<tt>Code
two data structures:
> + Generation Region</tt>. Several overridable code generation functions of the

> + Codegen class take a<tt>Code Generation Region</tt> as argument and returns

return

> + a<tt>Code Generation Location</tt>.
> +</p>
> +<!--=====================================================================-->
> +<h4>Code Generation Location(The Loc struct)</h4>
> +<!--=====================================================================-->
> +<p>A code generation location provides some information including:</p>

> +<ul>
> +<li>What kind of clast statement should the codegen passes to generate code
For what kind of clast statement should code be generated?</li>

> + for, and</li>
> +<li>Where should the codegen passes place the new generated code.</li>
Where should the newly genereated code be placed?
> +</ul>
> +<p>To ensure a code generation location can always possible to reaches all
I do not get this sentence.

> + passes in the chain, when a code generation pass finishes generate code for
. When a code generation finished
generating code ...

> + the current code generation location, they should return the next location to
it should return
> + driver. Then the driver will send the next location to the first pass in the
Then, the driver sends the next
> + chain. So instead of return the next location to the driver, if you call
In case you call recursively functions to handle the next
code generation location, instead of returning the next location to the
driver, the location will not be able to reach the passes earlier in the
pass chain. This is probably not what you want.

> + functions to handle next code generation location recursively, the location
> + will not able to reaches the passes before you pass in the chain, and this is
> + probably not the way you want.
> +</p>
> +<!--=====================================================================-->
> +<h4>Code Generation Region(The Reg struct)</h4>
> +<!--=====================================================================-->
> +<p>A code generation region carries information about a region in the new
in the newly

> + generated code, including:
code:
> +</p>
> +<ul>
> +<li>The clast statement correspond to this codegen region,</li>
corresponding ... region.</li>
> +<li>The entry BasicBlock and the exiting BasicBlock of this codegen region,

region.

> + this is useful when we generate code for some clast statements will lead to
This for clast statements which
will lead to
> + nonlinear CFG such as clast_for, clast_guard.</li>
a non-linear
> +<li>A symbol table that mapping clast names to LLVM Value, the symbol table is
maps Values. The symbol
> + shared by all codegen regions.</li>
> +<li>A parameter table that mapping parameter in the original scop to new
maps
> + parameter in the new generated code, the parameter table is also shared by all
parameters in the newly generated code. The parameter table is
shared by all code regions.
> + codegen regions.</li>
> +<p>Parameter map is designed to help the code generation pass to avoid cross
> + function references when generate code in a function that is different from
> + the parent function of the original scop.
> +</p>
> +<li>A Scop statement level vector of ValueMaps that mapping LLVM value in
maps values
> + the old scop to the new generated code, the ValueMaps are only available in
to new ones in the newly generated code. The ValueMaps
> + the code generation region coresponding to a Scop statement, and will be
corresponding statement. They are
> + cleared after the region of the Scop statement is exited.</li>
cleared is left.

> +<p>We need more than one value map because we need to support generate code
support code
generation for unrolled loops.
> + for unrolled loop, and we say each ValueMap correspond to a "symbol space",
. We say each correspond to a
"symbol space".
> + and you look up the same value in the old scop in difference "symbol space"
If you look up different
"symbol spaces"
> + will get difference result.</p>
you will get different results.

> +</ul>
> +<p>Because the clast have some tree structure, we will have nested code
has a tree structure, the code generation
regions are nested.
> + generation regions during the code generation process, and because we will

> + delete the codegen region after we finish it, so the nested codegen regions
> + have a stack structure.</p>

> +<p>During the code generation process, we have only one codegen region stack,
process there is only ... stack
> + and the stack is shared by all passes in the chain. Region pushing and poping
which is shared by all passes in the chain.
> + are managed by the Codegen class internally, and code generation passes do not
is managed ... internally and code ...


> + need to worry about the region management, all their need to do about codegen
. All they need to do is
> + region are do some preparation when entering a region that the pass is
the preperation when entering ...
> + interested in, and do finalization when exiting a region that just entered.
in and the finalization when exiting a region.
> +</p>
> +<!--=====================================================================-->
> +<h3>Overridable Code Generation Functions in Codegen Class</h3>
> +<!--=====================================================================-->
> +<p>There are several overridable(virtual) functions in Codegen class, you can
> + reimplement them to allow your pass generate code for some insteresting CLAst
to generate code for clast
> + statements or/and expressions.
statements and expressions it is interested in.

> +</p>
> +<!--=====================================================================-->
> +<h4>Functions to Handle CLAst For Statements</h4>
> +<!--=====================================================================-->
> +<p>There are two functions to handle the tasks about code generation of CLAst
the code generation of clast
> + For Statements, you can reimplement them to handle loop-related code
for-statements. You can reimplement them to handle loop-related code
> + generation tasks such as generate the CFG for the loop or unroll the loop.
generating the CFG ...
unrolling the loop.
> +</p>
> +<!--=====================================================================-->
> +<h5>The enterLoop(Reg *, Value *, Value *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Loc enterLoop(Reg *L, Value *LB, Value *UB);</pre>
> +<p>The enterLoop function is designed to let you do necessary preparation for
to do the necessary preparation
> + generating code for the region of a loop. It takes the code generation region
> + of the loop, the lower bound of the loop and the upper bound of the loop as
> + arguments, and returns the code generation location of the loop body.
arguments and returns ...
> +</p>
> +<p>What you need to do in this function usually includes:</p>
> +<ul>
> +<li>Generate the CFG of the loop, excluding the back edge.</li>
> +<li>Generate the induction variable of the loop, and insert it to the symble
into
the symbol
> + table of the code generation region.</li>
> +<li>Adjust the entry block and exit block of the code generation region.</li>
> +</ul>
> +<p>Instead of handle all incoming loops, you can just handle the loops that
> + you are insterated in, for example, the corresponding code in VectorizedCodegen
interested in. For example,
> + looks like this:
> +</p>
> +<pre class="code">
> +Codegen::Loc VectorizedCodegen::enterLoop(Reg *L, Value *LB, Value *UB) {
> + // Check if we can to vectorized the loop, and call the next pass in the chain
> + // if we cannot vectorized the loop.
> + if (...If we cannot vectorized the loop...)
> + return Codegen::enterLoop(L, LB, UB);
> +
> + // Tell the code generation framework that we are going to unroll the loop.
Comment redundant. Code explains it already itself.

> + L->UnrollTimes = NumIts;
> +
> + ...Set up the context variables of the pass to remember which loop we are
> + going to unroll...
> +
> + ...Generate the induction variable in every iteration of the loop...
> +
> + // Return the code generation location of the loop body.
> + return Loc(ForStmt->body, Header);
> +}
> +</pre>
> +<!--=====================================================================-->
> +<h5>The exitLoop(Reg *, BasicBlock *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Loc exitLoop(Reg *L, BasicBlock *LastBB);</pre>
> +<p>The exitLoop function is designed to let you do necessary finalization
to do the necessary
> + after code generation for the region of a loop is done. It takes the code
> + generation region of the loop, and the last basic block of the loop body as
> + arguments, and returns the code generation location of the CLAst statement that
arguments and returns clast
> + next to the for statement.
is next to the for-statement.


> +</p>
> +<p>Depending on what you had done in the enterLoop function, what you need to
you have done
> + do in this function usually includes:</p>
> +<ul>
> +<li>Generate the back edge, branching from the last BasicBlock of the loop
> + body.</li>
> +<li>Add the incoming value to the induction variable if it is a PHI
> + instruction.</li>
> +<li>Erase the the induction variable in the symbol table.</li>
> +</ul>
> +<p>Note that you should only finaliazed the loop that handle by the enterLoop
finalize
> + function of your pass, otherwise broken code may generated. For example, the
> + corresponding code in VectorizedCodegen looks like this:
> +</p>
> +<pre class="code">
> +Codegen::Loc VectorizedCodegen::exitLoop(Reg *L, BasicBlock *LastBB) {
> + // Did we unroll the loop in this pass?
> + if (...we had never unrolled the loop...)
> + return Codegen::exitLoop(L, LastBB);
> +
> + // Simply forget the symbol of the induction variable.
> + const clast_for *ForStmt = (const clast_for *)L->stmt;
> + forgetSymbol(ForStmt->iterator);
> +
> + ...Clear up the context variables....
> +
> + // If we unrolled loop, just generate code for next clast_stmt on LastBB,
> + // because we have linear CFG.
> + return Loc(L->stmt->next, LastBB);
> +}</pre>
> +<!--=====================================================================-->
> +<h4>Functions to Handle Scop Statements(CLAst User Statements)</h4>
> +<!--=====================================================================-->
> +<p>There are five functions to handle the tasks about code generation of Scop
> + Statements, you can reimplement them to handle code generation tasks related
> + to Scop statements such as copy the code in the original BasicBlock of
copying
> + statement to the new BasicBlock of the statement, or copy and vectorized the
copying and
vectorizing
> + code.
> +</p>
> +<!--=====================================================================-->
> +<h5>The enterScopStmt(Reg *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Loc enterScopStmt(Reg *Stmt);</pre>
> +<p>The enterScopStmt function is similar to the enterLoop function, it is
.
It is
> + designed to let you do necessary preparation for generating code for the Scop
to do the necessary
> + statement. It takes the code generation region of the scop statement as
> + argument, and returns the code generation location of the body of the statement,
argument and returns
> + i.e. the location to place the code for the CLAst assign statements.
> +</p>
> +<p>What you need to do in this function is rather simple: allocate enough local
> + maps to map the value in the old BasicBlock of the statement to the value in
> + the new BasicBlock of the statement. For example, the enterScopStmt function
> + in VectorizedCodegen allocate several maps to map the value in the old
allocates
> + BasicBlock to new value in difference iteration of the loop:
to the new values in different iterations of the loop:
> +</p>
> +<pre class="code">
> +Codegen::Loc VectorizedCodegen::enterScopStmt(Reg *Stmt) {
> + // Should us take care of this ScopStmt?
Should we
> + if (...we had never unrolled the loop...)
we never unrolled the loop
> + return Codegen::enterScopStmt(Stmt);
> +
> + // Allocate Scalar map for each iteration and a vector map.
a scalar map
> + Stmt->allocateLocalMaps(getUnrolledTimes() + 1);
> +
> + // Return the code generation location of the substitutions
> + const clast_user_stmt*u = (const clast_user_stmt*)Stmt->stmt;
> + return Loc(u->substitutions, Stmt->Entry);
> +}</pre>
> +<!--=====================================================================-->
> +<h5>The exitScopStmt(Reg *, BasicBlock *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Loc exitScopStmt(Reg *Stmt, BasicBlock *LastBB);</pre>
> +<p>The exitScopStmt function is similar to the exitLoop function, it is
> + designed to let you necessary finalization after code generation for
to do the necessary
> + Scop statement is done. It takes the code generation region of the Scop
> + Statement, and the last basic block of the statement as arguments, and returns
Statement and the last basic block of the statement as arguments,
and returns
> + the code generation location of the CLAst statement that next to the Scop
that is next
> + Statement.
> +</p>
> +<p>What you need to do in this function is rather simple also: release the
is also rather simple:
> + maps that allocated in the enterScopStmt function. For exmaple the
> + corresponding code in VectorizedCodegen looks like this:
> +</p>
> +<pre class="code">
> +Codegen::Loc VectorizedCodegen::exitScopStmt(Reg *Stmt, BasicBlock *LastBB) {
> + // Should us take care of this ScopStmt?
> + if (...we had never unrolled the loop...)
we never unrolled the loop
> + return Codegen::exitScopStmt(Stmt, LastBB);
> +
> + // Release the local maps allocated for ScopStmt.
> + Stmt->freeLocalMaps();
> +
> + // Return the code generation location of next statement.
> + return Loc(Stmt->stmt->next, LastBB);
> +}</pre>
> +<!--=====================================================================-->
> +<h5>The initIVSubstCodegen(unsigned) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual void initIVSubstCodegen(unsigned Iteration);</pre>
> +<p>This function is called before the code generation framework generate

generates code
> + code for the substitution of the CLAst user statement(Scop statement) at a
clast user statements
> + certain iteration of the loop. The current iteration number of the loop is
> + passed in as function argument.
> +</p>
> +<p>Your do not need to reimplement this function if you are not unrolling the
> + loop or doing something similar, otherwise you can change the value induction
. Otherwise, you can
> + variable in the symbol map of the code generation region to the right one in
> + this function. For example, in VectorizedCodegen, we do it like this:</p>
For example in Vector..
> +<pre class="code">
> +void VectorizedCodegen::initIVSubstCodegen(unsigned Iteration) {
> + // Have we touch this loop?
> + if (...we had never unrolled the loop...) return;
> +
> + // Map the right value to the induction variable at a certain iteration.
> + insertSymbol(UnrolledLoop->iterator, UnrolledIVs[Iteration]);
> +}</pre>
> +<!--=====================================================================-->
> +<h5>The substIVs(ScopStmt *, ArrayRef<Value*> , unsigned) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual void substIVs(ScopStmt*S, ArrayRef<Value*> NewIVs, unsigned Iteration);</pre>
> +<p>This function is called after all value to substitute the induction
values that substitute
> + variables in the old BasicBlock of the scop statement at a certain iteration
> + of the loop generated. The function takes the current scop statement, the
are generated.
> + array containing the generated value to substitute induction variables and the
> + current iteration number of the loop as arguments.
> +</p>
> +<p>Your also do not need to reimplement this function if you are not unrolling
> + the loop, because the Sequential code generation pass do the job for you in
because, in this case, the sequential code generation
pass does the job for you.
> + this case. Otherwise, you can simply insert the value to value mapping into
> + the right local map(This implies your need to allocate enough local maps in
> + the exitScopStmt function). You can also have a look at this function in the
> + VectorizedCodegen for example:
> +</p>
> +<pre class="code">
> +void VectorizedCodegen::substIVs(ScopStmt*S, ArrayRef<Value*> NewIVs,
> + unsigned Iteration) {
> + // Have we touch this loop?
> + if (...we had never unrolled the loop...)
> + return Codegen::substIVs(S, NewIVs, Iteration);
> +
> + for (unsigned i = 0, e = NewIVs.size(); i != e; ++i) {
> + const PHINode *PN = S->getInductionVariableForDimension(i);
> + assert(PN->getNumOperands() != 2&& "Broken IV found!");
> + // Create the IV mapping in the local map of the current iteration number.
> + insertLocalValue(PN, NewIVs[i], Iteration);
> + }
> +}</pre>
> +<!--=====================================================================-->
> +<h5>The copyInstTree(ScopStmt *, const Instruction&, BasicBlock *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Instruction *copyInstTree(ScopStmt *S, const Instruction&I,
> + BasicBlock *NewBB);</pre>
> +<p>The code generation call this function to copy the computation in the old
calls
> + BasicBlock of the Scop statement to the new BasicBlock. This function takes
> + the current Scop statement, the instruction in the old BasicBlock and the
> + BasicBlock to place the copied computation as arguments.
> +</p>
> +<p>Note that only the instruction that accessed memory or has side effect
only instructions that access memory or that have side
effects
> + (in this case the instruction will also be treated as if it accessed memory.)
> + will passed in, so you should not only copy the incoming instruction to the
. You should not only copying
> + new BasicBlock, but also copy all instruction contribute to the computation of
instructions contributing
> + the incomming instruction. Besides simply copy the instruction to the new
> + BasicBlock which is already implemented in the copyInstTree function in the
> + SequentialCodegen pass, you can also do something else, for example the
> + copyInstTree function in VectorizedCodegen vectorize the instructions.
> +</p>
> +<!--=====================================================================-->
> +<h4>Functions to Generate code for CLAst expression</h4>
> +<!--=====================================================================-->
> +<p>You can also generate your own code for some CLAst expression by
> + reimplement the functions listed below.
reimplementing
> +</p>
> +<!--=====================================================================-->
> +<h5>The codegen(enum clast_red_type, ArrayRef<Value*>, BasicBlock *) function</h5>
> +<!--=====================================================================-->
> +<pre class="code">
> + virtual Value*codegen(enum clast_red_type red_ty, ArrayRef<Value*> SubExprs,
> + BasicBlock *BB);</pre>
> +<p>This function is called when the code generation framework try to generate
framework generates
> + code for the CLAst reduction expression, the function takes the type of the
. The function
> + redution, the subexpressions of the reduction and the BasicBlock to place the
reduction,
> + generated code as arguments, and return the result value of the reduction.
. It returns the result value of the
reduction.
> +</p>
> +<p>You can reimplement this function if your platform have special support for
> + the reduce operation.
> +</p>
> +<!--=====================================================================-->
> +<h2>Other details about the Code Generation Framework</h2>
> +<!--=====================================================================-->
> +<p>This document do not cover all detials about the code generation
This document does not cover all details
> + framework of polly, you can read the code or ask on the list to get other
Polly. You
> + detials.
details.
> +</p>
> +</div>
> +</body>
> +</html>
>

Tobias Grosser

unread,
May 9, 2011, 3:26:21 PM5/9/11
to ether zhhb, poll...@googlegroups.com, polly-commits
On 05/09/2011 04:32 PM, ether zhhb wrote:
> hi tobi,
>
>>> it do not introduce any regressions at the moment
>>
>> Did you test it on more than 'make polly-test'? Did you run it on polybench?
>> The llvm test-suite? I really would appreciate to have wider testing.
> not yet, but i will do this later.
OK. No need for you to do all the work.

I am fine committing the patches not fully tested. As they are
independent it is better to have them early in the version control
system and fix existing bugs as we go. Otherwise they become too big to
review them conveniently. If your code is used by a wider audience it
will be tested automatically more throughly.

>> Another possibility to increase the speed of integrating your new codegen
>> framework is to add a new directory + a PollyCodegen library, that is
>> available in parallel to the existing framework. Like this we have both
> Yes, i am also going to do this, do you want i add a separated patch
> to move them or edit the previous patches?

I believe it is best to edit the previous patches. You can do this as
you go and submit one patch after the other for review. I believe we
mainly need to fix bugs, typos and style issues. The only possible
architectural problem I see is the use of analysis groups.

As stated earlier we should understand if your use of analysis groups is
OK. I will dig a little bit deeper into it, but believe it should be OK.

>> available and can easily compare them, work on bugs ...
> Yes, we can use the code generated by the light weight codegen pass as
> somekind of reference answer.
> And the user do not need to switch to the new untested codegen
> framework immediately, they can use the old well tested codegen pass.
>> After the automatic tester are available, and some more tests were run on
>> polybench and the the llvm test-suite we may switch over.
> Sounds great, so how is the status of the automatic tester?

Andreas and Raghesh planned to work on this. I do not know about recent
progress. (Andreas/Raghesh: Can we get an update?)

I tried to setup a buildbot earlier and the main problem was that the
buildbot did not support two different version control systems. As we
moved to LLVM svn, this problem disappeared. We now only need to adapt
the builder scripts in zorg/buildbot/builders. Than we need to setup the
buildbot on one of our machines (Andreas has a machine for this in
Passau) and ask to add it to the llvm buildbots.

Cheers
Tobi

ether zhhb

unread,
May 10, 2011, 12:36:29 PM5/10/11
to Tobias Grosser, polly-...@googlegroups.com, poll...@googlegroups.com
hi tobi,

thanks for fixing these for me.

>>
>> +  reimplement those functions by taking the existing code in the
>> framework for
>
>     reimplement these ?? I do not get the end of this sentence ?

I means override these virtual function.

>> +  LLVM Analysis Group framework. You can just copy the following code:
>
>                                            copy the following code:

You means "LLVM Analysis Group framework. Copy the following code:"?

>>
>> +</ul>
>> +<p>To ensure a code generation location can always possible to reaches
>> all
>
> I do not get this sentence.

It means: "To ensure all code generation location can always be
handled by all passes in the chain"
>

>
>            In case you call recursively functions to handle the next code
> generation location, instead of returning the next location to the driver,
> the location will not be able to reach the passes earlier in the pass chain.

So if the above sentence changed, i think this should also be change:
the location will not be handled by pass earlier ....


> This is probably not what you want.
>

The differences between the up to date and the original
codegenframewok.html passed below:
diff --git a/www/codegenframework.html b/www/codegenframework.html
index 991513e..4adf107 100755
--- a/www/codegenframework.html
+++ b/www/codegenframework.html
@@ -18,11 +18,11 @@
<h2>Introduction</h2>
<!--=====================================================================-->

- <p>The Code Generation Framework is a part of Polly that translate
- <a href="http://www.bastoul.net/cloog/manual.php#CLooG-Output">CLAst</a> back
- to <a href="http://llvm.org/docs/LangRef.html">LLVM IR</a>, and it provides a
- modular approach to adding code generation support for various
- platforms.
+ <p>The Code Generation Framework is a part of Polly that translate the
+ <a href="http://www.bastoul.net/cloog/manual.php#CLooG-Output">CLooG AST</a>
+ (clast) back to <a href="http://llvm.org/docs/LangRef.html">LLVM-IR</a>. It
+ provides a platform to implement code generation for different platforms in a
+ modular way.
</p>

<p>The framework can be divided in to two parts: The code
generation chain and

@@ -34,46 +34,45 @@
<!--=====================================================================-->


<h3>Code Generation Driver</h3>

<!--=====================================================================-->
- <p>The Code Generation Driver is a ScopPass, it will get the CLAst from the
- CloogInfo pass and call the main code generation function of the Code
- Generation Chain. The main code generation function will iterate over the
- CLAst, and generate code for the clast statement/expression.
- Note that only non-trivial CLAst statements/expressions(e.g. reduction
- expression, for statement and user statement) will be sent to the chain and
- handled by the passes in the chain, while trivial CLAst
statements/expressions
- will be handled immediately without passed to the chain.
+ <p>The Code Generation Driver is a ScopPass, it gets the clast from the
+ CloogInfo pass and calls the main code generation function of the Code
+ Generation Chain. The main code generation function iterates over the clast,
+ and generates code for the clast statement/expression. Note that only
+ non-trivial clast statements/expressions(e.g. reduction expression, for
+ statement and user statement) are sent to the chain and handled by the passes
+ in the chain, while trivial clast statements/expressions are handled
+ immediately without being passed to the chain.
</p>
<!--=====================================================================-->


<h3>Code Generation Chain</h3>

<!--=====================================================================-->
- <p>The Code Generation Chain consist of a set of Passes that also inherited
- from the "Codegen" class, their are chained together by register them as a
- member of the "Codegen"
+ <p>The Code Generation Chain consists of a set of passes that also inherited
+ from the "CodeGen" class, they are chained together by registering them as a
+ member of the "CodeGen"
<a href="http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup">
- Analysis Group</a>, and when we say "Polly Code Generation Pass", it means
- "A Pass in the Code Generation Chain of Polly".
+ Analysis Group</a>. The expression "Polly Code Generation Pass" means a pass
+ in the code generation chain of Polly.
</p>


<p>The code generation request will pass through chain until some pass

- in the chain respond to the request, generate code and stop passing the
- request to the next pass in the chain. If the code generation request reach
- the end of the chain - the "SequentialCodegen" pass, sequential code will be
- generated for the request. This means passes apart from
SequentialCodegen pass
- in the chain do not need to respond to ALL code generation request, instead,
- their only need to respond to the code generation request that their are
+ in the chain responds to the request, generates code and stops passing the
+ request to the next pass in the chain. If the code generation request reaches
+ the end of the chain, the "SequentialCodeGen" pass, sequential code will be


+ generated for the request. This means passes apart from

SequentialCodeGen pass
+ in the chain do not need to respond to ALL code generation requests. Instead,
+ they only need to respond to the code generation requests that their are


interested in. For example, the vector code generation pass only generates

- code for inner most loop and the ScopStmt inside the loop, the rest
code generation
- request will be ignored and passed to the next pass in the chain.
+ code for inner most loops and the ScopStmts inside these loops. The rest code
+ generation requests are ignored and passed to the next pass in the chain.
</p>

<!--=====================================================================-->


<h2>Writing a Code Generation Pass for Polly</h2>

<!--=====================================================================-->
- <p>We will disussed how to write a code generation pass in this section.
- We are going to start by listing the basic steps to write a code generation
- pass, and then disussed important data sturctures and function of the Codegen
- Class, at last we will show you how to use those data structures and how to
- reimplement those functions by taking the existing code in the framework for
- Example.
+ <p>We disuss how to write a code generation pass in this section. We start by
+ listing the basic steps to write a code generation pass, and then disuss
+ important data structures and functions of the CodeGen Class.
Finally, we show
+ how to use those data structures and how to override those
functions by taking
+ the existing code in the framework for example.
</p>
<!--=====================================================================-->
<h3>Basic Steps</h3>
@@ -83,44 +82,44 @@


<a href="http://llvm.org/docs/WritingAnLLVMPass.html">how to write a LLVM

Pass</a> and

- Groups</a>, then you can do it by follow these steps:
+ Groups</a>, then you can do it by following these steps:
<ol>
- <li>Add a c++ source file with the name of your code generation pass under
- lib/CodeGen/ directory, and add the file to the CMakelists.txt's source list.
+ <li>Add a C++ source file with the name of your code generation pass in
+ lib/CodeGen/ directory and add the file to the CMakelists.txt's source list.
</li>
- <li>Include "Codegen.h" and other necessary header files in you c++ file.
+ <li>Include "CodeGen.h" and other necessary header files in your C++ file.
</li>
- <li>Define a class that inherited from the "Codegen" class and the "ScopPass"
- class, and register you pass as an member of the "Codegen" Analysis
Group. For
- example, the VectorizedCodegen pass is regestered like this:
+ <li>Define a class that inherits from the "CodeGen" class and the "ScopPass"
+ class and register your pass as a member of the "CodeGen" Analysis Group. For
+ example, the VectorizedCodeGen pass is regestered as follows:
<pre class="code">
-static RegisterPass<VectorizedCodegen>
+static RegisterPass<VectorizedCodeGen>
D("polly-vectorized-codegen",
- "Polly - Vectorized Codegen implementation that generate vectorized code");
+ "Polly - Vectorized CodeGen implementation that generates
vectorized code");

-static RegisterAnalysisGroup<Codegen> E(D);</pre>
+static RegisterAnalysisGroup<CodeGen> E(D);</pre>
</li>
- <li>Reimplement the "runOnScop" function of ScopPass, and call the
"initializeChain"
- function there to set up the Code Generation Chain correctly like this:
+ <li>Override the "runOnScop" function of ScopPass and call the
+ "initializeChain" function there to set up the Code Generation Chain:
<pre class="code">
bool runOnScop(Scop &S) {
initializeChain(this);</pre>
</li>


<li> Implement the "getAdjustedAnalysisPointer" so your pass can
work with the

- LLVM Analysis Group framework. You can just copy the following code:
+ LLVM Analysis Group framework. Copy the following code:
<pre class="code">


/// getAdjustedAnalysisPointer - This method is used when a pass implements

/// an analysis interface through multiple inheritance. If needed, it

/// should override this to adjust the this pointer as needed for the

/// specified pass info.
void *getAdjustedAnalysisPointer(const void *ID) {
- if (ID == &Codegen::ID)
- return static_cast&lt;Codegen*&gt;(this);
+ if (ID == &CodeGen::ID)
+ return static_cast&lt;CodeGen*&gt;(this);
return this;
}</pre>
</li>
- <li>Reimplement the virtual functions of "Codegen" class to generate code for
- the CLAst statements or/and expressions that you are interested in.
+ <li>Reimplement the virtual functions of the "CodeGen" class to generate code
+ for the clast statements or/and expressions you are interested in.
</li>
</ol>
<p>
@@ -128,113 +127,112 @@ static RegisterAnalysisGroup<Codegen> E(D);</pre>
<!--=====================================================================-->


<h3>Important Data Structures in the Code Generation Framework</h3>

<!--=====================================================================-->
- <p>Passes in the code generation chain are communicate with each others via
- two kinds of data structure: <tt>Code Generation Location</tt> and <tt>Code
- Generation Region</tt>. Several overridable code generation functions of the
- Codegen class take a <tt>Code Generation Region</tt> as argument and returns
+ <p>Passes in the code generation chain communicate with each others via two
+ data structures: <tt>Code Generation Location</tt> and <tt>Code Generation
+ Region</tt>. Several overridable code generation functions of the
+ CodeGen class take a <tt>Code Generation Region</tt> as argument and return


a <tt>Code Generation Location</tt>.

</p>
<!--=====================================================================-->


<h4>Code Generation Location(The Loc struct)</h4>

<!--=====================================================================-->
- <p>A code generation location provides some information including:</p>
+ <p>A code generation location, or code location for short, provides some
+ information including:</p>
<ul>
- <li>What kind of clast statement should the codegen passes to generate code
- for, and</li>
- <li>Where should the codegen passes place the new generated code.</li>
+ <li>For what kind of clast statement should code be generated? and</li>
+ <li>Where should the newly generated code be placed?</li>
</ul>
- <p>To ensure a code generation location can always possible to reaches all
- passes in the chain, when a code generation pass finishes generate code for
- the current code generation location, they should return the next location to
- driver. Then the driver will send the next location to the first pass in the
- chain. So instead of return the next location to the driver, if you call
- functions to handle next code generation location recursively, the location
- will not able to reaches the passes before you pass in the chain, and this is
- probably not the way you want.
+ <p>To ensure all code location can always be handled by all passes
+ in the chain. When a code generation pass finished generating code for the
+ current code location, it should return the next location to driver.
+ Then, the driver sends the next location to the first pass in the chain. In
+ case you call recursively functions to handle the next code generation
+ location, instead of returning the next location to the driver, the location
+ will not be able to reach the passes earlier in the pass chain. This is
+ probably not what you want.
</p>
<!--=====================================================================-->


<h4>Code Generation Region(The Reg struct)</h4>

<!--=====================================================================-->
- <p>A code generation region carries information about a region in the new
- generated code, including:
+ <p>A code generation region, or code region for short, carries information
+ about a region in the newly generated code:
</p>
<ul>
- <li>The clast statement correspond to this codegen region,</li>
- <li>The entry BasicBlock and the exiting BasicBlock of this codegen region,
- this is useful when we generate code for some clast statements will lead to
- nonlinear CFG such as clast_for, clast_guard.</li>
- <li>A symbol table that mapping clast names to LLVM Value, the
symbol table is
- shared by all codegen regions.</li>
- <li>A parameter table that mapping parameter in the original scop to new
- parameter in the new generated code, the parameter table is also
shared by all
- codegen regions.</li>
+ <li>The clast statement corresponding to this code region.</li>
+ <li>The entry BasicBlock and the exiting BasicBlock of this code region.
+ This is useful when we generate code for clast statements爓hich lead to
+ non-linear CFG such as clast_for, clast_guard.</li>
+ <li>A symbol table that maps clast names to LLVM Values. The symbol table is
+ shared by all code regions.</li>
+ <li>A parameter table that maps parameters in the original scop to new
+ parameters in the newly generated code. The parameter table is shared by all
+ code regions.</li>


<p>Parameter map is designed to help the code generation pass to avoid cross

function references when generate code in a function that is different from

the parent function of the original scop.

</p>
- <li>A Scop statement level vector of ValueMaps that mapping LLVM value in
- the old scop to the new generated code, the ValueMaps are only available in
- the code generation region coresponding to a Scop statement, and will be
- cleared after the region of the Scop statement is exited.</li>
- <p>We need more than one value map because we need to support generate code
- for unrolled loop, and we say each ValueMap correspond to a "symbol space",
- and you look up the same value in the old scop in difference "symbol space"
- will get difference result.</p>
+ <li>A Scop statement level vector of ValueMaps that maps LLVM Values in
+ the old scop to new ones in the newly generated code. The ValueMaps are only
+ available in the code region corresponding to a Scop statement. They are
+ cleared after the code region of the Scop statement is left.</li>
+ <p>We need more than one value map because we need to support code generation
+ for unrolled loops. We say each ValueMap correspond to a "symbol
space". If you
+ look up the same value in the old scop in different "symbol space", you will
+ get different results.</p>
</ul>
- <p>Because the clast have some tree structure, we will have nested code
- generation regions during the code generation process, and because we will
- delete the codegen region after we finish it, so the nested codegen regions
- have a stack structure.</p>
- <p>During the code generation process, we have only one codegen region stack,
- and the stack is shared by all passes in the chain. Region pushing and poping
- are managed by the Codegen class internally, and code generation
passes do not
- need to worry about the region management, all their need to do about codegen
- region are do some preparation when entering a region that the pass is
- interested in, and do finalization when exiting a region that just entered.
- </p>
+ <p>Because the clast has a tree structure, the code generation regions are
+ nested during the code generation process. Because we will delete the code
+ region after we finish it, so the nested code regions can be managed by a
+ stack.</p>
+ <p>During the code generation process there is only one code region
stack which
+ is shared by all passes in the chain. Region pushing and poping is managed by
+ the CodeGen class internally and code generation passes do not need to worry
+ about the region management. All their need to do is the preperation when
+ entering a region that the pass is interested in and the finalization when
+ exiting a region.</p>
<!--=====================================================================-->
- <h3>Overridable Code Generation Functions in Codegen Class</h3>
+ <h3>Overridable Code Generation Functions in CodeGen Class</h3>
<!--=====================================================================-->
- <p>There are several overridable(virtual) functions in Codegen class, you can
- reimplement them to allow your pass generate code for some insteresting CLAst
- statements or/and expressions.
+ <p>There are several overridable(virtual) functions in CodeGen class, you can
+ override them to allow your pass to generate code for clast statements and
+ expressions it is interested in.
</p>
<!--=====================================================================-->
- <h4>Functions to Handle CLAst For Statements</h4>
+ <h4>Functions to Handle clast For Statements</h4>
<!--=====================================================================-->
- <p>There are two functions to handle the tasks about code generation of CLAst
- For Statements, you can reimplement them to handle loop-related code
- generation tasks such as generate the CFG for the loop or unroll the loop.
+ <p>There are two functions to handle the tasks about the code generation of
+ clast for-statements. You can override them to handle loop-related code
+ generation tasks such as generating the CFG for the loop or unrolling the
+ loop.
</p>
<!--=====================================================================-->


<h5>The enterLoop(Reg *, Value *, Value *) function</h5>

<!--=====================================================================-->
<pre class="code">


virtual Loc enterLoop(Reg *L, Value *LB, Value *UB);</pre>

- <p>The enterLoop function is designed to let you do necessary preparation for
- generating code for the region of a loop. It takes the code generation region
- of the loop, the lower bound of the loop and the upper bound of the loop as
- arguments, and returns the code generation location of the loop body.
+ <p>The enterLoop function is designed to let you to do the necessary
+ preparation for generating code for the region of a loop. It takes the code
+ region of the loop, the lower bound of the loop and the upper bound of the
+ loop as arguments and returns the code location of the loop body.
</p>


<p>What you need to do in this function usually includes:</p>

<ul>


<li>Generate the CFG of the loop, excluding the back edge.</li>

- <li>Generate the induction variable of the loop, and insert it to the symble
- table of the code generation region.</li>
- <li>Adjust the entry block and exit block of the code generation region.</li>
+ <li>Generate the induction variable of the loop, and insert it into
the symbol
+ table of the code region.</li>
+ <li>Adjust the entry block and exit block of the code region.</li>
</ul>


<p>Instead of handle all incoming loops, you can just handle the loops that

- you are insterated in, for example, the corresponding code in
VectorizedCodegen
+ you are insterated in. For example, the corresponding code in
VectorizedCodeGen
looks like this:
</p>
<pre class="code">
-Codegen::Loc VectorizedCodegen::enterLoop(Reg *L, Value *LB, Value *UB) {
+CodeGen::Loc VectorizedCodeGen::enterLoop(Reg *L, Value *LB, Value *UB) {


// Check if we can to vectorized the loop, and call the next pass
in the chain

// if we cannot vectorized the loop.

if (...If we cannot vectorized the loop...)

- return Codegen::enterLoop(L, LB, UB);
+ return CodeGen::enterLoop(L, LB, UB);

- // Tell the code generation framework that we are going to unroll the loop.
L->UnrollTimes = NumIts;

...Set up the context variables of the pass to remember which loop we are

@@ -242,7 +240,7 @@ Codegen::Loc VectorizedCodegen::enterLoop(Reg *L,
Value *LB, Value *UB) {

...Generate the induction variable in every iteration of the loop...

- // Return the code generation location of the loop body.
+ // Return the code location of the loop body.
return Loc(ForStmt->body, Header);
}
</pre>
@@ -251,13 +249,13 @@ Codegen::Loc VectorizedCodegen::enterLoop(Reg


*L, Value *LB, Value *UB) {

<!--=====================================================================-->
<pre class="code">


virtual Loc exitLoop(Reg *L, BasicBlock *LastBB);</pre>

- <p>The exitLoop function is designed to let you do necessary finalization
- after code generation for the region of a loop is done. It takes the code
- generation region of the loop, and the last basic block of the loop body as
- arguments, and returns the code generation location of the CLAst
statement that
+ <p>The exitLoop function is designed to let you to do the necessary
+ finalization after code generation for the region of a loop is done. It takes
+ the code generation region of the loop, and the last basic block of the loop
+ body as arguments and returns the code location of the clast
statement that is


next to the for statement.

</p>
- <p>Depending on what you had done in the enterLoop function, what you need to
+ <p>Depending on what you have done in the enterLoop function, what
you need to


do in this function usually includes:</p>

<ul>


<li>Generate the back edge, branching from the last BasicBlock of the loop

@@ -266,15 +264,15 @@ Codegen::Loc VectorizedCodegen::enterLoop(Reg


*L, Value *LB, Value *UB) {

instruction.</li>


<li>Erase the the induction variable in the symbol table.</li>

</ul>
- <p>Note that you should only finaliazed the loop that handle by the enterLoop
+ <p>Note that you should only finalize the loop that handle by the enterLoop


function of your pass, otherwise broken code may generated. For example, the

- corresponding code in VectorizedCodegen looks like this:
+ corresponding code in VectorizedCodeGen looks like this:
</p>
<pre class="code">
-Codegen::Loc VectorizedCodegen::exitLoop(Reg *L, BasicBlock *LastBB) {
+CodeGen::Loc VectorizedCodeGen::exitLoop(Reg *L, BasicBlock *LastBB) {


// Did we unroll the loop in this pass?

- if (...we had never unrolled the loop...)
- return Codegen::exitLoop(L, LastBB);
+ if (...we never unrolled the loop...)
+ return CodeGen::exitLoop(L, LastBB);

// Simply forget the symbol of the induction variable.

const clast_for *ForStmt = (const clast_for *)L->stmt;

@@ -287,41 +285,41 @@ Codegen::Loc VectorizedCodegen::exitLoop(Reg *L,
BasicBlock *LastBB) {


return Loc(L->stmt->next, LastBB);

}</pre>
<!--=====================================================================-->
- <h4>Functions to Handle Scop Statements(CLAst User Statements)</h4>
+ <h4>Functions to Handle Scop Statements(clast User Statements)</h4>
<!--=====================================================================-->


<p>There are five functions to handle the tasks about code generation of Scop

- Statements, you can reimplement them to handle code generation tasks related
- to Scop statements such as copy the code in the original BasicBlock of
- statement to the new BasicBlock of the statement, or copy and vectorized the
- code.
+ Statements, you can override them to handle code generation tasks related to
+ Scop statements such as copying the code in the original BasicBlock of
+ statement to the new BasicBlock of the statement, or copying and vectorizing
+ the code.
</p>
<!--=====================================================================-->


<h5>The enterScopStmt(Reg *) function</h5>

<!--=====================================================================-->
<pre class="code">


virtual Loc enterScopStmt(Reg *Stmt);</pre>

- <p>The enterScopStmt function is similar to the enterLoop function, it is
- designed to let you do necessary preparation for generating code for the Scop
- statement. It takes the code generation region of the scop statement as
- argument, and returns the code generation location of the body of
the statement,
- i.e. the location to place the code for the CLAst assign statements.
- </p>
- <p>What you need to do in this function is rather simple: allocate
enough local
- maps to map the value in the old BasicBlock of the statement to the value in
- the new BasicBlock of the statement. For example, the enterScopStmt function
- in VectorizedCodegen allocate several maps to map the value in the old
- BasicBlock to new value in difference iteration of the loop:
+ <p>The enterScopStmt function is similar to the enterLoop function. It is
+ designed to let you to do the necessary preparation for generating code for
+ the Scop statement. It takes the code region of the scop statement
as argument
+ and returns the code location of the body of the statement, i.e. the location
+ to place the code for the clast assign statements.
+ </p>
+ <p>What you need to do in this function is rather simple: allocates enough
+ local maps to map the value in the old BasicBlock of the statement
to the value
+ in the new BasicBlock of the statement. For example, the
enterScopStmt function
+ in VectorizedCodeGen allocate several maps to map the value in the old
+ BasicBlock to the new values in different iterations of the loop:
</p>
<pre class="code">
-Codegen::Loc VectorizedCodegen::enterScopStmt(Reg *Stmt) {
- // Should us take care of this ScopStmt?
- if (...we had never unrolled the loop...)
- return Codegen::enterScopStmt(Stmt);
+CodeGen::Loc VectorizedCodeGen::enterScopStmt(Reg *Stmt) {
+ // Should we take care of this ScopStmt?
+ if (...we never unrolled the loop...)
+ return CodeGen::enterScopStmt(Stmt);

- // Allocate Scalar map for each iteration and a vector map.
+ // Allocate a scalar map for each iteration and a vector map.
Stmt->allocateLocalMaps(getUnrolledTimes() + 1);

- // Return the code generation location of the substitutions
+ // Return the code location of the substitutions
const clast_user_stmt *u = (const clast_user_stmt*)Stmt->stmt;


return Loc(u->substitutions, Stmt->Entry);

}</pre>
@@ -331,46 +329,45 @@ Codegen::Loc VectorizedCodegen::enterScopStmt(Reg *Stmt) {
<pre class="code">


virtual Loc exitScopStmt(Reg *Stmt, BasicBlock *LastBB);</pre>

<p>The exitScopStmt function is similar to the exitLoop function, it is

- designed to let you necessary finalization after code generation for the
- Scop statement is done. It takes the code generation region of the Scop
- Statement, and the last basic block of the statement as arguments,
and returns
- the code generation location of the CLAst statement that next to the Scop
- Statement.
+ designed to let you to do the necessary finalization after code
generation for
+ the Scop statement is done. It takes the code region of the Scop
Statement and
+ the last basic block of the statement as arguments, and returns the code
+ location of the clast statement that is next to the Scop Statement.
</p>
- <p>What you need to do in this function is rather simple also: release the
+ <p>What you need to do in this function is also rather simple: release the


maps that allocated in the enterScopStmt function. For exmaple the

- corresponding code in VectorizedCodegen looks like this:
+ corresponding code in VectorizedCodeGen looks like this:
</p>
<pre class="code">
-Codegen::Loc VectorizedCodegen::exitScopStmt(Reg *Stmt, BasicBlock *LastBB) {
- // Should us take care of this ScopStmt?
- if (...we had never unrolled the loop...)
- return Codegen::exitScopStmt(Stmt, LastBB);
+CodeGen::Loc VectorizedCodeGen::exitScopStmt(Reg *Stmt, BasicBlock *LastBB) {
+ // Should we take care of this ScopStmt?
+ if (...we never unrolled the loop...)
+ return CodeGen::exitScopStmt(Stmt, LastBB);

// Release the local maps allocated for ScopStmt.

Stmt->freeLocalMaps();

- // Return the code generation location of next statement.
+ // Return the code location of next statement.


return Loc(Stmt->stmt->next, LastBB);

}</pre>
<!--=====================================================================-->
- <h5>The initIVSubstCodegen(unsigned) function</h5>
+ <h5>The initIVSubstCodeGen(unsigned) function</h5>
<!--=====================================================================-->
<pre class="code">
- virtual void initIVSubstCodegen(unsigned Iteration);</pre>
- <p>This function is called before the code generation framework generate
- code for the substitution of the CLAst user statement(Scop statement) at a
- certain iteration of the loop. The current iteration number of the loop is
- passed in as function argument.
- </p>
- <p>Your do not need to reimplement this function if you are not unrolling the
- loop or doing something similar, otherwise you can change the value induction
- variable in the symbol map of the code generation region to the right one in
- this function. For example, in VectorizedCodegen, we do it like this:</p>
+ virtual void initIVSubstCodeGen(unsigned Iteration);</pre>


+ <p>This function is called before the code generation framework

generates code
+ for the substitution of the clast user statements(Scop statement)
at a certain
+ iteration of the loop. The current iteration number of the loop is passed in
+ as function argument.
+ </p>
+ <p>Your do not need to override this function if you are not
unrolling the loop
+ or doing something similar. Otherwise, you can change the value induction
+ variable in the symbol map of the code region to the right one in
this function.
+ For example in VectorizedCodeGen, we do it like this:</p>
<pre class="code">
-void VectorizedCodegen::initIVSubstCodegen(unsigned Iteration) {
+void VectorizedCodeGen::initIVSubstCodeGen(unsigned Iteration) {


// Have we touch this loop?

- if (...we had never unrolled the loop...) return;
+ if (...we never unrolled the loop...) return;

// Map the right value to the induction variable at a certain iteration.

insertSymbol(UnrolledLoop->iterator, UnrolledIVs[Iteration]);

@@ -380,25 +377,25 @@ void
VectorizedCodegen::initIVSubstCodegen(unsigned Iteration) {
<!--=====================================================================-->
<pre class="code">
virtual void substIVs(ScopStmt *S, ArrayRef<Value*> NewIVs,
unsigned Iteration);</pre>
- <p>This function is called after all value to substitute the induction
+ <p>This function is called after all values that substitute the induction


variables in the old BasicBlock of the scop statement at a certain iteration

- of the loop generated. The function takes the current scop statement, the
+ of the loop are generated. The function takes the current scop statement, the


array containing the generated value to substitute induction
variables and the

current iteration number of the loop as arguments.

</p>
- <p>Your also do not need to reimplement this function if you are
not unrolling
- the loop, because the Sequential code generation pass do the job for you in
- this case. Otherwise, you can simply insert the value to value mapping into
+ <p>Your also do not need to override this function if you are not unrolling
+ the loop, because, in this case, the sequential code generation pass does the
+ job for you. Otherwise, you can simply insert the value to value mapping into


the right local map(This implies your need to allocate enough local maps in

the exitScopStmt function). You can also have a look at this function in the

- VectorizedCodegen for example:
+ VectorizedCodeGen for example:
</p>
<pre class="code">
-void VectorizedCodegen::substIVs(ScopStmt *S, ArrayRef<Value*> NewIVs,
+void VectorizedCodeGen::substIVs(ScopStmt *S, ArrayRef<Value*> NewIVs,
unsigned Iteration) {


// Have we touch this loop?

- if (...we had never unrolled the loop...)
- return Codegen::substIVs(S, NewIVs, Iteration);
+ if (...we never unrolled the loop...)
+ return CodeGen::substIVs(S, NewIVs, Iteration);

for (unsigned i = 0, e = NewIVs.size(); i != e; ++i) {

const PHINode *PN = S->getInductionVariableForDimension(i);

@@ -413,25 +410,25 @@ void VectorizedCodegen::substIVs(ScopStmt *S,
ArrayRef<Value*> NewIVs,
<pre class="code">


virtual Instruction *copyInstTree(ScopStmt *S, const Instruction &I,

BasicBlock *NewBB);</pre>
- <p>The code generation call this function to copy the computation in the old
- BasicBlock of the Scop statement to the new BasicBlock. This function takes
- the current Scop statement, the instruction in the old BasicBlock and the
- BasicBlock to place the copied computation as arguments.
+ <p>The code generation framework calls this function to copy the computation
+ in the old BasicBlock of the Scop statement to the new BasicBlock. This
+ function takes the current Scop statement, the instruction in the old
+ BasicBlock and the BasicBlock to place the copied computation as arguments.
</p>
- <p>Note that only the instruction that accessed memory or has side effect
+ <p>Note that only instructions that access memory or that have side effects


(in this case the instruction will also be treated as if it accessed memory.)

- will passed in, so you should not only copy the incoming instruction to the
- new BasicBlock, but also copy all instruction contribute to the
computation of
+ will passed in. You should not only copy the incoming instruction to the new
+ BasicBlock, but also copy all instructions contributing to the computation of


the incomming instruction. Besides simply copy the instruction to the new

BasicBlock which is already implemented in the copyInstTree function in the

- SequentialCodegen pass, you can also do something else, for example the
- copyInstTree function in VectorizedCodegen vectorize the instructions.
+ SequentialCodeGen pass, you can also do something else, for example the
+ copyInstTree function in VectorizedCodeGen vectorize the instructions.
</p>
<!--=====================================================================-->
- <h4>Functions to Generate code for CLAst expression</h4>
+ <h4>Functions to Generate code for clast expression</h4>
<!--=====================================================================-->
- <p>You can also generate your own code for some CLAst expression by
- reimplement the functions listed below.
+ <p>You can also generate your own code for some clast expression by
+ override the functions listed below.
</p>
<!--=====================================================================-->


<h5>The codegen(enum clast_red_type, ArrayRef<Value*>, BasicBlock
*) function</h5>

@@ -439,20 +436,20 @@ void VectorizedCodegen::substIVs(ScopStmt *S,
ArrayRef<Value*> NewIVs,
<pre class="code">
virtual Value *codegen(enum clast_red_type red_ty, ArrayRef<Value*> SubExprs,
BasicBlock *BB);</pre>
- <p>This function is called when the code generation framework try to generate
- code for the CLAst reduction expression, the function takes the type of the
- redution, the subexpressions of the reduction and the BasicBlock to place the
- generated code as arguments, and return the result value of the reduction.
+ <p>This function is called when the code generation framework generates code
+ for the clast reduction expression. The function takes the type of the
+ redution, the subexpressions of the reduction, and the BasicBlock
to place the
+ generated code as arguments. It returns the result value of the reduction.
</p>
- <p>You can reimplement this function if your platform have special
support for
+ <p>You can override this function if your platform have special support for
the reduce operation.
</p>
<!--=====================================================================-->


<h2>Other details about the Code Generation Framework</h2>

<!--=====================================================================-->
- <p>This document do not cover all detials about the code generation
- framework of polly, you can read the code or ask on the list to get other
- detials.
+ <p>This document do not cover all detials about the code generation framework
+ of Polly, you can read the code or ask on the list to get other
+ details.
</p>
</div>
</body>

ether zhhb

unread,
May 10, 2011, 12:45:37 PM5/10/11
to polly-...@googlegroups.com, poll...@googlegroups.com
hi tobi,

On Tue, May 10, 2011 at 3:26 AM, Tobias Grosser <tob...@grosser.es> wrote:
> On 05/09/2011 04:32 PM, ether zhhb wrote:
>>
>> hi tobi,
>>
>>>> it do not introduce any regressions at the moment
>>>
>>> Did you test it on more than 'make polly-test'? Did you run it on
>>> polybench?
>>> The llvm test-suite? I really would appreciate to have wider testing.
>>
>> not yet, but i will do this later.
>
> OK. No need for you to do all the work.

We can also commit the Makefile for Polly in the llvm test-suite to
main stream, so the build bots can check them out and run.


>
> I am fine committing the patches not fully tested. As they are independent
> it is better to have them early in the version control system and fix
> existing bugs as we go. Otherwise they become too big to review them
> conveniently. If your code is used by a wider audience it will be tested
> automatically more throughly.

Yes, so when you think it is ok to commit, please drop me a mail


>
>>> Another possibility to increase the speed of integrating your new codegen
>>> framework is to add a new directory + a PollyCodegen library, that is
>>> available in parallel to the existing framework. Like this we have both
>>
>> Yes, i am also going to do this, do you want i add a separated patch
>> to move them or edit the previous patches?
>
> I believe it is best to edit the previous patches.

Done.


> You can do this as you go and submit one patch after the other for review.
> I believe we mainly need to
> fix bugs, typos and style issues. The only possible architectural problem I
> see is the use of analysis groups.
>
> As stated earlier we should understand if your use of analysis groups is OK.
> I will dig a little bit deeper into it, but believe it should be OK.
>

best regards
ether

Tobias Grosser

unread,
May 14, 2011, 3:16:07 PM5/14/11
to ether zhhb, polly-...@googlegroups.com, poll...@googlegroups.com
On 05/10/2011 01:45 PM, ether zhhb wrote:
> hi tobi,
>
> On Tue, May 10, 2011 at 3:26 AM, Tobias Grosser<tob...@grosser.es> wrote:
>> On 05/09/2011 04:32 PM, ether zhhb wrote:
>>>
>>> hi tobi,
>>>
>>>>> it do not introduce any regressions at the moment
>>>>
>>>> Did you test it on more than 'make polly-test'? Did you run it on
>>>> polybench?
>>>> The llvm test-suite? I really would appreciate to have wider testing.
>>>
>>> not yet, but i will do this later.
>>
>> OK. No need for you to do all the work.
> We can also commit the Makefile for Polly in the llvm test-suite to
> main stream, so the build bots can check them out and run.

Yes, I would love to do this. Can you go ahead and ask if we are allowed
to do this?

>> I am fine committing the patches not fully tested. As they are independent
>> it is better to have them early in the version control system and fix
>> existing bugs as we go. Otherwise they become too big to review them
>> conveniently. If your code is used by a wider audience it will be tested
>> automatically more throughly.
> Yes, so when you think it is ok to commit, please drop me a mail

I am currently looking into your new patches.

I also committed a set of changes to the existing code generation, that
simplified the setup of our existing code generation. The main change is
that we do not delete the old SCoP, but keep it in a conditional-branch.
Furthermore, I removed unneeded code and incorrect addPreserves.

>>>> Another possibility to increase the speed of integrating your new codegen
>>>> framework is to add a new directory + a PollyCodegen library, that is
>>>> available in parallel to the existing framework. Like this we have both
>>>
>>> Yes, i am also going to do this, do you want i add a separated patch
>>> to move them or edit the previous patches?
>>
>> I believe it is best to edit the previous patches.
> Done.

Thanks.

Cheers
Tobi

ether zhhb

unread,
May 20, 2011, 11:58:55 AM5/20/11
to Tobias Grosser, polly-...@googlegroups.com, poll...@googlegroups.com
hi tobi,

>>
>> We can also commit the Makefile for Polly in the llvm test-suite to
>> main stream, so the build bots can check them out and run.
>
> Yes, I would love to do this. Can you go ahead and ask if we are allowed to
> do this?

Yes, but i will confirm if those patches work with llvm trunk first. I
think we also need to patch the "configure", so it copies the
Makefiles for Polly only when Polly is enabled.

>
> I also committed a set of changes to the existing code generation, that
> simplified the setup of our existing code generation. The main change is
> that we do not delete the old SCoP, but keep it in a conditional-branch.
> Furthermore, I removed unneeded code and incorrect addPreserves.

I will have a look at this.

best regards
ether

PS: Sorry for reply late, i just setup a IPv6 connection to access
gmail, and i am not able to access https://llvm.org now.

Tobias Grosser

unread,
May 20, 2011, 4:47:33 PM5/20/11
to ether zhhb, polly-...@googlegroups.com, poll...@googlegroups.com
On 05/20/2011 12:58 PM, ether zhhb wrote:
> hi tobi,
>
>>>
>>> We can also commit the Makefile for Polly in the llvm test-suite to
>>> main stream, so the build bots can check them out and run.
>>
>> Yes, I would love to do this. Can you go ahead and ask if we are allowed to
>> do this?
> Yes, but i will confirm if those patches work with llvm trunk first. I
> think we also need to patch the "configure", so it copies the
> Makefiles for Polly only when Polly is enabled.
OK.

>> I also committed a set of changes to the existing code generation, that
>> simplified the setup of our existing code generation. The main change is
>> that we do not delete the old SCoP, but keep it in a conditional-branch.
>> Furthermore, I removed unneeded code and incorrect addPreserves.
> I will have a look at this.

Thanks

Tobi

Tobias Grosser

unread,
May 26, 2011, 5:42:26 PM5/26/11
to ether zhhb, polly-...@googlegroups.com, poll...@googlegroups.com
On 05/10/2011 01:36 PM, ether zhhb wrote:
> hi tobi,
>
> thanks for fixing these for me.

Hi either,

sorry for the delay. I went again over these changes and have still a
couple of suggestions. They also include several suggestions that were
already mentioned in my previous mail, but that were not changed in your
new patch. I readded them as I was not sure if you overlooked them (some
are just typos) or you intentionally did not apply them. Not following
my suggestions is perfectly fine. I would just like to ask you to add in
the next mail for every change a short comment like 'OK' or 'Not
adapted, because ...'. Like this I can easily check if you just forgot
to add something or you decided to go a different way.

Also, I propose to start with this documentation as the first patch to
commit upstream. I added a new page 'documentation.html', in which you
may add a link to this document. Also, I think it would be good to add a
header that states this is still work in progress. Something like:

"WARNING: The framework documented here is new, not entirely committed
and only slightly tested. It is not yet enabled by default, but may
replace the existing code generation pass if it has proven itself useful
and robust."

Can you submit the final document a last time, such that I can go again
over the whole document.

Also, can you resubmit the first patch of the code generation framework?
I will review it then.

Cheers and thanks for all your work
Tobi

P.S.: More comments inline

Framework in Polly translates
- 'is a part' is not needed
- 'translate' -> 'translates': Third person
's'
(This existed already in my last review. I think it would be best, if
you could state for every suggestion, if you applied it or not.
Otherwise, I have to compare everything myself.)

> +<a href="http://www.bastoul.net/cloog/manual.php#CLooG-Output">CLooG AST</a>
> + (clast) back to<a href="http://llvm.org/docs/LangRef.html">LLVM-IR</a>. It
> + provides a platform to implement code generation for different platforms in a
> + modular way.
> </p>
>
> <p>The framework can be divided in to two parts: The code

consists of two parts:

> generation chain and
> @@ -34,46 +34,45 @@
> <!--=====================================================================-->
> <h3>Code Generation Driver</h3>
> <!--=====================================================================-->
> -<p>The Code Generation Driver is a ScopPass, it will get the CLAst from the
> - CloogInfo pass and call the main code generation function of the Code
> - Generation Chain. The main code generation function will iterate over the
> - CLAst, and generate code for the clast statement/expression.
> - Note that only non-trivial CLAst statements/expressions(e.g. reduction
> - expression, for statement and user statement) will be sent to the chain and
> - handled by the passes in the chain, while trivial CLAst
> statements/expressions
> - will be handled immediately without passed to the chain.
> +<p>The Code Generation Driver is a ScopPass, it gets the clast from the

. It gets
- Start a new sentence.
This is easier to
understand
(Also part of my last review)

> + CloogInfo pass and calls the main code generation function of the Code
> + Generation Chain. The main code generation function iterates over the clast,

clast and generates
- no comma

> + and generates code for the clast statement/expression. Note that only
> + non-trivial clast statements/expressions(e.g. reduction expression, for

statements or expressions (e.g. reduction
expressions, for statements, user statements)
- replace '/' with 'or'
- add space before '('
- use plural in the parenthesis (we used it
before the parenthesis too)

(The last change was already part of my last review)

> + statement and user statement) are sent to the chain and handled by the passes
> + in the chain, while trivial clast statements/expressions are handled

statements and expressions


> + immediately without being passed to the chain.
> </p>
> <!--=====================================================================-->
> <h3>Code Generation Chain</h3>
> <!--=====================================================================-->
> -<p>The Code Generation Chain consist of a set of Passes that also inherited
> - from the "Codegen" class, their are chained together by register them as a
> - member of the "Codegen"
> +<p>The Code Generation Chain consists of a set of passes that also inherited

that inherit
- remove 'also'
- use present tense


> + from the "CodeGen" class, they are chained together by registering them as a

. They are
- Start new sentence

(The change was already part of my last review)


> + member of the "CodeGen"
> <a href="http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup">
> - Analysis Group</a>, and when we say "Polly Code Generation Pass", it means
> - "A Pass in the Code Generation Chain of Polly".
> + Analysis Group</a>. The expression "Polly Code Generation Pass" means a pass
> + in the code generation chain of Polly.
> </p>
> <p>The code generation request will pass through chain until some pass

request passes through the chain until
- 'will pass' -> 'passes': use present tense
- add 'the'

(The change was already part of my last review)

> - in the chain respond to the request, generate code and stop passing the
> - request to the next pass in the chain. If the code generation request reach
> - the end of the chain - the "SequentialCodegen" pass, sequential code will be
> - generated for the request. This means passes apart from
> SequentialCodegen pass
> - in the chain do not need to respond to ALL code generation request, instead,
> - their only need to respond to the code generation request that their are
> + in the chain responds to the request, generates code and stops passing the
> + request to the next pass in the chain. If the code generation request reaches
> + the end of the chain, the "SequentialCodeGen" pass, sequential code will be
> + generated for the request. This means passes apart from
> SequentialCodeGen pass
> + in the chain do not need to respond to ALL code generation requests. Instead,
> + they only need to respond to the code generation requests that their are

that they are
- 'their' -> 'they'

> interested in. For example, the vector code generation pass only generates
> - code for inner most loop and the ScopStmt inside the loop, the rest
> code generation
> - request will be ignored and passed to the next pass in the chain.
> + code for inner most loops and the ScopStmts inside these loops. The rest code

The remaining code generation
- 'rest' -> 'remaining'

(The change was already part of my last review)


> + generation requests are ignored and passed to the next pass in the chain.
> </p>
>
> <!--=====================================================================-->
> <h2>Writing a Code Generation Pass for Polly</h2>
> <!--=====================================================================-->
> -<p>We will disussed how to write a code generation pass in this section.
> - We are going to start by listing the basic steps to write a code generation
> - pass, and then disussed important data sturctures and function of the Codegen
> - Class, at last we will show you how to use those data structures and how to
> - reimplement those functions by taking the existing code in the framework for
> - Example.
> +<p>We disuss how to write a code generation pass in this section. We start by

discuss


> + listing the basic steps to write a code generation pass, and then disuss

. Then we discuss
- New sentence
- 'disuss' -> 'discuss'

(The change was already part of my last review)


> + important data structures and functions of the CodeGen Class.
> Finally, we show
> + how to use those data structures and how to override those

these these
- 'those' -> 'these'

> functions by taking
by using

> + the existing code in the framework for example.

framework as an example.

> </p>
> <!--=====================================================================-->
> <h3>Basic Steps</h3>
> @@ -83,44 +82,44 @@
> <a href="http://llvm.org/docs/WritingAnLLVMPass.html">how to write a LLVM
> Pass</a> and
> <a href="http://llvm.org/docs/WritingAnLLVMPass.html#analysisgroup">Analysis
> - Groups</a>, then you can do it by follow these steps:
> + Groups</a>, then you can do it by following these steps:
> <ol>
> -<li>Add a c++ source file with the name of your code generation pass under
> - lib/CodeGen/ directory, and add the file to the CMakelists.txt's source list.
> +<li>Add a C++ source file with the name of your code generation pass in

pass in the lib/CodeGen
- missing 'the'

> + lib/CodeGen/ directory and add the file to the CMakelists.txt's source list.
> </li>
> -<li>Include "Codegen.h" and other necessary header files in you c++ file.
> +<li>Include "CodeGen.h" and other necessary header files in your C++ file.
> </li>
> -<li>Define a class that inherited from the "Codegen" class and the "ScopPass"
> - class, and register you pass as an member of the "Codegen" Analysis
> Group. For
> - example, the VectorizedCodegen pass is regestered like this:
> +<li>Define a class that inherits from the "CodeGen" class and the "ScopPass"
> + class and register your pass as a member of the "CodeGen" Analysis Group. For
> + example, the VectorizedCodeGen pass is regestered as follows:

registered


> <pre class="code">
> -static RegisterPass<VectorizedCodegen>
> +static RegisterPass<VectorizedCodeGen>
> D("polly-vectorized-codegen",
> - "Polly - Vectorized Codegen implementation that generate vectorized code");
> + "Polly - Vectorized CodeGen implementation that generates
> vectorized code");
>
> -static RegisterAnalysisGroup<Codegen> E(D);</pre>
> +static RegisterAnalysisGroup<CodeGen> E(D);</pre>
> </li>
> -<li>Reimplement the "runOnScop" function of ScopPass, and call the
> "initializeChain"
> - function there to set up the Code Generation Chain correctly like this:
> +<li>Override the "runOnScop" function of ScopPass and call the
> + "initializeChain" function there to set up the Code Generation Chain:

function to set up
> <pre class="code">
> bool runOnScop(Scop&S) {


> initializeChain(this);</pre>
> </li>
> <li> Implement the "getAdjustedAnalysisPointer" so your pass can
> work with the
> - LLVM Analysis Group framework. You can just copy the following code:
> + LLVM Analysis Group framework. Copy the following code:
> <pre class="code">
> /// getAdjustedAnalysisPointer - This method is used when a pass implements
> /// an analysis interface through multiple inheritance. If needed, it
> /// should override this to adjust the this pointer as needed for the
> /// specified pass info.
> void *getAdjustedAnalysisPointer(const void *ID) {

> - if (ID ==&Codegen::ID)


> - return static_cast&lt;Codegen*&gt;(this);

> + if (ID ==&CodeGen::ID)


> + return static_cast&lt;CodeGen*&gt;(this);
> return this;
> }</pre>
> </li>
> -<li>Reimplement the virtual functions of "Codegen" class to generate code for
> - the CLAst statements or/and expressions that you are interested in.
> +<li>Reimplement the virtual functions of the "CodeGen" class to generate code
> + for the clast statements or/and expressions you are interested in.

statements and expressions
- I think the 'or' is not needed here.

generated?</li>
- No need for the 'and'

> +<li>Where should the newly generated code be placed?</li>
> </ul>
> -<p>To ensure a code generation location can always possible to reaches all
> - passes in the chain, when a code generation pass finishes generate code for
> - the current code generation location, they should return the next location to
> - driver. Then the driver will send the next location to the first pass in the
> - chain. So instead of return the next location to the driver, if you call
> - functions to handle next code generation location recursively, the location
> - will not able to reaches the passes before you pass in the chain, and this is
> - probably not the way you want.
> +<p>To ensure all code location can always be handled by all passes

locations
> + in the chain.
^ here the sentence is still incomplete.

^
Chineese? ;-)

> + non-linear CFG such as clast_for, clast_guard.</li>

non-linear control flow constructs like clast_for ...

> +<li>A symbol table that maps clast names to LLVM Values. The symbol table is
> + shared by all code regions.</li>
> +<li>A parameter table that maps parameters in the original scop to new
> + parameters in the newly generated code. The parameter table is shared by all
> + code regions.</li>
> <p>Parameter map is designed to help the code generation pass to avoid cross
> function references when generate code in a function that is different from
> the parent function of the original scop.
> </p>
> -<li>A Scop statement level vector of ValueMaps that mapping LLVM value in
> - the old scop to the new generated code, the ValueMaps are only available in
> - the code generation region coresponding to a Scop statement, and will be
> - cleared after the region of the Scop statement is exited.</li>
> -<p>We need more than one value map because we need to support generate code
> - for unrolled loop, and we say each ValueMap correspond to a "symbol space",
> - and you look up the same value in the old scop in difference "symbol space"
> - will get difference result.</p>
> +<li>A Scop statement level vector of ValueMaps that maps LLVM Values in
> + the old scop to new ones in the newly generated code. The ValueMaps are only
> + available in the code region corresponding to a Scop statement. They are
> + cleared after the code region of the Scop statement is left.</li>
> +<p>We need more than one value map because we need to support code generation
> + for unrolled loops. We say each ValueMap correspond to a "symbol

correspnds


> space". If you
> + look up the same value in the old scop in different "symbol space", you will
> + get different results.</p>
> </ul>
> -<p>Because the clast have some tree structure, we will have nested code
> - generation regions during the code generation process, and because we will
> - delete the codegen region after we finish it, so the nested codegen regions
> - have a stack structure.</p>
> -<p>During the code generation process, we have only one codegen region stack,
> - and the stack is shared by all passes in the chain. Region pushing and poping
> - are managed by the Codegen class internally, and code generation
> passes do not
> - need to worry about the region management, all their need to do about codegen
> - region are do some preparation when entering a region that the pass is
> - interested in, and do finalization when exiting a region that just entered.
> -</p>
> +<p>Because the clast has a tree structure, the code generation regions are
> + nested during the code generation process. Because we will delete the code
> + region after we finish it, so the nested code regions can be managed by a

, the ..


> + stack.</p>
> +<p>During the code generation process there is only one code region
> stack which
> + is shared by all passes in the chain. Region pushing and poping is managed by

popping


> + the CodeGen class internally and code generation passes do not need to worry
> + about the region management. All their need to do is the preperation when

the preparation


> + entering a region that the pass is interested in and the finalization when
> + exiting a region.</p>
> <!--=====================================================================-->
> -<h3>Overridable Code Generation Functions in Codegen Class</h3>
> +<h3>Overridable Code Generation Functions in CodeGen Class</h3>
> <!--=====================================================================-->
> -<p>There are several overridable(virtual) functions in Codegen class, you can
> - reimplement them to allow your pass generate code for some insteresting CLAst
> - statements or/and expressions.
> +<p>There are several overridable(virtual) functions in CodeGen class, you can

class. You


> + override them to allow your pass to generate code for clast statements and

> + expressions it is interested in.
> </p>
> <!--=====================================================================-->
> -<h4>Functions to Handle CLAst For Statements</h4>
> +<h4>Functions to Handle clast For Statements</h4>
> <!--=====================================================================-->
> -<p>There are two functions to handle the tasks about code generation of CLAst
> - For Statements, you can reimplement them to handle loop-related code
> - generation tasks such as generate the CFG for the loop or unroll the loop.
> +<p>There are two functions to handle the tasks about the code generation of

to handle the code generation of clast
for-statements.


> + clast for-statements. You can override them to handle loop-related code
> + generation tasks such as generating the CFG for the loop or unrolling the
> + loop.
> </p>
> <!--=====================================================================-->
> <h5>The enterLoop(Reg *, Value *, Value *) function</h5>
> <!--=====================================================================-->
> <pre class="code">
> virtual Loc enterLoop(Reg *L, Value *LB, Value *UB);</pre>
> -<p>The enterLoop function is designed to let you do necessary preparation for

is designed to prepare generating code for
a loop region.


> - generating code for the region of a loop. It takes the code generation region
> - of the loop, the lower bound of the loop and the upper bound of the loop as
> - arguments, and returns the code generation location of the loop body.
> +<p>The enterLoop function is designed to let you to do the necessary
> + preparation for generating code for the region of a loop. It takes the code
> + region of the loop, the lower bound of the loop and the upper bound of the
> + loop as arguments and returns the code location of the loop body.
> </p>
> <p>What you need to do in this function usually includes:</p>
> <ul>
> <li>Generate the CFG of the loop, excluding the back edge.</li>
> -<li>Generate the induction variable of the loop, and insert it to the symble
> - table of the code generation region.</li>
> -<li>Adjust the entry block and exit block of the code generation region.</li>
> +<li>Generate the induction variable of the loop, and insert it into
> the symbol
> + table of the code region.</li>
> +<li>Adjust the entry block and exit block of the code region.</li>
> </ul>
> <p>Instead of handle all incoming loops, you can just handle the loops that
> - you are insterated in, for example, the corresponding code in
> VectorizedCodegen
> + you are insterated in. For example, the corresponding code in

interested

It prepares the code generation of the Scop statement.
- 'It is designed to let you': This part seems unnecessary. I have
the feeling saying directly what the function does/is used for
simplifies the text. (Probably also adaptable to the other
functions)

> + the Scop statement. It takes the code region of the scop statement
> as argument
> + and returns the code location of the body of the statement, i.e. the location
> + to place the code for the clast assign statements.
> +</p>
> +<p>What you need to do in this function is rather simple: allocates enough
> + local maps to map the value in the old BasicBlock of the statement
> to the value
> + in the new BasicBlock of the statement. For example, the
> enterScopStmt function
> + in VectorizedCodeGen allocate several maps to map the value in the old

allocates

> + BasicBlock to the new values in different iterations of the loop:
> </p>
> <pre class="code">
> -Codegen::Loc VectorizedCodegen::enterScopStmt(Reg *Stmt) {
> - // Should us take care of this ScopStmt?
> - if (...we had never unrolled the loop...)
> - return Codegen::enterScopStmt(Stmt);
> +CodeGen::Loc VectorizedCodeGen::enterScopStmt(Reg *Stmt) {
> + // Should we take care of this ScopStmt?
> + if (...we never unrolled the loop...)
> + return CodeGen::enterScopStmt(Stmt);
>
> - // Allocate Scalar map for each iteration and a vector map.
> + // Allocate a scalar map for each iteration and a vector map.
> Stmt->allocateLocalMaps(getUnrolledTimes() + 1);
>
> - // Return the code generation location of the substitutions
> + // Return the code location of the substitutions
> const clast_user_stmt *u = (const clast_user_stmt*)Stmt->stmt;
> return Loc(u->substitutions, Stmt->Entry);
> }</pre>
> @@ -331,46 +329,45 @@ Codegen::Loc VectorizedCodegen::enterScopStmt(Reg *Stmt) {
> <pre class="code">
> virtual Loc exitScopStmt(Reg *Stmt, BasicBlock *LastBB);</pre>
> <p>The exitScopStmt function is similar to the exitLoop function, it is

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Needed?

> - designed to let you necessary finalization after code generation for the
> - Scop statement is done. It takes the code generation region of the Scop
> - Statement, and the last basic block of the statement as arguments,
> and returns
> - the code generation location of the CLAst statement that next to the Scop
> - Statement.
> + designed to let you to do the necessary finalization after code

Here the same. 'it is designed to let you do': This seems to be very
redundant.

What about:

The exitScopStmt() function is called after processing a Scop statement.
It should contain code to finalize the generation of this ScopStatement.

overriding


> </p>
> <!--=====================================================================-->
> <h5>The codegen(enum clast_red_type, ArrayRef<Value*>, BasicBlock
> *) function</h5>
> @@ -439,20 +436,20 @@ void VectorizedCodegen::substIVs(ScopStmt *S,
> ArrayRef<Value*> NewIVs,
> <pre class="code">
> virtual Value *codegen(enum clast_red_type red_ty, ArrayRef<Value*> SubExprs,
> BasicBlock *BB);</pre>
> -<p>This function is called when the code generation framework try to generate
> - code for the CLAst reduction expression, the function takes the type of the
> - redution, the subexpressions of the reduction and the BasicBlock to place the
> - generated code as arguments, and return the result value of the reduction.
> +<p>This function is called when the code generation framework generates code
> + for the clast reduction expression. The function takes the type of the
> + redution, the subexpressions of the reduction, and the BasicBlock

reduction,

> to place the
> + generated code as arguments. It returns the result value of the reduction.
> </p>
> -<p>You can reimplement this function if your platform have special
> support for
> +<p>You can override this function if your platform have special support for
> the reduce operation.
> </p>
> <!--=====================================================================-->
> <h2>Other details about the Code Generation Framework</h2>
> <!--=====================================================================-->
> -<p>This document do not cover all detials about the code generation
> - framework of polly, you can read the code or ask on the list to get other
> - detials.
> +<p>This document do not cover all detials about the code generation framework

does details


> + of Polly, you can read the code or ask on the list to get other

. You

ether zhhb

unread,
May 28, 2011, 12:23:57 PM5/28/11
to Tobias Grosser, polly-...@googlegroups.com, poll...@googlegroups.com
hi tobi,

On Fri, May 27, 2011 at 5:42 AM, Tobias Grosser <tob...@grosser.es> wrote:

> Hi either,
>
> sorry for the delay. I went again over these changes and have still a couple

thanks for take your time again.


> of suggestions. They also include several suggestions that were already
> mentioned in my previous mail, but that were not changed in your new patch.
> I readded them as I was not sure if you overlooked them (some are just
> typos) or you intentionally did not apply them. Not following my suggestions
> is perfectly fine. I would just like to ask you to add in the next mail for

Well since i do not very good at English, i think i missed them in last mail ;)


> every change a short comment like 'OK' or 'Not adapted, because ...'.  Like
> this I can easily check if you just forgot to add something or you decided
> to go a different way.

Ok


>
> Also, I propose to start with this documentation as the first patch to
> commit upstream. I added a new page 'documentation.html', in which you may
> add a link to this document. Also, I think it would be good to add a header

Ok.


> that states this is still work in progress. Something like:

Yes, i remember the llvm document have such header, i will have a look
at those header.


>
> "WARNING: The framework documented here is new, not entirely committed and
> only slightly tested. It is not yet enabled by default, but may replace the
> existing code generation pass if it has proven itself useful and robust."
>
> Can you submit the final document a last time, such that I can go again over
> the whole document.

Ok, i will send you the mail on monday, i currently out of time...


>
> Also, can you resubmit the first patch of the code generation framework? I
> will review it then.

sure


>
> Cheers and thanks for all your work

you are welcome
> Tobi
best regards
ether

Tobias Grosser

unread,
Jun 8, 2011, 12:42:02 AM6/8/11
to ether zhhb, polly-...@googlegroups.com, poll...@googlegroups.com
Any news from this?

Tobi

Reply all
Reply to author
Forward
0 new messages