OpenScop RFC

67 views
Skip to first unread message

Cédric Bastoul

unread,
Jun 24, 2011, 9:56:27 PM6/24/11
to openscop-d...@googlegroups.com, Louis-Noel Pouchet, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Hi @ all,
it's one year late, but here is the "request for comments" version of OpenScop !

I recall this is an attempt to provide one input and/or output format for polyhedral compilation tools. The goal is to ease interaction between our tools. Ultimately, plugging Pluto/LeTSeE/Whatever to Graphite/Polly/WRAP-IT/Whatever would become pretty easy.

OpenScop has already been discussed for a while, so I don't plan major changes. However, I welcome very much your comments and suggestions. I think you will feel familiar with the representation instantly. For a quick tour, I suggest reading the manual (attached) Section 1 (intro), 3.1.1 and 3.1.2 (examples of the file format), 3.2 (extensions), 4.5 (development) and especially 4.5.4 for extension development.

Please also find in attachment the OpenScop Library corresponding to this RFC. This is an example implementation for importing/exporting OpenScop files. It's in development but it works (I deactivated fancy stuff which may be unstable but the core sounds OK).

Next week, you will receive an announce on the new OpenScop-based Clan. So it will be possible to translate codes to OpenScop very easily. The next targets will be Candl and CLooG.

I hope you will like it !
Cheers,

Cedric

OpenScop repository :
very soon in :


PS: Tobi, "may" accesses are in the core ;-) !
openscop-0.5.0.tar.gz
openscop.pdf

Uday K Bondhugula

unread,
Jun 26, 2011, 10:25:22 AM6/26/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Louis-Noel Pouchet, Tobias Grosser, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Cedric,

1. On page 30/31, why do we have '-1's in the first row following the
number of rows and columns -- for all the relations?

2. It would be good to add an example of a relation with an
existentially quantified dimension -- on page 23/24 (27/28 ext #).

Thanks,
Uday

Cédric Bastoul

unread,
Jun 26, 2011, 3:55:26 PM6/26/11
to ud...@csa.iisc.ernet.in, openscop-d...@googlegroups.com, Louis-Noel Pouchet, Tobias Grosser, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Sun, Jun 26, 2011 at 4:25 PM, Uday K Bondhugula <uday...@gmail.com> wrote:
Cedric,

1. On page 30/31, why do we have '-1's in the first row following the number of rows and columns -- for all the relations?

In matrix representation, the last four numbers (input, output, local, params) can be "undefined" (i.e., -1). If this is the case, they are not printed by openscop_structure_print() functions. However I chose to print them with openscop_structure_dump() functions since their primary usage is for debugging (so we may want to see the values).
 
2. It would be good to add an example of a relation with an existentially quantified dimension -- on page 23/24 (27/28 ext #).

Right, I'll add one.

Thanks for your feedback !

Ced.

胡士文

unread,
Jun 26, 2011, 11:25:41 PM6/26/11
to C閐ric_Bastoul, openscop-development@googlegroup, Louis-Noel Pouchet, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, Thomas Legris, 李昕
In the Installation Instruction, you mention the autogen.sh. But there is no autogen.sh in the root directory, so is it necessary to type the command: ./autogen.sh ?
I typed the other three commands and get the lib.
 

胡士文
2011-06-27

发件人: C閐ric_Bastoul
发送时间: 2011-06-25 09:56:54
收件人: openscop-d...@googlegroups.com; Louis-Noel Pouchet; Tobias Grosser; B Uday Kumar Reddy; Sven Verdoolaege; Pop, Sebastian; Tristan Vanderbruggen; 胡士文; Thomas Legris
抄送:
主题: OpenScop RFC

Louis-Noel Pouchet

unread,
Jun 27, 2011, 2:58:56 AM6/27/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

- I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
WRITE  # type
1 # array id
3 5 # row/col
 0 1 0 0 0 # matrix of the access
 0 0 1 0 0 #

- a general comment: I don't think a reader which knows only the prerequisite (affine, matrix, and for loop) can understand how to specify the 6 attributes of a relation ;) You definitely need a few examples, taking the reader by the hand to explain what are input, output, exst. quantified variables, etc.

- The names are char**. It has proven to be a life-easier for name structures to allow pointers (void*) for the data. Yes of course the user can always write wrappers and map some string to the pointer he wants, but that's not convenient... Contradicting this spirit, it is a requirement for the data to be char* in some tools (cloog for instance, which uses strcmp at a couple of places). So, either you replace char* for names by void*, and add a flag to specify if the data is a char* string or not; or you write in bold triple size that data are strings and cannot be substituted by pointers on random data, because of how the data will be interpreted by other tools (cloning, comparing, etc.).
FYI, the structure I use in PAST looks like that:
struct symbol_t
{
  e_symbol_type_t       type; // data type
  int is_char_data; // 1 if void* data is indeed char*
     // data (imply strdup), 0 otherwise
  void* data;
  int num_refs; // structure reference counter
  int is_attached_to_table; // linked to symbol table?
  struct symbol_t* next;
  struct symbol_t* prev;
};
typedef struct symbol_t s_symbol_t;

I'm not going that far as suggesting adding a lightweight symbol table to openscop (even though you must feel I'd like to ;) ), but at least the ability to use void* is very useful. The tuple (pointer, type) where type is a Boolean {ptr,char}, proves to be good enough for PoCC and all the tools involved, when it comes to managing char* or void* data in PolyOpt.

- You have a few typos, a nice one is speaking about clan_statement_t in the middle of an openscop_statement_t description.

- I'm not sure I understand how extensions are managed from your explanations. All parsing/printing functions are going to be implemented by the user, and we'll have only pointer copies when we clone an openscop_scop_t, right? You may wish to explain the user he should encapsulate scop functions (like, printing, reading, cloning, and freeing) into its own interface when he uses extensions. Another possibility is to have a 'char' extension and a 'generic' extension, the generic being fully unmanaged while the char would be managed by the openscop library. This way, the 'char' one will seamlessly support currently implemented extensions (variable names array, embedding of dependence polyhedra into the scoplib, etc.)

Thanks for the hard work!!


++


<openscop-0.5.0.tar.gz><openscop.pdf>

-- 
Louis-Noel Pouchet

Uday K Bondhugula

unread,
Jun 27, 2011, 3:14:36 AM6/27/11
to Louis-Noel Pouchet, Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris

On 06/27/2011 12:28 PM, Louis-Noel Pouchet wrote:
> Hi Cedric,
>
> I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )
>
> - In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?
>
> - I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
> WRITE # type
> 1 # array id
> 3 5 # row/col
> 0 1 0 0 0 # matrix of the access
> 0 0 1 0 0 #

This is also exactly my opinion. Encoding symbol id in the matrix is
messy; the parameter and iterator columns anyway don't apply. It would
be straightforward for library users to just read symbol id and then
treat the matrix as access matrix (which is what optimizers would store).

-- Uday

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Louis-Noel Pouchet

unread,
Jun 27, 2011, 3:24:00 AM6/27/11
to Louis-Noel Pouchet, Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Also, you should add a usr pointer to the openscop_scop_t structure, as you did with the statement structure.

++

-- 
Louis-Noel Pouchet

Cédric Bastoul

unread,
Jun 27, 2011, 5:12:29 AM6/27/11
to Louis-Noel Pouchet, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq). I wondered whether I should just put "0", but I thought "e" would be more consistent. It's true there are inconsistencies in the manual, I'll correct them and explain better the i/e stuff.
 
- I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
WRITE  # type
1 # array id
3 5 # row/col
 0 1 0 0 0 # matrix of the access
 0 0 1 0 0 #

Here, I really prefer the way I suggested for a number of reasons:

- I don't think there is a double use of the scalar column: I see the memory as a big array "Mem" and the array name is just one dimension of this array. For instance let's consider A[i] and suppose that the array id of A is 42. This corresponds to an access Mem[42][i]. The array id is one dimension of a memory access, not a particular case. It happens to be scalar, but it's not a problem at all.

- Using scalar dimensions is a beautiful solution to accept very complex memory accesses (not only simple "array accesses"). For instance foo->b[i]->bar.toto[j] is analyzed by Clan by counting the different fields for each reference. Let's suppose "foo" has id 43. "b" has id 1 (it's relative to "foo": maybe clan read, e.g., "foo->a" before and decided that "a" has id 0 relatively to "foo"), "bar" has id 0 (here it's relative to foo->b[]) and "toto" has id "2" (here it's relative to foo->b[]->bar), then it corresponds to an access to Mem[43][1][i][0][2][j]. It's nicely unified and ready for data dependence analysis :-).

- It allows using a relation structure the way it is. The solution you suggested has a problem because the array id would have the place used by the number of unions in the relation (it is optional if there is only one part in the union, see pp 23). Accesses can be unions too. Including an array id external to the constraint matrix would mean adding a new field which would not be useful for anything else.

What do you think ? And Uday ? That's really about "memory" accesses, not "array" accesses, I think it's pretty cool :-) !
 
- a general comment: I don't think a reader which knows only the prerequisite (affine, matrix, and for loop) can understand how to specify the 6 attributes of a relation ;) You definitely need a few examples, taking the reader by the hand to explain what are input, output, exst. quantified variables, etc.

OK.
 
- The names are char**. It has proven to be a life-easier for name structures to allow pointers (void*) for the data. Yes of course the user can always write wrappers and map some string to the pointer he wants, but that's not convenient... Contradicting this spirit, it is a requirement for the data to be char* in some tools (cloog for instance, which uses strcmp at a couple of places). So, either you replace char* for names by void*, and add a flag to specify if the data is a char* string or not; or you write in bold triple size that data are strings and cannot be substituted by pointers on random data, because of how the data will be interpreted by other tools (cloning, comparing, etc.).

Well, I did put such a flag. See sec. 3.1.4.3 pp 28 on openscop_names_t: there is the "textual" flag (1 if names are character strings, 0 otherwise). And in the general comment I precised:

---
The term "name" is generic and corresponds to a pointer to the information necessary to generate the code. A name may be a string of characters (char *) or a pointer to anything else. For textual tools convenience, the default type is (char *), but it may be casted to your preferred type iff the textual field is 0.
---

But, see below... =>

FYI, the structure I use in PAST looks like that:
struct symbol_t
{
  e_symbol_type_t       type; // data type
  int is_char_data; // 1 if void* data is indeed char*
     // data (imply strdup), 0 otherwise
  void* data;
  int num_refs; // structure reference counter
  int is_attached_to_table; // linked to symbol table?
  struct symbol_t* next;
  struct symbol_t* prev;
};
typedef struct symbol_t s_symbol_t;

I'm not going that far as suggesting adding a lightweight symbol table to openscop (even though you must feel I'd like to ;) ), but at least the ability to use void* is very useful. The tuple (pointer, type) where type is a Boolean {ptr,char}, proves to be good enough for PoCC and all the tools involved, when it comes to managing char* or void* data in PolyOpt.

... => Yes. I really like it and I would be happy to change the "char **" in openscop_names_t by "openscop_symbol_p *". Let me see what would be necessary for Clan to use this symbol table, maybe just a "void * usr" would be good enough. I let the discussion on a symbol table totally open. I'll come back on this in few days with a proposal.
 
- You have a few typos, a nice one is speaking about clan_statement_t in the middle of an openscop_statement_t description.

Hehe, yes, that's still a draft... Do not hesitate to write comments/corrections to a paper version then to scan and to send me the annotated version. Obviously if I get contributions, I'll change the manual authorship accordingly.
 
- I'm not sure I understand how extensions are managed from your explanations. All parsing/printing functions are going to be implemented by the user, and we'll have only pointer copies when we clone an openscop_scop_t, right?

No :-). All the eight base functions created by the extension writers are actually called. E.g., to copy, we call *your* openscop_extension_copy() function. They are linked to the core openscop code with a script (launched during configure), that's why naming conventions are so strict. Just ask for a unique number and a name, write your extension.h and your extension.c respecting the naming conventions for the eight base functions and it should compile and work out of the box :-).
 
You may wish to explain the user he should encapsulate scop functions (like, printing, reading, cloning, and freeing) into its own interface when he uses extensions. Another possibility is to have a 'char' extension and a 'generic' extension, the generic being fully unmanaged while the char would be managed by the openscop library. This way, the 'char' one will seamlessly support currently implemented extensions (variable names array, embedding of dependence polyhedra into the scoplib, etc.)

Thanks for the hard work!!

Well, there's a bunch of lines of code from you ;-) !
Thanks,

Ced

Louis-Noel Pouchet

unread,
Jun 27, 2011, 5:48:07 AM6/27/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:

Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).

For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??
For the access, since eq means the row is interpreted as ax+by+c = 0, I'm not sure I understand what it means to say the access A[i][j] is {i = 0, j = 0}. I suggest to write 'fixed' or 'unused' in the doc.


I wondered whether I should just put "0", but I thought "e" would be more consistent. It's true there are inconsistencies in the manual, I'll correct them and explain better the i/e stuff.
 
- I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
WRITE  # type
1 # array id
3 5 # row/col
 0 1 0 0 0 # matrix of the access
 0 0 1 0 0 #

Here, I really prefer the way I suggested for a number of reasons:

- I don't think there is a double use of the scalar column: I see the memory as a big array "Mem" and the array name is just one dimension of this array. For instance let's consider A[i] and suppose that the array id of A is 42. This corresponds to an access Mem[42][i]. The array id is one dimension of a memory access, not a particular case. It happens to be scalar, but it's not a problem at all.
Bwarf. Yes it's true, but really, bwarf... and you can achieve strictly the same with the id number before the access function.


- Using scalar dimensions is a beautiful solution to accept very complex memory accesses (not only simple "array accesses"). For instance foo->b[i]->bar.toto[j] is analyzed by Clan by counting the different fields for each reference. Let's suppose "foo" has id 43. "b" has id 1 (it's relative to "foo": maybe clan read, e.g., "foo->a" before and decided that "a" has id 0 relatively to "foo"), "bar" has id 0 (here it's relative to foo->b[]) and "toto" has id "2" (here it's relative to foo->b[]->bar), then it corresponds to an access to Mem[43][1][i][0][2][j]. It's nicely unified and ready for data dependence analysis :-).

Same remark as above.


- It allows using a relation structure the way it is. The solution you suggested has a problem because the array id would have the place used by the number of unions in the relation (it is optional if there is only one part in the union, see pp 23). Accesses can be unions too. Including an array id external to the constraint matrix would mean adding a new field which would not be useful for anything else.

That seems to be an implementation limitation, it it makes your life easier let's put the id after the matrix then. But I don't see how reading a value immediately after the access type would be any problem...


What do you think ? And Uday ? That's really about "memory" accesses, not "array" accesses, I think it's pretty cool :-) !

I am clearly unconvinced, because it is confusing. My polyhedral self says I love it, but my compiler self says it's counter-intuitive. A memory reference has an attribute, which is the associated variable symbol. Gluing together the description of the reference and the symbol is overloading the scalar column, which creates the 'confusion'.

If you insist I will live with it, because it's a cosmetic issue and not a semantics issue, but please consider my point carefully.


 
- a general comment: I don't think a reader which knows only the prerequisite (affine, matrix, and for loop) can understand how to specify the 6 attributes of a relation ;) You definitely need a few examples, taking the reader by the hand to explain what are input, output, exst. quantified variables, etc.

OK.
 
- The names are char**. It has proven to be a life-easier for name structures to allow pointers (void*) for the data. Yes of course the user can always write wrappers and map some string to the pointer he wants, but that's not convenient... Contradicting this spirit, it is a requirement for the data to be char* in some tools (cloog for instance, which uses strcmp at a couple of places). So, either you replace char* for names by void*, and add a flag to specify if the data is a char* string or not; or you write in bold triple size that data are strings and cannot be substituted by pointers on random data, because of how the data will be interpreted by other tools (cloning, comparing, etc.).

Well, I did put such a flag. See sec. 3.1.4.3 pp 28 on openscop_names_t: there is the "textual" flag (1 if names are character strings, 0 otherwise). And in the general comment I precised:

---
The term "name" is generic and corresponds to a pointer to the information necessary to generate the code. A name may be a string of characters (char *) or a pointer to anything else. For textual tools convenience, the default type is (char *), but it may be casted to your preferred type iff the textual field is 0.
---


What about the 'iterators' and 'body' pointers in the statement? You must be homogeneous. I have honestly missed the part (hey, I did a very quick read ;) ) about the 'textual' flag in the openscop_names_t, because I focused on openscop_statement_t, but that's not complete as again you must be homogeneous. Anyhow, yes a symbol table would be great :) FYI, you can have a look at it on pocc-svn, see ir/past/past/symbol.c




But, see below... =>

FYI, the structure I use in PAST looks like that:
struct symbol_t
{
  e_symbol_type_t       type; // data type
  int is_char_data; // 1 if void* data is indeed char*
     // data (imply strdup), 0 otherwise
  void* data;
  int num_refs; // structure reference counter
  int is_attached_to_table; // linked to symbol table?
  struct symbol_t* next;
  struct symbol_t* prev;
};
typedef struct symbol_t s_symbol_t;

I'm not going that far as suggesting adding a lightweight symbol table to openscop (even though you must feel I'd like to ;) ), but at least the ability to use void* is very useful. The tuple (pointer, type) where type is a Boolean {ptr,char}, proves to be good enough for PoCC and all the tools involved, when it comes to managing char* or void* data in PolyOpt.

... => Yes. I really like it and I would be happy to change the "char **" in openscop_names_t by "openscop_symbol_p *". Let me see what would be necessary for Clan to use this symbol table, maybe just a "void * usr" would be good enough. I let the discussion on a symbol table totally open. I'll come back on this in few days with a proposal.
 
- You have a few typos, a nice one is speaking about clan_statement_t in the middle of an openscop_statement_t description.

Hehe, yes, that's still a draft... Do not hesitate to write comments/corrections to a paper version then to scan and to send me the annotated version. Obviously if I get contributions, I'll change the manual authorship accordingly.
 
- I'm not sure I understand how extensions are managed from your explanations. All parsing/printing functions are going to be implemented by the user, and we'll have only pointer copies when we clone an openscop_scop_t, right?

No :-). All the eight base functions created by the extension writers are actually called. E.g., to copy, we call *your* openscop_extension_copy() function. They are linked to the core openscop code with a script (launched during configure), that's why naming conventions are so strict. Just ask for a unique number and a name, write your extension.h and your extension.c respecting the naming conventions for the eight base functions and it should compile and work out of the box :-).

Mmm. Ok. I don't understand how you get the name, and register it such that the driver calls it, etc. What about a plain good old function pointer? Also, it looks strange to have the extension implementation part of the openscop lib (unless I missed something again). That will mean many people playing with different libscoplib files, it will add difficulty to maintaining scoplib when bug reports will be filed.
I would suggest more something along those lines then:
provide a function ptr structure, like this:
typedef void (*ptrfun)(void*);
typedef void (*ptrfunprint)(FILE*, void*);
typedef void (*ptrfunread)(FILE*, void*);
typedef void* (*ptrfunclone)(void*);
struct openscop_extension_functions {
   int id;
   ptrfun freefun; // free function, free 1 instance of openscop_extension_t
   ptrfunprint printfun; // FILE print function called
   ptrfunprint readfun; // FILE read function called
   ptrfunclone clonefun; // cloning function called
};


Then, you have a "register" function in openscop, that will store an instance of this struct to its memory, and each time an extension of id 'id' is seen those functions are used.
The user can then wrap openscop in its own library, or link against both modules, genuine openscop and his personal extension module.

++
 
 
You may wish to explain the user he should encapsulate scop functions (like, printing, reading, cloning, and freeing) into its own interface when he uses extensions. Another possibility is to have a 'char' extension and a 'generic' extension, the generic being fully unmanaged while the char would be managed by the openscop library. This way, the 'char' one will seamlessly support currently implemented extensions (variable names array, embedding of dependence polyhedra into the scoplib, etc.)

Thanks for the hard work!!

Well, there's a bunch of lines of code from you ;-) !
Thanks,

Ced




On Jun 24, 2011, at 9:56 PM, Cédric Bastoul wrote:

Hi @ all,
it's one year late, but here is the "request for comments" version of OpenScop !

I recall this is an attempt to provide one input and/or output format for polyhedral compilation tools. The goal is to ease interaction between our tools. Ultimately, plugging Pluto/LeTSeE/Whatever to Graphite/Polly/WRAP-IT/Whatever would become pretty easy.

OpenScop has already been discussed for a while, so I don't plan major changes. However, I welcome very much your comments and suggestions. I think you will feel familiar with the representation instantly. For a quick tour, I suggest reading the manual (attached) Section 1 (intro), 3.1.1 and 3.1.2 (examples of the file format), 3.2 (extensions), 4.5 (development) and especially 4.5.4 for extension development.

Please also find in attachment the OpenScop Library corresponding to this RFC. This is an example implementation for importing/exporting OpenScop files. It's in development but it works (I deactivated fancy stuff which may be unstable but the core sounds OK).

Next week, you will receive an announce on the new OpenScop-based Clan. So it will be possible to translate codes to OpenScop very easily. The next targets will be Candl and CLooG.

I hope you will like it !
Cheers,

Cedric

OpenScop repository :
very soon in :


PS: Tobi, "may" accesses are in the core ;-) !
<openscop-0.5.0.tar.gz><openscop.pdf>

-- 
Louis-Noel Pouchet



-- 
Louis-Noel Pouchet

Cédric Bastoul

unread,
Jun 27, 2011, 7:35:26 AM6/27/11
to Louis-Noel Pouchet, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:

Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).

For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??

Mmm, from the start it was not supposed to be possible as there was a correspondence 1 row <=> 1 scattering dimension (scattering "function"). We can state that the ith row with a 0 in the first column corresponds to the ith scattering dimension (but if we want to do an index-set splitting with i == 0, we must use two inequalities). According to me this makes the matrix representation more complex to understand, but I would not fight against it.
 
For the access, since eq means the row is interpreted as ax+by+c = 0, I'm not sure I understand what it means to say the access A[i][j] is {i = 0, j = 0}. I suggest to write 'fixed' or 'unused' in the doc.

OK.
 
I wondered whether I should just put "0", but I thought "e" would be more consistent. It's true there are inconsistencies in the manual, I'll correct them and explain better the i/e stuff.
 
- I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
WRITE  # type
1 # array id
3 5 # row/col
 0 1 0 0 0 # matrix of the access
 0 0 1 0 0 #

Here, I really prefer the way I suggested for a number of reasons:

- I don't think there is a double use of the scalar column: I see the memory as a big array "Mem" and the array name is just one dimension of this array. For instance let's consider A[i] and suppose that the array id of A is 42. This corresponds to an access Mem[42][i]. The array id is one dimension of a memory access, not a particular case. It happens to be scalar, but it's not a problem at all.
Bwarf. Yes it's true, but really, bwarf... and you can achieve strictly the same with the id number before the access function.


- Using scalar dimensions is a beautiful solution to accept very complex memory accesses (not only simple "array accesses"). For instance foo->b[i]->bar.toto[j] is analyzed by Clan by counting the different fields for each reference. Let's suppose "foo" has id 43. "b" has id 1 (it's relative to "foo": maybe clan read, e.g., "foo->a" before and decided that "a" has id 0 relatively to "foo"), "bar" has id 0 (here it's relative to foo->b[]) and "toto" has id "2" (here it's relative to foo->b[]->bar), then it corresponds to an access to Mem[43][1][i][0][2][j]. It's nicely unified and ready for data dependence analysis :-).

Same remark as above.

No. Would you prefer to maintain a list of references (for foo->b[i]->bar.toto[j], foo, b, bar and  toto) and how they interleave with "true" array dimensions ([i][j], but put [i] after b and put [j] after toto) ? Probably not. If you're OK with internal scalar dimensions, there is no difference with the external array identifier. Except that this is consistent and beautiful: Mem[43][1][i][0][2][j] :-) !
- It allows using a relation structure the way it is. The solution you suggested has a problem because the array id would have the place used by the number of unions in the relation (it is optional if there is only one part in the union, see pp 23). Accesses can be unions too. Including an array id external to the constraint matrix would mean adding a new field which would not be useful for anything else.

That seems to be an implementation limitation, it it makes your life easier let's put the id after the matrix then. But I don't see how reading a value immediately after the access type would be any problem...
 
Sure it would not be a problem. The way I suggest is not a trick to avoid an additional field in the relation structure, I'm really convinced it is more elegant !
What do you think ? And Uday ? That's really about "memory" accesses, not "array" accesses, I think it's pretty cool :-) !

I am clearly unconvinced, because it is confusing. My polyhedral self says I love it, but my compiler self says it's counter-intuitive. A memory reference has an attribute, which is the associated variable symbol. Gluing together the description of the reference and the symbol is overloading the scalar column, which creates the 'confusion'.

Your polyhedral self is right: for data dependence analysis, there is no more need for a special case about the reference identifier. Every dimension is processed the same way, it's really elegant. When you need your symbol id, you use openscop_relation_get_array_id() and you're all set !

If you insist I will live with it, because it's a cosmetic issue and not a semantics issue, but please consider my point carefully.

I do insist. But let's be clear : if I'm the only one to be convinced it is a better way, I'll change it.
- a general comment: I don't think a reader which knows only the prerequisite (affine, matrix, and for loop) can understand how to specify the 6 attributes of a relation ;) You definitely need a few examples, taking the reader by the hand to explain what are input, output, exst. quantified variables, etc.

OK.
 
- The names are char**. It has proven to be a life-easier for name structures to allow pointers (void*) for the data. Yes of course the user can always write wrappers and map some string to the pointer he wants, but that's not convenient... Contradicting this spirit, it is a requirement for the data to be char* in some tools (cloog for instance, which uses strcmp at a couple of places). So, either you replace char* for names by void*, and add a flag to specify if the data is a char* string or not; or you write in bold triple size that data are strings and cannot be substituted by pointers on random data, because of how the data will be interpreted by other tools (cloning, comparing, etc.).

Well, I did put such a flag. See sec. 3.1.4.3 pp 28 on openscop_names_t: there is the "textual" flag (1 if names are character strings, 0 otherwise). And in the general comment I precised:

---
The term "name" is generic and corresponds to a pointer to the information necessary to generate the code. A name may be a string of characters (char *) or a pointer to anything else. For textual tools convenience, the default type is (char *), but it may be casted to your preferred type iff the textual field is 0.
---


What about the 'iterators' and 'body' pointers in the statement? You must be homogeneous. I have honestly missed the part (hey, I did a very quick read ;) ) about the 'textual' flag in the openscop_names_t, because I focused on openscop_statement_t, but that's not complete as again you must be homogeneous.

Right, I saw that too late but it was planned. I was about to create an openscop_body_t (extracted from openscop_statement_t, plus the textual flag) to put what is related to the body (I have a proof in the "body" branch of OpenScop :-D).
Now, the name comes from the "structure".c file (the script lists the files in the extensions directory and get the names from that, that's why I'm asking for exactly one .c per extension), the registration comes when you ask for a unique id and name. I thought a lot about a plain good old function pointer-based solution. I think the solutions are exactly the same except that mine avoids tons of painful casts when you are using extensions. For unknown extensions, they are just ignored with a warning message.
 
Ced. 

Uday K Bondhugula

unread,
Jun 27, 2011, 8:28:11 AM6/27/11
to Cédric Bastoul, Louis-Noel Pouchet, Tobias Grosser, openscop-d...@googlegroups.com, Tristan Vanderbruggen, Thomas Legris, 胡士文, Pop, Sebastian, Sven Verdoolaege

On 06/27/2011 02:42 PM, Cédric Bastoul wrote:
> Hi Louis-Noël,


> Here, I really prefer the way I suggested for a number of reasons:

> - I don't think there is a double use of the scalar column: I see the memory
> as a big array "Mem" and the array name is just one dimension of this array.
> For instance let's consider A[i] and suppose that the array id of A is 42.
> This corresponds to an access Mem[42][i]. The array id is one dimension of a
> memory access, not a particular case. It happens to be scalar, but it's not
> a problem at all.
>
> - Using scalar dimensions is a beautiful solution to accept very complex
> memory accesses (not only simple "array accesses"). For instance
> foo->b[i]->bar.toto[j] is analyzed by Clan by counting the different fields
> for each reference. Let's suppose "foo" has id 43. "b" has id 1 (it's
> relative to "foo": maybe clan read, e.g., "foo->a" before and decided that
> "a" has id 0 relatively to "foo"), "bar" has id 0 (here it's relative to
> foo->b[]) and "toto" has id "2" (here it's relative to foo->b[]->bar), then
> it corresponds to an access to Mem[43][1][i][0][2][j]. It's nicely unified
> and ready for data dependence analysis :-).
>
> - It allows using a relation structure the way it is. The solution you
> suggested has a problem because the array id would have the place used by
> the number of unions in the relation (it is optional if there is only one
> part in the union, see pp 23). Accesses can be unions too. Including an
> array id external to the constraint matrix would mean adding a new field
> which would not be useful for anything else.
>
> What do you think ? And Uday ? That's really about "memory" accesses, not
> "array" accesses, I think it's pretty cool :-) !
>

Yes, it looks very good, but not sufficient/complete. I see the
following issues when it comes to storing information about symbols
used. If you have a hierarchical access such as

s.a[j]

your id's at a lower level could collide with those at upper levels,
given the way you have it. So, in the above, s & a could get the same
id, or id of 'a' could be the same as that of an 'x' in 'x[k]'. Since a
and x are different symbols, we'll need to uniquely identify them, for
e.g.., to store information in the symbol table. So, one would have to
use the entire vector as key, say, (0) for s, (0,1) for a, and (1) for
x, but this would mean we have a problem for s.a[j] and t.a[j]: symbol
'a' should have a unique entry -- (0,1) for s.[j] and (2,1) for t.a[j]
will not help.

We have this issue because your memory doesn't map one-to-one to
symbols. So, overall, whatever you have really helps dependence analysis
but not sufficient/convenient to store associated symbol information.
Adding separate symbol information for each access will solve the
problem. Another way (ugly) would be to map symbols to have distinct
id's across all levels.

Cédric Bastoul

unread,
Jun 27, 2011, 10:17:38 AM6/27/11
to Uday K Bondhugula, Louis-Noel Pouchet, Tobias Grosser, openscop-d...@googlegroups.com, Tristan Vanderbruggen, Thomas Legris, 胡士文, Pop, Sebastian, Sven Verdoolaege


2011/6/27 Uday K Bondhugula <ud...@csa.iisc.ernet.in>
Right, I agree. Here is my proposal :
- base the representation on the one I suggested,
- create a new possibility for the first column of memory accesses : "2" for identifiers. "2" is the same as "0" (it is an equality) but it denotes the dimension corresponds to a symbol (so we encode the difference between A[42] and A->B when the id of B is 42),
- the symbol id will correspond to a unique id in the symbol table.
Does it sound OK ?

Ced.

Louis-Noel Pouchet

unread,
Jun 27, 2011, 3:19:58 PM6/27/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:



On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:

Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).

For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??

Mmm, from the start it was not supposed to be possible as there was a correspondence 1 row <=> 1 scattering dimension (scattering "function"). We can state that the ith row with a 0 in the first column corresponds to the ith scattering dimension (but if we want to do an index-set splitting with i == 0, we must use two inequalities). According to me this makes the matrix representation more complex to understand, but I would not fight against it.

My view is simple: with the current scoplib/matrix representation, I can represent tiling in the scattering. If I am to lose this functionality in openscop/matrix, I want to get a good motivation of why. The fact that the convention differs between matrix and relation looks strange. Why do I have the s columns explicit in one, and implicit in the other?
I suggest that, for both, you look at the number of columns of the scattering dimension. Either it's equal to the domain, and then only equalities are used and the "s1 = ..." part is implicit (this would be true for both matrix and relation representation), or it is different and you are merely unconstrained provided the input dimensions match the domain dimensions.

 
For the access, since eq means the row is interpreted as ax+by+c = 0, I'm not sure I understand what it means to say the access A[i][j] is {i = 0, j = 0}. I suggest to write 'fixed' or 'unused' in the doc.

OK.
 
I wondered whether I should just put "0", but I thought "e" would be more consistent. It's true there are inconsistencies in the manual, I'll correct them and explain better the i/e stuff.
 
- I am not a big fan of encoding the array reference number on the scalar column. This is making a double use of this column, one to describe the scalar value and one to describe an id, two different things in my view. I like that we have an explicit list of references, and not anymore the big matrix, but I would prefer an attribute in the openscop representation, to get something like:
WRITE  # type
1 # array id
3 5 # row/col
 0 1 0 0 0 # matrix of the access
 0 0 1 0 0 #

Here, I really prefer the way I suggested for a number of reasons:

- I don't think there is a double use of the scalar column: I see the memory as a big array "Mem" and the array name is just one dimension of this array. For instance let's consider A[i] and suppose that the array id of A is 42. This corresponds to an access Mem[42][i]. The array id is one dimension of a memory access, not a particular case. It happens to be scalar, but it's not a problem at all.
Bwarf. Yes it's true, but really, bwarf... and you can achieve strictly the same with the id number before the access function.


- Using scalar dimensions is a beautiful solution to accept very complex memory accesses (not only simple "array accesses"). For instance foo->b[i]->bar.toto[j] is analyzed by Clan by counting the different fields for each reference. Let's suppose "foo" has id 43. "b" has id 1 (it's relative to "foo": maybe clan read, e.g., "foo->a" before and decided that "a" has id 0 relatively to "foo"), "bar" has id 0 (here it's relative to foo->b[]) and "toto" has id "2" (here it's relative to foo->b[]->bar), then it corresponds to an access to Mem[43][1][i][0][2][j]. It's nicely unified and ready for data dependence analysis :-).

Same remark as above.

No. Would you prefer to maintain a list of references (for foo->b[i]->bar.toto[j], foo, b, bar and  toto) and how they interleave with "true" array dimensions ([i][j], but put [i] after b and put [j] after toto) ? Probably not.

If you're OK with internal scalar dimensions, there is no difference with the external array identifier.

To be precise, I'm ok with the external id and yes as I said there's no expressiveness difference.

Except that this is consistent and beautiful: Mem[43][1][i][0][2][j] :-) !

Many will just find that heavier.

Anyway, as said I can live with that, it's only cosmetic, so I've just expressed my disagreement on the syntax, but I'm fine with the semantics of the access function part of openscop.

- I'm not sure I understand how extensions are managed from your explanations. All parsing/printing functions are going to be implemented by the user, and we'll have only pointer copies when we clone an openscop_scop_t, right?

No :-). All the eight base functions created by the extension writers are actually called. E.g., to copy, we call *your* openscop_extension_copy() function. They are linked to the core openscop code with a script (launched during configure), that's why naming conventions are so strict. Just ask for a unique number and a name, write your extension.h and your extension.c respecting the naming conventions for the eight base functions and it should compile and work out of the box :-).

Mmm. Ok. I don't understand how you get the name, and register it such that the driver calls it, etc.
What about a plain good old function pointer?


Also, it looks strange to have the extension implementation part of the openscop lib (unless I missed something again). That will mean many people playing with different libscoplib files, it will add difficulty to maintaining scoplib when bug reports will be filed.
I would suggest more something along those lines then:
provide a function ptr structure, like this:
typedef void (*ptrfun)(void*);
typedef void (*ptrfunprint)(FILE*, void*);
typedef void (*ptrfunread)(FILE*, void*);
typedef void* (*ptrfunclone)(void*);
struct openscop_extension_functions {
   int id;
   ptrfun freefun; // free function, free 1 instance of openscop_extension_t
   ptrfunprint printfun; // FILE print function called
   ptrfunprint readfun; // FILE read function called
   ptrfunclone clonefun; // cloning function called
};


Then, you have a "register" function in openscop, that will store an instance of this struct to its memory, and each time an extension of id 'id' is seen those functions are used.
The user can then wrap openscop in its own library, or link against both modules, genuine openscop and his personal extension module.

Now, the name comes from the "structure".c file (the script lists the files in the extensions directory and get the names from that, that's why I'm asking for exactly one .c per extension), the registration comes when you ask for a unique id and name.

So, it implies to recompile openscop library if we add an extension, right? Again, that will lead to numerous versions of the library, I don't think this is the clean approach.

I thought a lot about a plain good old function pointer-based solution. I think the solutions are exactly the same

I clearly disagree: function pointers don't require recompiling openscop when we add an extension...

except that mine avoids tons of painful casts when you are using extensions.

The only casts I see are to promote the void* argument for clone, print and free functions into the type that the _user_ defined for himself in its extension. 3 casts total, I think it's quite ok.

For unknown extensions, they are just ignored with a warning message.

Equivalently, for extensions which have not been registered (there's no  openscop_extension_functions instance with the matching id), there's a warning.

Also, I do strongly believe you should provide a generic 'char' extension, which stores a string for the data, and that is managed. This way, we'll be able to seamlessly integrated existing text-based extensions.


++

-- 
Louis-Noel Pouchet

Louis-Noel Pouchet

unread,
Jun 27, 2011, 3:27:28 PM6/27/11
to Cédric Bastoul, Uday K Bondhugula, Tobias Grosser, openscop-d...@googlegroups.com, Tristan Vanderbruggen, Thomas Legris, 胡士文, Pop, Sebastian, Sven Verdoolaege
You're this time putting new semantics to a reserved column, in a clearly not intuitive way... If we now need to read 2 elements just to know what the reference is about, that clearly becomes worse than id before the matrix.
For me, the difference between a[42] and a->b is given by the type of a in the symbol table.

++


-- 
Louis-Noel Pouchet

Cédric Bastoul

unread,
Jun 27, 2011, 5:36:44 PM6/27/11
to Louis-Noel Pouchet, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Mon, Jun 27, 2011 at 9:19 PM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:



On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:

Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).

For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??

Mmm, from the start it was not supposed to be possible as there was a correspondence 1 row <=> 1 scattering dimension (scattering "function"). We can state that the ith row with a 0 in the first column corresponds to the ith scattering dimension (but if we want to do an index-set splitting with i == 0, we must use two inequalities). According to me this makes the matrix representation more complex to understand, but I would not fight against it.

My view is simple: with the current scoplib/matrix representation, I can represent tiling in the scattering. If I am to lose this functionality in openscop/matrix, I want to get a good motivation of why. The fact that the convention differs between matrix and relation looks strange. Why do I have the s columns explicit in one, and implicit in the other?
 
I suggest that, for both, you look at the number of columns of the scattering dimension. Either it's equal to the domain, and then only equalities are used and the "s1 = ..." part is implicit (this would be true for both matrix and relation representation), or it is different and you are merely unconstrained provided the input dimensions match the domain dimensions.

Well, the implicit/explicit s columns for the scattering (and "array" dimensions for the accesses) is the point to having two representations (along with existentially quantifiers) : a basic one and a complete one.

I don't understand your view of a matrix representation... Can you explain what are your rules in the scoplib/matrix representation ?

My rules on matrix representation are based on Clan's documentation (still provided with ScopLib), and when I did it, it was clearly stated in the documentation that the first column of scattering functions must be set to zero and that the "useless and error-prone" (sic) identity matrix disappeared.

Counting the number of columns in matrix representation (no explicit s columns) does not work : it's not possible to distinguish between "i = 0" and "s = i" if you accept non-scattering-function rows. To achieve tiling we should use a relation representation. Maybe it's too confusing to embed two representations in OpenScop, probably we should just support relations...
Yes. I think I'll save some time by asking you directly some precisions. Anyway I'm open to this solution. 
 
Again, that will lead to numerous versions of the library, I don't think this is the clean approach.
I thought a lot about a plain good old function pointer-based solution. I think the solutions are exactly the same

I clearly disagree: function pointers don't require recompiling openscop when we add an extension...

except that mine avoids tons of painful casts when you are using extensions.

The only casts I see are to promote the void* argument for clone, print and free functions into the type that the _user_ defined for himself in its extension. 3 casts total, I think it's quite ok.

I see one cast every time the user wants to use his own functions (there are eight functions in the interface). I agree it's not a big deal (except for 80 columns programming...).
For unknown extensions, they are just ignored with a warning message.

Equivalently, for extensions which have not been registered (there's no  openscop_extension_functions instance with the matching id), there's a warning.

Also, I do strongly believe you should provide a generic 'char' extension, which stores a string for the data, and that is managed. This way, we'll be able to seamlessly integrated existing text-based extensions.

Yes, I planned to do it, there's even a proof in the code as a reserved type OPENSCOP_EXTENSION_STRING !


Right, I agree. Here is my proposal :
- base the representation on the one I suggested,
- create a new possibility for the first column of memory accesses : "2" for identifiers. "2" is the same as "0" (it is an equality) but it denotes the dimension corresponds to a symbol (so we encode the difference between A[42] and A->B when the id of B is 42),
- the symbol id will correspond to a unique id in the symbol table.
Does it sound OK ?

You're this time putting new semantics to a reserved column, in a clearly not intuitive way... If we now need to read 2 elements just to know what the reference is about, that clearly becomes worse than id before the matrix.
For me, the difference between a[42] and a->b is given by the type of a in the symbol table.
 
I can obviously remove the second point.

Ced..

Tobias Grosser

unread,
Jun 27, 2011, 10:59:24 PM6/27/11
to Cédric Bastoul, Louis-Noel Pouchet, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On 06/27/2011 05:36 PM, Cédric Bastoul wrote:
>
>
> On Mon, Jun 27, 2011 at 9:19 PM, Louis-Noel Pouchet
> <pou...@cse.ohio-state.edu <mailto:pou...@cse.ohio-state.edu>> wrote:
>
>
> On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:
>
>>
>>
>> On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet
>> <pou...@cse.ohio-state.edu <mailto:pou...@cse.ohio-state.edu>>

>> wrote:
>>
>>
>> On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:
>>
>>> Hi Louis-Noël,
>>> thanks for your feedback,
>>>
>>> On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet
>>> <pou...@cse.ohio-state.edu

I am still working on my reply, but this was basically my main concern.
I believe the two representations complicate the understanding of OpenScop
and also require additional complexity on the tools supporting openscop.
I extremely support the removal of the matrix representation from the
core specification (Especially as even files that use the matrix
representation are incompatible with the scoplib files).
If needed, we may add a tool and library support that translates an old
scoplib file to an openscop file.

Cheers
Tobi

Tobias Grosser

unread,
Jun 27, 2011, 11:00:05 PM6/27/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Louis-Noel Pouchet, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On 06/24/2011 09:56 PM, Cédric Bastoul wrote:
> Hi @ all,
> it's one year late, but here is the "request for comments" version of
> OpenScop !

Hi Cedric,

thanks for getting this out. I looked into this and believe it will
definitely be a big step ahead.

Some comments:

- There are some typos in the documentation (Patch sent in separate mail).

- 3.1.3.1 Relation Representation

> • CONTEXT: for context information,
> • DOMAIN: for iteration domains,
> • SCATTERING: for scattering relation,
> • READ: for read accesses,
> • WRITE: for write accesses,
> • RDWR: for read/write accesses,
> • MAY_READ: for may-read accesses,
> • MAY_WRITE: for may-write accesses,
> • MAY_RDWR: for may-read/write accesses.

I obviously like the introduction of MAY_WRITE accesses, but wonder if
we need RDWR, MAY_READ, MAY_RDWR accesses? Those seem redundant to me.

RWDW can be represented as a READ followed by a WRITE
MAY_READ can be represented by a READ (that is thrown away)
MAY_RDWR can be represented by a READ (that is thrown away) followed by
a MAY_WRITE.


- 3.1.4.6 openscop scop t
> struct openscop_scop {
> int version; /* Version of the data structure */
> char * language; /* Target language (backend) */

Do we really need to encode an option that is only needed for code
generation in the OpenScop specification? I believe this should be a
CLooG flag or an extension.

- Matrix/Relation representation?

As far as I understand the proposal for OpenScop defines a coexistence
of matrix and relation representation. I have some problems with this:

1. The textual representation is unclear

> 3. A line with 2 (matrix representation) or 6 (relation
> representation) numbers, possibly followed by comments:
> 1. the number of rows of the constraint matrix,
> 2. the number of columns of the constraint matrix,
> 3. the number of output dimensions,
> 4. the number of input dimension,
> 5. the number of local dimensions (existentially quantified
> dimensions),
> 6. the number of parameters.
> The first two numbers are mandatory. They can be provided alone when
> using the simplified matrix representation. The last four numbers are
> optional in the matrix representation (however, the number of local
> dimensions must be -1) and mandatory when using the
> relation representation.

Setting the local dimensions to -1 to specify that a relation should be
interpreted as a matrix is a hack for me. The number of local dimensions
is 0 and anything else is wrong (or at least confusing). I believe, if
we want to distinguish between matrix and relation representation, we
should define that a matrix uses a line with two numbers and a relation
a line with 6 numbers. (This still leaves the data structure side as a
problem.)

To solve the data structure problem we may e.g. use a boolean to define
if a relation should be interpreted and printed as a relation. However,
this is just part of a bigger problem. As mentioned in the other email
the two representations lead to confusion at several places and
complicate the integration of openscop in new tools libraries. What is
the main reason for having the matrix representation? Backward
compatibility or to simplify user input? Can this be solved in a way
that is transparent to the openscop library users and that does not
require two representations in openscop?

Cheers
Tobi

Louis-Noel Pouchet

unread,
Jun 27, 2011, 11:07:56 PM6/27/11
to Tobias Grosser, Cédric Bastoul, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Tobias et al: I would be fine with removing the matrix support from openscop, and having a tool that converts openscop to scoplib when possible. I think Tobias has one already. For me, the only valid argument for keeping the matrix representation is teaching, since teaching relations is a little bit harder than plain old affine relations, trust my experience on that ;)

However, if at no cost we can support both, it's better I agree.

So, please proceed as you guys wish, I don't feel strongly on that.

++

--
Louis-Noel Pouchet
pou...@cse.ohio-state.edu

Louis-Noel Pouchet

unread,
Jun 27, 2011, 11:23:47 PM6/27/11
to Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Jun 27, 2011, at 5:36 PM, Cédric Bastoul wrote:



On Mon, Jun 27, 2011 at 9:19 PM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:



On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:

Hi Louis-Noël,
thanks for your feedback,

On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
Hi Cedric,

I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )

- In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?

i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).

For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??

Mmm, from the start it was not supposed to be possible as there was a correspondence 1 row <=> 1 scattering dimension (scattering "function"). We can state that the ith row with a 0 in the first column corresponds to the ith scattering dimension (but if we want to do an index-set splitting with i == 0, we must use two inequalities). According to me this makes the matrix representation more complex to understand, but I would not fight against it.

My view is simple: with the current scoplib/matrix representation, I can represent tiling in the scattering. If I am to lose this functionality in openscop/matrix, I want to get a good motivation of why. The fact that the convention differs between matrix and relation looks strange. Why do I have the s columns explicit in one, and implicit in the other?
 
I suggest that, for both, you look at the number of columns of the scattering dimension. Either it's equal to the domain, and then only equalities are used and the "s1 = ..." part is implicit (this would be true for both matrix and relation representation), or it is different and you are merely unconstrained provided the input dimensions match the domain dimensions.

Well, the implicit/explicit s columns for the scattering (and "array" dimensions for the accesses) is the point to having two representations (along with existentially quantifiers) : a basic one and a complete one.

I don't understand your view of a matrix representation... Can you explain what are your rules in the scoplib/matrix representation ?

If the scattering column number matches the domain column number, AND all eq/ineq bits are set to 0, then the scattering 
# eq/ineq i j 1
0 0 0 0
0 1 0 0
0 0 0 0
0 0 1 0
is interpreted as c1 = 0, c2 = i, c3 = 0, c4 = j. By "interpreted", I mean I automatically add the extra columns for c1...c4 in the matrix before passing it to cloog.
There is no constraint on the number of rows.

If however any of the above constraint is not met, the scattering is fully defined by the user, thus there must be explicit equalities/inequalities between extra columns for the scattering dimensions and the "domain" (ie, input) dimensions.



My rules on matrix representation are based on Clan's documentation (still provided with ScopLib), and when I did it, it was clearly stated in the documentation that the first column of scattering functions must be set to zero

I have departed from that, to allow tiling in scatterings.

and that the "useless and error-prone" (sic) identity matrix disappeared.

I don't know what it means.


Counting the number of columns in matrix representation (no explicit s columns) does not work : it's not possible to distinguish between "i = 0" and "s = i" if you accept non-scattering-function rows.

See above.

To achieve tiling we should use a relation representation.

It is not mandatory: I've lived without that for a long time now...


Maybe it's too confusing to embed two representations in OpenScop, probably we should just support relations...

I think Tobias second that ;) and as I said, I don't really mind. If however we keep the matrix representation, I do mind we can encode inequalities in the scattering matrix.
Can you tell me what are the 8 ones? I see reading from a file and printing to a file, cloning, allocating and freeing. That's 5.

Also, void* is "magical": if you pass a struct* type to a function with a void* argument, no warning as the cast is not needed. Same thing for a function that returns void* and of which you assign the result to a struct*.

So, when implementing those functions, yes the user needs a cast to convert the argument to the actual type. But when he invokes those functions he doesn't.

Another approach is to let the user declare functions like "struct myextension* blabla()" and have an openscop macro to cast "blabla" into a "void* fun" when registering the method.

The bottom line is that we're trying to do C++ in C (exactly inheritance and virtual functions here). So, it will never be totally clean. At least, with a function pointer based mechanism and a registering mechanism, you achieve exactly those functionalities, at little syntactic cost.


I agree it's not a big deal (except for 80 columns programming...).
For unknown extensions, they are just ignored with a warning message.

Equivalently, for extensions which have not been registered (there's no  openscop_extension_functions instance with the matching id), there's a warning.

Also, I do strongly believe you should provide a generic 'char' extension, which stores a string for the data, and that is managed. This way, we'll be able to seamlessly integrated existing text-based extensions.

Yes, I planned to do it, there's even a proof in the code as a reserved type OPENSCOP_EXTENSION_STRING !

Ok, great.

++

-- 
Louis-Noel Pouchet

Uday K Bondhugula

unread,
Jun 28, 2011, 3:22:37 AM6/28/11
to Cédric Bastoul, Louis-Noel Pouchet, Tobias Grosser, openscop-d...@googlegroups.com, Tristan Vanderbruggen, Thomas Legris, 胡士文, Pop, Sebastian, Sven Verdoolaege

>> the way you have it. So, in the above, s& a could get the same id, or id of


>> 'a' could be the same as that of an 'x' in 'x[k]'. Since a and x are
>> different symbols, we'll need to uniquely identify them, for e.g.., to store
>> information in the symbol table. So, one would have to use the entire vector
>> as key, say, (0) for s, (0,1) for a, and (1) for x, but this would mean we
>> have a problem for s.a[j] and t.a[j]: symbol 'a' should have a unique entry
>> -- (0,1) for s.[j] and (2,1) for t.a[j] will not help.
>>
>> We have this issue because your memory doesn't map one-to-one to symbols.
>> So, overall, whatever you have really helps dependence analysis but not
>> sufficient/convenient to store associated symbol information. Adding
>> separate symbol information for each access will solve the problem. Another
>> way (ugly) would be to map symbols to have distinct id's across all levels.
>
>
> Right, I agree. Here is my proposal :
> - base the representation on the one I suggested,
> - create a new possibility for the first column of memory accesses : "2" for
> identifiers. "2" is the same as "0" (it is an equality) but it denotes the
> dimension corresponds to a symbol (so we encode the difference between A[42]
> and A->B when the id of B is 42),
> - the symbol id will correspond to a unique id in the symbol table.
> Does it sound OK ?

Cédric,

Can you elaborate? This doesn't look clean (also as L-N says). If we
take a fresh view on what 'data access' info should be provided, there
are two things: (1) information about memory being accessed (you have
this very well), (2) symbol being accessed. The latter is so that one
can store and look up things like data type, extents of array
dimensions, etc. If we have a[i].b, the symbol involved is actually b.
And if we have c[i].b (a and c of the same type), it's the same symbol
and the same id will be put. So, can we have a separate row to put the
symbol id instead of encoding it in the memory matrix? ... so that the
access info has two components: memory matrix and the symbol id.

# Read access
3 5 # read matrix
... # a
... # [i]
... # b
# symbol id
44 # symbol id of b
...
# Read access
3 5 # read matrix
... # c
... # [i]
... # d
# symbol id
44 # symbol id of d


Thanks,

Uday K Bondhugula

unread,
Jun 28, 2011, 6:47:43 AM6/28/11
to Louis-Noel Pouchet, Cédric Bastoul, openscop-d...@googlegroups.com, Tobias Grosser, Thomas Legris

On 06/28/2011 08:53 AM, Louis-Noel Pouchet wrote:
>
> On Jun 27, 2011, at 5:36 PM, Cédric Bastoul wrote:
>
>>
>>
>> On Mon, Jun 27, 2011 at 9:19 PM, Louis-Noel Pouchet<pou...@cse.ohio-state.edu> wrote:
>>
>> On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:
>>
>>>
>>>
>>> On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet<pou...@cse.ohio-state.edu> wrote:
>>>
>>> On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:
>>>
>>>> Hi Louis-Noël,
>>>> thanks for your feedback,
>>>>
>>>> On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet<pou...@cse.ohio-state.edu> wrote:
>>>> Hi Cedric,
>>>>
>>>> I had a short look, and here are a few comments (some are typos, some are closer to existential questions ;) )
>>>>
>>>> - In the preliminary example w/ matrix representation: why sometimes using "i/e", sometimes "e", sometimes "ineq" for the comment of the first matrix column?
>>>>
>>>> i/e means it can be either 0 (eq) or 1 (ineq). In matrix representation for scattering and access it can just be 0 (eq).
>>>
>>> For the scattering? We cannot put anymore inequalities in scattering if we use the matrix representation??
>>>
>>> Mmm, from the start it was not supposed to be possible as there was a correspondence 1 row<=> 1 scattering dimension (scattering "function"). We can state that the ith row with a 0 in the first column corresponds to the ith scattering dimension (but if we want to do an index-set splitting with i == 0, we must use two inequalities). According to me this makes the matrix representation more complex to understand, but I would not fight against it.
>>
>> My view is simple: with the current scoplib/matrix representation, I can represent tiling in the scattering. If I am to lose this functionality in openscop/matrix, I want to get a good motivation of why. The fact that the convention differs between matrix and relation looks strange. Why do I have the s columns explicit in one, and implicit in the other?
>>
>> I suggest that, for both, you look at the number of columns of the scattering dimension. Either it's equal to the domain, and then only equalities are used and the "s1 = ..." part is implicit (this would be true for both matrix and relation representation), or it is different and you are merely unconstrained provided the input dimensions match the domain dimensions.
>>
>> Well, the implicit/explicit s columns for the scattering (and "array" dimensions for the accesses) is the point to having two representations (along with existentially quantifiers) : a basic one and a complete one.
>>
>> I don't understand your view of a matrix representation... Can you explain what are your rules in the scoplib/matrix representation ?


The manual shows data structures only for relation representation. So,
the library users have to only deal with the relational representation?

Thanks,
-- Uday

Tobias Grosser

unread,
Jun 28, 2011, 7:34:01 AM6/28/11
to Uday K Bondhugula, Louis-Noel Pouchet, Cédric Bastoul, openscop-d...@googlegroups.com, Thomas Legris

That's what I hoped before, but as far as I understood the library user
needs to check if all 'local dimensions' of a union are set to -1. In
this case the library user has to interpret the content of the
openscop_relation_t as a matrix representation and must e.g. interpret
the scattering following the semantics described for scattering in the
matrix representation. The printing also relies completely on the
content of openscop_relation_t. This means the library user must put the
content following the semantics of either a relation or a function and
must set 'local dimensions' to -1 (OPENSCOP_UNDEFINED) to make the
printing print a matrix style matrix.

I believe this complicates the use a lot. The library user should always
handle a openscop_relation_t as a relation. We may add syntactic sugar
to simplify the writing of matrices, but this should be completely
transparent for the library user.

Cheers
Tobi

Tobias Grosser

unread,
Jun 28, 2011, 7:34:37 AM6/28/11
to Louis-Noel Pouchet, Cédric Bastoul, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On 06/27/2011 11:23 PM, Louis-Noel Pouchet wrote:
>
> On Jun 27, 2011, at 5:36 PM, Cédric Bastoul wrote:
>
>>
>>
>> On Mon, Jun 27, 2011 at 9:19 PM, Louis-Noel Pouchet
>> <pou...@cse.ohio-state.edu <mailto:pou...@cse.ohio-state.edu>> wrote:
>>
>>
>> On Jun 27, 2011, at 7:35 AM, Cédric Bastoul wrote:
>>
>>>
>>>
>>> On Mon, Jun 27, 2011 at 11:48 AM, Louis-Noel Pouchet
>>> <pou...@cse.ohio-state.edu <mailto:pou...@cse.ohio-state.edu>>

>>> wrote:
>>>
>>>
>>> On Jun 27, 2011, at 5:12 AM, Cédric Bastoul wrote:
>>>
>>>> Hi Louis-Noël,
>>>> thanks for your feedback,
>>>>
>>>> On Mon, Jun 27, 2011 at 8:58 AM, Louis-Noel Pouchet
>>>> <pou...@cse.ohio-state.edu
row

> all eq/ineq bits are set to 0, then the scattering
> # eq/ineq i j 1
> 0 0 0 0
> 0 1 0 0
> 0 0 0 0
> 0 0 1 0
> is interpreted as c1 = 0, c2 = i, c3 = 0, c4 = j. By "interpreted", I
> mean I automatically add the extra columns for c1...c4 in the matrix
> before passing it to cloog.
> There is no constraint on the number of rows.
>
> If however any of the above constraint is not met, the scattering is
> fully defined by the user, thus there must be explicit
> equalities/inequalities between extra columns for the scattering
> dimensions and the "domain" (ie, input) dimensions.

We may integrate this as some kind of syntactic sugar in the openscop
reader,
but I would strongly prefer to automatically translate this into a
relational representation such that this is transparent for the openscop
library user. (Such as Louis-Noel is automatically translating this for
CLooG)

Cédric Bastoul

unread,
Jul 6, 2011, 11:44:02 AM7/6/11
to Tobias Grosser, Louis-Noel Pouchet, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Hi @all,
I was out of development for one week but now I can spend some time on it. Let me do a recap and let you know about my plans (still open for discussion):

------------------
Big things
------------------
- There is a consensus to remove the "matrix representation" because it is confusing.

=> My plan: I'll remove it. As suggested by Tobias, we may let some syntactic sugar in the openscop textual reader to allow a simplified input, but internally everything would be about relations.

- There has been some discussion about the access representation : (1) encoding the array id as a dimension is debatable and (2) the representation is not enough to store associated symbol information.

=> My plan: leave it as it is in my first proposal. For (1) because I do really think it's better: (i) there is no need for a special relation for accesses (the OpenScop format is simpler), (ii) it is directly usable for data dependence analysis (polyhedrically sound), (iii) it is consistent for complex memory accesses (which would require scalar dimensions anyway), and (iv) extracting the array id is still pretty easy (and there's a function for this). For (2) I don't think there is a simple way to make everyone happy (in our discussions Louis-Noël wants the external array id, e.g., A in A[i]->B[j], while Uday wants the internal one, e.g., B in A[i]->B[j]), plus it is not a necessary information for a basic parallelizer, so an extension should be used (forget about my cryptic suggestion in a previous mail).

- It may be better to use void* for extensions (Louis-Noël's proposal, see his mails in the thread) rather than some generative programming (my proposal, see section 4.5.4 in the documentation, it's just one page).

=> My plan: I will evaluate this. Actually I did not know that there was no warning when assigning void* to struct* and conversely. Other opinions are welcome.

- We want a symbol table.

=> My plan: it will be an extension anyway so it's not as urgent as the first points (they are blocking for OpenScop and Clan releases). We may discuss it later or separately.

------------------
Small things (I just agree, no discussion here)
------------------
- Better explain each relation attribute with examples, especially for existentially quantified dimensions.
- Add a void* usr; field in the openscop_scop_t structure.
- Create a "body" structure referring explicitely to either void* or char* elements.
- Restrict to READ, WRITE and MAY_WRITE access.
- Add an extension containing the complete extension textual string.

Let me know if you have some strong feeling about this !
Corrected proposal for the end of the week or before.
Cheers,

Cedric

Cédric Bastoul

unread,
Jul 6, 2011, 11:58:43 AM7/6/11
to Tobias Grosser, openscop-d...@googlegroups.com, Louis-Noel Pouchet, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Tue, Jun 28, 2011 at 5:00 AM, Tobias Grosser <gro...@fim.uni-passau.de> wrote:

- 3.1.4.6 openscop scop t
struct openscop_scop {
int version; /* Version of the data structure */
char * language; /* Target language (backend) */

Do we really need to encode an option that is only needed for code generation in the OpenScop specification? I believe this should be a CLooG flag or an extension.

Yes. That's because of the rule 1 of Openscop, i.e., "embed the minimum information to build a complete polyhedral compilation framework in the core part". The idea is, from a complete core part, we should be able to regenerate something (at least source to source with identity transformation). So we need to know the target language.

I realized at some point that I was not strict enough with this rule as I allowed to provide the scattering and the iteration domain dimension names (which could be generated by the code generator). Only the parameters names are necessary. I should probably remove them. Should I ?

Sebastian Pop

unread,
Jul 6, 2011, 12:05:33 PM7/6/11
to Cédric Bastoul, Tobias Grosser, openscop-d...@googlegroups.com, Louis-Noel Pouchet, B Uday Kumar Reddy, Sven Verdoolaege, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Wed, Jul 6, 2011 at 10:58, Cédric Bastoul <cedric....@u-psud.fr> wrote:
> On Tue, Jun 28, 2011 at 5:00 AM, Tobias Grosser <gro...@fim.uni-passau.de>
> wrote:
>>
>> - 3.1.4.6 openscop scop t
>>>
>>> struct openscop_scop {
>>> int version; /* Version of the data structure */
>>> char * language; /* Target language (backend) */
>>
>> Do we really need to encode an option that is only needed for code
>> generation in the OpenScop specification? I believe this should be a CLooG
>> flag or an extension.
>
> Yes.

I'm fine with your decision here.

> That's because of the rule 1 of Openscop, i.e., "embed the minimum
> information to build a complete polyhedral compilation framework in the core
> part". The idea is, from a complete core part, we should be able to
> regenerate something (at least source to source with identity
> transformation). So we need to know the target language.
> I realized at some point that I was not strict enough with this rule as I
> allowed to provide the scattering and the iteration domain dimension names
> (which could be generated by the code generator). Only the parameters names
> are necessary. I should probably remove them. Should I ?

Yes.
Also please, make the parameter names to be void* and not char*.

Thanks,
Sebastian

Louis-Noel Pouchet

unread,
Jul 6, 2011, 10:04:26 PM7/6/11
to Cédric Bastoul, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris

On Jul 6, 2011, at 11:44 AM, Cédric Bastoul wrote:

> Hi @all,
> I was out of development for one week but now I can spend some time on it. Let me do a recap and let you know about my plans (still open for discussion):
>
> ------------------
> Big things
> ------------------
> - There is a consensus to remove the "matrix representation" because it is confusing.
>
> => My plan: I'll remove it. As suggested by Tobias, we may let some syntactic sugar in the openscop textual reader to allow a simplified input, but internally everything would be about relations.

ok. I also suggest to give information (ie, a pointer to a software) for converting scoplib to openscop, alongside the openscop software documentation. Tobias, do you have such software at hand?

>
> - There has been some discussion about the access representation : (1) encoding the array id as a dimension is debatable and (2) the representation is not enough to store associated symbol information.
>
> => My plan: leave it as it is in my first proposal. For (1) because I do really think it's better: (i) there is no need for a special relation for accesses (the OpenScop format is simpler), (ii) it is directly usable for data dependence analysis (polyhedrically sound), (iii) it is consistent for complex memory accesses (which would require scalar dimensions anyway), and (iv) extracting the array id is still pretty easy (and there's a function for this). For (2) I don't think there is a simple way to make everyone happy (in our discussions Louis-Noël wants the external array id, e.g., A in A[i]->B[j], while Uday wants the internal one, e.g., B in A[i]->B[j]), plus it is not a necessary information for a basic parallelizer, so an extension should be used (forget about my cryptic suggestion in a previous mail).

Ok, but provide in the api of the openscop software a method to get the variable ids, and a method to get the "subset" of the relation (sorry, I know it's not the good english) corresponding to a given variable id.


>
> - It may be better to use void* for extensions (Louis-Noël's proposal, see his mails in the thread) rather than some generative programming (my proposal, see section 4.5.4 in the documentation, it's just one page).
>
> => My plan: I will evaluate this. Actually I did not know that there was no warning when assigning void* to struct* and conversely. Other opinions are welcome.

ok. Feel free to contact me off-line for a prototype.


> - We want a symbol table.
>
> => My plan: it will be an extension anyway so it's not as urgent as the first points (they are blocking for OpenScop and Clan releases). We may discuss it later or separately.

I disagree with this point: a symbol table is a quite fundamental object in the description of openscop. It should be considered from the start. It may be seen as an extension, but I do believe it should be part of the main description of the format. If you insist on an extension, you should provide it along with the openscop spec. You have the ear of many people right now, let's solve the symbol table issue also :)


>
> ------------------
> Small things (I just agree, no discussion here)
> ------------------
> - Better explain each relation attribute with examples, especially for existentially quantified dimensions.
> - Add a void* usr; field in the openscop_scop_t structure.
> - Create a "body" structure referring explicitely to either void* or char* elements.
> - Restrict to READ, WRITE and MAY_WRITE access.

Can you remind me why you're discarding may_read?


Thanks,

Louis-Noel Pouchet

unread,
Jul 6, 2011, 9:54:54 PM7/6/11
to Cédric Bastoul, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
>
> I realized at some point that I was not strict enough with this rule as I allowed to provide the scattering and the iteration domain dimension names (which could be generated by the code generator). Only the parameters names are necessary. I should probably remove them. Should I ?

I'm not sure I understand what you want to remove. Can you (re-)explain please?

Cédric Bastoul

unread,
Jul 7, 2011, 12:59:21 PM7/7/11
to Louis-Noel Pouchet, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Thu, Jul 7, 2011 at 3:54 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:
>
> I realized at some point that I was not strict enough with this rule as I allowed to provide the scattering and the iteration domain dimension names (which could be generated by the code generator). Only the parameters names are necessary. I should probably remove them. Should I ?

I'm not sure I understand what you want to remove. Can you (re-)explain please?

We can remove the (global) iterators and the scattering dimension names from the core part. They can be generated by the code generator, hence they are not necessary (and they contradict the first rule of OpenScop, i.e., "embed the minimum information to build a complete polyhedral compilation framework in the core part"). I remember the first versions of Clan/ScopLib did not include them, but we agreed to add them because it was convenient. I think the situation is different since OpenScop is very modular. Obviously I'll create an extension to add those names.

Cédric Bastoul

unread,
Jul 7, 2011, 1:27:39 PM7/7/11
to Louis-Noel Pouchet, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
On Thu, Jul 7, 2011 at 4:04 AM, Louis-Noel Pouchet <pou...@cse.ohio-state.edu> wrote:

On Jul 6, 2011, at 11:44 AM, Cédric Bastoul wrote:

> Hi @all,
> I was out of development for one week but now I can spend some time on it. Let me do a recap and let you know about my plans (still open for discussion):
>
> ------------------
> Big things
> ------------------
> - There is a consensus to remove the "matrix representation" because it is confusing.
>
> => My plan: I'll remove it. As suggested by Tobias, we may let some syntactic sugar in the openscop textual reader to allow a simplified input, but internally everything would be about relations.

ok. I also suggest to give information (ie, a pointer to a software) for converting scoplib to openscop, alongside the openscop software documentation. Tobias, do you have such software at hand?

OK. 

>
> - There has been some discussion about the access representation : (1) encoding the array id as a dimension is debatable and (2) the representation is not enough to store associated symbol information.
>
> => My plan: leave it as it is in my first proposal. For (1) because I do really think it's better: (i) there is no need for a special relation for accesses (the OpenScop format is simpler), (ii) it is directly usable for data dependence analysis (polyhedrically sound), (iii) it is consistent for complex memory accesses (which would require scalar dimensions anyway), and (iv) extracting the array id is still pretty easy (and there's a function for this). For (2) I don't think there is a simple way to make everyone happy (in our discussions Louis-Noël wants the external array id, e.g., A in A[i]->B[j], while Uday wants the internal one, e.g., B in A[i]->B[j]), plus it is not a necessary information for a basic parallelizer, so an extension should be used (forget about my cryptic suggestion in a previous mail).

Ok, but provide in the api of the openscop software a method to get the variable ids, and a method to get the "subset" of the relation (sorry, I know it's not the good english) corresponding to a given variable id.

No problem.

> - It may be better to use void* for extensions (Louis-Noël's proposal, see his mails in the thread) rather than some generative programming (my proposal, see section 4.5.4 in the documentation, it's just one page).
>
> => My plan: I will evaluate this. Actually I did not know that there was no warning when assigning void* to struct* and conversely. Other opinions are welcome.

ok. Feel free to contact me off-line for a prototype.

OK.
 
> - We want a symbol table.
>
> => My plan: it will be an extension anyway so it's not as urgent as the first points (they are blocking for OpenScop and Clan releases). We may discuss it later or separately.

I disagree with this point: a symbol table is a quite fundamental object in the description of openscop. It should be considered from the start. It may be seen as an extension, but I do believe it should be part of the main description of the format. If you insist on an extension, you should provide it along with the openscop spec. You have the ear of many people right now, let's solve the symbol table issue also :)

I agree we should discuss it now. Again a symbol table is not necessary to achieve, say, a Feautrier scheduler. That's why it should be considered as an extension (the infamous "rule 1"). From my point of view there is not much difference between the main part and the extension part, except that the extension part can be ignored by a software which pretends "supporting OpenScop". Any extension should be documented (see section 3.2 and 3.2.1 for an example, please !!!).

> ------------------
> Small things (I just agree, no discussion here)
> ------------------
> - Better explain each relation attribute with examples, especially for existentially quantified dimensions.
> - Add a void* usr; field in the openscop_scop_t structure.
> - Create a "body" structure referring explicitely to either void* or char* elements.
> - Restrict to READ, WRITE and MAY_WRITE access.

Can you remind me why you're discarding may_read?

If I'm not wrong, because from a data dependence point of view, it's the same as a read.

Ced.

Louis-Noel Pouchet

unread,
Jul 7, 2011, 1:56:43 PM7/7/11
to Cédric Bastoul, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
Ok. As soon as there's the extension along with openscop I'm fine with that.

++

-- 
Louis-Noel Pouchet

Louis-Noel Pouchet

unread,
Jul 7, 2011, 2:47:06 PM7/7/11
to Cédric Bastoul, Tobias Grosser, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris
> - Restrict to READ, WRITE and MAY_WRITE access.

Can you remind me why you're discarding may_read?

If I'm not wrong, because from a data dependence point of view, it's the same as a read.

Call me stupid, but why would it be any different than writes? I don't get the distinction.

++

-- 
Louis-Noel Pouchet

Uday K Bondhugula

unread,
Jul 9, 2011, 5:35:14 AM7/9/11
to Cédric Bastoul, Tobias Grosser, Louis-Noel Pouchet, openscop-d...@googlegroups.com, Sven Verdoolaege, Pop, Sebastian, Tristan Vanderbruggen, 胡士文, Thomas Legris

On Wed, 6 Jul 2011, C�dric Bastoul wrote:

> Hi @all,
> I was out of development for one week but now I can spend some time on it.
> Let me do a recap and let you know about my plans (still open for
> discussion):
>
> ------------------
> Big things
> ------------------

> [...]


> - There has been some discussion about the access representation : (1)
> encoding the array id as a dimension is debatable and (2) the representation
> is not enough to store associated symbol information.
>
> => My plan: leave it as it is in my first proposal. For (1) because I do
> really think it's better: (i) there is no need for a special relation for
> accesses (the OpenScop format is simpler), (ii) it is directly usable for
> data dependence analysis (polyhedrically sound), (iii) it is consistent for
> complex memory accesses (which would require scalar dimensions anyway), and
> (iv) extracting the array id is still pretty easy (and there's a function
> for this). For (2) I don't think there is a simple way to make everyone

> happy (in our discussions Louis-No�l wants the external array id, e.g., A in


> A[i]->B[j], while Uday wants the internal one, e.g., B in A[i]->B[j]), plus

Ideally, we would like the entire hierarchy to be available -- number of
symbols followed by the ids of all symbols from root to leaf. Just
putting the parent won't be sufficient. Consider for example performing
scalar replacement on that access; it's B[j] that's replaced and B's
symbol information is needed to determine the type of new scalar;
similarly, if we want to create a copy buffer for it.

> it is not a necessary information for a basic parallelizer, so an
> extension
> should be used (forget about my cryptic suggestion in a previous mail).

I feel this should be part of the core instead of being an extension
since the parser (Clan) has to be modified to put in this information.

Cédric Bastoul

unread,
Jul 21, 2011, 7:10:11 PM7/21/11
to Tobias Grosser, Louis-Noel Pouchet, openscop-d...@googlegroups.com, B Uday Kumar Reddy, Sven Verdoolaege, Pop, Sebastian
Hi @all,
just to let you know: I've implemented and documented everything I planned, including void* extensions. I also added a "next" field to the scop structure and introduced openscop delimiters, so we can have several scops per files. The current state is here in the master branch: http://repo.or.cz/w/openscop/bastoul.git

I may have three points to discuss (but if you don't want to discuss them it's just for your information):

-----------------
openscop_int_t

There is a problem with the type of the elements of the constraint matrix. Now it is "openscop_int_t", i.e. it is not defined (kind of classic way: the type is chosen at configure time; nice for a library, not nice for a specification). I see two solutions:

1 - Set it to GMP's mpz_t. Pros: it's simple and it's the greatest possible precision. Cons: supporting OpenScop would require an external library.

2 - Have an hybrid-precision (requires a new "precision" field in the relation and the constraint matrix reference to be void**). Pros: OpenScop would be hybrid and external library free. Cons: working on the constraint matrix directly would require a cast for long int and long long elements.

I chose the second way because it's the most generic and open. I started implementing it but it's not finished yet. If you disagree on this choice, let's discuss.

-----------------
openscop_context_t

There are several fields in the scop which are devoted to context (output language, context domain, parameter type and parameters). Looking at openscop_scop_t, it may be cleaner to put them in a dedicated structure. But it's another level of indirection to access them. I'm still thinking about it, if you have a preference, let me know.

-----------------
openscop_symbol_t

To answer Uday, Clan _will_ provide the symbol information anyway. I need it for my own work too. So no worries it will be in the extension part (this extension amongst others will be provided with the OpenScop Library distribution directly: no pain, we just don't require other tools to support it). We will port the effort of your intern. The next Clan will also support "#pragma outlined_scop" which will accept an outlined function (where the type and size information will be provided as the outlined function parameters), this would be enough for me.


I'll be out for the next two weeks, I'll try to hide a laptop somewhere, but I may not be reactive to mails ;-).
Best,

Cedric
Reply all
Reply to author
Forward
0 new messages