Anyone used Graphite of Gentoo recently?

436 views
Skip to first unread message

de.t...@gmail.com

unread,
Jan 22, 2014, 11:35:59 PM1/22/14
to gcc-gr...@googlegroups.com
I've been using Graphite for quiet a while now and reporting about a few packages that didn't compile, but recently (I dont remember if GCC was updated or not), things have been turning VERY bad.

bzip2 got broken, dbus turned buggy and kde failed to start.

As a result I was forced to remove graphite.

Anyone with similar experiences?

Tobias Grosser

unread,
Jan 24, 2014, 2:39:22 AM1/24/14
to de.t...@gmail.com, gcc-gr...@googlegroups.com
CC: gcc-graphite
Not on my side.

Is there a way to reproduce this problems. E.g. which version of gcc are
you using. What are the versions of its dependences? Do you happen to
have a reduced test case?

Cheers,
Tobias

dE

unread,
Jan 25, 2014, 11:09:48 AM1/25/14
to gcc-gr...@googlegroups.com
Can you get dbus to compile properly for starters?

I'm using GCC 4.8.2.

Tobias Grosser

unread,
Jan 25, 2014, 2:15:50 PM1/25/14
to dE, gcc-gr...@googlegroups.com
dbus is a big test case. Can you reduce it to a single function that
miscompiles?

Tobias

dE

unread,
Jan 25, 2014, 10:43:37 PM1/25/14
to gcc-gr...@googlegroups.com
No, I'm not a dev.

You can use quickpkg to make a binary off the working dbus?

dbus is not required for system booting, mostly the GUI requires it. You
can use the ttys to emerge the working dbus's binary.

Tobias Grosser

unread,
Feb 3, 2014, 7:36:31 AM2/3/14
to dE, gcc-gr...@googlegroups.com
Sorry. I do not have quickpkg and also do not have emerge (I am not
usign debian).

What compile flags have you been using that enabled graphite (it is
disabled by default)?

Also, gzip seems to be a easier start to reproduce the problem. Can you
post to gzip source version for which you see the failures as well as
a test file and the command line that shows the problems?

Tobias



dE

unread,
Feb 5, 2014, 11:47:54 PM2/5/14
to gcc-gr...@googlegroups.com
Emerge and quickpkg are Gentoo commands.

https://bugs.gentoo.org/show_bug.cgi?id=476682

Tobias Grosser

unread,
Feb 6, 2014, 2:41:01 AM2/6/14
to dE, gcc-gr...@googlegroups.com, Mircea Namolaru
Sorry.

> https://bugs.gentoo.org/show_bug.cgi?id=476682

This bug report gives a lot more details. I will copy Mircae, who just
started to take over the graphite maintainership.

Tobias

Andreas Simbuerger

unread,
Feb 14, 2014, 8:34:04 AM2/14/14
to gcc-gr...@googlegroups.com
I'm running gentoo and just had a quick look at it.
gcc-4.8.2 and bzip2-1.0.6-r{3,6}.

Can confirm, CFLAGS="-O3 -fgraphite-identity"

bunzip2: Data integrity error when decompressing.
Input file = snapper-0.1.8.tar.bz2, output file = snapper-0.1.8.tar

Cheers,
Andreas

Tobias Grosser

unread,
Feb 14, 2014, 8:48:44 PM2/14/14
to Andreas Simbuerger, gcc-gr...@googlegroups.com
Alright. Any chance we can get a reduced test case for this?

Tobias

Vladimir Kargov

unread,
Mar 24, 2014, 12:23:47 PM3/24/14
to Tobias Grosser, Andreas Simbuerger, GCC GRAPHITE
I can also confirm that -fgraphite-identity produces invalid code on
bzip2 that fails their built-in compress/decompress check. Tested on
GCC 4.9.0(svn trunk) and GCC 4.8.1(Ubuntu).

Attaching a reduced test case.

-O3 -fgraphite-identity produces "0 0 0 0"
while -O3 (and -O0) produces "-1 -1 0 0".

I'll try taking a more in-depth look at the problem.
> --
>
> --- You received this message because you are subscribed to the Google
> Groups "GCC GRAPHITE" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to gcc-graphite...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Vladimir
reduced.c

Tobias Grosser

unread,
Mar 24, 2014, 1:45:40 PM3/24/14
to Vladimir Kargov, Andreas Simbuerger, GCC GRAPHITE, Mircea Namolaru
On 03/24/2014 05:23 PM, Vladimir Kargov wrote:
> I can also confirm that -fgraphite-identity produces invalid code on
> bzip2 that fails their built-in compress/decompress check. Tested on
> GCC 4.9.0(svn trunk) and GCC 4.8.1(Ubuntu).
>
> Attaching a reduced test case.
>
> -O3 -fgraphite-identity produces "0 0 0 0"
> while -O3 (and -O0) produces "-1 -1 0 0".
>
> I'll try taking a more in-depth look at the problem.

Hi Vladimir,

great to hear that you are able to reproduce the problem. It seems
really important to track down this issue. If you could help us with
this, that would be great.

Cheers,
Tobias

Mircea Namolaru

unread,
Mar 26, 2014, 6:15:48 PM3/26/14
to gcc-gr...@googlegroups.com, de.t...@gmail.com
Hi,

As I work at some possible related Graphite problems and get a run to the Vladimir test (great job ! - such a small test).

Almost sure that the problem is with the loop  "for (i = minLen; i <= maxLen; i++) {
Graphite performs wrapping of the bounds and the AST of the code generated by CLOOG looks as:

CLAST generated by CLooG:
for (scat_1=0;scat_1<=-4294967296*floord(maxLen_6-minLen_4,4294967296)+maxLen_6-minLen_4;scat_1++) {
  (scat_1);
}

I disabled the wrapping and get the following CLAST structure - and in this case got the correct result.

CLAST generated by CLooG:
for (scat_1=0;scat_1<=maxLen_6-minLen_4;scat_1++) {
  (scat_1);
}

The only difference between the two CLAST is the floord and suspect that somehow the problem is due to it.

Hope that it helps.  Regards, Mircea

Sven Verdoolaege

unread,
Mar 26, 2014, 7:16:32 PM3/26/14
to Mircea Namolaru, gcc-gr...@googlegroups.com, de.t...@gmail.com
On Wed, Mar 26, 2014 at 03:15:48PM -0700, Mircea Namolaru wrote:
> Hi,
>
> As I work at some possible related Graphite problems and get a run to the
> Vladimir test (great job ! - such a small test).
>
> Almost sure that the problem is with the loop "for (i = minLen; i <=
> maxLen; i++) {
> Graphite performs wrapping of the bounds and the AST of the code generated
> by CLOOG looks as:
>
> CLAST generated by CLooG:
> for
> (scat_1=0;scat_1<=-4294967296*floord(maxLen_6-minLen_4,4294967296)+maxLen_6-minLen_4;scat_1++)
> {
> (scat_1);
> }

Can you dump the CLooG input (with wrapping) and post it here?

skimo

Vladimir Kargov

unread,
Mar 26, 2014, 8:37:19 PM3/26/14
to Mircea Namolaru, Sven Verdoolaege, GCC GRAPHITE, de.t...@gmail.com
Hi all,

> On Wed, Mar 26, 2014 at 03:15:48PM -0700, Mircea Namolaru wrote:
Thanks for looking at it.
>> Almost sure that the problem is with the loop "for (i = minLen; i <=
>> maxLen; i++) {
>> Graphite performs wrapping of the bounds and the AST of the code generated
>> by CLOOG looks as:
>
> CLAST generated by CLooG:
> for
> (scat_1=0;scat_1<=-4294967296*floord(maxLen_6-minLen_4,4294967296)+maxLen_6-minLen_4;scat_1++)
> {
> (scat_1);
> }
> I disabled the wrapping and get the following CLAST structure - and in this case got the correct result.
Interesting, though it appears to me that the loop end expression you
provided should always evaluate to "scat_1<=maxLen_6-minLen_4" for
this test case because maxLen > minLen and they both are 32-bit and
non-negative (thus, maxLen_6-minLen_4 will be >=0 and <4294967296,
making (max_Len_6-minLen_4)/4294967296 == 0).
Moreover, if you declare i, j, minLen, maxLen and alphaSize as "long
long int"(which are 64-bit on my machine) the result becomes correct,
even though the loop generated by cloog looks very similar:
for (scat_1=0;scat_1<=-18446744073709551616*floord(maxLen_34-minLen_33+13835058055282163713,18446744073709551616)+maxLen_34-minLen_33;scat_1++)
{
S1(scat_1);
}
Unfortunately, I don't know Graphite well enough to tell whether the
loop exit expression we're seeing is expected or not, I assumed that
it would always evaluate to 0 and not cause a problem.

On 27 March 2014 03:16, Sven Verdoolaege <sk...@kotnet.org> wrote:
> Can you dump the CLooG input (with wrapping) and post it here?
I'll attach mine since they produce the same result. reduced.int.cloog
is the input for the code Mircea posted, and reduced.ll.cloog is the
version for 64-bit(long long int) variables that produces correct code
with Graphite.

--
Vladimir
reduced.int.cloog
reduced.ll.cloog

Sven Verdoolaege

unread,
Mar 27, 2014, 4:29:28 AM3/27/14
to Vladimir Kargov, Tobias Grosser, Andreas Simbuerger, GCC GRAPHITE, Mircea Namolaru, de.t...@gmail.com
Quoting the source code for reference.

On Mon, Mar 24, 2014 at 08:23:47PM +0400, Vladimir Kargov wrote:
> {
> int i, j, vec = 0;
>
> for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
> for (i = 1; i < BZ_MAX_CODE_LEN; i++) base[i] += base[i-1];
>
> for (i = minLen; i <= maxLen; i++) {
> vec += (base[i+1] - base[i]);
> limit[i] = vec-1;
> }
> }


On Thu, Mar 27, 2014 at 04:37:19AM +0400, Vladimir Kargov wrote:
> # Iteration domain of statement 1 ().
> 1
>
> 5 6 1 0 1 2
> 1 1 0 0 0 0
> 1 0 -4294967296 1 -1 0
> 1 0 4294967296 -1 1 4294967295
> 1 -1 -4294967296 1 -1 0
> 1 -1 0 0 0 4294967295

In isl notation (which I find easier to read), this looks like

[maxLen_34, minLen_33] -> { S1[i0] -> [0, i0, 0] : exists (e0 = floor((maxLen_34 - minLen_33)/4294967296): i0 >= 0 and 4294967296e0 <= maxLen_34 - minLen_33 and 4294967296e0 >= -4294967295 + maxLen_34 - minLen_33 and 4294967296e0 <= maxLen_34 - minLen_33 - i0 and i0 <= 4294967295) }

So, essentially, the iteration domain is

0 <= i <= (maxLen_34 - minLen_33) mod 4294967296

Now, apparently the code generated by CLooG has some issues, but
the input is already wrong. In particular, if maxLen is smaller
than minLen, then the iteration domain should be empty,
while in the input it is not.

skimo

Tobias Grosser

unread,
Mar 27, 2014, 6:03:32 AM3/27/14
to Sven Verdoolaege, Vladimir Kargov, Andreas Simbuerger, GCC GRAPHITE, Mircea Namolaru, de.t...@gmail.com
Right. However, this could both be a bug in graphite or just information
that got lost during previous transformations.

Cheers,
Tobias

Mircea Namolaru

unread,
Mar 27, 2014, 10:39:57 AM3/27/14
to Tobias Grosser, Sven Verdoolaege, Vladimir Kargov, Andreas Simbuerger, GCC GRAPHITE, de.t...@gmail.com
Hi,

The domain is computed on basis of the information provided by number_of_latch_execution that returns the tree expression

(unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)

For signed integers (without the cast to unsigned int) this seems to be the correct value. As an unsigned int expression it leads to incorrect domain.

Btw, if you have an iteration less (i < maxLen instead of i<= maxLen) you get something as:

((unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)) + 4294967295

and you get correct results.

Mircea

Tobias Grosser

unread,
Mar 28, 2014, 9:29:32 AM3/28/14
to Vladimir Kargov, Andreas Simbuerger, GCC GRAPHITE
On 03/24/2014 05:23 PM, Vladimir Kargov wrote:
> I can also confirm that -fgraphite-identity produces invalid code on
> bzip2 that fails their built-in compress/decompress check. Tested on
> GCC 4.9.0(svn trunk) and GCC 4.8.1(Ubuntu).
>
> Attaching a reduced test case.
>
> -O3 -fgraphite-identity produces "0 0 0 0"
> while -O3 (and -O0) produces "-1 -1 0 0".

I just looked into this with mircea.

> #include <stdio.h>
>
> #define ALPHASIZE 0
> #define BZ_MAX_CODE_LEN 4
>
> void BZ2_hbCreateDecodeTables ( int *limit,
> int *base,
> unsigned char *length,
> int minLen,
> int maxLen,
> int alphaSize )
> {
> int i, j, vec = 0;
>
> for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
> for (i = 1; i < BZ_MAX_CODE_LEN; i++) base[i] += base[i-1];
>
> for (i = minLen; i <= maxLen; i++) {
> vec += (base[i+1] - base[i]);
> limit[i] = vec-1;

We simplified this to

limit [i] = i + 1;

> }
> }

The interesting point is that commenting out any of the first two loops
make the problem disappear, even though they should have no effect on
the last loop.

Cheers,
Tobias

Mircea Namolaru

unread,
Mar 28, 2014, 9:05:48 PM3/28/14
to gcc-gr...@googlegroups.com, de.t...@gmail.com
Hi,

Just an update. I've checked again the tests that Tobias and me looked (see Tobias note).
and tried to further simplify it - strange at some point began to get different results.It was
possible to remove a loop and the the other one.

In fact now I have a function:

int BZ2_hbCreateDecodeTables (int minLen,
                              int maxLen,
                              int vec)
{
  int i, j;

   printf("%d %d %d\n", minLen, maxLen, vec);

   for (i = minLen; i <= maxLen; i++) {

     vec += i;
   }

   printf("%d %d %d\n", minLen, maxLen, vec);

   return vec;
}

that returns the wrong result (and the single invocation to Graphite is for this loop). It seems that
the loop is not entered (no matter the values of minLen and maxLen). But as said
not sure that the results are deterministic (except the Vladimir original test that always fail).


Mircea

Tobias Grosser

unread,
Mar 29, 2014, 3:27:42 AM3/29/14
to Mircea Namolaru, gcc-gr...@googlegroups.com, de.t...@gmail.com
Great. Mircea, could you post the full test file such that others could
start from that one right away.

Cheers,
Tobias

Vladimir Kargov

unread,
Mar 30, 2014, 8:26:33 PM3/30/14
to Tobias Grosser, GCC GRAPHITE, Mircea Namolaru
On 28 March 2014 17:29, Tobias Grosser <tob...@grosser.es> wrote:

> The interesting point is that commenting out any of the first two loops make the problem disappear, even though they should have no effect on the last loop.

Turns out this is due to early inlining. When you comment one of the
first two loops, GCC treats the function as "lightweight" and inlines
it into main() where it fully unrolls it (since the iteration number
is known) which eliminates the need in Graphite. You can compare the
compiler dumps after the einline phase to see it. Adding the noinline
attribute solves the problem and reduces the test further to the form
similar to the one Mircea has posted.

#include <stdio.h>

#define ALPHASIZE 0
#define BZ_MAX_CODE_LEN 4

void __attribute__ ((noinline))
BZ2_hbCreateDecodeTables( int *limit, int minLen, int maxLen)
{
int i, j;

for (i = minLen; i <= maxLen; i++) {
limit[i] = i + 1;
}
}

void main()
{
int limit[BZ_MAX_CODE_LEN] = {0, 0, 0, 0};
int pos;

BZ2_hbCreateDecodeTables (limit, 0, 1);

for ( pos = 0; pos < BZ_MAX_CODE_LEN; pos++ )
printf("%d ", limit[pos]);
printf("\n");
}


--
Vladimir

Vladimir Kargov

unread,
Mar 31, 2014, 12:25:50 AM3/31/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On 27 March 2014 18:39, Mircea Namolaru <mircea....@gmail.com> wrote:
> The domain is computed on basis of the information provided by
> number_of_latch_execution that returns the tree expression
>
> (unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)
>
> For signed integers (without the cast to unsigned int) this seems to be the
> correct value. As an unsigned int expression it leads to incorrect domain.

I've got a couple of questions with regards to wrapping. Pasting the
relevant code from graphite-sese-to-poly.c:

static isl_pw_aff *
extract_affine (scop_p s, tree e, __isl_take isl_space *space)
{
...
// e comes from number_of_latch_executions()
type = TREE_TYPE (e);
if (TYPE_UNSIGNED (type))
res = wrap (res, TYPE_PRECISION (type));

1) Any idea why wrapping occurs only for unsigned expressions?
2) What exactly is wrapping needed for? I can't find it in the old PPL
implementation (though it could have been called differently).

Also, no matter what the logic behind wrapping is, it does seem
suspicious that this code checks the signedness of the result of
number_of_latch_execution() (which may be arbitrary, as far as I
understood), and not of the loop induction variable, for example.

Best regrads,
--
Vladimir

Sven Verdoolaege

unread,
Mar 31, 2014, 1:36:54 AM3/31/14
to Vladimir Kargov, Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On Mon, Mar 31, 2014 at 08:25:50AM +0400, Vladimir Kargov wrote:
> On 27 March 2014 18:39, Mircea Namolaru <mircea....@gmail.com> wrote:
> > The domain is computed on basis of the information provided by
> > number_of_latch_execution that returns the tree expression
> >
> > (unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)
> >
> > For signed integers (without the cast to unsigned int) this seems to be the
> > correct value. As an unsigned int expression it leads to incorrect domain.
>
> I've got a couple of questions with regards to wrapping. Pasting the
> relevant code from graphite-sese-to-poly.c:
>
> static isl_pw_aff *
> extract_affine (scop_p s, tree e, __isl_take isl_space *space)
> {
> ...
> // e comes from number_of_latch_executions()
> type = TREE_TYPE (e);
> if (TYPE_UNSIGNED (type))
> res = wrap (res, TYPE_PRECISION (type));
>
> 1) Any idea why wrapping occurs only for unsigned expressions?

In C, an overflow in an operation on unsigned integers results
in wrapping, while an overflow on signed integers results in
undefined behavior.

> 2) What exactly is wrapping needed for? I can't find it in the old PPL
> implementation (though it could have been called differently).
>
> Also, no matter what the logic behind wrapping is, it does seem
> suspicious that this code checks the signedness of the result of
> number_of_latch_execution() (which may be arbitrary, as far as I
> understood), and not of the loop induction variable, for example.

I would think that it doesn't make sense to perform wrapping
on the result of number_of_latch_executions. However, if your
loop looks like

for (i = a + b; i < c + d; ++i)

and a, b, c and d are unsigned, then you do need to perform
wrapping on (a + b) and (c + d). Not being familiar with the
internals of number_of_latch_executions, I don't know if it already
performs this wrapping.

skimo

Tobias Grosser

unread,
Mar 31, 2014, 2:23:55 AM3/31/14
to Vladimir Kargov, Mircea Namolaru, GCC GRAPHITE, GCC Mailing List
On 03/31/2014 06:25 AM, Vladimir Kargov wrote:
> On 27 March 2014 18:39, Mircea Namolaru <mircea....@gmail.com> wrote:
>> The domain is computed on basis of the information provided by
>> number_of_latch_execution that returns the tree expression
>>
>> (unsigned int) maxLen_6(D) - (unsigned int) minLen_4(D)
>>
>> For signed integers (without the cast to unsigned int) this seems to be the
>> correct value. As an unsigned int expression it leads to incorrect domain.
>
> I've got a couple of questions with regards to wrapping. Pasting the
> relevant code from graphite-sese-to-poly.c:
>
> static isl_pw_aff *
> extract_affine (scop_p s, tree e, __isl_take isl_space *space)
> {
> ...
> // e comes from number_of_latch_executions()
> type = TREE_TYPE (e);
> if (TYPE_UNSIGNED (type))
> res = wrap (res, TYPE_PRECISION (type));
>
> 1) Any idea why wrapping occurs only for unsigned expressions?

In the C standard, unsinged operations have defined overflow semantics
in form of wrapping. Signed operations instead have undefined behavior,
which means we can assume that no wrapping occurs and consequently do
not need to model it.

> 2) What exactly is wrapping needed for? I can't find it in the old PPL
> implementation (though it could have been called differently).

it is necessary to model the defined overflow behavior of unsigned integers.

> Also, no matter what the logic behind wrapping is, it does seem
> suspicious that this code checks the signedness of the result of
> number_of_latch_execution() (which may be arbitrary, as far as I
> understood), and not of the loop induction variable, for example.

It is very suspicious. It is a rather difficult topic and I doubt it is
implemented correctly. Also, I do not think that implementing wrapping
this way is the right approach. Instead, we should just model it to
extract a run-time check that verifiers that no wrapping occurs and then
go one without wrapping.

Regarding this bug there are two directions to go:

1) It is necessary to understand if we need to do wrapping on the result
of number_of_latch_executions. Meaning, we should understand the
semantics of this function and the expression it returns.

2) There is something else happening?? From my observations it seems
that the generated code at least for this test case should behave
correctly and the relevant loop should be executed exactly twice.
Surprisingly this is not the case. It would be interesting to understand
what exactly prevents this loop from being executed. (From the clast,
this is not obvious to me).

Tobi

Tobias Grosser

unread,
Mar 31, 2014, 2:24:13 AM3/31/14
to Vladimir Kargov, GCC GRAPHITE, Mircea Namolaru
On 03/31/2014 02:26 AM, Vladimir Kargov wrote:
> On 28 March 2014 17:29, Tobias Grosser <tob...@grosser.es> wrote:
>
>> The interesting point is that commenting out any of the first two loops make the problem disappear, even though they should have no effect on the last loop.
>
> Turns out this is due to early inlining. When you comment one of the
> first two loops, GCC treats the function as "lightweight" and inlines
> it into main() where it fully unrolls it (since the iteration number
> is known) which eliminates the need in Graphite. You can compare the
> compiler dumps after the einline phase to see it. Adding the noinline
> attribute solves the problem and reduces the test further to the form
> similar to the one Mircea has posted.

Nice observation. Great test case.

Richard Biener

unread,
Mar 31, 2014, 4:10:18 AM3/31/14
to Tobias Grosser, Vladimir Kargov, Mircea Namolaru, GCC GRAPHITE, GCC Mailing List
Note that there are always two sides of the "undefined overflow" thing

1. in the input IL whether an operation may overflow or not (check
TYPE_OVERFLOW_UNDEFINED, _not_ !TYPE_UNSIGNED)
2. in the GIMPLE IL generated from clast - all operations that satisfy
TYPE_OVERFLOW_UNDEFINED may not have different overflow behavior
than in the original untransformed program (no intermediate overflows
unless they were already present). Usually that's not very easy to guarantee,
thus re-writing everything into unsigned arithmetic may be easiest.

Richard.

> Tobi

Mircea Namolaru

unread,
Mar 31, 2014, 7:24:29 AM3/31/14
to Tobias Grosser, Vladimir Kargov, GCC GRAPHITE
Hi,

Thanks to Vladimir I could explain part of the results (I was not consistently disabling the inlining).

The most plausible hypothesis for this problem seems related to wrapping or some side-effect due do to it (the floord operator etc)..First I would check the the loop generated by Graphite in the tree-SSA code and see if indeed there is a  problem.

Regarding, the number_of_latch_executions its semantics seems rather confusing to me. Looking at the results  - sometimes adds UMAX_INT to the difference between init_val and final_val. It depends on the expression returned (it is variable or an arithmetic expression, and if it is an addition if val_init is a constant or not).  Sometimes scalar evolution invoked introduces UMAX_INT - for this case the function returns val_max - val_min -1 and scalar evolution transforms it to val_max - val_min + UMAX_INT - 1. But not sure that the problem is caused by the domain, I think that the code for the loop begins by testing if val_min <= val_max (and if not the loop is not entered) and with this addition the domain seems correct.

Mircea.


Tobias Grosser

unread,
Mar 31, 2014, 7:53:03 AM3/31/14
to Richard Biener, Vladimir Kargov, Mircea Namolaru, GCC GRAPHITE, GCC Mailing List
On 03/31/2014 10:10 AM, Richard Biener wrote:
> On Mon, Mar 31, 2014 at 8:23 AM, Tobias Grosser <tob...@grosser.es> wrote:
>> On 03/31/2014 06:25 AM, Vladimir Kargov wrote:
>> Regarding this bug there are two directions to go:
>>
>> 1) It is necessary to understand if we need to do wrapping on the result of
>> number_of_latch_executions. Meaning, we should understand the semantics of
>> this function and the expression it returns.
>>
>> 2) There is something else happening?? From my observations it seems
>> that the generated code at least for this test case should behave correctly
>> and the relevant loop should be executed exactly twice. Surprisingly this is
>> not the case. It would be interesting to understand what exactly prevents
>> this loop from being executed. (From the clast, this is not obvious to me).
>
> Note that there are always two sides of the "undefined overflow" thing
>
> 1. in the input IL whether an operation may overflow or not (check
> TYPE_OVERFLOW_UNDEFINED, _not_ !TYPE_UNSIGNED)

Thanks Richi. I was not aware of this, but we really should check for
TYPE_OVERFLOW_UNDEFINED instead.

I think for now we may want to limit ourselves to
TYPE_OVERFLOW_UNDEFINED, as the wrapping code is not very reliable and
will cause very ugly code.

> 2. in the GIMPLE IL generated from clast - all operations that satisfy
> TYPE_OVERFLOW_UNDEFINED may not have different overflow behavior
> than in the original untransformed program (no intermediate overflows
> unless they were already present). Usually that's not very easy to guarantee,
> thus re-writing everything into unsigned arithmetic may be easiest.

The code we generate should not have any overflows. In case there are
earlier defined overflows we should at best bail out. This is the safest
approach. Anything else requires some more investigations.

Cheers,
Tobias

Vladimir Kargov

unread,
Apr 2, 2014, 5:24:59 AM4/2/14
to Tobias Grosser, Sven Verdoolaege, Mircea Namolaru, GCC GRAPHITE
On Mon, Mar 31, 2014 at 08:25:50AM +0400, Vladimir Kargov wrote:
> 1) Any idea why wrapping occurs only for unsigned expressions?
On 31 March 2014 09:36, Sven Verdoolaege <sk...@kotnet.org> wrote:
> In C, an overflow in an operation on unsigned integers results
> in wrapping, while an overflow on signed integers results in
> undefined behavior.
On 31 March 2014 10:23, Tobias Grosser <tob...@grosser.es> wrote:
> In the C standard, unsinged operations have defined overflow semantics in
> form of wrapping. Signed operations instead have undefined behavior, which
> means we can assume that no wrapping occurs and consequently do not need to
> model it.
Thanks, forgot about that.

By the way, could someone clarify why the cloog representation of the
domain that Graphite built for our loop:

for (i = minLen; i <= maxLen; i++) {
limit[i] = i;

contains 6 columns?

5 6 1 0 1 2
1 1 0 0 0 0
1 0 -4294967296 1 -1 0
1 0 4294967296 -1 1 4294967295
1 -1 -4294967296 1 -1 0
1 -1 0 0 0 4294967295

I thought there should be five ("in/eq, i, maxLen, minLen, 1", i.e. 1
column to specify equality/inequality, 1 column for induction variable
i, 2 columns for parameters minLen and maxLen and 1 row for a
constant).

Also, is there an easy way to convert the Cloog notation into the ISL notation?

Best regards,
--
Vladimir

Tobias Grosser

unread,
Apr 2, 2014, 5:30:58 AM4/2/14
to Vladimir Kargov, Sven Verdoolaege, Mircea Namolaru, GCC GRAPHITE
My guess is that this is due to the modulo wrapping, which expressed as
'expr % 4294967295' introduces an additional existentially quantified
dimension.

> Also, is there an easy way to convert the Cloog notation into the ISL notation?

isl as a library and consequently the tools iscc and islpy can read such
domains and can then dump them directly.

If you go to http://compsys-tools.ens-lyon.fr/iscc/index.php

and enter (the ';' is important):

==================
5 6 1 0 1 2
1 1 0 0 0 0
1 0 -4294967296 1 -1 0
1 0 4294967296 -1 1 4294967295
1 -1 -4294967296 1 -1 0
1 -1 0 0 0 4294967295
;
==================

you get:

[p0, p1] -> {
[i0] : exists (e0: i0 >= 0 and 4294967296e0 <= p0 - p1 and
4294967296e0 >= -4294967295 + p0 - p1 and 4294967296e0 <= p0 - p1 - i0
and i0 <= 4294967295);
}

Generally I would try to not even create the CLooG format, but to just
dump isl strings directly. If we still dump the cloog format somewhere,
patches to dump the isl format are more than welcome.

Cheers,
Tobias

Mircea Namolaru

unread,
Apr 2, 2014, 9:03:45 AM4/2/14
to Sven Verdoolaege, Vladimir Kargov, Tobias Grosser, GCC GRAPHITE
Vladimir,

There are already some dump functions in gcc in graphite-poly,c (debug_isl_map, debug_isl_set etc) with output in the isl format.


Mircea


On Wed, Apr 2, 2014 at 11:32 AM, Sven Verdoolaege <sk...@kotnet.org> wrote:
On Wed, Apr 02, 2014 at 01:24:59PM +0400, Vladimir Kargov wrote:
> By the way, could someone clarify why the cloog representation of the
> domain that Graphite built for our loop:
>
> for (i = minLen; i <= maxLen; i++) {
>   limit[i] = i;
>
> contains 6 columns?
>
> 5 6 1 0 1 2
> 1     1     0     0     0     0
> 1     0 -4294967296     1    -1     0
> 1     0 4294967296    -1     1 4294967295
> 1    -1 -4294967296     1    -1     0
> 1    -1     0     0     0 4294967295
>
> I thought there should be five ("in/eq, i, maxLen, minLen, 1", i.e. 1
> column to specify equality/inequality, 1 column for induction variable
> i, 2 columns for parameters minLen and maxLen and 1 row for a
> constant).

There's an extra column for the "local" (i.e., existentially quantified)
variable.  Look for "extended PolyLib" format in the isl manual.


> Also, is there an easy way to convert the Cloog notation into the ISL notation?

The above notation is not a "CLooG" notation, but the "extended PolyLib" format,
which AFAIK is only supported by isl.
You can use isl_cat to convert between different formats.

    skimo@purples:~/obj/isl$ ./isl_cat --format=isl

    5 6 1 0 1 2
    1     1     0     0     0     0
    1     0 -4294967296     1    -1     0
    1     0 4294967296    -1     1 4294967295
    1    -1 -4294967296     1    -1     0
    1    -1     0     0     0 4294967295
    [p0, p1] -> { [i0] : exists (e0 = floor((p0 - p1)/4294967296): i0 >= 0 and 4294967296e0 <= p0 - p1 and 4294967296e0 >= -4294967295 + p0 - p1 and 4294967296e0 <= p0 - p1 - i0 and i0 <= 4294967295) }

Btw, you should post questions about isl on the isl group.

skimo

Mircea Namolaru

unread,
Apr 2, 2014, 9:36:23 AM4/2/14
to Sven Verdoolaege, Vladimir Kargov, Tobias Grosser, GCC GRAPHITE
For a loop as in our case "for (i = minLen; i < MaxLen; i++)" where minLen and maxLen are signed an overflow not present in the original program may occur when estimating the number of iterations as maxLen - maxInt. But it seems to me that performing this computation in a some larger type as unsigned. long etc is  enough for correctness. Why wrapping is needed ?

What it is the status of this problem ? If the tree-SSA code seems correct did someone take a look at the RTL code (assembly for sure is not good, executed it via the debugger) ?

Mircea

Sven Verdoolaege

unread,
Apr 2, 2014, 5:32:47 AM4/2/14
to Vladimir Kargov, Tobias Grosser, Mircea Namolaru, GCC GRAPHITE
On Wed, Apr 02, 2014 at 01:24:59PM +0400, Vladimir Kargov wrote:
> By the way, could someone clarify why the cloog representation of the
> domain that Graphite built for our loop:
>
> for (i = minLen; i <= maxLen; i++) {
> limit[i] = i;
>
> contains 6 columns?
>
> 5 6 1 0 1 2
> 1 1 0 0 0 0
> 1 0 -4294967296 1 -1 0
> 1 0 4294967296 -1 1 4294967295
> 1 -1 -4294967296 1 -1 0
> 1 -1 0 0 0 4294967295
>
> I thought there should be five ("in/eq, i, maxLen, minLen, 1", i.e. 1
> column to specify equality/inequality, 1 column for induction variable
> i, 2 columns for parameters minLen and maxLen and 1 row for a
> constant).

There's an extra column for the "local" (i.e., existentially quantified)
variable. Look for "extended PolyLib" format in the isl manual.

> Also, is there an easy way to convert the Cloog notation into the ISL notation?

The above notation is not a "CLooG" notation, but the "extended PolyLib" format,
which AFAIK is only supported by isl.
You can use isl_cat to convert between different formats.

skimo@purples:~/obj/isl$ ./isl_cat --format=isl
5 6 1 0 1 2
1 1 0 0 0 0
1 0 -4294967296 1 -1 0
1 0 4294967296 -1 1 4294967295
1 -1 -4294967296 1 -1 0
1 -1 0 0 0 4294967295

Tobias Grosser

unread,
Apr 2, 2014, 12:08:00 PM4/2/14
to Mircea Namolaru, Sven Verdoolaege, Vladimir Kargov, GCC GRAPHITE
On 04/02/2014 03:36 PM, Mircea Namolaru wrote:
> For a loop as in our case "for (i = minLen; i < MaxLen; i++)" where minLen
> and maxLen are signed an overflow not present in the original program may
> occur when estimating the number of iterations as maxLen - maxInt.

Right.

> But it
> seems to me that performing this computation in a some larger type as
> unsigned. long etc is enough for correctness. Why wrapping is needed ?

That depends.

1) In some cases a user or some earlier gimple passes may have created
code that relies on wrapping to work. For this cases, just modeling the
wrapping as a + b instead of a + b % size is incorrect.

Assuming the expressions is in fact a + b, we can in some cases use a
larger type. However, if a and b are already 64 bit types, we can not
really go a lot larger.

There are different solutions for the problems

o Add run-time checks that ensure that no wrapping is needed

o Derive the minimal valid type automatically

We can later look into this solutions. However, when we looked into this
very problem at least the clast that has been generated looked correct.
So I would first understand what the actual issue is. This leads us to
the next question.


> What it is the status of this problem ? If the tree-SSA code seems correct
> did someone take a look at the RTL code (assembly for sure is not good,
> executed it via the debugger) ?

I am surprised you are asking this? The last time we discussed, you
planned to look into the generated gimple code for this bug. Did you
find time do this?

Also, how do you get to the conclusion that the generated tree-SSA code
is correct? Did someone else report this already? Did I miss a mail?

Tobias

Mircea Namolaru

unread,
Apr 3, 2014, 8:29:57 AM4/3/14
to Tobias Grosser, Sven Verdoolaege, Vladimir Kargov, GCC GRAPHITE
Hi,

> 1) In some cases a user or some earlier gimple passes may have created code that relies on wrapping to work. For this cases, just modeling the wrapping as a + b instead of a + b % size is incorrect.

Yes, but my question was wrapping of signed integers (which excludes the user code relying on wrapping). Anyway the topic of how
signed behaviour is modelled by the compiler (to avoid introducing overflows) is not a simple one, and indeed not one of the Graphite
priorities at this time.

> I am surprised you are asking this? The last time we discussed, you planned to look
into the generated gimple code for this bug. Did you find time do this?

The last time that we talked it seemed that somehow the problem is related
to other loops in the functions. Investigates this hypothesis -  generated a
simpler test case that excludes this possibility. At the time also took a short look at the tree-SSA and RTL code -
I could tell that the tree-SSA code seems rather close to the source (and CLAST loop) while RTL
code greatly modifies it. As I have now some (rather limited) time to return to this problem, wondered
if meantime someone has advanced, not to repeat the efforts done.


> Also, how do you get to the conclusion that the generated tree-SSA code is correct? Did someone else report this already? Did I miss a mail?

You missed the word "If" -  "If the tree-SSA code ..."  :-).

Mircea




Vladimir Kargov

unread,
Apr 4, 2014, 3:01:50 AM4/4/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On 3 April 2014 16:29, Mircea Namolaru <mircea....@gmail.com> wrote:
> As I have now some (rather limited) time to return to this problem, wondered if meantime someone has advanced, not to repeat the efforts done.
Just for the record, I have nothing new to write about.
I've got some gaps in the understanding of corner cases in gimple/rtl
and their dumps which prevent me from telling whether the intermediate
representation produced by Graphite will work on the input data or not
and when exactly the semantics breaks. So I'm spending some time on
covering the gaps I have instead.

(Well, this mail was absolutely pointless!)
--
Vladimir

Albert Cohen

unread,
Apr 4, 2014, 3:58:38 AM4/4/14
to gcc-gr...@googlegroups.com
It was not :-).

Thanks a lot Vladimir.

Albert

Mircea Namolaru

unread,
Apr 4, 2014, 10:20:27 AM4/4/14
to Vladimir Kargov, Tobias Grosser, GCC GRAPHITE
I think that the problem is already in the tree SSA code:

For CLAST generated by CLooG:
for (scat_1=0;scat_1<=-4294967296*floord(maxLen_5-minLen_4,4294967296)+maxLen_5-minLen_4;scat_1++) {
  (scat_1);
}

In tree SSA the computation of the bounds of the loop is done on long signed integers and looks as"
for (scat_1=0;scat_1<=4294967296*floord(maxLen_5-minLen_4,-4294967296)+maxLen_5-minLen_4;scat_1++) {
  (scat_1);
}

The expression
-4294967296*floord(maxLen_5-minLen_4,4294967296)
is translated to
4294967296*floord(maxLen_5-minLen_4,-4294967296)
In the second case the result of the floord is -1 instead of 0 and the loop is not entered..




Tobias Grosser

unread,
Apr 4, 2014, 4:06:30 PM4/4/14
to Mircea Namolaru, Vladimir Kargov, GCC GRAPHITE
On 04/04/2014 04:20 PM, Mircea Namolaru wrote:
> I think that the problem is already in the tree SSA code:
>
> For CLAST generated by CLooG:
> for
> (scat_1=0;scat_1<=-4294967296*floord(maxLen_5-minLen_4,4294967296)+maxLen_5-minLen_4;scat_1++)
> {
> (scat_1);
> }
>
> In tree SSA the computation of the bounds of the loop is done on long
> signed integers and looks as"
> for
> (scat_1=0;scat_1<=4294967296*floord(maxLen_5-minLen_4,-4294967296)+maxLen_5-minLen_4;scat_1++)
> {
> (scat_1);
> }
>
> The expression
> -4294967296*floord(maxLen_5-minLen_4,4294967296)
> is translated to
> 4294967296*floord(maxLen_5-minLen_4,-4294967296)
> In the second case the result of the floord is -1 instead of 0 and the loop
> is not entered..

Great finding! Any idea why this is the case?

Tobias

Vladimir Kargov

unread,
Apr 7, 2014, 3:54:51 AM4/7/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On 4 April 2014 18:20, Mircea Namolaru <mircea....@gmail.com> wrote:
> I think that the problem is already in the tree SSA code:
>
> For CLAST generated by CLooG:
> for
> (scat_1=0;scat_1<=-4294967296*floord(maxLen_5-minLen_4,4294967296)+maxLen_5-minLen_4;scat_1++)
> {
> (scat_1);
> }
>
> In tree SSA the computation of the bounds of the loop is done on long signed
> integers and looks as"
> for
> (scat_1=0;scat_1<=4294967296*floord(maxLen_5-minLen_4,-4294967296)+maxLen_5-minLen_4;scat_1++)
> {
> (scat_1);
> }
> The expression
> -4294967296*floord(maxLen_5-minLen_4,4294967296)
> is translated to
> 4294967296*floord(maxLen_5-minLen_4,-4294967296)
> In the second case the result of the floord is -1 instead of 0 and the loop is not entered..

Nice find, I thought /[fl] worked in a different way.

On 5 April 2014 00:06, Tobias Grosser <tob...@grosser.es> wrote:
> Great finding! Any idea why this is the case?

I looked more into the problem and found something very curious during
clast-to-gimple conversion in Graphite. The explanation turns out to
be kind of lenghthy, please bear with me.

Consider the stacktrace taken during clast-to-gimple conversion:
...
#4 0x0000000000e41b6f in make_node_stat (code=FLOOR_DIV_EXPR) at
../../gcc/tree.c:847
#5 0x0000000000e4e1c0 in build2_stat (code=FLOOR_DIV_EXPR,
tt=0x7ffff6ebce70, arg0=0x7ffff6eca168,
arg1=0x7ffff6ec98a0) at ../../gcc/tree.c:4162
#6 0x000000000087d1ec in build2_stat_loc (loc=0, code=FLOOR_DIV_EXPR,
type=0x7ffff6ebce70, arg0=0x7ffff6eca168,
arg1=0x7ffff6ec98a0) at ../../gcc/tree.h:3476
#7 0x00000000008e3df9 in fold_build2_stat_loc (loc=0,
code=FLOOR_DIV_EXPR, type=0x7ffff6ebce70,
op0=0x7ffff6eca168, op1=0x7ffff6ec98a0) at ../../gcc/fold-const.c:15196
#8 0x0000000000882c56 in fold_negate_expr (loc=0, t=0x7ffff6eca190)
at ../../gcc/fold-const.c:704
#9 0x0000000000883143 in negate_expr (t=0x7ffff6eca190) at
../../gcc/fold-const.c:790
#10 0x00000000008bcacf in fold_binary_loc (loc=0, code=MULT_EXPR,
type=0x7ffff6ebce70, op0=0x7ffff6eca190,
op1=0x7ffff6ec98a0) at ../../gcc/fold-const.c:11116
#11 0x00000000008e3dd2 in fold_build2_stat_loc (loc=0, code=MULT_EXPR,
type=0x7ffff6ebce70, op0=0x7ffff6eca190,
op1=0x7ffff6ec98a0) at ../../gcc/fold-const.c:15194
#12 0x00000000008b3ca5 in fold_binary_loc (loc=0, code=MULT_EXPR,
type=0x7ffff6ebce70, op0=0x7ffff6ec98a0,
op1=0x7ffff6eca190) at ../../gcc/fold-const.c:10181
#13 0x00000000008e3dd2 in fold_build2_stat_loc (loc=0, code=MULT_EXPR,
type=0x7ffff6ebce70, op0=0x7ffff6ec98a0,
op1=0x7ffff6eca190) at ../../gcc/fold-const.c:15194
#14 0x00000000012955c3 in clast_to_gcc_expression
(type=0x7ffff6ebce70, e=0x1f71870, ip=0x7fffffffd890)
at ../../gcc/graphite-clast-to-gimple.c:458
#15 0x0000000001295166 in clast_to_gcc_expression_red
(type=0x7ffff6ebce70, op=PLUS_EXPR, r=0x1f74ef0,
ip=0x7fffffffd890) at ../../gcc/graphite-clast-to-gimple.c:395
#16 0x000000000129564a in clast_to_gcc_expression
(type=0x7ffff6ebce70, e=0x1f74ef0, ip=0x7fffffffd890)
at ../../gcc/graphite-clast-to-gimple.c:477
#17 0x0000000001295166 in clast_to_gcc_expression_red
(type=0x7ffff6ebce70, op=MIN_EXPR, r=0x1f752d0,
ip=0x7fffffffd890) at ../../gcc/graphite-clast-to-gimple.c:395
#18 0x0000000001295668 in clast_to_gcc_expression
(type=0x7ffff6ebce70, e=0x1f752d0, ip=0x7fffffffd890)
at ../../gcc/graphite-clast-to-gimple.c:480
#19 0x0000000001296c01 in graphite_create_new_loop_guard
(entry_edge=0x7ffff6e57d90, stmt=0x1f94a00,
type=0x7fffffffd768, lb=0x7fffffffd770, ub=0x7fffffffd778,
ip=0x7fffffffd890)
at ../../gcc/graphite-clast-to-gimple.c:1140
#20 0x0000000001296ef5 in translate_clast_for
(context_loop=0x7ffff6ec1000, stmt=0x1f94a00, next_e=0x7ffff6e57d90,
bb_pbb_mapping=..., level=0, ip=0x7fffffffd890) at
../../gcc/graphite-clast-to-gimple.c:1220
#21 0x00000000012971c0 in translate_clast
(context_loop=0x7ffff6ec1000, stmt=0x1f94a00, next_e=0x7ffff6e57d90,
bb_pbb_mapping=..., level=0, ip=0x7fffffffd890) at
../../gcc/graphite-clast-to-gimple.c:1307
#22 0x00000000012972b4 in translate_clast
(context_loop=0x7ffff6ec1000, stmt=0x1f7b180, next_e=0x7ffff6e57d90,
bb_pbb_mapping=..., level=0, ip=0x7fffffffd890) at
../../gcc/graphite-clast-to-gimple.c:1327
#23 0x0000000001297dee in gloog (scop=0x1f73cc0, bb_pbb_mapping=...)
at ../../gcc/graphite-clast-to-gimple.c:1705
#24 0x0000000001293dbb in graphite_transform_loops () at
../../gcc/graphite.c:309
#25 0x0000000001293e50 in graphite_transforms () at ../../gcc/graphite.c:345
...

Graphite is building a new loop guard. For this purpose it needs to
convert the clast expression with the Graphite-optimized code back
into Gimple (fr #18, clast_to_gcc_expression())

In frame #13 it is making an expression which is the product of
"((signed long) maxLen_5(D) - (signed long) minLen_3(D)) /[fl]
4294967296" and "-4294967296" using the constant sub-tree folding
facilities. The result should be "(((signed long) maxLen_5(D) -
(signed long) minLen_3(D)) /[fl] 4294967296) * -4294967296" which is
the correct expression that stand-alone cloog produced on our clast
data.

However, in frame #10 you can see that folding tries to optimize the
expression along the way by negating both of its parts (ok, no problem
yet):

/* Transform x * -C into -x * C if x is easily negatable. */
if (TREE_CODE (arg1) == INTEGER_CST
&& tree_int_cst_sgn (arg1) == -1
&& negate_expr_p (arg0)
&& (tem = negate_expr (arg1)) != arg1
&& !TREE_OVERFLOW (tem))
return fold_build2_loc (loc, MULT_EXPR, type,
fold_convert_loc (loc, type,
negate_expr (arg0)),
tem);

In frame 9 (negate_expr) it tries to find a simplified negation for
"((signed long) maxLen_5(D) - (signed long) minLen_3(D)) /[fl]
4294967296"

In frame 8 (fold_negate_expr) it finally picks the "appropriate"
negation. Which is (gasp)
"((signed long) maxLen_5(D) - (signed long) minLen_3(D)) /[fl] -4294967296".

This negation looks WRONG to me, since, as Mircea noted,
(1/[fl]4294967296)*-4294967296 = 0 BUT (1/[fl]-4294967296)*4294967296
= -1.

Looking at the code in fold_negate_expr() you can see that only
problems with overflow are taken into consideration for /[fl], and our
case where its denominator changes the sign is NOT checked:

case TRUNC_DIV_EXPR:
case ROUND_DIV_EXPR:
case FLOOR_DIV_EXPR:
case CEIL_DIV_EXPR:
case EXACT_DIV_EXPR:
/* In general we can't negate A / B, because if A is INT_MIN and
B is 1, we may turn this into INT_MIN / -1 which is undefined
and actually traps on some architectures. But if overflow is
undefined, we can negate, because - (INT_MIN / 1) is an
overflow. */
if (!INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_UNDEFINED (type))
{
const char * const warnmsg = G_("assuming signed overflow does not "
"occur when negating a division");
tem = TREE_OPERAND (t, 1);
if (negate_expr_p (tem))
{
if (INTEGRAL_TYPE_P (type)
&& (TREE_CODE (tem) != INTEGER_CST
|| integer_onep (tem)))
fold_overflow_warning (warnmsg, WARN_STRICT_OVERFLOW_MISC);
return fold_build2_loc (loc, TREE_CODE (t), type, // <==
yep, we execute this!
TREE_OPERAND (t, 0), negate_expr (tem));
}

The same logical fallacy can be seen in negate_expr_p() which checks
if an expression should be negated:

case TRUNC_DIV_EXPR:
case ROUND_DIV_EXPR:
case FLOOR_DIV_EXPR:
case CEIL_DIV_EXPR:
case EXACT_DIV_EXPR:
/* In general we can't negate A / B, because if A is INT_MIN and
B is 1, we may turn this into INT_MIN / -1 which is undefined
and actually traps on some architectures. But if overflow is
undefined, we can negate, because - (INT_MIN / 1) is an
overflow. */
if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
{
if (!TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (t)))
break;
/* If overflow is undefined then we have to be careful because
we ask whether it's ok to associate the negate with the
division which is not ok for example for
-((a - b) / c) where (-(a - b)) / c may invoke undefined
overflow because of negating INT_MIN. So do not use
negate_expr_p here but open-code the two important cases. */
if (TREE_CODE (TREE_OPERAND (t, 0)) == NEGATE_EXPR
|| (TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST
&& may_negate_without_overflow_p (TREE_OPERAND (t, 0))))
return true;
}

I commented the line "case FLOOR_DIV_EXPR:" in negate_expr_p(), thus,
disabling incorrect negation optimization for /[fl]. And, rightfully
so, the test is now giving the correct result.

Bug in constant sub-tree folding?

Peace,
--
Vladimir

Vladimir Kargov

unread,
Apr 7, 2014, 5:20:09 AM4/7/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
Small correction:

> This negation looks WRONG to me, since, as Mircea noted,
(1/[fl]4294967296)*-4294967296 = 0 BUT (1/[fl]-4294967296)*4294967296
= -1.
I said that (1/[fl]-4294967296)*4294967296 = -1, while it's clearly
-4294967296. This correction does not invalidate anything in my last
post.

--
Vladimir

Mircea Namolaru

unread,
Apr 10, 2014, 11:07:58 AM4/10/14
to Vladimir Kargov, Tobias Grosser, GCC GRAPHITE
Great ! Indeed the folding of floord is responsible for this problem.

It is not a Graphite problem - Graphite only exposed incorrect folding of the floord  operator.

I suggest sending  a patch with your fix to gcc-patches - I am not sure that it covers other possible wrong folding of FLOORD beside this example (and maybe also the folding of CEIL should be reviewed). But enough to fix this problem and another open Graphite bug on bugzilla:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55022
I've checked it, same problem - wrong code caused by folding of floord at tree-SSA level. And with Vladimir fix the code returned to work.

Mircea



 

Vladimir Kargov

unread,
Apr 11, 2014, 12:06:52 AM4/11/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On 10 April 2014 19:07, Mircea Namolaru <mircea....@gmail.com> wrote:
> Great ! Indeed the folding of floord is responsible for this problem.
> It is not a Graphite problem - Graphite only exposed incorrect folding of
> the floord operator.

Thanks for reconfirming.

> I suggest sending a patch with your fix to gcc-patches - I am not sure that
> it covers other possible wrong folding of FLOORD beside this example (and
> maybe also the folding of CEIL should be reviewed). But enough to fix this
> problem and another open Graphite bug on bugzilla:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55022

CEIL seems to be equally affected by the problem. I'll post the patch
after some thinking when floor and ceil _can_ be optimized the old way
so that the opimization would still work when possible. (Any ideas how
to check that, by the way, other than proving that both values that
are being interchanged are known to be of equal sign OR abs(numerator)
>= abs(denominator)?)

> I've checked it, same problem - wrong code caused by folding of floord at
> tree-SSA level. And with Vladimir fix the code returned to work.

It also fixed bzip2 which was initially reported in the thread, so
looks like a pretty universal cure.

--
Vladimir

Vladimir Kargov

unread,
Apr 11, 2014, 12:58:06 AM4/11/14
to Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
On 11 April 2014 08:06, Vladimir Kargov <kar...@gmail.com> wrote:
> when floor and ceil _can_ be optimized the old way so that the opimization would still work when possible
> other than proving that both values that are being interchanged are known to be of equal sign OR abs(numerator) >= abs(denominator)
Scratch that, I think the only case when -(a/[fl/cl]b) ==
(a/[fl/cl]-b) can happen is when (a mod b) == 0 (i.e. there is no
remainder), since otherwise the result will be rounded to the other
integer after negation than it was before. This condition seems to be
hard to check at compile time for anything other than constants and I
don't think there are many uses which would benefit from this.

Sorry for double posting again.
--
Vladimir

Albert Cohen

unread,
Apr 11, 2014, 1:26:44 AM4/11/14
to Vladimir Kargov, Tobias Grosser, gcc-graphite, Mircea Namolaru
Thanks a lot for ironing this out, Vladimir, Mircea, and for the patch. A conservative patch may be just fine for now. The general problem is slightly combinatorial because of the many ways to lower integer division.

Albert

Le 11 avr. 2014 06:06, Vladimir Kargov <kar...@gmail.com> a écrit :
>
> On 10 April 2014 19:07, Mircea Namolaru <mircea....@gmail.com> wrote:
> > Great ! Indeed the folding of floord is responsible for this problem.
> > It is not a Graphite problem - Graphite only exposed incorrect folding of
> > the floord  operator.
>
> Thanks for reconfirming.
>
> > I suggest sending  a patch with your fix to gcc-patches - I am not sure that
> > it covers other possible wrong folding of FLOORD beside this example (and
> > maybe also the folding of CEIL should be reviewed). But enough to fix this
> > problem and another open Graphite bug on bugzilla:
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55022
>
> CEIL seems to be equally affected by the problem. I'll post the patch
> after some thinking when floor and ceil _can_ be optimized the old way
> so that the opimization would still work when possible. (Any ideas how

> to check that, by the way, other than proving that both values that

> are being interchanged are known to be of equal sign OR abs(numerator)

> >= abs(denominator)?)
>
> > I've checked it, same problem - wrong code caused by folding of floord at
> > tree-SSA level. And with Vladimir fix the code returned to work.
>
> It also fixed bzip2 which was initially reported in the thread, so
> looks like a pretty universal cure.
>
> --
> Vladimir
>

> --
>
> ---
> You received this message because you are subscribed to the Google Groups "GCC GRAPHITE" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gcc-graphite...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sven Verdoolaege

unread,
Apr 11, 2014, 5:01:19 AM4/11/14
to Vladimir Kargov, Mircea Namolaru, Tobias Grosser, GCC GRAPHITE
Indeed, I believe at that point it's too late to check for this
condition. Note that the isl AST generator does try to check
for this condition and will generate different operations depending
on the outcome.

skimo

Vladimir Kargov

unread,
Apr 11, 2014, 7:55:25 AM4/11/14
to Albert Cohen, Sven Verdoolaege, Tobias Grosser, gcc-graphite, Mircea Namolaru
On 11 April 2014 09:26, Albert Cohen <Albert...@inria.fr> wrote:
> A conservative patch may be just fine for now.
Richard Biener has just assigned bug 55022 to himself and wrote he'll
be testing the conservative version, so I guess this makes him the one
to commit the fix.

On 11 April 2014 13:01, Sven Verdoolaege <sk...@kotnet.org> wrote:
> Note that the isl AST generator does try to check for this condition and will generate different operations depending on the outcome.
Interesting, though I think there must be some other uses of floor
division outside Graphite which may not do such smart checking.

--
Vladimir

Vladimir Kargov

unread,
Mar 2, 2015, 9:41:15 PM3/2/15
to aeonio...@gmail.com, GCC GRAPHITE
On 3 March 2015 at 04:46, <aeonio...@gmail.com> wrote:

>"Fixing" graphite for that is probably uncalled for unless
> errors in gcc proper can be ruled out.

It should be very easy to check that. If compiling the program with
-O3 produces a running executable and using "-O3 -fgraphite-identity"
breaks it, then either Graphite itself, or some other compiler
component internally used by Graphite are to blame, like it was in
this case (which was fixed almost a year ago, by the way).
If compiling the code with -O3 alone breaks the executable, then
you've got a problem with some other optimization, most likely
unrelated to Graphite.

--
Vladimir

aeonio...@gmail.com

unread,
Mar 2, 2015, 11:22:49 PM3/2/15
to gcc-gr...@googlegroups.com, kar...@gmail.com, tob...@grosser.es, mircea....@gmail.com
Err, I don't know if this is relevant anymore, but I'd like to point out that -O3 is not supported on gentoo in general due to the fact that it breaks programs. "Fixing" graphite for that is probably uncalled for unless errors in gcc proper can be ruled out.

On the other hand the last time I tried compiling my system with graphite that didn't work out so well. Now that I know more about what the options do things might be different, but unfortunately I'm no longer running gentoo so I can't really test it. :(
Reply all
Reply to author
Forward
0 new messages