Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Mixed arithmetic

408 views
Skip to first unread message

Bart

unread,
Aug 9, 2018, 11:32:07 AM8/9/18
to
If I run this code, without or without the LL, then I get the same
result with gcc on Windows or Linux as a 32- or 64- bit program:

#include <stdio.h>

int main(void) {
long long int value=0;

if (!(-2147483648 <= value && value <= 4294967295LL)) {
printf("outside range: %lld\n",value);
}
}


That is, it should print nothing. But on gcc 4.4.4 for Linux on 32-bit
ARM, it triggers the message.

Is the program at fault or is it a buggy gcc? It does give the warning:

'this decimal warning is unsigned only in ISO C90'.

So is that second compare done as signed or unsigned? It needs to be
signed 64-bit.

(The purpose of the code is to detect values that are within the range
of both 32-bit signed ints and unsigned ints, so within approx -2
billion to +4 billion.)

(This is the same gcc that didn't like repeated typedef names. And if I
try -std=c99, it goes crazy and throws out everything.)

--
bart

jacobnavia

unread,
Aug 9, 2018, 11:52:13 AM8/9/18
to
Le 09/08/2018 à 17:32, Bart a écrit :
> #include <stdio.h>
>
> int main(void) {
> long long int value=0;
>
>     if (!(-2147483648 <= value && value <= 4294967295LL)) {
>         printf("outside range: %lld\n",value);
>     }
> }

lcc-win prints nothing
clang doesn'ty print anything either.

Barry Schwarz

unread,
Aug 9, 2018, 12:11:38 PM8/9/18
to
What are the sizes of int, long int, and long long int?
Remove del for email

Tim Rentsch

unread,
Aug 9, 2018, 3:43:18 PM8/9/18
to
What happens when you compile using 'gcc -pedantic-errors' ?

Bart

unread,
Aug 9, 2018, 5:43:26 PM8/9/18
to
On 09/08/2018 20:43, Tim Rentsch wrote:
> Bart <b...@freeuk.com> writes:

>> int main(void) {
>> long long int value=0;
>>
>> if (!(-2147483648 <= value && value <= 4294967295LL)) {
>> printf("outside range: %lld\n",value);
>> }
>> }

>> (This is the same gcc that didn't like repeated typedef names. And if
>> I try -std=c99, it goes crazy and throws out everything.)

[That's when I used -std=c99 on the main applications, which are 10-25Kloc]

> What happens when you compile using 'gcc -pedantic-errors' ?

It says C90 doesn't support long long. And with just -pedantic, it gives
a warning rather than an error. But it's extraordinary however that in
the absence of any such options, it will choose to say nothing over such
a significant short-coming.

Using -std=c99, this short program works.

But then...

On 09/08/2018 17:11, Barry Schwarz wrote:
> What are the sizes of int, long int, and long long int?

4, 4 and 8 respectively.

So even though it doesn't support long long, it appears to have a size
for it!

(I think I'll have to investigate the use of -std=c99 here, although
with my actual applications, that gives hundreds of struct/member errors
on code that compiles fine with gcc on other systems. Maybe it doesn't
like my anonymous structs and unions? I'd have to see.

Or maybe just forget this particular implementation. Unfortunately this
Ubuntu is too old to update and I don't think I can update gcc by itself.)

--
bart


james...@alumni.caltech.edu

unread,
Aug 9, 2018, 6:49:04 PM8/9/18
to
On Thursday, August 9, 2018 at 11:32:07 AM UTC-4, Bart wrote:
> If I run this code, without or without the LL, then I get the same
> result with gcc on Windows or Linux as a 32- or 64- bit program:
>
> #include <stdio.h>
>
> int main(void) {
> long long int value=0;
>
> if (!(-2147483648 <= value && value <= 4294967295LL)) {
> printf("outside range: %lld\n",value);
> }
> }
>
>
> That is, it should print nothing. But on gcc 4.4.4 for Linux on 32-bit
> ARM, it triggers the message.
>
> Is the program at fault or is it a buggy gcc? It does give the warning:
>
> 'this decimal warning is unsigned only in ISO C90'.
>
> So is that second compare done as signed or unsigned? It needs to be
> signed 64-bit.
>
> (The purpose of the code is to detect values that are within the range
> of both 32-bit signed ints and unsigned ints, so within approx -2
> billion to +4 billion.)

What output do you get from the following program, when compiled the same way as above?:

#include <stdio.h>
#include <limits.h>

int main(void)
{
long long int value=0;

if (!(-2147483648 <= value && value <= 4294967295LL)) {
printf("outside range: %lld\n",value);
}
printf("%d %ld %lld\n", INT_MAX, LONG_MAX, LLONG_MAX);
printf("2147483648:%lld\n", (long long)2147483648);
printf("-2147483648:%lld\n", (long long)-2147483648);
printf("4294967295:%lld\n", (long long)4294967295);
return 0;
}

Tim Rentsch

unread,
Aug 9, 2018, 8:10:09 PM8/9/18
to
Bart <b...@freeuk.com> writes:

> On 09/08/2018 20:43, Tim Rentsch wrote:
>
>> Bart <b...@freeuk.com> writes:
>>
>>> int main(void) {
>>> long long int value=0;
>>>
>>> if (!(-2147483648 <= value && value <= 4294967295LL)) {
>>> printf("outside range: %lld\n",value);
>>> }
>>> }
>>>
>>> (This is the same gcc that didn't like repeated typedef names. And if
>>> I try -std=c99, it goes crazy and throws out everything.)
>
> [That's when I used -std=c99 on the main applications, which are 10-25Kloc]
>
>> What happens when you compile using 'gcc -pedantic-errors' ?
>
> It says C90 doesn't support long long. And with just -pedantic, it
> gives a warning rather than an error. But it's extraordinary however
> that in the absence of any such options, it will choose to say nothing
> over such a significant short-coming.

If you're compiling with no options, that could explain both the
acceptances of 'long long' and the result you're getting, ie,
that the message is printed.

> Using -std=c99, this short program works.
>
> But then...
>
> On 09/08/2018 17:11, Barry Schwarz wrote:
>
>> What are the sizes of int, long int, and long long int?
>
> 4, 4 and 8 respectively.
>
> So even though it doesn't support long long, it appears to have a size
> for it!

Again that could be explained by compiling with no options
given.

Try this test:

if( -2147483647L -1 <= value && value <= 4294967295LL ){
printf( "value is within range\n" );
} else {
printf( "value is outside range\n" );
}

There's a fair chance that will work for you.

> (I think I'll have to investigate the use of -std=c99 here, although
> with my actual applications, that gives hundreds of struct/member
> errors on code that compiles fine with gcc on other systems. Maybe it
> doesn't like my anonymous structs and unions? I'd have to see.

The C standard added support for anonymous structs and unions in
C11. So you could try -std=c11. If the compiler you have
doesn't support that option, you might try -std=gnu99 (which I
have used only in short tests but I think it may be what you're
looking for).

Bart

unread,
Aug 9, 2018, 9:00:22 PM8/9/18
to
On 09/08/2018 23:48, james...@alumni.caltech.edu wrote:
> On Thursday, August 9, 2018 at 11:32:07 AM UTC-4, Bart wrote:

>> int main(void) {
>> long long int value=0;
>>
>> if (!(-2147483648 <= value && value <= 4294967295LL)) {
>> printf("outside range: %lld\n",value);
>> }
>> }
>>
>>
>> That is, it should print nothing. But on gcc [4.4.3] for Linux on 32-bit
>> ARM, it triggers the message.

> What output do you get from the following program, when compiled the same way as above?:
>
> #include <stdio.h>
> #include <limits.h>
>
> int main(void)
> {
> long long int value=0;
>
> if (!(-2147483648 <= value && value <= 4294967295LL)) {
> printf("outside range: %lld\n",value);
> }
> printf("%d %ld %lld\n", INT_MAX, LONG_MAX, LLONG_MAX);
> printf("2147483648:%lld\n", (long long)2147483648);
> printf("-2147483648:%lld\n", (long long)-2147483648);
> printf("4294967295:%lld\n", (long long)4294967295);
> return 0;
> }
>

Output is:

....:~$ ./a.out
outside range: 0
2147483647 2147483647 9223372036854775807
2147483648:2147483648
-2147483648:2147483648
4294967295:4294967295

Same with a gcc 4.8.x, both running on 32-bit ARM Linux.

But on 32/64-bit programs under Windows 64 and Linux 64, both on x64,
that third line of numbers shows both negative.

--
bart

anti...@math.uni.wroc.pl

unread,
Aug 9, 2018, 9:35:22 PM8/9/18
to
Hmm, IIUC in C89 (without long long) on 32-bit machines your programs
has undefined behaviour: the largest "official" type is long and
2147483648 does not fit: it leads to overflow. It is probably
wise to use -2147483648LL to make sure that compiler either
rejects it (if there is no support for long long) or handles
it correctly without overflow.

--
Waldek Hebisch

David Brown

unread,
Aug 10, 2018, 4:36:12 AM8/10/18
to
You have hit a somewhat interesting corner case here. The details
depend on the particular standard version being used, the variant being
used, and the size of "long" on the target. It does not depend on the
version of gcc, as far as I have tested - and (shock! horror! Who would
have guess?) gcc is generating the correct code as well as giving you a
(somewhat obscure) warning that your code might not do what you think it
does.

First, note that without a specific "-std" option, gcc 4.4 uses the
"gnu89" "standard". gcc 5 onwards defaults to "gnu11".

In this code, there is no relevant difference between "gnu11" and
standard C11. Indeed, here this is no difference from standard C99.

The key relevant difference here between "gnu89" and "c89 -pedantic" is
that "gnu89" supports "long long int" even though that type does not
exist in C89.


Secondly, note that "long" is 32-bit on the ARM, gcc for Windows, and
gcc for Linux with the "-m32" switch. "Long" is 64-bit on 64-bit linux.
"Long long" is 64-bit on all relevant platforms.


Armed with this, it is easy to see why you would get compile failures
compiling in stricter C89/C90 modes - the type "long long int" does not
exist, and is a constraint error. So we move on to "gnu89" compared to C99.

The critical point is the integer constant 2147583648 - equal to 2^31.
Note that in C, something that looks like a negative integer constant is
actually a /positive/ integer constant with the unary minus operator
applied. So the constant is /not/ -2147583648, which would fit in a
32-bit signed int - it is 2147583648 and then the minus is applied.
2147583648 does not fit in a signed int. So what happens to it? The
rules have changed between C89 and C99.

In C99, the table in 6.4.4.1 shows that if a decimal constant does not
fit in "int", the next one tried is "long int", and then if necessary
"long long int". At no point are unsigned types used (unless you use a
"u" suffix, or octal or hexadecimal). So the constant here will be of
type "long int" on 64-bit Linux, and "long long int" on the other
platforms discussed. The minus is applied without changing the type,
everything is in range, everyone is happy, and no "outside range"
message is printed.

In C89, the order for applying types to a decimal constant is "int",
"long int", "unsigned long int". On 64-bit Linux, that would mean "long
int" - and again, everything is in range and everyone is happy (assuming
you are using "gnu89" to allow "long long int value;").

For the other targets here, in C89, 2147583648 is of type "unsigned long
int". Applying the unary minus to it gives the same value - 2147583648
of type "unsigned long int". And from there, it is clear that the
program will print "outside range".

gcc - in C89 mode - gives the warning "this decimal constant is unsigned
only in ISO C90". That's not /quite/ accurate, since it also applies to
"gnu89" mode, but it's close enough.


In summary, in C99 modes, the code is fine and no "outside range" should
be printed. In C89 modes, the code is in error due to the "long long
int" type (and the "LL" suffix). In "C89 with a touch of C99" modes,
you could argue that the compiler could have borrowed the handling of
large integer constants from C99 when it borrowed the large integer
types. However, a principle of gcc's "extended" modes is that if you
give the compiler correct, standard C89 code then it will interpret it
the same way in "c89" and "gnu89" modes. The extensions are added
features, not changes to existing features, and C89 says that 2147583648
is an unsigned long int constant (when "long int" is 32-bit).


Since I don't know what standards, or modified standards, your other
compilers use by default, I can't say if they are doing the right thing
or not.



Tim Rentsch

unread,
Aug 10, 2018, 6:33:36 AM8/10/18
to
anti...@math.uni.wroc.pl writes:

> Bart <b...@freeuk.com> wrote:
>> On 09/08/2018 23:48, james...@alumni.caltech.edu wrote:
>>> On Thursday, August 9, 2018 at 11:32:07 AM UTC-4, Bart wrote:
>>>
>>>> int main(void) {
>>>> long long int value=0;
>>>>
>>>> if (!(-2147483648 <= value && value <= 4294967295LL)) {
>>>> printf("outside range: %lld\n",value);
>>>> }
>>>> }

[...]

> Hmm, IIUC in C89 (without long long) on 32-bit machines your
> programs has undefined behaviour: the largest "official" type is
> long and 2147483648 does not fit: it leads to overflow. It is
> probably wise to use -2147483648LL to make sure that compiler
> either rejects it (if there is no support for long long) or
> handles it correctly without overflow.

The rules for integer constants in C89/C90 are not the same as
those in C99 and later. For unsuffixed decimal constants, the
type is the first of 'int', 'long int', and 'unsigned long int',
that will hold the constant's value. So 2147483648 (and also
4294967295, without any suffix) do not result in overflow but
are legitimate constants of type unsigned long int (ie, if the
implementation has 32-bit wide longs).

To get the desired value as type long rather than unsigned long,
the expression

(-2147483647L - 1)

can be written, and this will work for all implementations except
the oddball ones that don't use 2's complement or do use 2's
complement but use the 0x80000000 bit pattern to be (what is
later called) a trap representation. And of course it will work
on any implementation where long is wider than 32 bits.

To be completely safe, the left hand comparison could be written

value >= 0 || -(value+1) <= 2147483647

which sidesteps the problem of needing to represent -2147483648.
Also, as an additional benefit, it works and avoids surprises
in cases where 'value' has an unsigned type.

james...@alumni.caltech.edu

unread,
Aug 10, 2018, 8:04:59 AM8/10/18
to
OK - that makes it clear what's happening. The key point is that integer
constants in C cannot be negative. 2147483648 is an integer constant,
but -2147483648 is not an integer constant, it's an expression, applying
the unary minus operator to the integer constant 2147483648.

On the system you're compiling for, 2147483648 is too big to be
represented as an int. It's also too long to be represented as a long
int. Under C99 rules, that would mean that 2147483648 would have the
type long long. However, since you insist on using gcc with no options
selected, what it implements is not C, but GnuC. GnuC supports long
long, even in GnuC90. However, it would appear that the version of GnuC
you're using implements a variant of the C90 rules. Under those rules,
if an integer constant is too big to be represented by a long int, but
is small enough to be represented as an unsigned long int, then it has
the type unsigned long int.

The expression -2147483648 is evaluated by first taking the negative of
2147483648, which is a value that cannot be represented as an unsigned
long int. Therefore, the number that is 1 more than ULONG_MAX is added
to it as many times as needed to bring it into the range that is
representable. ULONG_MAX is presumably, 4294967295, so a single addition
is enough: (-2147483648) + 4294967296 == 2147483648, so that's why you
got that result.

This is why ULONG_MAX is probably defined for that machine as something like (-2147483647-1).

Bart

unread,
Aug 10, 2018, 10:51:35 AM8/10/18
to
On 10/08/2018 09:36, David Brown wrote:
> On 09/08/18 17:32, Bart wrote:
>> If I run this code, without or without the LL, then I get the same
>> result with gcc on Windows or Linux as a 32- or 64- bit program:

> You have hit a somewhat interesting corner case here. The details
> depend on the particular standard version being used, the variant being
> used, and the size of "long" on the target. It does not depend on the
> version of gcc, as far as I have tested - and (shock! horror! Who would
> have guess?) gcc is generating the correct code as well as giving you a
> (somewhat obscure) warning that your code might not do what you think it
> does.

Here's a more interesting test:

#include <stdio.h>

int main(void){

long long int value = 0;

printf("value %lld:\n",value);

if (-2147483648 <= value && value <= 2147483647) {
puts("in int range");
}
if (0 <= value && value <= 4294967296) {
puts("in word range");
}
if (-2147483648 <= value && value <= 4294967296) {
puts("in intword range");
}
if (-2147483648 <= value && value <= 4294967295) {
puts("in intword range2");
}
}

Most Windows compilers generate this output ('word' means unsigned int):

value 0:
in int range
in word range
in intword range
in intword range2

Except for Tiny C and MSVC, which generate

value 0:
in word range

So only the second if-condition is true. This is the same output on
32-bit Linux with that funny gcc version.

>
> The critical point is the integer constant 2147583648 - equal to 2^31.
> Note that in C, something that looks like a negative integer constant is
> actually a /positive/ integer constant with the unary minus operator
> applied. So the constant is /not/ -2147583648, which would fit in a
> 32-bit signed int - it is 2147583648 and then the minus is applied.
> 2147583648 does not fit in a signed int. So what happens to it? The
> rules have changed between C89 and C99.

I see. On my C compiler 2147583648 has long long type so subsequent
operations like NEG will have a long long result too.

OK, so the challenge then is how to generate code than will work on any
C compiler, one that ought to support 64-bit operations. You wouldn't
have thought that would be a problem in 2018.

The original source for that first test uses "int.minvalue", which has
the internal value -2147583648, but the representation in C involves
"2147583648" which is what gives the problems.

I could reduce by range by 1 to get round it, and mostly that will work
(it is usually to decide whether something will fit into a 32-bit
field), but sometimes that int.minvalue figure has to be exact.

I will try outputting such boundary values as (-2147583647-1), as I
think I saw suggested, and see what happens. Testing on the above code
shows it might work.

(It is still odd how a major compiler like MSVC can produce such a
dramatically different result with the same code.)

--
bart

David Brown

unread,
Aug 10, 2018, 11:22:44 AM8/10/18
to
Did you read my post?

None of your results here are surprising.

Nor is it surprising that you don't appreciate the difference between
C90 and C99 in these cases, even though I and a couple of others have
explained it. Nor is it surprising that you don't know what C standards
the various compilers support (or claim to support), nor that you
haven't posted the flags used to affect such support.

>
>>
>> The critical point is the integer constant 2147583648 - equal to 2^31.
>> Note that in C, something that looks like a negative integer constant is
>> actually a /positive/ integer constant with the unary minus operator
>> applied. So the constant is /not/ -2147583648, which would fit in a
>> 32-bit signed int - it is 2147583648 and then the minus is applied.
>> 2147583648 does not fit in a signed int. So what happens to it? The
>> rules have changed between C89 and C99.
>
> I see. On my C compiler 2147583648 has long long type so subsequent
> operations like NEG will have a long long result too.

That's fine for C99. It is wrong for C90. Which standard do you aim to
support? (If the answer is neither, and you just aim to support
something roughly C-like, then that's fine - but you can't then suggest
that your compiler is "better" than other tools in how it treats code.)

>
> OK, so the challenge then is how to generate code than will work on any
> C compiler, one that ought to support 64-bit operations. You wouldn't
> have thought that would be a problem in 2018.

Nor is it a problem if you are willing to write correct code. If you
want support for 64-bit operations, use a compiler that supports C99 or
later (and /tell/ the compiler to use that mode!), or use a compiler
that has 64-bit "long".

If you use a compiler and standard that does not have types bigger than
32-bit, don't be surprised that you have problems with code that relies
on bigger values.

>
> The original source for that first test uses "int.minvalue", which has
> the internal value -2147583648, but the representation in C involves
> "2147583648" which is what gives the problems.
>
> I could reduce by range by 1 to get round it, and mostly that will work
> (it is usually to decide whether something will fit into a 32-bit
> field), but sometimes that int.minvalue figure has to be exact.
>
> I will try outputting such boundary values as (-2147583647-1), as I
> think I saw suggested, and see what happens. Testing on the above code
> shows it might work.

(-2147583647 - 1) should give you the value you want (on all realistic
targets, anyway). The order of evaluation here is 2147583647 is an
integer constant of type "int" (on 32-bit int platforms), the expression
-2147583647 is a constant integer expression of type int, and then
(-2147583647 - 1) is also a constant integer expression of type int and
value -2147583648.

Of course, since you are using type "long long int", you are really
asking for C99 anyway. Ask for it /properly/ - with the right compiler
flags as needed - and you can write your code clearly and simply as
-2147583648.

>
> (It is still odd how a major compiler like MSVC can produce such a
> dramatically different result with the same code.)
>

C99 and C90 differ in how this code is supposed to be interpreted. The
compilers are doing their job.

Bart

unread,
Aug 10, 2018, 11:56:24 AM8/10/18
to
On 10/08/2018 16:22, David Brown wrote:

>> I will try outputting such boundary values as (-2147583647-1), as I
>> think I saw suggested, and see what happens. Testing on the above code
>> shows it might work.
>
> (-2147583647 - 1) should give you the value you want (on all realistic
> targets, anyway). The order of evaluation here is 2147583647 is an
> integer constant of type "int" (on 32-bit int platforms), the expression
> -2147583647 is a constant integer expression of type int, and then
> (-2147583647 - 1) is also a constant integer expression of type int and
> value -2147583648.
>
> Of course, since you are using type "long long int", you are really
> asking for C99 anyway. Ask for it /properly/ - with the right compiler
> flags as needed - and you can write your code clearly and simply as
> -2147583648.

I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
this INT_MIN problem, it proceeded to throw out all my anonymous structs
and unions which some of my applications absolutely rely on.

So it's a minefield.

From my point of view, I'm coming to the end of this C targetting
project, as my apps now do generally work on multiple platforms via C.
Maybe look at a few lingering issues with some apps on ARM32.

But that's with gcc. To make them work across multiple compilers is yet
another minefield which I'm not interested in getting into again.

Generating mere assembly code seems so simple in comparison! Wasn't
targetting a HLL meant to make things easier?

--
bart

Keith Thompson

unread,
Aug 10, 2018, 2:42:55 PM8/10/18
to
Bart <b...@freeuk.com> writes:
[...]
> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
> this INT_MIN problem, it proceeded to throw out all my anonymous structs
> and unions which some of my applications absolutely rely on.

And you're surprised? Anonymous structs and unions were introduced
in C11.

It looks like gcc 4.4.3 doesn't support "-std=c11". Are you unable to
use a more modern compiler?

[...]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Ben Bacarisse

unread,
Aug 10, 2018, 3:45:55 PM8/10/18
to
Bart <b...@freeuk.com> writes:

> Generating mere assembly code seems so simple in comparison! Wasn't
> targetting a HLL meant to make things easier?

No. In general, language translation has always been considered harder
than targeting assembler. When it's done, it's usually to get the
benefit of an existing optimiser, or to make it easier to move between
systems (OSs and architectures).

--
Ben.

Bart

unread,
Aug 10, 2018, 3:51:30 PM8/10/18
to
On 10/08/2018 19:42, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
> [...]
>> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
>> this INT_MIN problem, it proceeded to throw out all my anonymous structs
>> and unions which some of my applications absolutely rely on.
>
> And you're surprised? Anonymous structs and unions were introduced
> in C11.
>
> It looks like gcc 4.4.3 doesn't support "-std=c11". Are you unable to
> use a more modern compiler?

Not on that machine.

Yet it apparently does support those features if you don't specify
'-std=c99'. It seems that adding -std=c99 enables some c99 features, but
disables others.

Anyway with the fix I did with INT_MIN, 4 out of 5 of my applications,
expressed as a single C file, compile and run on that machine (old Linux
netbook).

5 out of 5 run on the original RPi also with ARM32 but a 4.8 gcc, and
all five run on 4 other platforms via C (x86 and x64 with Windows and
Linux). (I don't have an ARM64 machine to test on.)

So I need to find out what the problem is with that app on that one machine.

--
bart

james...@alumni.caltech.edu

unread,
Aug 10, 2018, 4:09:49 PM8/10/18
to
On Friday, August 10, 2018 at 3:51:30 PM UTC-4, Bart wrote:
> On 10/08/2018 19:42, Keith Thompson wrote:
> > Bart <b...@freeuk.com> writes:
> > [...]
> >> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
> >> this INT_MIN problem, it proceeded to throw out all my anonymous structs
> >> and unions which some of my applications absolutely rely on.
> >
> > And you're surprised? Anonymous structs and unions were introduced
> > in C11.
> >
> > It looks like gcc 4.4.3 doesn't support "-std=c11". Are you unable to
> > use a more modern compiler?
>
> Not on that machine.
>
> Yet it apparently does support those features if you don't specify
> '-std=c99'. It seems that adding -std=c99 enables some c99 features, but
> disables others.

More accurately, it disables many GnuC features, some of which happen to also be C99 features. If you're going to insist on using gcc without command line options, you should make an effort to learn more about the language it implements when no options are selected. GnuC has numerous substantial differences from C.

Keith Thompson

unread,
Aug 10, 2018, 4:17:34 PM8/10/18
to
Bart <b...@freeuk.com> writes:
> On 10/08/2018 19:42, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>> [...]
>>> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
>>> this INT_MIN problem, it proceeded to throw out all my anonymous structs
>>> and unions which some of my applications absolutely rely on.
>>
>> And you're surprised? Anonymous structs and unions were introduced
>> in C11.
>>
>> It looks like gcc 4.4.3 doesn't support "-std=c11". Are you unable to
>> use a more modern compiler?
>
> Not on that machine.
>
> Yet it apparently does support those features if you don't specify
> '-std=c99'. It seems that adding -std=c99 enables some c99 features, but
> disables others.

What C99 features does it disable? (Anonymous structs and unions are
not a C99 feature.)

Have you tried '-std=gnu99'? Using gcc 4.4.7 on godbolt.org, that seems
to accept anonymous structures without complaint.

David Brown

unread,
Aug 10, 2018, 7:15:59 PM8/10/18
to
On 10/08/18 17:56, Bart wrote:
> On 10/08/2018 16:22, David Brown wrote:
>
>>> I will try outputting such boundary values as (-2147583647-1), as I
>>> think I saw suggested, and see what happens. Testing on the above code
>>> shows it might work.
>>
>> (-2147583647 - 1) should give you the value you want (on all realistic
>> targets, anyway).  The order of evaluation here is 2147583647 is an
>> integer constant of type "int" (on 32-bit int platforms), the expression
>> -2147583647 is a constant integer expression of type int, and then
>> (-2147583647 - 1) is also a constant integer expression of type int and
>> value -2147583648.
>>
>> Of course, since you are using type "long long int", you are really
>> asking for C99 anyway.  Ask for it /properly/ - with the right compiler
>> flags as needed - and you can write your code clearly and simply as
>> -2147583648.
>
> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
> this INT_MIN problem, it proceeded to throw out all my anonymous structs
> and unions which some of my applications absolutely rely on.
>
> So it's a minefield.

C99 does not have anonymous structs and unions - it is not surprising
that the compiler complained about the code. Nor does C90. They were
not part of the C standards until C11. "gnu89" and "gnu99" both have
them as extensions.

So if you want to program in "C99 with gnu extensions", use "gnu99" as
your standard of choice with gcc. That is absolutely fine - it is the
standard I always used (specified explicitly as a compiler flag), until
"gnu11" was supported.

(gcc 4.4.3 is quite old, by the way. gcc has come a long way in the
last 8 years.)

>
> From my point of view, I'm coming to the end of this C targetting
> project, as my apps now do generally work on multiple platforms via C.
> Maybe look at a few lingering issues with some apps on ARM32.
>
> But that's with gcc. To make them work across multiple compilers is yet
> another minefield which I'm not interested in getting into again.
>
> Generating mere assembly code seems so simple in comparison! Wasn't
> targetting a HLL meant to make things easier?
>

It /is/ easier, but you have to know what language you are targetting.
You can't tell the details by testing random compilers with jumbled
standards support (including your own, supporting a mash of different
standards and extensions and modifications to C).

David Brown

unread,
Aug 10, 2018, 7:20:56 PM8/10/18
to
On 10/08/18 21:51, Bart wrote:
> On 10/08/2018 19:42, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>> [...]
>>> I tried -std=c99 on that funny gcc 4.4.3 compiler, and while it fixed
>>> this INT_MIN problem, it proceeded to throw out all my anonymous structs
>>> and unions which some of my applications absolutely rely on.
>>
>> And you're surprised?  Anonymous structs and unions were introduced
>> in C11.
>>
>> It looks like gcc 4.4.3 doesn't support "-std=c11".  Are you unable to
>> use a more modern compiler?
>
> Not on that machine.
>
> Yet it apparently does support those features if you don't specify
> '-std=c99'. It seems that adding -std=c99 enables some c99 features, but
> disables others.
>

No, specifying "-std=c99" enables /all/ C99 features and disables most
non-C99 features (it still allows a few extras - use "-pedantic" to weed
them out if you want). In particular, it disables some of the language
extensions that the default "-std=gnu89" supported.

The features you are wanting here - anonymous structs and unions - are
not C99 features. They are C11 features, but have been supported in gcc
(and some other compilers) as an extension for a great many years. So
that means you need either C11 or a "gnuXX" standard.

You also want "long long int" support - that means C99 or a "gnuXX"
standard. And you want "long long int" integer constants - that means
C99 or "gnu99" onwards.

And you don't want C11, since you want to use gcc compilers older than
gcc 5.

This all boils down to "-std=gnu99" being the right choice for you. Use
that, and I think many of your issues here will go away.

Bart

unread,
Aug 10, 2018, 9:14:50 PM8/10/18
to
On 10/08/2018 20:51, Bart wrote:

> So I need to find out what the problem is with that app on that one
> machine.

That problem is illustrated here:

char data[100];

// typedef double T;
typedef long long int T;

int main(void)
{
T x;
T *p,*q;

p = &x;
q = (T*)(&data[1]);

*p = *q;
}

As it is, this does an assignment of a 64-bit int from an unaligned
source. And it works (if it didn't, I would have done something about it
long ago.)

But change the typedef of T to double, and it says 'Illegal
instruction'. Presumably this processor (some sort of 32-bit ARM)
doesn't like unaligned accesses for floating point data (but the one in
the RPi is OK with it).

Odd. I don't do much with unaligned accesses (the data[] array
represents byte-code data from a file which may contain packed floating
point values at any odd index), but now I can fix that knowing the problem.

I don't think it's a C issue. Certainly enabling a bunch of extra
warnings didn't throw up anything. (But then they usually don't detect
real problems.)

--
bart

Ian Collins

unread,
Aug 10, 2018, 10:07:47 PM8/10/18
to
On 11/08/18 13:14, Bart wrote:
> On 10/08/2018 20:51, Bart wrote:
>
>> So I need to find out what the problem is with that app on that one
>> machine.
>
> That problem is illustrated here:
>
> char data[100];
>
> // typedef double T;
> typedef long long int T;
>
> int main(void)
> {
> T x;
> T *p,*q;
>
> p = &x;
> q = (T*)(&data[1]);

When you lie to the compiler, weird shit happens.

If you know your tools and your platform, you know why weird shit happens.

--
Ian.

Bart

unread,
Aug 11, 2018, 7:13:01 AM8/11/18
to
On 11/08/2018 03:07, Ian Collins wrote:
> On 11/08/18 13:14, Bart wrote:
>> On 10/08/2018 20:51, Bart wrote:
>>
>>> So I need to find out what the problem is with that app on that one
>>> machine.
>>
>> That problem is illustrated here:
>>
>>       char data[100];
>>
>> //  typedef double T;
>>       typedef long long int T;
>>
>>       int main(void)
>>       {
>>           T x;
>>           T *p,*q;
>>
>>           p = &x;
>>           q = (T*)(&data[1]);
>
> When you lie to the compiler, weird shit happens.

This is a hardware issue. Nothing to do with C or gcc other than, in
this case, it chooses to use fldd and fstd via d-registers to do a
64-bit move even though no floating operation is being performed.

And apparently, on this machine, floating point accesses to and from the
d-registers need to be aligned, as unaligned access is either not
possible or not enabled.

--
bart

Richard Damon

unread,
Aug 11, 2018, 7:38:09 AM8/11/18
to
Yes, many processors have requirements that certain multi-byte data
types need to be properly aligned to be accessed in the native manner.
Some handle unaligned access by falling back automatically to a slower
access (their rules for alignment are more advisory), some will trap
(giving the user the option to write a trap routine to 'fix' the issue,
and some just silently get the wrong value. Floating point units are
much more apt to have this sort of restriction.

I am a bit surprised that you say 'it works', as that probably means you
really have only done this with a very few different families of
processors to avoid having run into the issue.

The C language, because the issue is well known for a long time, has
implementation dependent rules about pointer alignment (implementation
dependent because it really does depend on the target machine). You tend
to not get warnings about it, because you have an explicit cast, which
tends to mean to the compiler that you know what you are doing so please
do this, an the conversion of a char buffer in this manner is not
uncommon, it just requires that you know you have made a properly
aligned pointer.

Ben Bacarisse

unread,
Aug 11, 2018, 9:27:37 AM8/11/18
to
Bart <b...@freeuk.com> writes:

> On 11/08/2018 03:07, Ian Collins wrote:
>> On 11/08/18 13:14, Bart wrote:
>>> On 10/08/2018 20:51, Bart wrote:
>>>
>>>> So I need to find out what the problem is with that app on that one
>>>> machine.
>>>
>>> That problem is illustrated here:
>>>
>>>       char data[100];
>>>
>>> //  typedef double T;
>>>       typedef long long int T;
>>>
>>>       int main(void)
>>>       {
>>>           T x;
>>>           T *p,*q;
>>>
>>>           p = &x;
>>>           q = (T*)(&data[1]);
>>
>> When you lie to the compiler, weird shit happens.
>
> This is a hardware issue. Nothing to do with C or gcc other than, in
> this case, it chooses to use fldd and fstd via d-registers to do a
> 64-bit move even though no floating operation is being performed.

Well it has something to do with C in that C does not say that this code
will work, thereby allowing an implementation to use a naive hardware
access. And it has something to do with gcc in that gcc appears to take
advantage of that permission.

memcpy(&x, data+1);

should work and might even be optimised away in cases where the hardware
permits an unaligned load.

--
Ben.

Bart

unread,
Aug 11, 2018, 9:35:05 AM8/11/18
to
On 11/08/2018 12:37, Richard Damon wrote:
> On 8/10/18 9:14 PM, Bart wrote:

> I am a bit surprised that you say 'it works', as that probably means you
> really have only done this with a very few different families of
> processors to avoid having run into the issue.

I'm only interested in x86 and ARM families at the minute, as they are
the ones likely to be inside the machines I can get my hands on.

x86 family has allowed unaligned access for 40 years. And even the most
recent versions only require alignment for certain instrucions.

So if someone is developing a product that will run on either of those
two families, they might not need to consider it. (But ARM I think has
rather more variations.)

It would anyway not be something you would routinely do - nearly all
accesses /will/ be aligned. In this case input was coming from a packed
file format where alignment wasn't observed.

But it's really not a big deal - if it ever causes a problem, then you
fix it, as I have.

> The C language, because the issue is well known for a long time, has
> implementation dependent rules about pointer alignment (implementation
> dependent because it really does depend on the target machine). You tend
> to not get warnings about it, because you have an explicit cast, which
> tends to mean to the compiler that you know what you are doing so please
> do this, an the conversion of a char buffer in this manner is not
> uncommon, it just requires that you know you have made a properly
> aligned pointer.

The only other problem I've had on ARM was the opposite: gcc thought I
was doing misaligned pointer accesses, and proceeded to do
byte-at-a-time transfers, even though the processor /could/ do
unaligned. And even though the actual transfer /was/ aligned!

In this case it just slowed down my program by a factor of three.

--
bart

Bart

unread,
Aug 11, 2018, 9:55:36 AM8/11/18
to
On 11/08/2018 14:27, Ben Bacarisse wrote:
> Bart <b...@freeuk.com> writes:

>> This is a hardware issue. Nothing to do with C or gcc other than, in
>> this case, it chooses to use fldd and fstd via d-registers to do a
>> 64-bit move even though no floating operation is being performed.
>
> Well it has something to do with C in that C does not say that this code
> will work, thereby allowing an implementation to use a naive hardware
> access. And it has something to do with gcc in that gcc appears to take
> advantage of that permission.
>
> memcpy(&x, data+1);
>
> should work and might even be optimised away in cases where the hardware
> permits an unaligned load.

I fixed it by changing two lines of original source (to use 64-bit int
not 64-bit float). That was then turned into C without changing that
translator.

(This is unlike the INT_MIN problem where I had to change the translator
to make use of the workaround within C; the original still used
'int.minvalue').

And it was compiled from C without changing any options either:

gcc pc_lin32.c -opc -lm -ldl

Two are of these are necessary to make it work. Otherwise there is
nothing of the raft of extra options that people sometimes advocate.

This part should just work.

This why I said it was nothing to do with C. I would have had to make
the same mod if targetting native ARM code (if I'd been silly enough to
use a floating point load and store in the first place).


--
bart

Ben Bacarisse

unread,
Aug 11, 2018, 10:00:02 AM8/11/18
to
Bart <b...@freeuk.com> writes:

> On 11/08/2018 14:27, Ben Bacarisse wrote:
>> Bart <b...@freeuk.com> writes:
>
>>> This is a hardware issue. Nothing to do with C or gcc other than, in
>>> this case, it chooses to use fldd and fstd via d-registers to do a
>>> 64-bit move even though no floating operation is being performed.
>>
>> Well it has something to do with C in that C does not say that this code
>> will work, thereby allowing an implementation to use a naive hardware
>> access. And it has something to do with gcc in that gcc appears to take
>> advantage of that permission.
>>
>> memcpy(&x, data+1);
>>
>> should work and might even be optimised away in cases where the hardware
>> permits an unaligned load.
>
> I fixed it by changing two lines of original source (to use 64-bit int
> not 64-bit float). That was then turned into C without changing that
> translator.

That's the sort of fix I used to be paid to fix. One day, you'll use a
system where the fixed code does not work either.

> This part should just work.
>
> This why I said it was nothing to do with C.

I still don't understand why you say that. C has something to say about
the code regardless of the 64-but type used.

--
Ben.

james...@alumni.caltech.edu

unread,
Aug 11, 2018, 10:25:47 AM8/11/18
to
On Saturday, August 11, 2018 at 9:55:36 AM UTC-4, Bart wrote:
...
> And it was compiled from C without changing any options either:
>
> gcc pc_lin32.c -opc -lm -ldl

No, that's not compiled from C. gcc can compile many different languages, including pascal and fortran. With the options you've chosen, it is NOT compiling C, it's compiling GnuC.

> Two are of these are necessary to make it work. Otherwise there is
> nothing of the raft of extra options that people sometimes advocate.
>
> This part should just work.
>
> This why I said it was nothing to do with C.

Well, that's true enough. You're not using a C compiler, so the rules of C are irrelevant. If you had been compiling with C, those rules would have been very relevant.

Bart

unread,
Aug 11, 2018, 1:03:33 PM8/11/18
to
On 11/08/2018 15:25, james...@alumni.caltech.edu wrote:
> On Saturday, August 11, 2018 at 9:55:36 AM UTC-4, Bart wrote:
> ...
>> And it was compiled from C without changing any options either:
>>
>> gcc pc_lin32.c -opc -lm -ldl
>
> No, that's not compiled from C. gcc can compile many different languages, including pascal and fortran. With the options you've chosen, it is NOT compiling C, it's compiling GnuC.

I know you guys like to show off your knowledge of C and its tools and
what counts as 'C' and what doesn't. But here you are in danger of
talking rubbish.

Let's take a marginally different version of that program (which leaves
out some OS-specific calls), and compile it the same way on that ARM
machine:

gcc pc_nos32.c -opc -lm -ldl

It now compiles and runs as before (well, without being able to use
dlopen/dlsym).

This EXACT SAME FILE can also be compiled under Windows (-m32 is needed
as this gcc defaults to -m64):

gcc -m32 pc_nos32.c -opc.exe

And it also runs the same way. But apparently that's still not C, so
let's try one more option:

gcc -std=c11 -m32 pc_nos32.c -opc.exe

And it still compiles (this time with no warnings at all) and still
runs. As does this:

gcc -std=c99 -m32 pc_nos32.c -opc.exe

And this, although with a handful of warnings:

gcc -std=c90 -m32 pc_nos32.c -opc.exe

Actually, on Windows, if I need to run a C version, I might use this:

gcc -std=c11 -O3 pc_win32.c -opc.exe

There are two warnings this time (in 21,000 lines of code), but it now
runs at maximum speed and with OS-specific functions available.

You still think I'm not compiling C? Well if I'm not, then, does it
matter? Whatever the language is, then gcc can compile it, and so can
MSVC. It's not Fortran or Pascal anyway.

> Well, that's true enough. You're not using a C compiler


So it's compiling a language called gnuC that appears to be a subset or
superset of C99 and C11 that also miraculously compiles with MSVC.

Don't forget that most people have to use actual compilers to compile
actual subsets, supersets or dialects of C.

I would guess then that somewhere between 90 and 99% of C programmers
aren't really C programmers at all, if your yardstick is whether they
are using 100% conforming compilers and writing 100% pure C code
according to the C standard.

Which is a nonsensical conclusion.

--
bart

Bart

unread,
Aug 11, 2018, 1:15:15 PM8/11/18
to
On 11/08/2018 18:03, Bart wrote:

>
> And this, although with a handful of warnings:
>
>    gcc -std=c90 -m32 pc_nos32.c -opc.exe
>
> Actually, on Windows, if I need to run a C version, I might use this:
>
>   gcc -std=c11 -O3 pc_win32.c -opc.exe

That last one was pc_win64.c.

The pc_nos32.c file is here:
https://github.com/sal55/mlang/blob/master/dist/pc_nos32.c as a piece of
generated C code, not as a program to run.

--
bart

james...@alumni.caltech.edu

unread,
Aug 11, 2018, 3:12:26 PM8/11/18
to
No, it's GnuC - a language sufficiently similar to C for GnuC code to
often compile as C code, with the same behavior, and sufficiently
different from C to provide you with an endless stream of examples of
code not behaving the way the C standard requires it to behave. Which
is, in fact, what you've done over the past several years. You
constantly show code that is supposed to work one way when a conforming
implementation of C is used to compile it, and then you compile it using
gcc with no options, and then complain about the the fact that the
behavior is different. It's different because gcc with no options
selected is NOT a conforming implementation of C, it is an
implementation of GnuC, and every single discrepancy you've noted is a
fresh example of that fact.
Sooner or later you need to get this into your head, particularly if
you're going to continue insisting on using gcc without selecting the
options that need to be selected in order for it to conform to the C
standard.

> > Well, that's true enough. You're not using a C compiler
>
>
> So it's compiling a language called gnuC that appears to be a subset or
> superset of C99 and C11 that also miraculously compiles with MSVC.

It's neither a superset nor a subset, but simply a very similar language
with lots and lost of differences that can trip you up if you don't pay
enough attention to the fact that it is, in fact, a different language.

> Don't forget that most people have to use actual compilers to compile
> actual subsets, supersets or dialects of C.

Most developers have a lot less trouble than you do figuring out how to
configure their systems to compile the particular version of C that they
want to compile.

> I would guess then that somewhere between 90 and 99% of C programmers
> aren't really C programmers at all, if your yardstick is whether they
> are using 100% conforming compilers and writing 100% pure C code
> according to the C standard.

A fair number of people go out of their way to deliberately put their
compilers into modes that fully conform with a particular version of the
C standard. I have no solid figures on how many, but I'd be surprised if
it was small as 10%, and I simply don't believe your claim that it might
be as small as 1%.

> Which is a nonsensical conclusion.

GnuC exists for a reason - some people prefer it to C, so that doesn't
strike me as particularly nonsensical. What does strike me as
nonsensical is using GnuC unintentionally, simply because you can't be
bothered to choose the appropriate command line arguments to make gcc
compile something else. I still strongly recommend that if you're going
to continue doing this, you should get rid of your C text books and your
copy of the C standard, and start reading up on how GnuC works, because
that's the language you're actually using.

Bart

unread,
Aug 11, 2018, 4:13:24 PM8/11/18
to
On 11/08/2018 20:12, james...@alumni.caltech.edu wrote:
> On Saturday, August 11, 2018 at 1:03:33 PM UTC-4, Bart wrote:

>> Don't forget that most people have to use actual compilers to compile
>> actual subsets, supersets or dialects of C.
>
> Most developers have a lot less trouble than you do figuring out how to
> configure their systems to compile the particular version of C that they
> want to compile.

OK, so what are the options do I use to get gcc to compile actual C?

Since apparently using -std=c11 is not enough.


> GnuC exists for a reason - some people prefer it to C, so that doesn't
> strike me as particularly nonsensical. What does strike me as
> nonsensical is using GnuC unintentionally, simply because you can't be
> bothered to choose the appropriate command line arguments to make gcc
> compile something else. I still strongly recommend that if you're going
> to continue doing this, you should get rid of your C text books and your
> copy of the C standard, and start reading up on how GnuC works, because
> that's the language you're actually using.


If I type in C code online via rextester.com, I will choose of the the C
compilers they offer. That's *C* compiler, not C++, Ada, Go, Fortran or
any of the other alternatives.

They don't make a big deal of the fact that these might be slightly
different dialects - of C. The default standard for C (gcc) is set (via
options) to gnu99, when I bother display them, which would be unusual.
That's the same for C (clang). For C (vc), there is no option to specify
the standard. Presumably "C (vc)" is enough.

Funnily there is no option to just use "C" without a specific
implementation.

So some people take what is strictly C and what isn't, less seriously
than you do. rextester.com could have chosen to have set up default
options to be the same as what you hopefully you will tell me in
response to my query, but for reasons of their own, they didn't. Maybe
most people are happy to write gnu99 code, or they know nor don't care
about subtle distinctions between the C variations.

One more question: if someone's job is writing code using gnu C, are
they allowed to call themselves a C programmer or not?

--
bart

Bart

unread,
Aug 11, 2018, 4:31:57 PM8/11/18
to
On 11/08/2018 14:59, Ben Bacarisse wrote:
> Bart <b...@freeuk.com> writes:

>> I fixed it by changing two lines of original source (to use 64-bit int
>> not 64-bit float). That was then turned into C without changing that
>> translator.
>
> That's the sort of fix I used to be paid to fix. One day, you'll use a
> system where the fixed code does not work either.

That's just ordinary porting. You try an application on a new machine,
and either it works or it doesn't. If it doesn't, you fix it.

Sometimes, the fix might be incompatible with making it work with an
existing systems. But that's just programming which is made up of such
routine problems.

(This application happened to be an interpreter - special dispensation
applies. You port the one application just once to each platform, so one
lot of effort, and a thousand applications that it runs will
automatically work on the machine [in theory]. That's better than
porting 1000 applications.)

>> This part should just work.
>>
>> This why I said it was nothing to do with C.
>
> I still don't understand why you say that. C has something to say about
> the code regardless of the 64-but type used.

Because the issue exists regardless of which language might be used.

I know people are trying to pin the fault on me and my alleged
misunderstanding of how C compilers are used. But in this case they're
wrong.

--
bart

anti...@math.uni.wroc.pl

unread,
Aug 11, 2018, 4:45:47 PM8/11/18
to
Bart <b...@freeuk.com> wrote:
>
> I know people are trying to pin the fault on me and my alleged
> misunderstanding of how C compilers are used. But in this case they're
> wrong.
>

You told compiler that your pointer is a pointer to double.
On this ARM doubles have to be properly aligned, but your
pointer was unaligned. So you lied to the compiler. On
x86 your lie was harmless (the pointer was not distingushable
from pointer to double). On ARM you run into trouble.
Correct code would not lie to compiler, so there will be
no problem at all. If type lie is essential for performace
people at least put some safeguards to make sure it causes
no troubles. And yes, things that you despise like
conditional compilation and configure are helpful there.

--
Waldek Hebisch

Bart

unread,
Aug 11, 2018, 5:01:50 PM8/11/18
to
By that token then every cast is a lie to the compiler.

--
bart


Keith Thompson

unread,
Aug 11, 2018, 5:10:50 PM8/11/18
to
Bart <b...@freeuk.com> writes:
> On 11/08/2018 20:12, james...@alumni.caltech.edu wrote:
>> On Saturday, August 11, 2018 at 1:03:33 PM UTC-4, Bart wrote:
>>> Don't forget that most people have to use actual compilers to compile
>>> actual subsets, supersets or dialects of C.
>>
>> Most developers have a lot less trouble than you do figuring out how to
>> configure their systems to compile the particular version of C that they
>> want to compile.
>
> OK, so what are the options do I use to get gcc to compile actual C?
>
> Since apparently using -std=c11 is not enough.

"gcc -std=c11" should correctly compile correct C11 code.

"gcc -std=c11 -pedantic" should additionally produce all diagnostics
required by the C11 standard. (Some diagnostics are non-fatal
warnings, and there's no clear distinction between required
diagnostics and non-required warnings. Using "-pedantic-errors"
should partially alleviate that issue.)

I don't know what you mean by "actual C", so this likely doesn't
answer your question. It certainly isn't new information, and
you still have questions, so what you meant to ask is probably
different from what I understand your question to be. But in the
future, if the compiler doesn't behave in the way you expect it to
for some particular C code, telling us that you compiled it with
"gcc -std=c11 -pedantic" (or "-pedantic-errors") may make it easier
for us to help you -- if you want help.

Bart

unread,
Aug 11, 2018, 6:11:53 PM8/11/18
to
On 11/08/2018 22:10, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
>> On 11/08/2018 20:12, james...@alumni.caltech.edu wrote:
>>> On Saturday, August 11, 2018 at 1:03:33 PM UTC-4, Bart wrote:
>>>> Don't forget that most people have to use actual compilers to compile
>>>> actual subsets, supersets or dialects of C.
>>>
>>> Most developers have a lot less trouble than you do figuring out how to
>>> configure their systems to compile the particular version of C that they
>>> want to compile.
>>
>> OK, so what are the options do I use to get gcc to compile actual C?
>>
>> Since apparently using -std=c11 is not enough.
>
> "gcc -std=c11" should correctly compile correct C11 code.
>
> "gcc -std=c11 -pedantic" should additionally produce all diagnostics
> required by the C11 standard. (Some diagnostics are non-fatal
> warnings, and there's no clear distinction between required
> diagnostics and non-required warnings. Using "-pedantic-errors"
> should partially alleviate that issue.)
>
> I don't know what you mean by "actual C",

I mean whatever James means by "C", as he clearly thinks the language I
am using and generating is not C.

But I'm puzzled because I did use exactly those options, and my code
compiled OK in that there were no errors, no warnings, and it ran
perfectly well, but he was certain that it was not C anyway.

What does he want, blood?

Of course if I use -pedantic, then I get many thousands of lines which
are largely repetitions of these:

pc_nos32.c:1252:5: warning: ISO C forbids conversion of function
pointer

pc_nos32.c:2379:5: warning: pointer targets in initialization differ
in signedness [-Wpointer-sign]

pc_nos32.c:4819:34: warning: pointer targets in passing argument 2 of
'strcpy' differ in signedness [-Wpointer-sign]

So pedantic is right.

If getting such warnings means I am not writing code in the C language,
according to JK, then so be it.

But I would suggest that HE is the one being pedantic and that he is
wrong. Or he's just desperate to make out that I don't know and don't
use C, no matter WHAT my actual code is or how it is compiled.

Or maybe he genuinely has such a narrow, abstract view of what "C" is.

(By the same measure, very few people have used any language properly
because they necessarily have to use actual implementations, not
abstract ones. That's why I said 90-99% may not be C programmers.)

so this likely doesn't
> answer your question. It certainly isn't new information, and
> you still have questions, so what you meant to ask is probably
> different from what I understand your question to be. But in the
> future, if the compiler doesn't behave in the way you expect it to
> for some particular C code, telling us that you compiled it with
> "gcc -std=c11 -pedantic" (or "-pedantic-errors") may make it easier
> for us to help you -- if you want help.

OK, so if my complaint is that a compiler doesn't take a fault seriously
(ie as a fatal error), it's because I haven't used -pedantic-errors.

Which makes pretty much everything an error, including things one should
reasonably be able to want or need to do.

That sounds a bit of a cop-out to me, but OK.

--
bart

Chris M. Thomasson

unread,
Aug 11, 2018, 6:20:48 PM8/11/18
to
What about the following casts:
______________________
#include <stdio.h>


struct child
{
int a;
};

struct parent
{
struct child child;
int b;
};


int main(int argc, char *argv[])
{
struct parent parent = { { 1 }, 2 };

struct child* child = (struct child*)&parent.child;

printf(
"(%p):child->a = %d\n"
"(%p):parent.child.a = %d\n"
"(%p):parent.b = %d\n\n",
(void*)child, child->a,
(void*)&parent, parent.child.a,
(void*)&parent, parent.b
);

child->a = 41;
parent.child.a += 1;
parent.b = 43;

printf(
"(%p):child->a = %d\n"
"(%p):parent.child.a = %d\n"
"(%p):parent.b = %d\n\n",
(void*)child, child->a,
(void*)&parent, parent.child.a,
(void*)&parent, parent.b
);

return 0;
}
______________________

?

Ian Collins

unread,
Aug 11, 2018, 6:22:54 PM8/11/18
to
On 11/08/18 23:12, Bart wrote:
> On 11/08/2018 03:07, Ian Collins wrote:
>> On 11/08/18 13:14, Bart wrote:
>>> On 10/08/2018 20:51, Bart wrote:
>>>
>>>> So I need to find out what the problem is with that app on that one
>>>> machine.
>>>
>>> That problem is illustrated here:
>>>
>>>       char data[100];
>>>
>>> //  typedef double T;
>>>       typedef long long int T;
>>>
>>>       int main(void)
>>>       {
>>>           T x;
>>>           T *p,*q;
>>>
>>>           p = &x;
>>>           q = (T*)(&data[1]);
>>
>> When you lie to the compiler, weird shit happens.
>
> This is a hardware issue. Nothing to do with C or gcc other than, in
> this case, it chooses to use fldd and fstd via d-registers to do a
> 64-bit move even though no floating operation is being performed.

Still, you told the compiler the conversion was safe by using a cast. It
wasn't so you (all be it inadvertently) lied to the compiler. I see
this kind of bug all to often porting between systems. As Ben pointed
out, memcpy() would have been a safe option. It may not be the most
efficient, but it would have been safe. If you want fast and safe,
you'll have to go beyond standard C has to offer.

--
Ian.

Ian Collins

unread,
Aug 11, 2018, 6:25:46 PM8/11/18
to
On 12/08/18 10:11, Bart wrote:
> On 11/08/2018 22:10, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>>> On 11/08/2018 20:12, james...@alumni.caltech.edu wrote:
>>>> On Saturday, August 11, 2018 at 1:03:33 PM UTC-4, Bart wrote:
>>>>> Don't forget that most people have to use actual compilers to compile
>>>>> actual subsets, supersets or dialects of C.
>>>>
>>>> Most developers have a lot less trouble than you do figuring out how to
>>>> configure their systems to compile the particular version of C that they
>>>> want to compile.
>>>
>>> OK, so what are the options do I use to get gcc to compile actual C?
>>>
>>> Since apparently using -std=c11 is not enough.
>>
>> "gcc -std=c11" should correctly compile correct C11 code.
>>
>> "gcc -std=c11 -pedantic" should additionally produce all diagnostics
>> required by the C11 standard. (Some diagnostics are non-fatal
>> warnings, and there's no clear distinction between required
>> diagnostics and non-required warnings. Using "-pedantic-errors"
>> should partially alleviate that issue.)
>>
>> I don't know what you mean by "actual C",
>
> I mean whatever James means by "C", as he clearly thinks the language I
> am using and generating is not C.
>
> But I'm puzzled because I did use exactly those options, and my code
> compiled OK in that there were no errors, no warnings, and it ran
> perfectly well, but he was certain that it was not C anyway.

Because you lied to the compiler. You told it you knew better and the
conversion was safe.

--
Ian.

Bart

unread,
Aug 11, 2018, 6:37:46 PM8/11/18
to
No. The pc_nos32.c file in my post had been fixed.

It compiled fine with -std=c90, -std=c99 and -std=c11.

But it still wasn't C. He was sure of it even without looking at the code!

I suspect he was sure of it because it might actually be impossible to
write a large body of practical code in absolutely correct, pure
abstract C. So he might have been right insofar as NO real program is in
'proper' C, including mine. No need to actually see it.

--
bart

Ben Bacarisse

unread,
Aug 11, 2018, 7:40:17 PM8/11/18
to
Bart <b...@freeuk.com> writes:

> On 11/08/2018 14:59, Ben Bacarisse wrote:
>> Bart <b...@freeuk.com> writes:

>>> This why I said it was nothing to do with C.
>>
>> I still don't understand why you say that. C has something to say about
>> the code regardless of the 64-but type used.
>
> Because the issue exists regardless of which language might be used.

Other languages might have defined semantics for such accesses. If you
targeted such a language, instead of C, your would never have even seen
the issue. Also, C has a way to do what you want in a defined way so it
seems plain that the issue is, in part, a C one.

> I know people are trying to pin the fault on me and my alleged
> misunderstanding of how C compilers are used. But in this case they're
> wrong.

I'm not trying to pin anything on you. I'm simply pointing out that the
issue you had is, in part, a C one.

--
Ben.

Bart

unread,
Aug 11, 2018, 7:52:07 PM8/11/18
to
On 11/08/2018 23:20, Chris M. Thomasson wrote:
> On 8/11/2018 2:01 PM, Bart wrote:

>> By that token then every cast is a lie to the compiler.
>>
>
> What about the following casts:

>     struct child* child = (struct child*)&parent.child;

>         (void*)child, child->a,
>         (void*)&parent, parent.child.a,
>         (void*)&parent, parent.b

>         (void*)child, child->a,
>         (void*)&parent, parent.child.a,
>         (void*)&parent, parent.b

I'm not sure what you're getting at there.

I sort of meant casts which are needed to change behaviour or to get a
program to compile.

These ones don't really do much. The first casts type T* to type T*,
which is not that controversial. Maybe the lie is because the cast
normally used to specify a different type, but it's the same one.

The casts from U* to void* aren't that much more exciting, and are
usually an implicit conversion when needed, but you are still sort of
saying that you want your U* type to be treated as a void*, so it is a
kind of lie.

BTW it has been pointed out - by experts - that the language I'm talking
about is not and never has been C. Exactly what that language is if it's
not C, I don't know and apparently neither do they. So make of that what
you will.


--
bart

anti...@math.uni.wroc.pl

unread,
Aug 11, 2018, 9:09:20 PM8/11/18
to
Well, C has rather restrictive rules about validity of
pointe casts. But in this case cast alone is not the
problem, the main thing is that you derefereced
the pointer and this is main lie.

If you want examples of legal use of casts, there is one
important case. Namely, consider a graph, that is set of
nodes connected by links, with nodes storing some extra
information. In C natural way to represent graphs is to
store node data in a structure and represent links as
pointers. If nodes are inhomogeneous, that is data
stored at various nodes have different structure, than
there is problem, because pointers to nodes will
have different types. I am not sure if it is possible
to write program working on such graph without using
casts. At least compiler would reject naive version
of such program reporting type errors. But if you
create union of node types, use pointer to union and
add casts between pointer to union and pointers to
member struct, then your program will work.

Another classic is storing pointers in integer variables.

If you work with specific implementation, then there
are more examples. On ARM microcontrollers, when
running on raw hardware (without OS) you access devices
by casting adresses (integers) to pointers. But
that is code which is "undefined C": it works because
GNU C dereferences such pointers by "memory" read/write
and hardware reacts on such accesses. The point is
that there are specific rules and if you break the
rules you may easily pass incorrect info (lie) to
the compiler.

Even on x86 you may have alignment problems: several SSE
instructions work only with aligned data and GNU C
may generate them. And ignoring alignment incorrect
pointer casts may easily give you wrong code.

--
Waldek Hebisch

james...@alumni.caltech.edu

unread,
Aug 11, 2018, 11:15:23 PM8/11/18
to
On Saturday, August 11, 2018 at 6:11:53 PM UTC-4, Bart wrote:
> On 11/08/2018 22:10, Keith Thompson wrote:
> > Bart <b...@freeuk.com> writes:
> >> On 11/08/2018 20:12, james...@alumni.caltech.edu wrote:
> >>> On Saturday, August 11, 2018 at 1:03:33 PM UTC-4, Bart wrote:
> >>>> Don't forget that most people have to use actual compilers to compile
> >>>> actual subsets, supersets or dialects of C.
> >>>
> >>> Most developers have a lot less trouble than you do figuring out how to
> >>> configure their systems to compile the particular version of C that they
> >>> want to compile.
> >>
> >> OK, so what are the options do I use to get gcc to compile actual C?
> >>
> >> Since apparently using -std=c11 is not enough.
> >
> > "gcc -std=c11" should correctly compile correct C11 code.
> >
> > "gcc -std=c11 -pedantic" should additionally produce all diagnostics
> > required by the C11 standard. (Some diagnostics are non-fatal
> > warnings, and there's no clear distinction between required
> > diagnostics and non-required warnings. Using "-pedantic-errors"
> > should partially alleviate that issue.)
> >
> > I don't know what you mean by "actual C",
>
> I mean whatever James means by "C", as he clearly thinks the language I
> am using and generating is not C.

No, I said you weren't compiling it as C code. And that comment was only
in reference to your compilations that failed to set the options needed
to put gcc into a standard-conforming mode. You did show one compilation
using -std=c11, and that compilation was in fact compilation as C code.
Strictly speaking, it needs "-pedantic" to be fully conforming, but
-std=c11 is a good start towards standards conformance.
I made no comments about that particular compilation; all my comments in
response to the message were about the other compilations you did of
that code.

> But I'm puzzled because I did use exactly those options, and my code
> compiled OK in that there were no errors, no warnings, and it ran
> perfectly well, but he was certain that it was not C anyway.

No, I made no comments about that compilation. None whatsoever, all of
my comments about not compiling it as C code were about the the other
compilations you did of that code.

> What does he want, blood?

What I want you to do (I know it's far too much to ask, but you did ask
what I want) is for you to stop complaining about the fact that when you
insist on compiling your code as GnuC code, the compiler behaves in
accordance with the rules of GnuC, which are often significantly
different from the rules of C. This would, of course, require you to
learn what those rules are, which is why I don't expect you to ever do
it.

> Of course if I use -pedantic, then I get many thousands of lines which
> are largely repetitions of these:
>
> pc_nos32.c:1252:5: warning: ISO C forbids conversion of function
> pointer
>
> pc_nos32.c:2379:5: warning: pointer targets in initialization differ
> in signedness [-Wpointer-sign]
>
> pc_nos32.c:4819:34: warning: pointer targets in passing argument 2 of
> 'strcpy' differ in signedness [-Wpointer-sign]
>
> So pedantic is right.
>
> If getting such warnings means I am not writing code in the C language,
> according to JK, then so be it.

It's not about writing code in the C language. It's about compiling it
as C code. When you use gcc without -std=c90, or c99, or c11, you're
not compiling it as C code, regardless of what you were thinking when
you wrote it.

> But I would suggest that HE is the one being pedantic and that he is
> wrong. Or he's just desperate to make out that I don't know and don't
> use C, no matter WHAT my actual code is or how it is compiled.

Look - it's you who makes it a big deal. You repeatedly show us some
code that is not compiling the way the C standard says it should
compile, and complain about that fact, and invariably it turns out that
the reason it's not compiling that way is that you're compiling it as
GnuC. If you would stop acting shocked by the fact that gcc behaves as
a GnuC compiler, it wouldn't matter to me much what language you thought
you were compiling your code as.

> Or maybe he genuinely has such a narrow, abstract view of what "C" is.

It's your complaints I'm talking about - you expect gcc to conform when
invoked with no options, and it doesn't. You can blame gcc for that, if
you want - but they want it compile GnuC by default - they think GnuC is
a better language than C, so you're not likely to convince them to
change.

> so this likely doesn't
> > answer your question. It certainly isn't new information, and
> > you still have questions, so what you meant to ask is probably
> > different from what I understand your question to be. But in the
> > future, if the compiler doesn't behave in the way you expect it to
> > for some particular C code, telling us that you compiled it with
> > "gcc -std=c11 -pedantic" (or "-pedantic-errors") may make it easier
> > for us to help you -- if you want help.
>
> OK, so if my complaint is that a compiler doesn't take a fault seriously
> (ie as a fatal error), it's because I haven't used -pedantic-errors.

It's also because you haven't selected a -std= option. GnuC allows a lot
of thing that standard C does not.

> Which makes pretty much everything an error, including things one should
> reasonably be able to want or need to do.

Actually, it doesn't - you're unreasonable about what you want to do.
Properly written C code does NOT trigger thousands of warnings when
using -pedantic. I could show you how to re-write your code to avoid
those warnings, but I doubt that you have any sincere interest in
avoiding them - you just want to complain about them.

Scott

unread,
Aug 12, 2018, 2:01:16 AM8/12/18
to
On Sun, 12 Aug 2018 01:09:12 +0000 (UTC), anti...@math.uni.wroc.pl
wrote:

>Another classic is storing pointers in integer variables.

The only way that's classic is as a classic example of a bad idea. C
has never promised that pointers and integers are compatible. "Don't
store pointers in integers" has been good advice for decades.

Chris M. Thomasson

unread,
Aug 12, 2018, 2:37:27 AM8/12/18
to
Fwiw, "sometimes", one can "benefit" from some undefined behavior in C.
Think of something along the lines of implementation specific behavior
wrt storing pointers in unsigned integers; uintptr_t if you will. Think
of stealing a bit or two from a pointer... Say, we simply know we are
running on archs that have unused bits in a void* pointer. Well,
sometimes we can really put those unused bits to work. Been there, done
that. We can use the stolen bits for all sorts of fun activities...

;^)

Bart

unread,
Aug 12, 2018, 6:28:24 AM8/12/18
to
On 12/08/2018 02:09, anti...@math.uni.wroc.pl wrote:
> Bart <b...@freeuk.com> wrote:

>> By that token then every cast is a lie to the compiler.

> Even on x86 you may have alignment problems: several SSE
> instructions work only with aligned data and GNU C
> may generate them. And ignoring alignment incorrect
> pointer casts may easily give you wrong code.
>

I remember a program where a Callback function, when compiled by gcc
with -O3 (which switched to using certain XMM instructions that need
special alignment) causes a crash because it assumed the stack was
16-byte aligned and it wasn't.

Whether compiled with -std=c11 or not, that was a bug. Since that was a
callback called from code where it doesn't know if it was even written
in C, it should have been more careful.

Now, I didn't make any lies in my C code and it was clearly marked as a
callback.

It requires, in fact, this attribute to make it work properly:

#define gcc_callback __attribute__ ((force_align_arg_pointer))

Maybe you can argue that the lie is in not writing this line; that is
likely what JK would argue.


--
bart

Bart

unread,
Aug 12, 2018, 6:57:02 AM8/12/18
to
On 12/08/2018 04:15, james...@alumni.caltech.edu wrote:
> On Saturday, August 11, 2018 at 6:11:53 PM UTC-4, Bart wrote:

>> I mean whatever James means by "C", as he clearly thinks the language I
>> am using and generating is not C.
>
> No, I said you weren't compiling it as C code. And that comment was only
> in reference to your compilations that failed to set the options needed
> to put gcc into a standard-conforming mode. You did show one compilation
> using -std=c11, and that compilation was in fact compilation as C code.

No. I showed four successive compilations using -std, followed by the
comment 'You still think I'm not compiling C?'.

You said 'No, it's GnuC'.

> What I want you to do (I know it's far too much to ask, but you did ask
> what I want) is for you to stop complaining about the fact that when you
> insist on compiling your code as GnuC code, the compiler behaves in
> accordance with the rules of GnuC, which are often significantly
> different from the rules of C.

If I take this program (c.c):

a,b,c; # implicit int

fred(); # implicit int and () params

main()
{
puts("hi there"); # implicit func decl
fred(10);
fred("twenty","thirty"); # contradictory args to fred
}

and compile as:

gcc -c c.c

then I do get a bunch of warnings, not errors. But if I follow your
advice and compile it like this:

gcc -c -std=c11 -pedantic c.c

Then I get exactly the same warnings. It MAKES NO DIFFERENCE!

So it does rather sound like you don't know what you're talking about.

I expect you will instead move on to the /version/ of gcc in an attempt
to make it my fault still.

Of course I can make that compilation fail by using -pedantic-errors,
but I can do that without using -std. Actually, without -std, I get
multiple errors; using -std=c11 I get only one, and not the first one
(it complains about puts not being declared).

> It's not about writing code in the C language. It's about compiling it
> as C code. When you use gcc without -std=c90, or c99, or c11, you're
> not compiling it as C code, regardless of what you were thinking when
> you wrote it.

And my little test shows it makes no difference. Do you really think
gnuC is so much more lax about things that matter? Seems silly to add so
many features make the language better, but then say, 'Let's be more
forgiving than standard C about implicit ints'!

> Actually, it doesn't - you're unreasonable about what you want to do.
> Properly written C code does NOT trigger thousands of warnings when
> using -pedantic.

On my intended platforms, void*, T* and T(*)() pointers can be cast to
each other and they all share the same 32- or 64-bit representation. So
here the compiler is being unreasonable. Avoiding casts to/from function
pointers sees especially fiddly.

How would YOU rewrite a table like this of mixed function pointers:

void *table[] = {malloc, realloc, free, printf, puts};

How would you use such a pointer afterwards for calling it?

Say, calling the pointer at table[2] knowing that it takes a void* argument.



--
bart

fir

unread,
Aug 12, 2018, 7:14:53 AM8/12/18
to
W dniu niedziela, 12 sierpnia 2018 12:57:02 UTC+2 użytkownik Bart napisał:
>
> If I take this program (c.c):
>
> a,b,c; # implicit int
>
> fred(); # implicit int and () params
>
> main()
> {
> puts("hi there"); # implicit func decl
> fred(10);
> fred("twenty","thirty"); # contradictory args to fred
> }
>

it seems to me that for c it is not implicit int as input but implicit "..."
which means more like 'many', implicit int is probably an return value

Ben Bacarisse

unread,
Aug 12, 2018, 7:54:57 AM8/12/18
to
Bart <b...@freeuk.com> writes:

> How would YOU rewrite a table like this of mixed function pointers:
>
> void *table[] = {malloc, realloc, free, printf, puts};

typedef void func(void);

func *table[] = {
(func *)malloc, (func *)realloc, (func *)free, (func *)printf, (func *)puts
};

> How would you use such a pointer afterwards for calling it?
>
> Say, calling the pointer at table[2] knowing that it takes a void*
> argument.

((void (*)(void *))table[2])(p);

Ghastly, but C is not geared to working with generic function pointers.
C's rules are designed so as to accommodate a wide range of
implementations, some of which will use calling conventions that depend
on the exact types.

There is no way to write a single call expression that is correct for
table[i] (where i is in range of course) because the functions all have
different types. Your get a little flexibility if your generic pointer
type has no prototype (i.e. void (*)()) but not much. The presence of
printf (with a ... in the prototype) means you must call at least
table[3] with the correct function type.

If you need to be both truly portable, and you need to be able to call
table[i] without writing the correct type for the ith function at the
call site, you will probably have to use "shim" functions -- little
functions (sometimes automatically generated) of some suitable universal
type that call the functions in question.

Because that's complicated, I've seen a code that just crosses its
fingers and calls functions through call expressions with the wrong
type, hoping for the best. It used to work because of the way
pre-standard C functions were defined, but it's been getting riskier and
riskier as calling conventions have got more and more sophisticated.

--
Ben.

Ike Naar

unread,
Aug 12, 2018, 8:26:09 AM8/12/18
to
On 2018-08-10, Bart <b...@freeuk.com> wrote:
> On 10/08/2018 09:36, David Brown wrote:
>> On 09/08/18 17:32, Bart wrote:
>>> If I run this code, without or without the LL, then I get the same
>>> result with gcc on Windows or Linux as a 32- or 64- bit program:
>
>> You have hit a somewhat interesting corner case here. The details
>> depend on the particular standard version being used, the variant being
>> used, and the size of "long" on the target. It does not depend on the
>> version of gcc, as far as I have tested - and (shock! horror! Who would
>> have guess?) gcc is generating the correct code as well as giving you a
>> (somewhat obscure) warning that your code might not do what you think it
>> does.
>
> Here's a more interesting test:
>
> #include <stdio.h>
>
> int main(void){
>
> long long int value = 0;
>
> printf("value %lld:\n",value);
>
> if (-2147483648 <= value && value <= 2147483647) {
> puts("in int range");
> }
> if (0 <= value && value <= 4294967296) {
> puts("in word range");

Is 4294967296 (33 bits wide) in word (32 bits wide) range?

Bart

unread,
Aug 12, 2018, 8:37:48 AM8/12/18
to
No, that's a mistake inside my translator, where I tried to be cool and
used 4294... whatever instead of 0xFFFF'FFFF to define 'word.maxvalue'.

(This C code is a much tidied version of that output, but I didn't touch
the numbers.)

Thanks...

--
bart

Ike Naar

unread,
Aug 12, 2018, 9:41:41 AM8/12/18
to
On 2018-08-11, Bart <b...@freeuk.com> wrote:
> On 11/08/2018 18:03, Bart wrote:
>
>>
>> And this, although with a handful of warnings:
>>
>> ?? gcc -std=c90 -m32 pc_nos32.c -opc.exe
>>
>> Actually, on Windows, if I need to run a C version, I might use this:
>>
>> ? gcc -std=c11 -O3 pc_win32.c -opc.exe
>
> That last one was pc_win64.c.
>
> The pc_nos32.c file is here:
> https://github.com/sal55/mlang/blob/master/dist/pc_nos32.c as a piece of
> generated C code, not as a program to run.

Here's a function from that URL:

> int64 pc_support$ipower(int64 a,int32 n) {
> if ((n <= 0) ) {

This looks like it should have been n < 0

> return 0;
> }
> else if ((n == 0)) {

otherwise this branch could never be selected

> return 1;
> }
> else if ((n == 1)) {
> return a;
> }
> else if (((n & 1) == 0)) {
> return pc_support$ipower((a * a),(n / 2));
> }
> else {
> return (a * pc_support$ipower((a * a),((n - 1) / 2)));
> }
> }

Bart

unread,
Aug 12, 2018, 9:49:59 AM8/12/18
to
On 12/08/2018 12:54, Ben Bacarisse wrote:
> Bart <b...@freeuk.com> writes:
>
>> How would YOU rewrite a table like this of mixed function pointers:
>>
>> void *table[] = {malloc, realloc, free, printf, puts};
>
> typedef void func(void);
>
> func *table[] = {
> (func *)malloc, (func *)realloc, (func *)free, (func *)printf, (func *)puts
> };

OK. I may have misremembered Tim's solution to the same problem a few
years back as being more elaborate.

(I think this type func* corresponds to 'ref proc' in the original
source and which I would have used, but there was a bug in initialising
such a table even with casts, as it didn't recognise it as constant, so
'ref void' or void* was a quick workaround and didn't need casts. And
which normally works in C is you lay off -pedantic.)

>> How would you use such a pointer afterwards for calling it?
>>
>> Say, calling the pointer at table[2] knowing that it takes a void*
>> argument.
>
> ((void (*)(void *))table[2])(p);
>
> Ghastly,

The actual code (where the details of each functions depend on various
other tables) is a lot worse. And some of it necessarily involves
'lying' to C.

--
bart

Bart

unread,
Aug 12, 2018, 9:52:17 AM8/12/18
to
OK, I will need to look at that in a few days.

Feel free to debug the rest!

--
bart

anti...@math.uni.wroc.pl

unread,
Aug 12, 2018, 10:13:52 AM8/12/18
to
Hmm,

| 7.20.1.4 Integer types capable of holding object pointers
|
| 1 The following type designates a signed integer type with the
| property that any valid pointer to void can be converted to
| this type, then converted back to pointer to void, and the
| result will compare equal to the original pointer:
|
| intptr_t

Sure, this type is optional, but if it is defined then we have
the promise that we can store pointers to void in such variables.

--
Waldek Hebisch

David Brown

unread,
Aug 12, 2018, 12:30:27 PM8/12/18
to
Every cast is a message to the compiler saying "I'm doing something odd
here, but I know it is correct". When you don't know it is correct, you
are lying to the compiler.

Richard Damon

unread,
Aug 12, 2018, 2:12:43 PM8/12/18
to
On 8/12/18 10:13 AM, anti...@math.uni.wroc.pl wrote:
>
> Hmm,
>
> | 7.20.1.4 Integer types capable of holding object pointers
> |
> | 1 The following type designates a signed integer type with the
> | property that any valid pointer to void can be converted to
> | this type, then converted back to pointer to void, and the
> | result will compare equal to the original pointer:
> |
> | intptr_t
>
> Sure, this type is optional, but if it is defined then we have
> the promise that we can store pointers to void in such variables.
>

Not only can we store a pointer to void in such a variable, but we can
convert any pointer to an object into a pointer to void, so ultimately
we can store any pointer to an object in one. We do need to somehow keep
track of the original type, as only by converting back to that type (or
something compatible) can we use it.

If we need to store a pointer to function into an integer type, we need
something else to provide the method (in many cases the same type will
work, but the C Standard doesn't require it),

Richard Damon

unread,
Aug 12, 2018, 2:26:51 PM8/12/18
to
On 8/12/18 6:56 AM, Bart wrote:
> On my intended platforms, void*, T* and T(*)() pointers can be cast to
> each other and they all share the same 32- or 64-bit representation. So
> here the compiler is being unreasonable. Avoiding casts to/from function
> pointers sees especially fiddly.
>
> How would YOU rewrite a table like this of mixed function pointers:
>
>  void *table[] = {malloc, realloc, free, printf, puts};
>
> How would you use such a pointer afterwards for calling it?
>
> Say, calling the pointer at table[2] knowing that it takes a void*
> argument.
>
>
>
> -

The C Standard says that for ALL function pointers, you are allowed to
cast them to a different type of function pointer, and then back, and
get a usable pointer to the original function.

Thus you can cast each function (which when used like that decays into a
pointer to that function) to some 'common' type, a common one is

void (*funp)(void)

and store that into the array. Then to use it, you cast it back to the
right function pointer type, and use it to call the function. Something like

((void (*)(void*))(table[2]))(ptr);

Note that changing between object pointers and function pointers is not
something defined by the C standard (and the pointer might even be of
different sizes) but for many machines it just happens to work.

The two main cases where it doesn't work are for Harvard Architecture
machines, where 'Program' space and 'Data' space are two very different
things (and may be very different sizes) and some models of segmented
architectures where one type of address may have an assumed segment
register while the other has an explicitly provided one (traditionally
the medium and compact memory models).

Scott

unread,
Aug 12, 2018, 2:41:45 PM8/12/18
to
Oh, I could tell some stories....

One of the earliest computers I owned had a base configuration with
16KB of RAM. That meant you only needed the bottom 14 bits of an
address register; since the top two bits didn't go anywhere and
addresses would just wrap around, you could appropriate those bits for
other purposes.

Now, this little computer had an option for upgrading to a full 64KB
of RAM. It wasn't supposed to be a user serviceable thing, but of
course in those halcyon days we did all kinds of things that started
with breaking the warranty seal, and adding a few DRAM chips was easy
enough if you were half handy with a soldering iron. (Socketed RAM?
What a luxury that would have been!)

Of course, once you upped the RAM from 16KB to 64, well, now you need
all 16 bits to address it. And all those clever things you wrote that
used those "unused" address bits? Yeah. You know what that looks like.

But, yes, WRT implementation-specific behavior, on my current x86
32-bit platform (XP/Cygwin), uint and all * are 32 bits with no hidden
thunks, and so indeed interchangeable. The box next to it, with BSD on
AMD64, has uint with 32 bits and all * with 64 bits, so slightly less
interchangeable.

bart...@gmail.com

unread,
Aug 12, 2018, 5:06:39 PM8/12/18
to
PC_lin32.c

Yes that's a typo in int-power function. There is another such function (for bigints) and that uses < not <=.

Kudos for spotting that in such hard to read C. Several people have said generated code doesn't need to be readable, and I took them at their word.

--
Bart

David Brown

unread,
Aug 12, 2018, 5:56:30 PM8/12/18
to
The issue here is that there are two separate implementations - gcc and
MSVC (used for the Windows OS and relevant system dll's) that have
differing opinions on implementation-defined behaviour.

It is unfortunate that in the Windows world, there are no consistent
standards for this sort of thing - no system ABI, no standard calling
convention, not even consistent sizes and alignments of all basic types.
This means that different implementations can make different decisions
about things. In this case, gcc by default has a 16-byte stack
alignment for efficiency, while MSVC has a 4-byte stack alignment.

So who is to blame?

Not C, anyway - the language has nothing to do with it.

You could say that gcc should default to assuming less about the stack
pointer alignment. That would mean less efficient code, but work more
safely - that could have been a better default.

You could say that Windows is at fault - it should ensure stronger stack
alignment before calling callback functions.

You could say the user is at fault for not using the
"force_align_arg_pointer" function attribute on the callback function.

I would say that gcc for Windows should probably be configured to have
the "-mstackrealign" option set by default, and let advanced users
change it if they know what they are doing. That would be an issue for
the MinGW builds of gcc.

I would also say that Windows should be using more conservative stack
alignment for callbacks.

And I would say the user could have done a bit of searching online, and
figured out the issue pretty quickly.

Keith Thompson

unread,
Aug 12, 2018, 6:18:17 PM8/12/18
to
David Brown <david...@hesbynett.no> writes:
> On 11/08/18 23:01, Bart wrote:
[...]
>> By that token then every cast is a lie to the compiler.
>
> Every cast is a message to the compiler saying "I'm doing something odd
> here, but I know it is correct". When you don't know it is correct, you
> are lying to the compiler.

Certainly not *every* cast.

A cast is just an explicit type conversion. Many numeric casts are
perfectly ordinary, for example:

int a = ...;
int b = ...;
double ratio = (double)a / (double)b;

The language's implicit type conversions usually do the right thing,
but not always. This is a case where you need to override them.
(Or perhaps a and b should have been defined as double in the
first place.)

Pointer conversions (most of which require casts) are usually "odd"
and require a bit more care.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

james...@alumni.caltech.edu

unread,
Aug 12, 2018, 6:27:51 PM8/12/18
to
On Sunday, August 12, 2018 at 6:57:02 AM UTC-4, Bart wrote:
> On 12/08/2018 04:15, james...@alumni.caltech.edu wrote:
> > On Saturday, August 11, 2018 at 6:11:53 PM UTC-4, Bart wrote:
>
> >> I mean whatever James means by "C", as he clearly thinks the language I
> >> am using and generating is not C.
> >
> > No, I said you weren't compiling it as C code. And that comment was only
> > in reference to your compilations that failed to set the options needed
> > to put gcc into a standard-conforming mode. You did show one compilation
> > using -std=c11, and that compilation was in fact compilation as C code.
>
> No. I showed four successive compilations using -std, followed by the
> comment 'You still think I'm not compiling C?'.
>
> You said 'No, it's GnuC'.

Yes, and when I said, "No, it's GnuC", I was referring to the two
compilations that did not use -std. That was not, in any sense, a
comment about the four compilations that used -std. Looking back, I see
that I should have placed the comment earlier, to make that distinction
clear. I apologize for placing it incorrectly.

> > What I want you to do (I know it's far too much to ask, but you did ask
> > what I want) is for you to stop complaining about the fact that when you
> > insist on compiling your code as GnuC code, the compiler behaves in
> > accordance with the rules of GnuC, which are often significantly
> > different from the rules of C.
>
> If I take this program (c.c):
>
> a,b,c; # implicit int
>
> fred(); # implicit int and () params
>
> main()
> {
> puts("hi there"); # implicit func decl
> fred(10);
> fred("twenty","thirty"); # contradictory args to fred
> }
>
> and compile as:
>
> gcc -c c.c
>
> then I do get a bunch of warnings, not errors. But if I follow your
> advice and compile it like this:
>
> gcc -c -std=c11 -pedantic c.c
>
> Then I get exactly the same warnings. It MAKES NO DIFFERENCE!

I'll take your word for it. GnuC is a very different language than C,
but it is also very similar to C, and this is an example of how it's
similar. I pay attention to both warnings and errors - and I treat a
warning message that is valid the same as a valid error message, so the
fact that it generates warning messages rather than error messages
doesn't matter to me.

> > It's not about writing code in the C language. It's about compiling it
> > as C code. When you use gcc without -std=c90, or c99, or c11, you're
> > not compiling it as C code, regardless of what you were thinking when
> > you wrote it.
>
> And my little test shows it makes no difference. Do you really think
> gnuC is so much more lax about things that matter? Seems silly to add so

No, the key point is not that it's lax, but that it's different.

> many features make the language better, but then say, 'Let's be more
> forgiving than standard C about implicit ints'!

Why? The version you're using implements GnuC90 by default, which shares
with C90 the fact that implicit int was considered perfectly normal. Why
shouldn't it be forgiving of implicit int?

> > Actually, it doesn't - you're unreasonable about what you want to do.
> > Properly written C code does NOT trigger thousands of warnings when
> > using -pedantic.
>
> On my intended platforms, void*, T* and T(*)() pointers can be cast to
> each other and they all share the same 32- or 64-bit representation. So
> here the compiler is being unreasonable. Avoiding casts to/from function
> pointers sees especially fiddly.
>
> How would YOU rewrite a table like this of mixed function pointers:
>
> void *table[] = {malloc, realloc, free, printf, puts};

I'd use void (*)(void), rather than void*, as Ben showed you.

> How would you use such a pointer afterwards for calling it?

> Say, calling the pointer at table[2] knowing that it takes a void* argument.


(*(void (*)(void*))table[2])(ptr);

That isn't quite correct, however. The truth is, I would never write code
like that. Using a function pointer to call a function when the function
pointer points at is incompatible with the definition of the function is
undefined behavior of the worst kind, so I would not use a generic
function pointer unless I combined it with strong measures to make sure
I kept proper track of the actual type of the function it points at. In
older C code, I would put the function pointer inside a struct, with
another member of the struct devoted to keeping track of the type of the
function pointed at. I would do my best to make sure that the type
identifier was kept synchronized with the pointer. In modern C, I'd use
a _Generic expression in the process of setting the type code. In C++,
I'd use templates for the same purpose.

Now, a lot of the C code you work with is computer generated. If your
code generator makes sure that the second element of the array always
points at a function with the same interface as free(), and if your code
generator makes sure that the second pointer is never used to call a
function in any way other than using that same interface, then it's safe
- because of your code generator, not because the generated code is
itself safe. Since I don't write code generators, but actual source
code, it's vitally important to me that the compiler be able to catch
type errors like that, which would be impossible with your code.

A scheme similar to the one you describe was used by C++ compilers that
generated C code as an intermediate step, for purposes of implementing
virtual function tables. But it is properly done using an array of
function pointers, not object pointers.

David Brown

unread,
Aug 13, 2018, 4:40:00 AM8/13/18
to
On 11/08/18 03:14, Bart wrote:
> On 10/08/2018 20:51, Bart wrote:
>
>> So I need to find out what the problem is with that app on that one
>> machine.
>
> That problem is illustrated here:
>
>     char data[100];
>
> //  typedef double T;
>     typedef long long int T;
>
>     int main(void)
>     {
>         T x;
>         T *p,*q;
>
>         p = &x;
>         q = (T*)(&data[1]);
>
>         *p = *q;
>     }
>
> As it is, this does an assignment of a 64-bit int from an unaligned
> source. And it works (if it didn't, I would have done something about it
> long ago.)

It is not valid standard C. It might work on some platforms and
compilers, and fail on others. There are two problems:

1. You are accessing data of one type (char) via a pointer to an
incompatible type (T). That is undefined behaviour. Many compilers let
you do that, at least in some circumstances (optimisation disabled, or
-fno-strict-aliasing) - others will not guarantee to give you the code
you expect. It's unlikely to be a problem in this particular example,
but it certainly can be in other cases.

2. You are accessing data via a pointer that is (possibly) not properly
aligned. Whether that works or not depends on the target platform, the
types involved, the kind of instructions generated, etc. And since it
is undefined behaviour in the standards, compilers can assume it won't
happen. (Note that as far as the compiler is concerned, it only knows
that data is "char" aligned. It /could/ be that &data[1] is properly
aligned for accessing with type T.)


Some compilers and target processors may guarantee that this code will
work on that particular implementation. Most will not.

>
> But change the typedef of T to double, and it says 'Illegal
> instruction'. Presumably this processor (some sort of 32-bit ARM)
> doesn't like unaligned accesses for floating point data (but the one in
> the RPi is OK with it).
>
> Odd. I don't do much with unaligned accesses (the data[] array
> represents byte-code data from a file which may contain packed floating
> point values at any odd index), but now I can fix that knowing the problem.
>
> I don't think it's a C issue. Certainly enabling a bunch of extra
> warnings didn't throw up anything. (But then they usually don't detect
> real problems.)
>

A cast tells the compiler "I really know what I am doing here" - you get
far fewer warnings when you have casts involved.

This kind of code is a case where you have an awkward choice. The
correct C solution would be to use a memcpy(). On compilers that have
quality optimisation and good knowledge of their targets - the kind
where you are going to be in trouble for using casts like your code -
the memcpy() will be optimised to minimal instructions and give you the
best generated code. On compilers with lower quality optimisation, a
memcpy() might mean a slow library call while accessing via incompatible
types will probably do exactly what you want.

So do you write correct code with memcpy()? That will always be correct
and valid, but very inefficient on some compilers. Do you write
incorrect code with pointer casts? That will be efficient on many
compilers, but broken on others.

I count correctness as much more important than efficiency. And if
efficiency on poorer implementations is still important, I'd use an
inline function (or pre-C99, a macro) with conditional compilation to
generate faster code on implementations where I know memcpy() is slow
and I know casting works.


David Brown

unread,
Aug 13, 2018, 4:50:55 AM8/13/18
to
On 13/08/18 00:18, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
>> On 11/08/18 23:01, Bart wrote:
> [...]
>>> By that token then every cast is a lie to the compiler.
>>
>> Every cast is a message to the compiler saying "I'm doing something odd
>> here, but I know it is correct". When you don't know it is correct, you
>> are lying to the compiler.
>
> Certainly not *every* cast.
>
> A cast is just an explicit type conversion. Many numeric casts are
> perfectly ordinary, for example:
>
> int a = ...;
> int b = ...;
> double ratio = (double)a / (double)b;
>
> The language's implicit type conversions usually do the right thing,
> but not always. This is a case where you need to override them.
> (Or perhaps a and b should have been defined as double in the
> first place.)

Fair enough - they can be "ordinary" as well as "odd". But they still
tell the compiler you know what you are doing. If you cast a double to
an int, you are telling the compiler that you know everything is in
range and there are no overflows.

In some styles of code, it is not unusual to see casts written out
explicitly even though the same conversions would have happened
implicitly, on at least some targets. It can make it clearer to the
reader exactly which types are involved. But it can also mean that some
warnings that might have be issued if there are mistakes in the code,
are effectively disabled by the cast.

>
> Pointer conversions (most of which require casts) are usually "odd"
> and require a bit more care.
>

Agreed.

David Brown

unread,
Aug 13, 2018, 4:54:10 AM8/13/18
to
It only needs to be readable from the point of view of debugging the
generator. That is a very different requirement from being readable as
standard C code, which again is very different from looking like
manually written code or being convenient for manual modification.

People who write code generators generally get quite familiar with their
output, and can read the generated code much more easily than other people.

bart...@gmail.com

unread,
Aug 13, 2018, 7:43:46 AM8/13/18
to
David Brown:"It only needs to be readable from the point of view of debugging the
generator. That is a very different requirement from being readable as
standard C code, which again is very different from looking like
manually written code or being convenient for manual modification.

People who write code generators generally get quite familiar with their
output, and can read the generated code much more easily than other people."

For debugging, things like name mangling can be turned off, if it is not necessary to compile the C. Although that is not needed when the original program has just one module.

Excess brackets can also be removed, but my algorithm for that ran into problems due to add and subtract having the same precedence. It needs fixing.

Most debugging of generated C however is done with the smallest test program possible.

So the readability is mainly an issue for anybody who wants to browse the source code. Then, the original source would be much easier to read in all cases.

--
Bart

David Brown

unread,
Aug 13, 2018, 8:27:39 AM8/13/18
to
On 11/08/18 15:34, Bart wrote:
> On 11/08/2018 12:37, Richard Damon wrote:
>> On 8/10/18 9:14 PM, Bart wrote:
>
>> I am a bit surprised that you say 'it works', as that probably means you
>> really have only done this with a very few different families of
>> processors to avoid having run into the issue.
>
> I'm only interested in x86 and ARM families at the minute, as they are
> the ones likely to be inside the machines I can get my hands on.
>

That's a fairly short-sighted view. It might be appropriate, but there
are other architectures around. RISC-V is the optimistic new kid on the
block, and could be very popular. It does not support misaligned
accesses in most of its versions. And misaligned access support varies
significantly from ARM to ARM.

> x86 family has allowed unaligned access for 40 years. And even the most
> recent versions only require alignment for certain instrucions.

An alternative viewpoint is that since many x86 devices don't support
misaligned access on every instruction, they don't support general
misaligned access - they only support it on certain instructions. Make
your default assumptions the ones that are /always/ correct, not the
ones that are merely /usually/ correct.

>
> So if someone is developing a product that will run on either of those
> two families, they might not need to consider it. (But ARM I think has
> rather more variations.)

Program accurately in C and use a decent compiler, and you don't need to
consider it at all - the /compiler/ can handle it. The compiler will
generate instructions that work for the alignment access requirements of
the target processor. Then you don't need to mess around trying to use
broken pointer casts and misaligned accesses - use legitimate techniques
like unions, packed structs (implementation-defined, but fine on the
platforms that support them), and memcpy. The compiler will happily
generate optimal code, using misaligned accesses if it works and is
efficient.

>
> It would anyway not be something you would routinely do - nearly all
> accesses /will/ be aligned. In this case input was coming from a packed
> file format where alignment wasn't observed.
>
> But it's really not a big deal - if it ever causes a problem, then you
> fix it, as I have.

You haven't fixed it - you have just used a different broken solution
that happened, by luck, to give the hoped-for answer on a particular
processor and particular compiler. Fixing it /properly/ is not hard,
especially if you are not overly concerned about the performance on
weaker compilers (and if you /were/ concerned with performance, you
wouldn't be using such compilers).

>
>> The C language, because the issue is well known for a long time, has
>> implementation dependent rules about pointer alignment (implementation
>> dependent because it really does depend on the target machine). You tend
>> to not get warnings about it, because you have an explicit cast, which
>> tends to mean to the compiler that you know what you are doing so please
>> do this, an the conversion of a char buffer in this manner is not
>> uncommon, it just requires that you know you have made a properly
>> aligned pointer.
>
> The only other problem I've had on ARM was the opposite: gcc thought I
> was doing misaligned pointer accesses, and proceeded to do
> byte-at-a-time transfers, even though the processor /could/ do
> unaligned. And even though the actual transfer /was/ aligned!
>
> In this case it just slowed down my program by a factor of three.
>

Correct is more important than fast.

If gcc thought that your pointers were misaligned, it was probably due
to how you had manipulated these pointers - like converting them from
void*, char* or uintptr_t types that lost the alignment information.
You can restore alignment information using the __builtin_assume_aligned
function.

Scott Lurndal

unread,
Aug 13, 2018, 9:45:37 AM8/13/18
to
David Brown <david...@hesbynett.no> writes:
>On 12/08/18 12:28, Bart wrote:
>> On 12/08/2018 02:09, anti...@math.uni.wroc.pl wrote:
>>> Bart <b...@freeuk.com> wrote:
>>

>> Whether compiled with -std=c11 or not, that was a bug. Since that was a
>> callback called from code where it doesn't know if it was even written
>> in C, it should have been more careful.

Yes, a microsoft bug.

>>
>> Now, I didn't make any lies in my C code and it was clearly marked as a
>> callback.
>>
>> It requires, in fact, this attribute to make it work properly:
>>
>>   #define gcc_callback __attribute__ ((force_align_arg_pointer))
>>
>> Maybe you can argue that the lie is in not writing this line; that is
>> likely what JK would argue.
>>
>
>The issue here is that there are two separate implementations - gcc and
>MSVC (used for the Windows OS and relevant system dll's) that have
>differing opinions on implementation-defined behaviour.
>
>It is unfortunate that in the Windows world, there are no consistent
>standards for this sort of thing - no system ABI, no standard calling
>convention, not even consistent sizes and alignments of all basic types.
> This means that different implementations can make different decisions
>about things. In this case, gcc by default has a 16-byte stack
>alignment for efficiency, while MSVC has a 4-byte stack alignment.
>
>So who is to blame?
>
>Not C, anyway - the language has nothing to do with it.
>
>You could say that gcc should default to assuming less about the stack
>pointer alignment. That would mean less efficient code, but work more
>safely - that could have been a better default.

In this case, gcc defaulted to the System V processor specific ABI (psABI)
for x86-64, which requires stack alignment sufficent to store the largest
machine data data type naturally aligned.

MS compilers have been and always will be crappy.

bart...@gmail.com

unread,
Aug 13, 2018, 2:07:50 PM8/13/18
to
Scott: "Yes, a microsoft bug.
....
In this case, gcc defaulted to the System V processor specific ABI (psABI)
for x86-64, which requires stack alignment sufficent to store the largest
machine data data type naturally aligned.

MS compilers have been and always will be crappy"

Is MS really at fault? Don't they obey their own Win64 ABI? That says the stack should be 16 byte aligned just before CALL is executed, therefore the stack will never be 16 byte aligned at the entry point at the function, which is where GCC takes over.

Unless the stack is anything other than odd-8-byte aligned, then I would say it's up to GCC.

--
Bart

bart...@gmail.com

unread,
Aug 13, 2018, 2:30:05 PM8/13/18
to
David Brown: "Correct is more important than fast."

Not with a 3x slowdown due to a misunderstanding, on a machine already considered very slow.

"If gcc thought that your pointers were misaligned, it was probably due
to how you had manipulated these pointers - like converting them from
void*, char* or uintptr_t types that lost the alignment information.
You can restore alignment information using the __builtin_assume_aligned
function. ".

(Replying to more pertinent parts but selective quoting is not easy on this device.)

The OP and alignment issue is just one of hundreds of things to be tackled to get this particular program working. This is an interpreter - you sometimes have to do underhand things. Even more so if there was JIT tacked onto it.

And the work covers the design of the language to be run, the design of the language used to be implement it, and the design of the program used to translate the latter to C, as these are all changing.

As it is, this version for C excludes 6000 lines of inline assembly used for performance (which actually makes it 2-3 times as fast as pure C with GCC O3).

The alignment thing covers 2 lines out of 20,000.

--
Bart

David Brown

unread,
Aug 13, 2018, 3:43:54 PM8/13/18
to
On 13/08/18 20:29, bart...@gmail.com wrote:
> David Brown: "Correct is more important than fast."
>
> Not with a 3x slowdown due to a misunderstanding, on a machine already considered very slow.
>

Yes, regardless of the slowdown, correct is /always/ more important than
fast. If "correct" means too slow to be useful, then you have a
problem. But incorrect is /always bad, regardless of speed.

If this is all due to a bug in gcc, then it is a performance bug in gcc
and should be reported and hopefully fixed. That does not change the
fact that it is invariably better to err on the side of slow and correct
than fast but incorrect.

> "If gcc thought that your pointers were misaligned, it was probably due
> to how you had manipulated these pointers - like converting them from
> void*, char* or uintptr_t types that lost the alignment information.
> You can restore alignment information using the __builtin_assume_aligned
> function. ".
>
> (Replying to more pertinent parts but selective quoting is not easy on this device.)
>

(Snipping is great - we should do it more often in these threads. The
line endings are jumbled by your device or newsreader, however.)

> The OP and alignment issue is just one of hundreds of things to be tackled to get this particular program working. This is an interpreter - you sometimes have to do underhand things. Even more so if there was JIT tacked onto it.
>

Sometimes when you do that sort of thing you have to use additional
tricks to keep the compiler informed of the details if you want fast code.

bart...@gmail.com

unread,
Aug 13, 2018, 6:52:02 PM8/13/18
to
David Brown: "Yes, regardless of the slowdown, correct is /always/ more important than
fast. If "correct" means too slow to be useful, then you have a
problem. But incorrect is /always bad, regardless of speed. "

Why do you always seem to think I write incorrect programs? My apps are not little exercises that print one result that may be verified as being right or wrong.

In the case of an interpreter, inputs can be complex and so can outputs. Actually the input is a program, and it is that that determines the output. If it is wrong, it might be an interpreter bug, or a bug in the input program (or a bug in another tool).

An interpreter has to deal with many thousands of combinations of byte-codes and data types and operations, and data ranging from 1 byte to 1GB. Not all will be supported, some may be buggy, or have limitations. Or the input program is os-specific.

So your idea of a program being either correct or incorrect is ludicrous.

(My measure is whether an input program known to work on one system, still works on this port. Then you try more programs to achieve some degree of confidence that most inputs will work and the port was successful.

For this app, that means that a /binary/ input program from 64-bit Windows works unchanged in 32-bit Linux.)

--
Bart

David Brown

unread,
Aug 13, 2018, 7:10:43 PM8/13/18
to
On 14/08/18 00:51, bart...@gmail.com wrote:
> David Brown: "Yes, regardless of the slowdown, correct is /always/ more important than
> fast. If "correct" means too slow to be useful, then you have a
> problem. But incorrect is /always bad, regardless of speed. "
>
> Why do you always seem to think I write incorrect programs? My apps are not little exercises that print one result that may be verified as being right or wrong.
>

I wrote "Correct is more important than fast." You wrote "Not with a 3x
slowdown on a machine already considered very slow".

What conclusion should I draw from that, other than that you think it is
sometimes fine for programs to be incorrect as long as they are not too
slow?


> In the case of an interpreter, inputs can be complex and so can outputs. Actually the input is a program, and it is that that determines the output. If it is wrong, it might be an interpreter bug, or a bug in the input program (or a bug in another tool).
>
> An interpreter has to deal with many thousands of combinations of byte-codes and data types and operations, and data ranging from 1 byte to 1GB. Not all will be supported, some may be buggy, or have limitations. Or the input program is os-specific.
>
> So your idea of a program being either correct or incorrect is ludicrous.

No, it is not.

A program has a specification (whether you write it down or not is
irrelevant). Given this input, it should give that output. A program
is correct if it fulfils that specification - incorrect if it does not.

An interpreter is correct if it handles a correct source code in the
expected way. You might additionally have requirements about how it
handles particular errors - in which case it has to get those right too.

If it is known to handle some specific test cases correctly, then it
might be correct - or might not be. If it is known to fail on test
cases that should work, then it is incorrect.

(It might still be useful in a limited fashion even if it is incorrect.)

bart...@gmail.com

unread,
Aug 13, 2018, 7:52:44 PM8/13/18
to
David Brown: 'I wrote "Correct is more important than fast."'

And you wrote that as though incorrect was an option! And since you seemed to consider a dodgy but working cast as incorrect, maybe you intended it to be one.

I already mentioned that I 'withdrew' my C compiler because of problems I wasn't happy with, but those are real ones not a workaround that might not work on some compilers that may not even exist.

(As a turnaround on that project, I'm considering changing the backend to output code in my language, as a change from going the other way. But there are some difficulties and it might only work for single file C programs.)

--
Bart

David Brown

unread,
Aug 14, 2018, 3:00:26 AM8/14/18
to
On 14/08/18 01:52, bart...@gmail.com wrote:
> David Brown: 'I wrote "Correct is more important than fast."'
>
> And you wrote that as though incorrect was an option! And since you
> seemed to consider a dodgy but working cast as incorrect, maybe you
> intended it to be one.
>

People /do/ write incorrect code. Most programmers don't consider it an
option - they write incorrect code only by accident. Some, however, are
happy to write knowingly incorrect code.

Code can be correct for a particular implementation of C - a particular
compiler, flag combination or target - while still being incorrect as C.
It can also be working code, while still being incorrect - it happens
to work, but you can't be sure of it.

Your horrible cast code is certainly incorrect C - it breaks a variety
of C rules. It happens to work on particular examples on some of the
tools you have used with some targets and some flag combinations. It
fails to work in other cases, and even when it /does/ work, you can't be
sure of it - perhaps the same code sequences would fail when mixed with
different source code. It is /incorrect/ code.

As I say, people write incorrect code sometimes - and sometimes they
don't realise it, because the code happens to work during their tests.
This can be due to ignorance - C can be a complex language. Usually it
is due to simply making a mistake. But what is special about your code,
such as your dodgy casts, is that you /know/ the code is incorrect. You
/know/ it breaks rules, and will fail to work in many circumstances.
You /know/ how to write it correctly, so that it is valid on any C
compiler. You /know/ how to write it so that it is correct and
efficient on all the tools you use, with a fall-back to correct but
possibly inefficient on unknown tools.

So you are intentionally writing incorrect code. Why? I try to look on
it from the best light, and think you are doing so for efficiency or
simplicity. An alternative explanation is that you write code this way
so that you can feed your paranoia about gcc and use it to claim that
the compiler is evil.

bart...@gmail.com

unread,
Aug 14, 2018, 5:29:58 AM8/14/18
to
DB:"Your horrible cast code is certainly incorrect C - it breaks a variety
of C rules. It happens to work on particular examples on some of the ..."

[What is incorrect code]

This is probably where your thinking differs from mine. In your model C is far too dominant and influences how you write programs too much.

My programming model is different. In my case C is just used as an intermediate representation, so I don't care about it's stinkin' rules.

My code snippet copies an 8-byte block via pointers. Any mistake was in attributing a float type which here I'd not needed. My own code generator knows that and ignores it. It will use a genetic word type.

If there is a problem moving a word as one block, then on the first target that will be a problem, that can be tweaked. As I said, such unaligned accessed are very rare.

And I can tweak it by changing the code in original source, or by allowing the unaligned-ness to be specified in the language.

C meanwhile, ought to do what I tell it.

--
Bart

David Brown

unread,
Aug 14, 2018, 6:20:16 AM8/14/18
to
On 14/08/18 11:29, bart...@gmail.com wrote:
> DB:"Your horrible cast code is certainly incorrect C - it breaks a
> variety of C rules. It happens to work on particular examples on
> some of the ..."
>
> [What is incorrect code]
>
> This is probably where your thinking differs from mine. In your model
> C is far too dominant and influences how you write programs too
> much.

I write in various programming languages. If I write in Python, I write
code that is correct Python - using Pythonic coding that I would not
consider in C. If I write in assembly, I write correct assembly, using
styles that I would not use in C. If I write in C, I write correct C
code. (Baring mistakes, of course.)

I don't view C as a "portable assembly". I don't view Python as a "high
level C". I don't view assembly as a "low level Javscript".

Pick an appropriate language for the job, and use that language
correctly for the task.

It just boggles my mind why someone would knowingly and intentionally
write bad code that they /know/ doesn't work in all the cases they need,
when they know how to do it correctly.

For a motoring analogy, I can understand why someone might see a speed
limit sign saying "50 mph" and choose to drive at 60 mph - it is a
calculated risk with some benefit. That's like relying on your "int"
being 32-bit - it can be more convenient in the code, and for many
people you are not going to see any problems.

But screwing around with casting pointers to incompatible types with
unsupported alignments, that sometimes works on limited compilers? It's
like seeing a sign saying "low bridge 3m" and thinking "My van is 3.5m -
but since my tyres are below proper pressure I'll probably scrape through".

>
> My programming model is different. In my case C is just used as an
> intermediate representation, so I don't care about it's stinkin'
> rules.
>

You /do/ care about the rules - you care enough to explicitly ignore
them and them blame them when things go wrong.

> My code snippet copies an 8-byte block via pointers. Any mistake was
> in attributing a float type which here I'd not needed. My own code
> generator knows that and ignores it. It will use a genetic word
> type.

Following the rules for C here is /easy/. It is not remotely
challenging. You choose actively to write things in a knowingly
incorrect way. And then you "fix" it by using different but equally
incorrect code.

>
> If there is a problem moving a word as one block, then on the first
> target that will be a problem, that can be tweaked. As I said, such
> unaligned accessed are very rare.
>
> And I can tweak it by changing the code in original source, or by
> allowing the unaligned-ness to be specified in the language.
>
> C meanwhile, ought to do what I tell it.
>

C /does/ do what you tell it - you just have to learn the language. You
tell it "I don't care about the rules. Do whatever the *beep* you
want", and then complain when it doesn't read your mind.

mark.b...@gmail.com

unread,
Aug 14, 2018, 6:59:04 AM8/14/18
to
On Tuesday, 14 August 2018 10:29:58 UTC+1, bart...@gmail.com wrote:

> My programming model is different. In my case C is just used as an
> intermediate representation, so I don't care about it's stinkin' rules.

Well, you clearly should - if you don't know C and it's rules adequately,
then you have no basis to assert that the C you've generated accurately
reflects the semantics of the code from which it was generated, surely.

The recent example of undefined behaviour from signed overflow is a case
in point. Your lack of understanding C's "stinkin' rules" meant that you
incorrectly assumed that the behaviour you wanted in your original language
would be automatically provided by C.

It seems to me that anyone choosing to use C as an intermediate language
needs to understand the language and its rules at least sufficiently well to
understand how to implement the semantics of the original language in C,
not just assume/hope that it will be a "Do What I Meant" language.


bart...@gmail.com

unread,
Aug 14, 2018, 7:07:17 AM8/14/18
to
DB: "
It just boggles my mind why someone would knowingly and intentionally
write bad code that they /know/ doesn't work in all the cases they need,
when they know how to do it correctly. "

It DOES work in all my current test cases, which is 5 platforms and probably 6. I know it won't work on some like 64kb Z80 because the program is too bleeding big, and requires that ints are 32 bits.

Note that interpreters are again a special case. It's OK to need to overhaul or rewrite in a new platform, because you only do it once. Then each of the million apps you run under it will work unchanged. If there is a bug that manifests itself in 100,000 of those apps, then it is only fixed once in the interpreter.

Also, such programs are also different in possibly having an accelerator written specially for each platform, which will be highly non portable.

So I'm not going to be bothered about one unaligned assignment out of many 1000s of lines, which is allowed by the hardware.

This is a non-issue. A bug was found, which was the hard part in doing it in crappy hardware with crappy tools. The fix was then trivial.

The only interesting thing was that all the lauded features of GCC didn't pick it up when in C form.

You might also bear in mind that in C form, not a single macro nor conditional is used. And the pc_nos32.c version runs on either OS. Now go and take another look at sources for any other interpreter.


--
Bart

Scott

unread,
Aug 14, 2018, 8:31:44 PM8/14/18
to
On Tue, 14 Aug 2018 02:29:47 -0700 (PDT), bart...@gmail.com wrote:

>C meanwhile, ought to do what I tell it.

Oh, C does what you tell it, that's for certain.

The problem you're having is that what you're telling it, and what you
*think* you're telling it, are turning out to be different things.
This is not much of a surprise coming from someone who doesn't
understand what they're doing.

blmblm.m...@gmail.com

unread,
Aug 14, 2018, 8:33:09 PM8/14/18
to
In article <5b7077c9...@news.xmission.com>,
I'm reminded of the long-ago days when I did systems work on
IBM mainframes. My years doing that overlapped with the move from
24-bit addresses (in 32-bit words) to an uneasy mix of 24-bit and
31-bit addresses. In the 24-bit-address days, programmers (and
IIRC some of the machine instructions?) made use of those 8 extra
bits. That "uneasy mix" was MVS/XA, which I remember as a massive
kludge designed to allow running both the older 24-bit programs
and newer 31-bit ones.

> But, yes, WRT implementation-specific behavior, on my current x86
> 32-bit platform (XP/Cygwin), uint and all * are 32 bits with no hidden
> thunks, and so indeed interchangeable. The box next to it, with BSD on
> AMD64, has uint with 32 bits and all * with 64 bits, so slightly less
> interchangeable.

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.

bart...@gmail.com

unread,
Aug 15, 2018, 4:10:59 AM8/15/18
to
Scott wrote: "
Oh, C does what you tell it, that's for certain. "

Not GCC -O3. It'll look for every opportunity to /avoid/ doing what I told it.

This is in the interests of making the program faster.

No, that's cheating. I want it to do what I say, but efficiently.

A trivial example is a 100-million loop which may be eliminated if the results it calculates are not used.

Another is the recursive Fibonacci benchmark where fib(n) is expected to perform X million calls, but manages to get the right answer using a fraction of those.

We're not interested in the answer! But in how well it can do millions of function calls.

So benchmarks have to use -O1. -O3 is only used for real programs, for which any improvements will be less dramatic.

--
Bart

Reinhardt Behm

unread,
Aug 15, 2018, 5:31:42 AM8/15/18
to
AT Wednesday 15 August 2018 16:10, bart...@gmail.com wrote:

> Scott wrote: "
> Oh, C does what you tell it, that's for certain. "
>
> Not GCC -O3. It'll look for every opportunity to /avoid/ doing what I told
> it.

What you mean you have told it but did not.

> This is in the interests of making the program faster.
>
> No, that's cheating. I want it to do what I say, but efficiently.

Then tell it what you want, and don't tell it something different. if you
want to tell someone what to do, you have to use the language of that
someone, not some language with a different meaning.
>
> A trivial example is a 100-million loop which may be eliminated if the
> results it calculates are not used.

Which is absolutely correct but probably beyond your understanding.

> Another is the recursive Fibonacci benchmark where fib(n) is expected to
> perform X million calls, but manages to get the right answer using a
> fraction of those.
>
> We're not interested in the answer! But in how well it can do millions of
> function calls.
>
> So benchmarks have to use -O1. -O3 is only used for real programs, for
> which any improvements will be less dramatic.

Real programmers write real programs not toy benchmarks.

--
Reinhardt

Andrew Smallshaw

unread,
Aug 15, 2018, 5:58:09 AM8/15/18
to
On 2018-08-13, David Brown <david...@hesbynett.no> wrote:
>
> Yes, regardless of the slowdown, correct is /always/ more important than
> fast. If "correct" means too slow to be useful, then you have a
> problem. But incorrect is /always bad, regardless of speed.

To some extent this comes down to definitions (what counts as
"correct"?) but I would be very careful about blindly asserting
that, especially as /always/ being true. Quick and roughly right
is often perfectly acceptable over something more mathematically
rigorous, or in realtime scenarios a late result is as much use as
result at all.

Or consider many real world applications where the consequences of
an error are less inconvenient than the hassle of doing the job
properly to begin with. For example, how many times does a bar
code fail to scan first time? You could eliminate a lot of that
with more controlled scanning conditions, e.g. a "scanning bench"
with a mounted scanner, moveable table that is adjusted to the
optimal focus, controlled lighting etc. It isn't worth it - the
whole point of something like that is to be quick and convenient.
Certainly you don't want a mismatch - one code being matched for
another - so you have reliable error detection on the result - but
if the initial scan fails, well you try it again. It is the
trade-off to have a system that generally works in a second or less
as opposed to taking five or ten minutes.

--
Andrew Smallshaw
and...@sdf.org

David Brown

unread,
Aug 15, 2018, 6:39:34 AM8/15/18
to
On 15/08/18 10:10, bart...@gmail.com wrote:
> Scott wrote: " Oh, C does what you tell it, that's for certain. "
>
> Not GCC -O3. It'll look for every opportunity to /avoid/ doing what I
> told it.
>

Sorry, you are wrong. Totally, and completely wrong. Scot is correct -
you are not telling the compiler to do what you think you are telling it.

Until you understand the language, and understand the importance of
speaking the same language as your tools, you will /always/ get this
wrong, you will write incorrect programs, and you will get problems
depending on details of the code, the compiler, the flags, the targets.

> This is in the interests of making the program faster.

Yes, gcc -O3 tries to make /correct/ programs run faster. For those of
us who know how to program in C, that's great - for the same effort in
the source code, we get better results. And more importantly IMHO, good
optimisation means I can write my source code with a focus on clarity
and let the compiler worry about the details of efficiency.

>
> No, that's cheating. I want it to do what I say, but efficiently.

Ask it in C, not in a sort-of-C dialect that works by luck on a few test
systems.

>
> A trivial example is a 100-million loop which may be eliminated if
> the results it calculates are not used.
>

And that is a marvellous thing. It is not uncommon that people have
some code that does not do anything useful in a given build - and it is
great that the compiler can figure it out and remove it. This lets
people write their code in more general ways, or clearer ways - and the
compiler skips over code that might be needed for one build, and is not
needed in another build.

Of course, if you have a 100-million loop whose result is not used, you
may have to question the point of having it in the first place. But you
don't have to question the benefits of the compiler skipping it.

(If you are trying to do some timing or other testing, then you might
want to learn about "observable behaviour" in C.)

> Another is the recursive Fibonacci benchmark where fib(n) is expected
> to perform X million calls, but manages to get the right answer using
> a fraction of those.

Again, that is a marvellous optimisation.

>
> We're not interested in the answer! But in how well it can do
> millions of function calls.

/You/ might not be interested in the answer. Usually when I write code,
I /am/ interested in the results. And I am happy if the compiler can
find shortcuts to give me those results faster.

If I want to do testing or benchmarking, then I do testing and
benchmarking. C provides the key features needed (observable behaviour)
- it is not gcc's fault that you don't understand the fundamentals of
how the C language works.

(You are not the first, or the only, programmer to fail to understand
this - but you may be the only programmer who has so repeatedly failed
to understand despite being told innumerable times.)

>
> So benchmarks have to use -O1. -O3 is only used for real programs,
> for which any improvements will be less dramatic.
>

As usual, complete rubbish based on a stubborn and determined ignorance.

If you want to learn what you need to do for benchmarking like this, ask
and I (or others) will happily tell you. The same applies to anything
else about C programming and compilers. Otherwise, whenever you have a
complaint to make about compilers or optimisations it would be safer if
you simply assumed that you are wrong and keep quite - the group does
not need this continuous supply of ignorant FUD and nonsense.

David Brown

unread,
Aug 15, 2018, 6:49:38 AM8/15/18
to
On 15/08/18 11:57, Andrew Smallshaw wrote:
> On 2018-08-13, David Brown <david...@hesbynett.no> wrote:
>>
>> Yes, regardless of the slowdown, correct is /always/ more important than
>> fast. If "correct" means too slow to be useful, then you have a
>> problem. But incorrect is /always bad, regardless of speed.
>
> To some extent this comes down to definitions (what counts as
> "correct"?) but I would be very careful about blindly asserting
> that, especially as /always/ being true. Quick and roughly right
> is often perfectly acceptable over something more mathematically
> rigorous, or in realtime scenarios a late result is as much use as
> result at all.
>

As you say, it comes down to the meaning of "correct". If a rough
approximation to something is good enough, then a rough approximation is
correct. And if your results need to be available within a certain
time, then a program that is too slow is incorrect even if the results
are accurate.

"Correct" basically means "fulfils the specification". Getting the
right specification - that's the hard bit. (The answer is 42. Now what
is the question?)

> Or consider many real world applications where the consequences of
> an error are less inconvenient than the hassle of doing the job
> properly to begin with. For example, how many times does a bar
> code fail to scan first time? You could eliminate a lot of that
> with more controlled scanning conditions, e.g. a "scanning bench"
> with a mounted scanner, moveable table that is adjusted to the
> optimal focus, controlled lighting etc. It isn't worth it - the
> whole point of something like that is to be quick and convenient.
> Certainly you don't want a mismatch - one code being matched for
> another - so you have reliable error detection on the result - but
> if the initial scan fails, well you try it again. It is the
> trade-off to have a system that generally works in a second or less
> as opposed to taking five or ten minutes.
>

Again, that's a matter of specification. "It should scan 99% of the bar
codes on first attempt in typical lighting" might be part of the
specification, and occasional second attempts are fine. In that case, a
system that needs a second try once in a while would still be correct.
But a system that gave an invalid output, or crashed, or sometimes hung
depending on the timing of the "read the barcode" and the "check the
button" threads would be incorrect.

And if the specification includes "the C code only needs to be compiled
and tested on gcc 4.3.2 for x86 with -std=gnu99 and no optimisation",
then a program that faffs around with invalid pointer casts might still
be correct. If the specs say "it needs to work with any C compiler",
the code would be incorrect.






Malcolm McLean

unread,
Aug 15, 2018, 7:50:46 AM8/15/18
to
On Wednesday, August 15, 2018 at 11:39:34 AM UTC+1, David Brown wrote:
> On 15/08/18 10:10, bart...@gmail.com wrote:
>
> (If you are trying to do some timing or other testing, then you might
> want to learn about "observable behaviour" in C.)
>
> > Another is the recursive Fibonacci benchmark where fib(n) is expected
> > to perform X million calls, but manages to get the right answer using
> > a fraction of those.
>
> Again, that is a marvellous optimisation.
>
Not necessarily.
One issue is when the program is intended to be run only once, and the
optimisation takes longer than the program to execute. That might be a
program with a runtime of two weeks, for example the programs I wrote
for my doctoral studies.

James Kuyper

unread,
Aug 15, 2018, 8:09:04 AM8/15/18
to
On 08/15/2018 06:39 AM, David Brown wrote:
> On 15/08/18 10:10, bart...@gmail.com wrote:
>> Scott wrote: " Oh, C does what you tell it, that's for certain. "
>>
>> Not GCC -O3. It'll look for every opportunity to /avoid/ doing what I
>> told it.
...
>> This is in the interests of making the program faster.
...
>> No, that's cheating. I want it to do what I say, but efficiently.
...
>> A trivial example is a 100-million loop which may be eliminated if
>> the results it calculates are not used.
...
> (If you are trying to do some timing or other testing, then you might
> want to learn about "observable behaviour" in C.)
>
>> Another is the recursive Fibonacci benchmark where fib(n) is expected
>> to perform X million calls, but manages to get the right answer using
>> a fraction of those.
...
>> We're not interested in the answer! But in how well it can do
>> millions of function calls.
...
> If I want to do testing or benchmarking, then I do testing and
> benchmarking. C provides the key features needed (observable behaviour)
> - it is not gcc's fault that you don't understand the fundamentals of
> how the C language works.

Just in case David's hints about "observable behavior" have gone over
your head, here's what he means:

"The least requirements on a conforming implementation are:
— Accesses to volatile objects are evaluated strictly according to the
rules of the abstract machine.
— At program termination, all data written into files shall be identical
to the result that execution of the program according to the abstract
semantics would have produced.
— The input and output dynamics of interactive devices shall take place
as specified in 7.21.3. The intent of these requirements is that
unbuffered or line-buffered output appear as soon as possible, to ensure
that prompting messages actually appear prior to a program waiting for
input.
This is the _observable behavior_ of the program."
(4.2.3p6)
Note that the phrase "observable behavior" is in italics, an ISO
convention indicating that the sentence containing that phrase
constitutes the official definition of the meaning of that phrase.

What this clause means is that things which qualify as observable
behavior must occur exactly as specified by your code. Optimizations are
not allowed to change those things; they are allowed to change ANYTHING
else. Therefore, all you need to do in order to ensure that a given
block of code actually gets executed, regardless of optimizations, is to
place a piece of code in that block that has observable behavior. The
entire rest of that block might be removed by optimization, but that
particular piece cannot be, and therefore the block cannot be skipped.
If the rest of your code ensures that, in the abstract machine, that
block should be executed 1 million times, then that piece of code will
be executed 1 million times. If it is written in a way that guarantees
that, in the abstract machine, the executions must occur in a certain
order, and if the observable behavior of that piece of code takes a form
(such as printing the value of a counter) that allows you to tell what
order those executions occurred in, then they must occur in the
specified order.
The disadvantage, for timing purposes, is that what you're timing
necessarily includes the time required to produce the observable behavior.
Now, "What constitutes an access to an object that has
volatile-qualified type is implementation-defined." (6.7.3p7), but
incrementing a volatile-qualified counter, for instance, is pretty
likely to qualify.

How does this connect back to the idea of "C does what you tell it"?
Very simple: any code construct that does NOT have observable behavior,
does NOT say what you mistakenly think it says. All such constructs
implicitly say "feel free to optimize me away if you can do so without
affecting the observable behavior". If that's not what you want to say,
you need to replace it with code that does have (or at least, affects
code that has) observable behavior.

David Brown

unread,
Aug 15, 2018, 9:08:23 AM8/15/18
to
If you can find a program that takes two weeks to run, and /longer/ than
two weeks to compile due to optimisation, then I would love to see the code.

It is certainly possible for a compilation to take longer than a run
time. If that is a problem, because you are compiling often and only
need to run the code once, you've probably got the wrong language - and
you've certainly got the compiler options.

David Brown

unread,
Aug 15, 2018, 9:24:57 AM8/15/18
to
On 15/08/18 14:08, James Kuyper wrote:

<snipping for space>

> The disadvantage, for timing purposes, is that what you're timing
> necessarily includes the time required to produce the observable behavior.
> Now, "What constitutes an access to an object that has
> volatile-qualified type is implementation-defined." (6.7.3p7), but
> incrementing a volatile-qualified counter, for instance, is pretty
> likely to qualify.
>

Incrementing a volatile counter will necessarily involve access - I
don't /think/ that one is open to interpretation.

A key point about "what constitutes an access" is when you use a
"pointer to volatile" to access a non-volatile object:

#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))

This is a common idiom (though the "typeof" is a gcc'ism) used in OS and
embedded programming. But the C standards don't actually say if it
works. However, at one meeting the committee members were apparently
asked how their tools handled such expressions - they all said that they
would give you a "volatile access". It's a pity that has not been
codified in the standards (not even in C17, AFAICS).

Bart

unread,
Aug 15, 2018, 9:33:57 AM8/15/18
to
Yes, exactly. That's why pulling out all the stops to optimise a 10-line
benchmark is a waste of time.

Benchmarks are necessary to compare the cost of functions calls, where
you /do/ need to make function calls, across implementations. Then a
compiler that is showing off how good it is at eliminating them in a
tiny program is not helpful.

The point is however that the compiler is also not doing exactly what I
said.

--
bart



David Brown

unread,
Aug 15, 2018, 9:38:48 AM8/15/18
to
On 15/08/18 15:07, Stefan Ram wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> does NOT say what you mistakenly think it says. All such constructs
>> implicitly say "feel free to optimize me away if you can do so without
>> affecting the observable behavior". If that's not what you want to say,
>> you need to replace it with code that does have (or at least, affects
>> code that has) observable behavior.
>
> 6.8.5 says,
>
> |An iteration statement whose controlling expression is not a
> |constant expression, that performs no input/output
> |operations, does not access volatile objects, and performs no
> |synchronization or atomic operations in its body, controlling
> |expression, or (in the case of a for statement) its
> |expression-3, may be assumed by the implementation to terminate.
>
> . So, in this sense it can be "optimized away".

That is typically not the reason for "optimising out" loops. 6.8.5p6
does allow the compiler to remove loops that do no work without having
to prove that they terminate, but in many cases the compiler can already
see that they terminate and thus doesn't need this extra permission.
(6.8.5p6 also allows some additional movement of code in optimising.)
Usually loops get removed because the compiler can see they don't do
anything useful.

>
> But they introduced an exception for /constant expressions/!
>
> So, »while( 0 )« starts a loop that /cannot/ be optimized
> away by this rule. So, one can still write an endless loop
> (for sole calculation purposes, with no "visible behavior")
> if one wishes to do so and is aware of 6.8.5 or lucky.
>

A "while (0)" loop can be removed without reference to the rule.

A "while (1)" loop does not terminate (without a "break", "goto",
"return", "longjmp", "exit", "abort", or some implementation-specific
function or method). Unending loops are absolutely critical to a lot of
programming - embedded systems almost invariably have a "while (1)" or
equivalent after initialisation.


Reinhardt Behm

unread,
Aug 15, 2018, 9:47:42 AM8/15/18
to
It is doing what you said. You think you speak C but you speak Bart-C (a
strange language, that barely resembles C), but the compiler only speaks and
understands C. And it is doing what it understands.

--
Reinhardt

james...@alumni.caltech.edu

unread,
Aug 15, 2018, 9:54:45 AM8/15/18
to
On Wednesday, August 15, 2018 at 9:33:57 AM UTC-4, Bart wrote:
> On 15/08/2018 10:31, Reinhardt Behm wrote:
...
> > Real programmers write real programs not toy benchmarks.
>
>
> Yes, exactly. That's why pulling out all the stops to optimise a 10-line
> benchmark is a waste of time.
>
> Benchmarks are necessary to compare the cost of functions calls, where
> you /do/ need to make function calls, across implementations. Then a
> compiler that is showing off how good it is at eliminating them in a
> tiny program is not helpful.
>
> The point is however that the compiler is also not doing exactly what I
> said.

You were using C. In C, writing a function call does not tell the
compiler to call that function. It tells the compiler to produce the
same "observable behavior" (as the C standard defines that term) as
calling the function would produce, by whatever means the compiler
thinks is best, which may or may not involve calling a function - which
might or might not bear any visible resemblance to the function you
wrote. C is, very fundamentally, about observable behavior, and
everything else is important only insofar as it affects the observable
behavior. If there's anything other than the observable behavior that
matters to you, C is not an appropriate language for requesting that
those other things occur.
It's very odd that you're interested in the speed of function calls in
and of themselves. The only reason you should be interested in the
execution time of function calls is because the time spent executing
them causes delays in the occurrence of observable behavior - and those
delays are easily benchmarked in C.
It is loading more messages.
0 new messages