Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Knowing the implementation, are all undefined behaviours become implementation-defined behaviours?

5 views
Skip to first unread message

Michael Tsang

unread,
Feb 14, 2010, 1:03:38 AM2/14/10
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
pointer is defined to "crash the program with SIGSEGV".

Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
simply wrap around so we can say that the behaviour is defined to round on
x86 CPUs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkt3kjsACgkQm4klUUKw07D7QwCfQH0jkVFEDAQMi9+t31JiQ449
4QMAn2M+QxWW3yf4WShHgmWjBCluBvun
=e8V1
-----END PGP SIGNATURE-----

Alf P. Steinbach

unread,
Feb 14, 2010, 1:20:28 AM2/14/10
to
* Michael Tsang:

Your question, from the subject line, is

"Knowing the implementation, are all undefined behaviours become
implementation-defined behaviours?"

And it's cross-posted to [comp.lang.c] and [comp.lang.c++].

At least for C++ the answer is a definite maybe: theoretically it depends on the
implementation.

In practice the answer is a more clear "no", because it's practically impossible
for an implementation to clearly define all behaviors, in particular pointer
operations and use of external libraries.

Cheers & hth.,

- Alf

Seebs

unread,
Feb 14, 2010, 2:03:57 AM2/14/10
to
On 2010-02-14, Michael Tsang <mik...@gmail.com> wrote:
> Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
> program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
> pointer is defined to "crash the program with SIGSEGV".

Not necessarily.

> Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
> simply wrap around so we can say that the behaviour is defined to round on
> x86 CPUs.

That's not rounding, that's wrapping.

But no, it's not the case. These are not necessarily *defined* -- they may
merely be typical side-effects that are not guaranteed or supported.

Modern gcc can do some VERY strange things if you write code which might
dereference a null pointer. (For instance, loops which check whether a
pointer is null may have the test removed because, if it were null, it
would have invoked undefined behavior to dereference it...)

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Malcolm McLean

unread,
Feb 14, 2010, 2:16:49 AM2/14/10
to
On Feb 14, 8:03 am, Michael Tsang <mikl...@gmail.com> wrote:
>
"Undefined behaviour" doesn't mean "exists in some metaphysical state
of indefiniteness" but "the C standard imposes no requirements on the
program's behaviour (and therefore the program is incorrect)". There
was a huge thread about this a few years back on gets.

So typically derefencing null will have the same effect each time any
particular program is run, probably the same effect on any particular
platform. Derefencing a wild pointer may have different effects,
particularly on a multi-taskign machine where exact pointer vlaues
vary from runto run.

Robert Fendt

unread,
Feb 14, 2010, 3:16:10 AM2/14/10
to
And thus spake Seebs <usenet...@seebs.net>
14 Feb 2010 07:03:57 GMT:

> dereference a null pointer. (For instance, loops which check whether a
> pointer is null may have the test removed because, if it were null, it
> would have invoked undefined behavior to dereference it...)

Sorry to interrupt, but since when is checking a pointer value
for 0 the same as deferencing it? Checking a pointer treats the
pointer itself as a value, and comparison against 0 is one of
the few things that are _guaranteed_ to work with a pointer
value. So if GCC really would remove a check of the form

if(!pointer)
do_something(*pointer);

or even

if(pointer == 0)
throw NullPointerException;

then GCC would be very much in violation of the standard. And
produce absolutely useless code, as well. What's the point of
having pointers in a language if you wouldn't even be able to
perform basic operations on them?

Regards,
Robert

Richard Heathfield

unread,
Feb 14, 2010, 3:21:53 AM2/14/10
to
Michael Tsang wrote:
> Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
> program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
> pointer is defined to "crash the program with SIGSEGV".

Thread's subject line: Knowing the implementation, are all undefined
behaviours become implementation-defined behaviours?

No. For example, consider a stack exploit on gets(). There are systems
on which the behaviour could be absolutely anything at all, depending on
user input!6\b$10be5c39no carrier

Alf P. Steinbach

unread,
Feb 14, 2010, 4:12:24 AM2/14/10
to
* Richard Heathfield:

:-)


Cheers,

- Alf

Bo Persson

unread,
Feb 14, 2010, 5:24:48 AM2/14/10
to

Yes, but there are cases where the compiler can determine that the
pointer is ALWAYS null or not-null, and remove code that would execute
otherwise. For example:

*pointer = 42;


if(pointer == 0)
throw NullPointerException;

is known never to throw the exception!


Bo Persson


Ersek, Laszlo

unread,
Feb 14, 2010, 6:17:13 AM2/14/10
to
In article <20100214091...@vulcan.local>, Robert Fendt <rob...@fendt.net> writes:

> Checking a pointer treats the
> pointer itself as a value, and comparison against 0 is one of
> the few things that are _guaranteed_ to work with a pointer
> value.

No, evaluating an invalid pointer is undefined behavior.

{
void *p;

p = malloc(1);
free(p);
p; /* UB */
!p; /* UB */
0 != p; /* UB */
}

See the C99 Rationale 6.3.2.3 Pointers for an informative (not
normative) description.

I believe that in this paragraph:

----v----
Regardless how an invalid pointer is created, any use of it yields
undefined behavior. Even assignment, comparison with a null pointer
constant, or comparison with itself, might on some systems result in an
exception.
----^----

"any use" denotes "any evaluation", and "assignment" means "assignment
FROM the invalid pointer". I'm fairly sure the following is valid:

{
int *ip;

ip = malloc(sizeof *ip);
free(ip);
sizeof ip;
sizeof *ip;
ip = 0;
ip;
!ip;
0 != ip;
}

Cheers,
lacos

Richard Tobin

unread,
Feb 14, 2010, 6:23:17 AM2/14/10
to
In article <853c9a09-8911-48c5...@z11g2000yqz.googlegroups.com>,
Malcolm McLean <malcolm...@btinternet.com> wrote:

>Derefencing a wild pointer may have different effects,
>particularly on a multi-taskign machine where exact pointer vlaues
>vary from runto run.

It's not a general characteristic of multi-tasking systems that
pointer values vary from run to run. Virtual memory has traditionally
been used to give all instances of a program indistinguishable address
spaces, and addresses will usually be the same.

Recently for security reasons some operating systems have started to
deliberately randomise the locations of, for example, shared
libraries, so pointers are now more likely to vary. (Fortunately this
can usually be disabled for debugging.)

-- Richard
--
Please remember to mention me / in tapes you leave behind.

Robert Fendt

unread,
Feb 14, 2010, 6:54:43 AM2/14/10
to
And thus spake "Bo Persson" <b...@gmb.dk>
Sun, 14 Feb 2010 11:24:48 +0100:

> Yes, but there are cases where the compiler can determine that the
> pointer is ALWAYS null or not-null, and remove code that would execute
> otherwise. For example:
>
> *pointer = 42;
> if(pointer == 0)
> throw NullPointerException;
>
> is known never to throw the exception!

Yes, that's static optimisation. Nothing wrong with that.
However, the posting I was commenting explicitely described
something different:

>> dereference a null pointer. (For instance, loops which check
>> whether a pointer is null may have the test removed because, if it
>> were null, it would have invoked undefined behavior to dereference
>> it...)

This would mean nothing else than the compiler removing
nullpointer checks solely on the grounds that a nullpointer
cannot be de-referenced legally. So the compiler would see a
pointer dereference, and decide "then it can't be null anyway,
since it's used later". And that's just bull, sorry.

Yes, if there's an unconditional pointer dereference and
_afterwards_ a check for null, the compiler could take this as a
hint that said pointer has been checked for null before the first
dereference and thus remove the superfluous check. So if you had
something like this:

MyType& obj = *pointer;
if (!pointer)
threw NullPointerException;

Since the dereference happens _before_ the check, the program
has already entered the domain of undefined behaviour, and the
check is moot (even if one has not 'used' the object reference
in any other way). If the author of the previous posting meant
that, then I agree (though I have doubts whether GCC really
optimises this agressively). But in that case his comment was at
least not very clear.

Regards,
Robert

pete

unread,
Feb 14, 2010, 8:23:56 AM2/14/10
to
Michael Tsang wrote:

> undefined behaviour

The way that term is used in the standard,
is to describe programs outside of any context.

The question is,
"Does the standard place any limitions
on the behavior of this program?"
If the answer is "No", then you have undefined behavior.

I think it is simplest to consider
the behavior of an otherwise correct
program which executes this statement
return (1 / (CHAR_BIT - 7));
as being implementation defined

and the the behavior of an otherwise correct
program which executes this statement
return (1 / (CHAR_BIT - 9));
as being undefined.

--
pete

Ben Bacarisse

unread,
Feb 14, 2010, 8:41:23 AM2/14/10
to
Robert Fendt <rob...@fendt.net> writes:
<snip>

> Yes, if there's an unconditional pointer dereference and
> _afterwards_ a check for null, the compiler could take this as a
> hint that said pointer has been checked for null before the first
> dereference and thus remove the superfluous check. So if you had
> something like this:
>
> MyType& obj = *pointer;
> if (!pointer)
> threw NullPointerException;
>
> Since the dereference happens _before_ the check, the program
> has already entered the domain of undefined behaviour, and the
> check is moot (even if one has not 'used' the object reference
> in any other way). If the author of the previous posting meant
> that, then I agree (though I have doubts whether GCC really
> optimises this agressively).

gcc does exactly that (with certain options). I think this is the
nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19

The pointer use was ever so slightly less obvious but it led gcc to
conclude that the following test could be removed.

Given the cross-post, I should say that I have no idea if gcc does
this for the exact case you cite (which is C++) but I wanted to point
out that similar things are done.

<snip>
--
Ben.

Robert Fendt

unread,
Feb 14, 2010, 8:54:26 AM2/14/10
to
And thus spake Ben Bacarisse <ben.u...@bsb.me.uk>
Sun, 14 Feb 2010 13:41:23 +0000:

> gcc does exactly that (with certain options). I think this is the
> nature a recent Linux kernel bug: http://lkml.org/lkml/2009/7/6/19

It certainly looks that way. That's a nasty bugger to spot.

> Given the cross-post, I should say that I have no idea if gcc does
> this for the exact case you cite (which is C++) but I wanted to point
> out that similar things are done.

Yes, I did not notice this whole thread had been crossposted to
comp.lang.c; a more appropriate example would then have been a
sizeof(*pointer) or something. Since sizeof in that case relies
only on static type information, one could assume it should work
whether the pointer is null or not. But the dereference itself
already makes the whole programm ill-formed (in case of a
nullpointer).

Regards,
Robert

James Kanze

unread,
Feb 14, 2010, 9:11:20 AM2/14/10
to
On Feb 14, 1:54 pm, Robert Fendt <rob...@fendt.net> wrote:
> And thus spake Ben Bacarisse <ben.use...@bsb.me.uk>

> Sun, 14 Feb 2010 13:41:23 +0000:

> > gcc does exactly that (with certain options). I think this
> > is the nature a recent Linux kernel
> > bug:http://lkml.org/lkml/2009/7/6/19

> It certainly looks that way. That's a nasty bugger to spot.

Either the pointer can be null, or it cannot. If it can be
null, the first unit test which tests it with null should cause
a crash. If it cannot, then the test the g++ would have
removed is superfluous, and removing it shouldn't change
anything.

There are many other cases of undefined behavior which do affect
optimizations, however. Consider an expression like: f((*p)++,
(*q)++). Given this, the compiler "knows" that p and q do not
reference the same memory (since if they did, it would be
undefined behavior), which means that in other code in the
function, the compiler might have cached *p, and knows that it
doesn't have to update or purge its cached value if there is a
write through *q.

> > Given the cross-post, I should say that I have no idea if
> > gcc does this for the exact case you cite (which is C++) but
> > I wanted to point out that similar things are done.

> Yes, I did not notice this whole thread had been crossposted
> to comp.lang.c; a more appropriate example would then have
> been a sizeof(*pointer) or something. Since sizeof in that
> case relies only on static type information, one could assume
> it should work whether the pointer is null or not. But the
> dereference itself already makes the whole programm ill-formed
> (in case of a nullpointer).

Dereferencing a null pointer is only undefined behavior if the
code is actually executed. Something like sizeof(
f(*(MyType*)0) ) is perfectly legal, and widely used in some
template idioms (although I can't think of a reasonable use for
it in C).

--
James Kanze

Malcolm McLean

unread,
Feb 14, 2010, 9:35:40 AM2/14/10
to
On Feb 14, 4:11 pm, James Kanze <james.ka...@gmail.com> wrote:
>
> Dereferencing a null pointer is only undefined behavior if the
> code is actually executed.  Something like sizeof(
> f(*(MyType*)0) ) is perfectly legal, and widely used in some
> template idioms (although I can't think of a reasonable use for
> it in C).
>
Nulls are dereferenced to produce the offsetof macro hack in C.

Ersek, Laszlo

unread,
Feb 14, 2010, 10:26:28 AM2/14/10
to
In article <ac7734c3-682b-4b37...@q29g2000yqn.googlegroups.com>, Malcolm McLean <malcolm...@btinternet.com> writes:

> On Feb 14, 4:11=A0pm, James Kanze <james.ka...@gmail.com> wrote:
>>
>> Dereferencing a null pointer is only undefined behavior if the
>> code is actually executed. =A0Something like sizeof(

>> f(*(MyType*)0) ) is perfectly legal, and widely used in some
>> template idioms (although I can't think of a reasonable use for
>> it in C).
>>
> Nulls are dereferenced to produce the offsetof macro hack in C.

No, they are not.

I guess you mean something like this:

#define offsetof(type, member_designator) \
((size_t)&((type *)0)->member_designator)

Let's deal first with the conversion of the final pointer to size_t:

C99 6.3.2.3 Pointers, p6: "Any pointer type may be converted to an
integer type. Except as previously specified, the result is
implementation-defined. If the result cannot be represented in the
integer type, the behavior is undefined. The result need not be in the
range of values of any integer type."

Then wrt. dereferencing the null pointer:

C99 6.6 Constant expressions, p9: "An address constant is a null
pointer, [...]; it shall be created explicitly using the unary &
operator or an integer constant cast to pointer type, or [...]. The
[...] member-access . and -> operators, the address & and indirection *
unary operators, and pointer casts may be used in the creation of an
address constant, but the value of an object shall not be accessed by
use of these operators."

Perhaps this is relevant too:

C99 6.5.3.2 Address and indirection operators, p3: "[...] If the operand
is the result of a unary * operator, neither that operator nor the &
operator is evaluated and the result is as if both were omitted, except
that the constraints on the operators still apply and the result is not
an lvalue. [...]"

Cheers,
lacos

Ben Bacarisse

unread,
Feb 14, 2010, 11:18:10 AM2/14/10
to
James Kanze <james...@gmail.com> writes:

> On Feb 14, 1:54 pm, Robert Fendt <rob...@fendt.net> wrote:

<snip>


>> Yes, I did not notice this whole thread had been crossposted
>> to comp.lang.c; a more appropriate example would then have
>> been a sizeof(*pointer) or something. Since sizeof in that
>> case relies only on static type information, one could assume
>> it should work whether the pointer is null or not. But the
>> dereference itself already makes the whole programm ill-formed
>> (in case of a nullpointer).
>
> Dereferencing a null pointer is only undefined behavior if the
> code is actually executed. Something like sizeof(
> f(*(MyType*)0) ) is perfectly legal, and widely used in some
> template idioms (although I can't think of a reasonable use for
> it in C).

For a non-literal null, it is quite common:

new_ptr = realloc(old_ptr, new_length * sizeof *new_ptr);

will work regardless of the state of new_ptr (null, well-defined or
indeterminate).

[I know you know this: I am simple illustrating the point with a
common idiom.]

--
Ben.

Seebs

unread,
Feb 14, 2010, 11:35:21 AM2/14/10
to
On 2010-02-14, Robert Fendt <rob...@fendt.net> wrote:
> And thus spake Seebs <usenet...@seebs.net>
> 14 Feb 2010 07:03:57 GMT:
>> dereference a null pointer. (For instance, loops which check whether a
>> pointer is null may have the test removed because, if it were null, it
>> would have invoked undefined behavior to dereference it...)

> Sorry to interrupt, but since when is checking a pointer value
> for 0 the same as deferencing it?

It's not.

But if you dereference a pointer at some point, a check against it can
be omitted. If, that is, that dereference can happen without the check.

So imagine something like:

ptr = get_ptr();

while (ptr != 0) {
/* blah blah blah */
ptr = get_ptr();
x = *ptr;
}

gcc might turn the while into an if followed by an infinite loop, because
it *knows* that ptr can't become null during the loop, because if it did,
that would have invoked undefined behavior.

And there are contexts where you can actually dereference a null and not
get a crash, which means that some hunks of kernel code can become infinite
loops unexpectedly with modern gcc. Until the kernel is fixed, which I
believe it has been.

Seebs

unread,
Feb 14, 2010, 11:37:22 AM2/14/10
to
On 2010-02-14, James Kanze <james...@gmail.com> wrote:
> Either the pointer can be null, or it cannot. If it can be
> null, the first unit test which tests it with null should cause
> a crash. If it cannot, then the test the g++ would have
> removed is superfluous, and removing it shouldn't change
> anything.

Unless you're in a context where dereferencing null exhibits the undefined
behavior of giving you access to a block of memory.

> Dereferencing a null pointer is only undefined behavior if the
> code is actually executed. Something like sizeof(
> f(*(MyType*)0) ) is perfectly legal, and widely used in some
> template idioms (although I can't think of a reasonable use for
> it in C).

Implementation of offsetof(), too, although that's not exactly safe.

Pete Becker

unread,
Feb 14, 2010, 11:48:52 AM2/14/10
to

In some compilers the offsetof macro is implemented that way. But that
only works with compilers that don't trap on such a dereference. Which
is why it's in the standard library: the library that comes with the
compiler can take advantage of compiler-specific behavior while portable
code can't.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

Ben Bacarisse

unread,
Feb 14, 2010, 12:07:51 PM2/14/10
to
Malcolm McLean <malcolm...@btinternet.com> writes:

Then I would say that it is not an example of what James was talking
about. In his C++ example, no null pointer is dereferenced.

Obviously there is a terminology issue here in that you might want to
say that sizeof *(int *)0 is a dereference of a null pointer because,
structurally, it applies * to such a pointer; but I would rather
reserve the word dereference for an /evaluated/ application of * (or []
or ->). I'd go so far as to say that any other use is wrong.

--
Ben.

Thad Smith

unread,
Feb 14, 2010, 12:30:55 PM2/14/10
to
Michael Tsang wrote:
>
> Deferencing a NULL pointer is undefined behaviour,

Actually, dereferencing a null pointer _results in_ behavior undefined by
Standard C.

In answer to your subject line question "Knowing the implementation, are all
undefined behaviours become implementation-defined behaviours?", no.

In Standard C "implementation-defined behavior" means that the implementation
documents the behavior. Even if the behavior is consistent for a particular
implementation, it may not be documented.

--
Thad

Richard Tobin

unread,
Feb 14, 2010, 12:38:38 PM2/14/10
to
In article <slrnhng9pl.2mq...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:

> while (ptr != 0) {
> /* blah blah blah */
> ptr = get_ptr();
> x = *ptr;
> }
>
>gcc might turn the while into an if followed by an infinite loop, because
>it *knows* that ptr can't become null during the loop, because if it did,
>that would have invoked undefined behavior.

As I've said before, the fact that the compiler can do this sort of
optimisation is often an indication of an error in the code. Why
would the programmer repeatedly test the pointer if it couldn't be
null? I would much rather that the compiler warned about this, instead
of just treating it as an opportunity to remove some code.

Seebs

unread,
Feb 14, 2010, 12:46:27 PM2/14/10
to
On 2010-02-14, Richard Tobin <ric...@cogsci.ed.ac.uk> wrote:
> As I've said before, the fact that the compiler can do this sort of
> optimisation is often an indication of an error in the code. Why
> would the programmer repeatedly test the pointer if it couldn't be
> null? I would much rather that the compiler warned about this, instead
> of just treating it as an opportunity to remove some code.

That's an interesting point, and I think I'd agree. Maybe. Do we
want a warning for while(1), which we know definitely loops forever?

It could be that the loop was written because the programmer wasn't *sure*
it couldn't be null, but the compiler has proven it and thus feels safe
optimizing.

Richard Tobin

unread,
Feb 14, 2010, 1:00:57 PM2/14/10
to
In article <slrnhngdv0.8ao...@guild.seebs.net>,
Seebs <usenet...@seebs.net> wrote:


>That's an interesting point, and I think I'd agree. Maybe. Do we
>want a warning for while(1), which we know definitely loops forever?

No, but only because it's a common idiom.

>It could be that the loop was written because the programmer wasn't *sure*
>it couldn't be null, but the compiler has proven it and thus feels safe
>optimizing.

All the more reason for a warning. Then the programmer can sleep
soundly, and perhaps modify the code accordingly. (And modify the
comment about it that he no doubt wrote.)

Seebs

unread,
Feb 14, 2010, 1:02:29 PM2/14/10
to
On 2010-02-14, Richard Tobin <ric...@cogsci.ed.ac.uk> wrote:
> In article <slrnhngdv0.8ao...@guild.seebs.net>,
> Seebs <usenet...@seebs.net> wrote:
>>It could be that the loop was written because the programmer wasn't *sure*
>>it couldn't be null, but the compiler has proven it and thus feels safe
>>optimizing.

> All the more reason for a warning. Then the programmer can sleep
> soundly, and perhaps modify the code accordingly. (And modify the
> comment about it that he no doubt wrote.)

Hmm.

The problem is, this becomes a halting problem case, effectively. It's like
the warnings for possible use of uninitialized variables. We *know* that
those warnings are going to sometimes be wrong, or sometimes be omitted when
they were appropriate, so the compiler has to accept some risk of error. I
think this optimization is in the same category -- there's too many boundary
cases to make it a behavior that people rely on or expect.

Pete Becker

unread,
Feb 14, 2010, 2:02:55 PM2/14/10
to
Thad Smith wrote:
>
> In Standard C "implementation-defined behavior" means that the
> implementation documents the behavior.

It's actually a little more than that: it means that the standard
*requires* the implementation to document the behavior. Providing
documentation does not turn undefined behavior into
implementation-defined behavior.

Even if the behavior is
> consistent for a particular implementation, it may not be documented.
>

Even if it's documented, it's still undefined behavior according to the
standard. It may well be well-behaved and consistent, but calling that
"implementation-defined" muddles meanings because
"implementation-defined" has a specific meaning within the standard.

ThosRTanner

unread,
Feb 15, 2010, 3:49:41 AM2/15/10
to
On Feb 14, 6:03 am, Michael Tsang <mikl...@gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1

>
> Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
> program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
> pointer is defined to "crash the program with SIGSEGV".
>
> Signed integer overflow is undefined behaviour, but, on x86 CPUs, the number
> simply wrap around so we can say that the behaviour is defined to round on
> x86 CPUs.

The results of accessing uninitialised memory however are wildly
unpredictable, as well as going over the end of an array. It may
always crash, but its far more likely to occasionally do something
strange.

void func()
{
int i;
std::cout << i << "\n";
}

is going to depend, among other things, on what has been called before
func.

Arguably it's predictable but so is the weather. It just takes 3 days
of work on a massive supercomputer to get a 24 hours forecast...

Phil Carmody

unread,
Feb 15, 2010, 4:18:39 AM2/15/10
to

One high profile instance was fixed. I think I fixed another one or two
recently. Independent eyes, and all that, imply that there are probably
hundreds left.

I wonder if sparse can be tweaked to detect it. I'm a tad rusty at
sparse internals, but might be able to add it if I have some spare
time.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Phil Carmody

unread,
Feb 15, 2010, 4:26:16 AM2/15/10
to
Seebs <usenet...@seebs.net> writes:
> On 2010-02-14, Richard Tobin <ric...@cogsci.ed.ac.uk> wrote:
>> As I've said before, the fact that the compiler can do this sort of
>> optimisation is often an indication of an error in the code. Why
>> would the programmer repeatedly test the pointer if it couldn't be
>> null? I would much rather that the compiler warned about this, instead
>> of just treating it as an opportunity to remove some code.
>
> That's an interesting point, and I think I'd agree. Maybe. Do we
> want a warning for while(1), which we know definitely loops forever?
>
> It could be that the loop was written because the programmer wasn't *sure*
> it couldn't be null, but the compiler has proven it and thus feels safe
> optimizing.

There's a difference between deducing that an expression has or
doesn't have a particular value based on code elsewhere, and
detecting a purely constant value. So the two can be treated as
separate cases and {en,dis}abled independently.

However, in a C context, this would be a mid-way case between those two:

static const int debugging = MAYBE_A_HASHDEFINE;

foo() {
//...
if(debugging) { frob_stuff(); }
//...
}

I think I wouldn't want a compiler warning for that, were debugging
to be zero.

Nobody

unread,
Feb 15, 2010, 11:18:48 AM2/15/10
to
On Sun, 14 Feb 2010 09:16:10 +0100, Robert Fendt wrote:

>> dereference a null pointer. (For instance, loops which check whether a
>> pointer is null may have the test removed because, if it were null, it
>> would have invoked undefined behavior to dereference it...)
>
> Sorry to interrupt, but since when is checking a pointer value

> for 0 the same as deferencing it? Checking a pointer treats the


> pointer itself as a value, and comparison against 0 is one of
> the few things that are _guaranteed_ to work with a pointer

> value. So if GCC really would remove a check of the form
>
> if(!pointer)
> do_something(*pointer);
>
> or even
>

> if(pointer == 0)
> throw NullPointerException;

It won't remove those checks, but the first one may well be converted to:

if(!pointer)
do_something_completely_different();

And that's entirely legal, as the the original code invokes UB when the
test succeeds. [Or did you get the test the wrong way around?]

Also, if you do e.g.:

int x = *p;

if (p)
do_something_to(x);

it may simply omit the test. As you have already dereferenced p by that
point, if p happens to be null, you invoke UB and the compiler can do
whatever it wants, including executing the code which should be executed
when p is non-null.

This isn't a theoretical situation; some versions of gcc *will* perform
the above optimisation. This particular case resulted in an exploitable
bug in a Linux kernel module (the bug was compounded by the fact that you
*can* have memory mapped to page zero; this is permitted for the benefit
of emulators such as DOSbox).

Alan Curry

unread,
Feb 15, 2010, 4:28:29 PM2/15/10
to
In article <hl83o2$aos$1...@news.eternal-september.org>,

Michael Tsang <mik...@gmail.com> wrote:
|-----BEGIN PGP SIGNED MESSAGE-----
|Hash: SHA1
|
|Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
|program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
|pointer is defined to "crash the program with SIGSEGV".

Are you sure?

Compile this with and without -DUSE_STDIO and explain the results.

Both branches ask the system to do the exact same thing: fetch a byte from
the address indicated by NULL, and write it to the standard output.

#include <stdio.h>
#include <unistd.h>

int main(void)
{
#ifdef USE_STDIO
if(fwrite(NULL, 1, 1, stdout)==0 || fflush(stdout))
perror("fwrite from null pointer");
#else
if(write(STDOUT_FILENO, NULL, 1)<0)
perror("write from null pointer");
#endif
return 0;
}

--
Alan Curry

James Kanze

unread,
Feb 15, 2010, 7:28:27 PM2/15/10
to
On Feb 14, 5:07 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> Malcolm McLean <malcolm.mcle...@btinternet.com> writes:
> > On Feb 14, 4:11 pm, James Kanze <james.ka...@gmail.com> wrote:

> >> Dereferencing a null pointer is only undefined behavior if
> >> the code is actually executed. Something like sizeof(
> >> f(*(MyType*)0) ) is perfectly legal, and widely used in
> >> some template idioms (although I can't think of a
> >> reasonable use for it in C).

> > Nulls are dereferenced to produce the offsetof macro hack in C.

> Then I would say that it is not an example of what James was
> talking about. In his C++ example, no null pointer is
> dereferenced.

It's also a case of the implementation (compiler) guaranteeing
the implementation (library) that in this particular case, it
will work. That's an in-house guarantee, that they might not
have made public. (And of course, not all compilers do this.)

> Obviously there is a terminology issue here in that you might
> want to say that sizeof *(int *)0 is a dereference of a null
> pointer because, structurally, it applies * to such a pointer;
> but I would rather reserve the word dereference for an
> /evaluated/ application of * (or [] or ->). I'd go so far as
> to say that any other use is wrong.

It's not a question of what one would "rather". The standard is
very clear that dereferencing a null pointer is something that
happens at execution, not at compile time. And that a sizeof
expression is fully evaluated at compile time.

--
James Kanze

Ersek, Laszlo

unread,
Feb 15, 2010, 7:56:50 PM2/15/10
to

> It's not a question of what one would "rather". The standard is
> very clear that dereferencing a null pointer is something that
> happens at execution, not at compile time. And that a sizeof
> expression is fully evaluated at compile time.

There might be a shorter path leading there:

C90 6.3.3.4 "The sizeof operator", p2: "The size is determined from the
type of the operand. which is not itself evaluated."

C99 6.5.3.4 "The sizeof operator", p2: "If the type of the operand is a
variable length array type, the operand is evaluated; otherwise, the
operand is not evaluated and the result is an integer constant."

C++98/C++03 5.3.3 "Sizeof", p1: "The operand is either an expression,
which is not evaluated, or a parenthesized type-id."

Cheers,
lacos

Ben Bacarisse

unread,
Feb 15, 2010, 8:37:23 PM2/15/10
to
James Kanze <james...@gmail.com> writes:

This is a C/C++ distinction. I'd already forgotten the cross post by
the end of my post. The C standard does not use the term so, for C
alone, a case could be made to use the term in a purely syntactic way
and I was arguing against that.

--
Ben.

Nobody

unread,
Feb 16, 2010, 1:55:32 AM2/16/10
to
On Mon, 15 Feb 2010 21:28:29 +0000, Alan Curry wrote:

> |Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
> |program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
> |pointer is defined to "crash the program with SIGSEGV".
>
> Are you sure?
>
> Compile this with and without -DUSE_STDIO and explain the results.
>
> Both branches ask the system to do the exact same thing: fetch a byte from
> the address indicated by NULL, and write it to the standard output.

Neither branch *dereferences* a null pointer; they just pass it to a
function. It's unknown whether either function ultimately dereferences the
pointer; they may test it for validity first.

In any case, the argument about dereferencing null pointers being
"defined" to crash the program with SIGSEGV is bogus.

First, SIGSEGV isn't defined to "crash" the program; you can install a
handler for SIGSEGV, and even if you don't, not everyone would consider
terminating on a signal to be a "crash" (does abort() "crash" the program?).

Second, SIGSEGV typically arises from accessing an address to which
nothing is mapped (or which is mapped but without the desired access). But
you can have memory mapped to page zero; 8086 (DOS) emulators frequently
do this.

Finally, reading a null pointer (as that term is defined by C) doesn't
necessarily result in reading address zero, as the compiler can optimise
the access away. In particular, in a context where a pointer has already
been dereferenced, recent versions of gcc will assume that the pointer is
non-null (if it's null, you've invoked UB by dereferencing it, so the
compiler can do whatever it wants, including doing whatever it's supposed
to do when the pointer is non-null).

tonydee

unread,
Feb 16, 2010, 3:12:53 AM2/16/10
to
On Feb 14, 10:23 pm, pete <pfil...@mindspring.com> wrote:
> Michael Tsang wrote:
> > undefined behaviour
>
> The way that term is used in the standard,
> is to describe programs outside of any context.
>
> The question is,
> "Does the standard place any limitions
> on the behavior of this program?"
> If the answer is "No", then you have undefined behavior.
>
> I think it is simplest to consider
> the behavior of an otherwise correct
> program which executes this statement
>      return (1 / (CHAR_BIT - 7));
> as being implementation defined

The way you're contrasting this with "/ (CHAR_BIT - 9)" suggests you
believe CHAR_BIT >= 8. I've heard rumours of systems where it was 7,
but let's ignore that.

I don't believe that your example is implementation defined vis-à-vis
the Standard.

"1.4 Definitions
[intro.defs]

--implementation-defined behavior: Behavior, for a well-formed
program
construct and correct data, that depends on the implementation
and
that each implementation shall document."

If your code divides 1 by some positive value, that has a well-defined
meaning and flow of control that is common to all C++ compilers/
environments, though the exact divisor and result may vary. Nothing
here needs to be documented per implementation.

> and the the behavior of an otherwise correct
> program which executes this statement
>      return (1 / (CHAR_BIT - 9));
> as being undefined.

Only on a system where CHAR_BIT was equal to 9 would this result in
undefined behaviour. From the Standard:

5.6 Multiplicative operators [expr.mul]

4 ...
If the second operand of / or % is zero the behavior is unde-
fined; otherwise (a/b)*b + a%b is equal to a. If both operands
are
nonnegative then the remainder is nonnegative; if not, the sign of
the
remainder is implementation-defined

This applies at run-time. A program doesn't have some static property
of "undefined behaviour" just because some unsupported inputs could
cause undefined behaviour at run-time. That said, given CHAR_BIT may
be constant for a particular version of a compiler on a particular
system, it may be that compiling and running the program in that
environment will always generate undefined behaviour. Taking the code
in isolation from the compilation environment, it's more likely to
provide a negative divisor, triggering the mixed-signs clause above
and hence implementation-defined behaviour. So, there are three
possible run-time outcomes based on static analysis of the source
code: undefined behaviour, implementation defined behaviour, and well-
defined behaviour.

Cheers,
Tony

Nick Keighley

unread,
Feb 16, 2010, 3:24:18 AM2/16/10
to
On 16 Feb, 08:12, tonydee <tony_in_da...@yahoo.co.uk> wrote:

> [this] suggests you
> believe CHAR_BIT >= 8.  I've heard rumours of systems where it was 7 [...]

CHAR_BIT cannot be less than 8 on a standard conforming C (or C++)
implementation

Richard Heathfield

unread,
Feb 16, 2010, 3:27:22 AM2/16/10
to
[I note the cross-post to clc++. My answer is given in a C context,
which may or may not also apply to C++.]

tonydee wrote:
> On Feb 14, 10:23 pm, pete <pfil...@mindspring.com> wrote:

<snip>

>> I think it is simplest to consider
>> the behavior of an otherwise correct
>> program which executes this statement
>> return (1 / (CHAR_BIT - 7));
>> as being implementation defined
>
> The way you're contrasting this with "/ (CHAR_BIT - 9)" suggests you
> believe CHAR_BIT >= 8. I've heard rumours of systems where it was 7,
> but let's ignore that.

Yes, let's, since CHAR_BIT is *required* to be >= 8 on conforming
implementations.

> I don't believe that your example is implementation defined vis-�-vis
> the Standard.

The CHAR_BIT - 7 one is implementation-defined. The implementation
defines it by defining CHAR_BIT (as it is required to do), which is at
least 8 but which can be greater. So the result of the expression 1 /
(CHAR_BIT - 7) will be 1 unless CHAR_BIT exceeds 8, in which case it
will be 0.

When we shift to CHAR_BIT - 9, however, the result can be -1 (for
CHAR_BIT of 8), or 1 (for CHAR_BIT of 10), or 0 (for CHAR_BIT of 11+),
or undefined (for CHAR_BIT of 9). So, if all we can observe is the
source (i.e. we don't know the implementation), the safest observation
we can make is that the code exhibits undefined behaviour and should be
changed.

<snip>

> If your code divides 1 by some positive value, that has a well-defined
> meaning and flow of control that is common to all C++ compilers/
> environments, though the exact divisor and result may vary. Nothing
> here needs to be documented per implementation.

CHAR_BIT does. Since the result of the expression depends on that value,
the behaviour is effectively implementation-defined.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

tonydee

unread,
Feb 16, 2010, 4:33:51 AM2/16/10
to
On Feb 16, 5:27 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
> [I note the cross-post to clc++. My answer is given in a C context,
> which may or may not also apply to C++.]
>
> tonydee wrote:
> > On Feb 14, 10:23 pm, pete <pfil...@mindspring.com> wrote:
>
> >> I think it is simplest to consider
> >> the behavior of an otherwise correct
> >> program which executes this statement
> >>      return (1 / (CHAR_BIT - 7));
> >> as being implementation defined
>
> > I don't believe that your example is implementation defined vis-à-vis

> > the Standard.
>
> The CHAR_BIT - 7 one is implementation-defined. The implementation
> defines it by defining CHAR_BIT (as it is required to do), which is at
> least 8 but which can be greater. So the result of the expression 1 /
> (CHAR_BIT - 7) will be 1 unless CHAR_BIT exceeds 8, in which case it
> will be 0.

The _value_ of CHAR_BIT is implementation defined. Programs that
incorporate the value into the division above always receive a result
that's entirely specified - as a function of the inputs - by the
Standard. The division process and behaviour are well defined. I
don't think it's correct or useful to imagine that all behaviour
consequent to something implementation defined is itself
implementation defined.

If we look at what was being said:

> >> I think it is simplest to consider
> >> the behavior of an otherwise correct
> >> program which executes this statement
> >> return (1 / (CHAR_BIT - 7));
> >> as being implementation defined

Surely it is implied that it's the use of CHAR_BIT in the division,
and not CHAR_BIT itself, which might make the expression
implementation defined? I'm saying that in that exact but limited
sense, it's moved past the implementation defined aspect and division
behaviour is well defined.

> When we shift to CHAR_BIT - 9, however, the result can be -1 (for
> CHAR_BIT of 8), or 1 (for CHAR_BIT of 10), or 0 (for CHAR_BIT of 11+),
> or undefined (for CHAR_BIT of 9). So, if all we can observe is the
> source (i.e. we don't know the implementation), the safest observation
> we can make is that the code exhibits undefined behaviour and should be
> changed.

Again, this misses the subtlety I was trying to communicate. "The
code exhibits undefined behaviour" is misleading. It's only exhibited
when it happens. It certainly _can_ exhibit undefined behaviour, but
there are many environments where it will run with well-defined
behaviour. There may even be a compile time assertion that CHAR_BIT !
= 9 somewhere above. While any good program would handle this issue
is a robust fashion, but it's not a precondition for avoiding
undefined behaviour when the implementation has CHAR_BIT != 9. It
boils down to a defensive programming consideration.

> > If your code divides 1 by some positive value, that has a well-defined
> > meaning and flow of control that is common to all C++ compilers/
> > environments, though the exact divisor and result may vary.  Nothing
> > here needs to be documented per implementation.
>
> CHAR_BIT does. Since the result of the expression depends on that value,
> the behaviour is effectively implementation-defined.

(Discussed again above.)

Cheers,
Tony

Richard Heathfield

unread,
Feb 16, 2010, 4:41:31 AM2/16/10
to
tonydee wrote:

<snip>

> I don't think it's correct or useful to imagine that all behaviour
> consequent to something implementation defined is itself
> implementation defined.

It is clear that you have thought this through. I, too, have thought
this through. We have arrived at opposite conclusions. (This is hardly a
unique phenomenon.) And it seems unlikely that either of us will change
our position through debate.

I suggest, then, that we agree to disagree. :-)

Richard Bos

unread,
Mar 1, 2010, 4:15:16 PM3/1/10
to
ric...@cogsci.ed.ac.uk (Richard Tobin) wrote:

I'd much rather that it did both. I can see why you'd want a warning,
but I still want my compiler to optimise away a test which I'd not
realised was superfluous (or perhaps more likely, which is superfluous
on one architecture but not on another).

Richard

Tim Rentsch

unread,
Mar 2, 2010, 5:12:54 PM3/2/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-02-14, Richard Tobin <ric...@cogsci.ed.ac.uk> wrote:
>> In article <slrnhngdv0.8ao...@guild.seebs.net>,
>> Seebs <usenet...@seebs.net> wrote:
>>>It could be that the loop was written because the programmer wasn't *sure*
>>>it couldn't be null, but the compiler has proven it and thus feels safe
>>>optimizing.
>
>> All the more reason for a warning. Then the programmer can sleep
>> soundly, and perhaps modify the code accordingly. (And modify the
>> comment about it that he no doubt wrote.)
>
> Hmm.
>
> The problem is, this becomes a halting problem case, effectively. It's like
> the warnings for possible use of uninitialized variables.

There is one very important difference -- in one case the
compilers says the code /might/ not work they way you /think/ it
does, and in the other case the compiler says the code /won't/
work they way you said it /should/ work. Any optimizations
predicated on previous undefined behavior fall into the second
category, and warnings for these are "fool proof", because they
happen only when the compiler is doing something it's /sure/ is
dangerous, not when you're doing something that only /might/ be dangerous.

> We *know* that
> those warnings are going to sometimes be wrong, or sometimes be omitted when
> they were appropriate, so the compiler has to accept some risk
> of error.

Good compilers do only one of these for uninitialized variables,
namely, they sometimes warn that variables might be used without
initialization even though they aren't. No decent compiler that
purports to give warnings on uninitialized variable use ever
misses a case when this might happen.

> I
> think this optimization is in the same category -- there's too many boundary
> cases to make it a behavior that people rely on or expect.

I expect if you think about it a little longer you'll reach
a different conclusion.

Tim Rentsch

unread,
Mar 2, 2010, 5:17:19 PM3/2/10
to
ral...@xs4all.nl (Richard Bos) writes:

Definitely - as long as the compiler provides an option to give
the warning, I also want the option to do the optimization.

Seebs

unread,
Mar 2, 2010, 5:26:54 PM3/2/10
to
On 2010-03-02, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> Good compilers do only one of these for uninitialized variables,
> namely, they sometimes warn that variables might be used without
> initialization even though they aren't. No decent compiler that
> purports to give warnings on uninitialized variable use ever
> misses a case when this might happen.

I do not believe this to be the case.

> I expect if you think about it a little longer you'll reach
> a different conclusion.

Well, actually. The reason I hold my current belief is that I have had
opportunities to discuss "may be used uninitialized" warnings with gcc
developers. And it turns out that, in fact, spurious warnings are
consistently reported as bugs, and that at any given time, gcc is usually
known both to give spurious warnings and to miss some possible uninitialized
uses. In each case, the goal of the developers seems to be to maximize
the chances that gcc is correct.

Now, maybe that makes gcc "not a good compiler", but it certainly makes gcc
the sort of compiler that customers appear to want, which is one which does
its best to minimize errors rather than accepting a larger number of errors
in order to ensure that they're all of the same sort.

Keep in mind that, stupid though it may be, many projects have a standing
policy of building with warnings-as-errors, so a spurious warning can cause
a fair bit of hassle.

Tim Rentsch

unread,
Mar 6, 2010, 8:12:02 AM3/6/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-03-02, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
>> Good compilers do only one of these for uninitialized variables,
>> namely, they sometimes warn that variables might be used without
>> initialization even though they aren't. No decent compiler that
>> purports to give warnings on uninitialized variable use ever
>> misses a case when this might happen.
>
> I do not believe this to be the case.

Okay, I'll read on...

>> I expect if you think about it a little longer you'll reach
>> a different conclusion.
>
> Well, actually. The reason I hold my current belief is that I have had
> opportunities to discuss "may be used uninitialized" warnings with gcc
> developers.

Interjection: this response is not really on point for what I
was saying (unfortunately hidden because the context snipping was
a bit overzealous), but no matter, I'll try to clear that up
later...

> And it turns out that, in fact, spurious warnings are
> consistently reported as bugs, and that at any given time, gcc is usually
> known both to give spurious warnings and to miss some possible uninitialized
> uses. In each case, the goal of the developers seems to be to maximize
> the chances that gcc is correct.

It seems worth pointing out that this comment is of the form "I
have some private knowledge, not known to the general public, and
about which I'm not going to give any real specifics, that makes
me think I'm right." It's great if made to win an argument.
Not as good if the point is to communicate reasoning and reach
shared understanding.

> Now, maybe that makes gcc "not a good compiler", but it certainly makes gcc
> the sort of compiler that customers appear to want, which is one which does
> its best to minimize errors rather than accepting a larger number of errors
> in order to ensure that they're all of the same sort.

It's not really possible to give a useful response to this,
because there is no knowledge generally publically available
on which to base a response. Effectively the statement just
stops further communication. Is that what you wanted? Or
were you meaning to do something else?

> Keep in mind that, stupid though it may be, many projects have a standing
> policy of building with warnings-as-errors, so a spurious warning can cause
> a fair bit of hassle.

Unfortunately the discussion got sidetracked onto whether or not
gcc is a decent compiler. Of course no sensible person wants
stupid, bogus, or nearly-information-content-free warnings of the
kind attributed to gcc above (okay that description may be a
little unfair, but it's not completely unfair). However -- and
this is the key point -- that's not what I was talking about; I
tried to make that clear but despite that the discussion got
turned into an assessment of gcc's warning policy.

To get back on track, two things: first, I'm talking about
warnings where "optimizations" are based on changing the expressed
intent by exercising a complete freedom of choice where undefined
behavior is involved; second, warnings about these can be given
exactly because the compiler has available _perfect knowledge_
about whether the condition in question has occurred -- it's not
any kind of heuristic like the case of uninitialized variables.
It's because of this perfect knowledge condition that giving
warnings for this situation is not equivalent to the halting
problem. Therefore a compiler can (and for purposes of the
question under discussion, will) faithfully give exact information
warnings about when such things occur. I think any sensible person
should agree that (an option that enables) getting these warnings
is desirable. (If gcc chooses to provide some sort of related-but-
not-quite-the-same warnings, those might or might not be desirable,
but in any case that's a separate discussion.)

Is my point a little bit clearer now?

Seebs

unread,
Mar 6, 2010, 2:33:48 PM3/6/10
to
On 2010-03-06, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> It seems worth pointing out that this comment is of the form "I
> have some private knowledge, not known to the general public, and
> about which I'm not going to give any real specifics, that makes
> me think I'm right." It's great if made to win an argument.
> Not as good if the point is to communicate reasoning and reach
> shared understanding.

I was responding to the allegation that I hadn't thought about it enough,
not necessarily to the underlying substance.

I agree that it's not a persuasive argument that my position is correct;
it is offered only as an argument that my position is not unconsidered.

>> Now, maybe that makes gcc "not a good compiler", but it certainly makes gcc
>> the sort of compiler that customers appear to want, which is one which does
>> its best to minimize errors rather than accepting a larger number of errors
>> in order to ensure that they're all of the same sort.

> It's not really possible to give a useful response to this,
> because there is no knowledge generally publically available
> on which to base a response. Effectively the statement just
> stops further communication. Is that what you wanted? Or
> were you meaning to do something else?

You have a fair point here. I guess, to put it another way: You have
an argument (and it is a sound one) that we ought to prefer a compiler
which gives warnings whenever it's not totally sure, rather than risking
not giving a warning. I personally prefer one which does its best to guess
right, even if that means it may be wrong in either direction. I don't
much rely on the warnings (I hardly ever see them anyway, because I'm a
habitual initializer of variables), but when I do see spurious warnings,
they annoy me a great deal.

> To get back on track, two things: first, I'm talking about
> warnings where "optimizations" are based on changing the expressed
> intent by exercising a complete freedom of choice where undefined
> behavior is involved; second, warnings about these can be given
> exactly because the compiler has available _perfect knowledge_
> about whether the condition in question has occurred -- it's not
> any kind of heuristic like the case of uninitialized variables.

Ahh! I see. I think we were talking about two separate kinds of cases.
One is warnings about possibly-uninitialized values, where I favor gcc's
policy of trying for the most accurate warnings it can, even though this
means it sometimes omits a warning.

I would not object in the least to a warning flag that, say, requests warnings
whenever gcc optimizes a test out because it's concluded that a pointer is
dereferenced, and therefore non-null. I would even think it should probably
be on by default, because honestly, I can't think of a case where I would
intentionally write code subject to such an optimization.

Come to think of it, I think I'll go file that as an enhancement request with
our vendor.

(Actually, I think I can. Imagine a function which has as part of its
contract that its argument is non-null... Oh, but wait, there'd be no test
to optimize in that case. Hmm.)

> Is my point a little bit clearer now?

I think so.

Tim Rentsch

unread,
Mar 22, 2010, 10:28:30 AM3/22/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-03-06, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
>> It seems worth pointing out that this comment is of the form "I
>> have some private knowledge, not known to the general public, and
>> about which I'm not going to give any real specifics, that makes
>> me think I'm right." It's great if made to win an argument.
>> Not as good if the point is to communicate reasoning and reach
>> shared understanding.
>
> I was responding to the allegation that I hadn't thought about it enough,
> not necessarily to the underlying substance.

Just for the record it wasn't my intention to make such an allegation.
Perhaps a prediction that you would reach a new conclusion upon
consideration of new facts.

> I agree that it's not a persuasive argument that my position is correct;
> it is offered only as an argument that my position is not unconsidered.

My confusion. I never considered your position to be unconsidered;
sorry if it came across otherwise.


>>> Now, maybe that makes gcc "not a good compiler", but it certainly makes gcc
>>> the sort of compiler that customers appear to want, which is one which does
>>> its best to minimize errors rather than accepting a larger number of errors
>>> in order to ensure that they're all of the same sort.
>
>> It's not really possible to give a useful response to this,
>> because there is no knowledge generally publically available
>> on which to base a response. Effectively the statement just
>> stops further communication. Is that what you wanted? Or
>> were you meaning to do something else?
>
> You have a fair point here. I guess, to put it another way: You have
> an argument (and it is a sound one) that we ought to prefer a compiler
> which gives warnings whenever it's not totally sure, rather than risking
> not giving a warning. I personally prefer one which does its best to guess
> right, even if that means it may be wrong in either direction. I don't
> much rely on the warnings (I hardly ever see them anyway, because I'm a
> habitual initializer of variables), but when I do see spurious warnings,
> they annoy me a great deal.

I have to admit I don't like spurious warnings either. However there
are (at least) two different modes for acting on warning messages, and
I think that difference may be relevant here. One mode I might call
"lint like", where the warnings are taken as informational. The other
mode is (for lack of a better term) "-Werror like", where warnings
cause compilation to fail. I almost always use -Werror (and use lint
only rarely). Of course this means the set of warnings reported must
be at least somewhat selective, since there certainly are warning
conditions that are more about style than substance. I find it helps
to have the -Werror warnings that are reported be conservative (ie,
possibly false positives, but never false negatives), for two reasons.
One, if the compiler has a hard time figuring it out, people sometimes
do also, and even if I can see that a warning isn't strictly necessary
I don't want to assume all my co-workers can also (and also vice
versa). Two, like it or not, when routinely using -Werror to identify
and fail on certain warning conditions, people tend not to think so
much about those particular conditions, expecting the compiler to
catch their oversights. So it seems better to have warning conditions
be "hard" rather than "soft", with the understanding that we're
talking about using -Werror and that the set of warnings being tested
is not everything but a specifically chosen set. So at the end of
day I still grumble to myself about spurious warnings, but I think
it's better for overall quality and productivity to suffer these
once in a while to get the benefits of having hard-edged warning
conditions.


>> To get back on track, two things: first, I'm talking about
>> warnings where "optimizations" are based on changing the expressed
>> intent by exercising a complete freedom of choice where undefined
>> behavior is involved; second, warnings about these can be given
>> exactly because the compiler has available _perfect knowledge_
>> about whether the condition in question has occurred -- it's not
>> any kind of heuristic like the case of uninitialized variables.
>
> Ahh! I see. I think we were talking about two separate kinds of cases.
> One is warnings about possibly-uninitialized values, where I favor gcc's
> policy of trying for the most accurate warnings it can, even though this
> means it sometimes omits a warning.

Right, I was talking about a different situation, and that was
probably the most important point of what I was saying.

> I would not object in the least to a warning flag that, say, requests warnings
> whenever gcc optimizes a test out because it's concluded that a pointer is
> dereferenced, and therefore non-null. I would even think it should probably
> be on by default, because honestly, I can't think of a case where I would
> intentionally write code subject to such an optimization.

I probably could construct such cases if I put my mind to it,
perhaps even "natural" ones (I'm thinking macro calls here),
but I agree, nine times out of ten it's just bad thinking.

> Come to think of it, I think I'll go file that as an enhancement request with
> our vendor.
>
> (Actually, I think I can. Imagine a function which has as part of its
> contract that its argument is non-null... Oh, but wait, there'd be no test
> to optimize in that case. Hmm.)
>
>> Is my point a little bit clearer now?
>
> I think so.

It certainly seems so. As predicted, I thought you
would see my point, once you saw my point. :)

Seebs

unread,
Mar 22, 2010, 12:01:23 PM3/22/10
to
On 2010-03-22, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> Just for the record it wasn't my intention to make such an allegation.
> Perhaps a prediction that you would reach a new conclusion upon
> consideration of new facts.

Ahh, fair enough. I may or may not. I'm still waffling.

> I have to admit I don't like spurious warnings either. However there
> are (at least) two different modes for acting on warning messages, and
> I think that difference may be relevant here. One mode I might call
> "lint like", where the warnings are taken as informational. The other
> mode is (for lack of a better term) "-Werror like", where warnings
> cause compilation to fail. I almost always use -Werror (and use lint
> only rarely).

In the stuff I have to keep building reliably for our build system, I use
-Werror during development, and then disable it for release. We end up
with a lot of code that has to be compiled by an unpredictable variety of
customer compilers, whereupon it becomes essentially impossible to
avoid SOME version of gcc yielding warnings.

> Of course this means the set of warnings reported must
> be at least somewhat selective, since there certainly are warning
> conditions that are more about style than substance. I find it helps
> to have the -Werror warnings that are reported be conservative (ie,
> possibly false positives, but never false negatives), for two reasons.
> One, if the compiler has a hard time figuring it out, people sometimes
> do also, and even if I can see that a warning isn't strictly necessary
> I don't want to assume all my co-workers can also (and also vice
> versa).

For something like -Wuninitialized (or however they spell it), I am not
at all confident that there exists a way to be certain of never missing
such a case short of printing the message unconditionally.

In the case of something like the "optimize out null pointer tests when
they're irrelevant", it gets fussier. That said, Code Sourcery gave us
a really good example of a reason for which doing that silently might be
desireable.

> Two, like it or not, when routinely using -Werror to identify
> and fail on certain warning conditions, people tend not to think so
> much about those particular conditions, expecting the compiler to
> catch their oversights. So it seems better to have warning conditions
> be "hard" rather than "soft", with the understanding that we're
> talking about using -Werror and that the set of warnings being tested
> is not everything but a specifically chosen set. So at the end of
> day I still grumble to myself about spurious warnings, but I think
> it's better for overall quality and productivity to suffer these
> once in a while to get the benefits of having hard-edged warning
> conditions.

You know, thinking about this, it occurs to me that there's probably a key
difference between your workflow and mine.

I'm usually looking at cases where people have a large pool of open source
software to compile. They either don't want to change it, or are prohibited
by some term in some policy somewhere from changing it. They want it to
compile without hassle. They are not looking to fix the code; they're okay
with taking the risk that something will blow up, in many cases, but they
still want to compile it without failures or spurious warnings.

For an extreme example, we had people who were (for reasons I never quite
understood) committed to using an old version of openssh which relied on
calling function pointers through the wrong interfaces. On PPC, gcc smacked
this behavior down hard... So the ultimate customer requirement was:
* This MUST NOT produce a warning/diagnostic.
* It's fine if the resulting code segfaults instantly.

(Why? Because it was in a segment of code they didn't use.)

>> Ahh! I see. I think we were talking about two separate kinds of cases.
>> One is warnings about possibly-uninitialized values, where I favor gcc's
>> policy of trying for the most accurate warnings it can, even though this
>> means it sometimes omits a warning.

> Right, I was talking about a different situation, and that was
> probably the most important point of what I was saying.

Yup.

>> I would not object in the least to a warning flag that, say, requests warnings
>> whenever gcc optimizes a test out because it's concluded that a pointer is
>> dereferenced, and therefore non-null. I would even think it should probably
>> be on by default, because honestly, I can't think of a case where I would
>> intentionally write code subject to such an optimization.

> I probably could construct such cases if I put my mind to it,
> perhaps even "natural" ones (I'm thinking macro calls here),
> but I agree, nine times out of ten it's just bad thinking.

Macro calls and inlined functions. The latter is the case that I think
makes it particularly persuasive -- you don't want to get warnings for
that.

And I don't think the optimization occurs at a level where it can tell
you whether there's macros or inlined functions involved.

> It certainly seems so. As predicted, I thought you
> would see my point, once you saw my point. :)

The first rule of tautology club is the first rule of tautology club.

Tim Rentsch

unread,
Mar 26, 2010, 7:59:46 AM3/26/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-03-22, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:

[snip]


> For something like -Wuninitialized (or however they spell it), I am not
> at all confident that there exists a way to be certain of never missing
> such a case short of printing the message unconditionally.

Clearly there must be. At the very least, any code where the only
(read-)use of a variable directly follows an assignment to the
variable needn't be flagged. This example is ridiculously
trivial but it illustrates the point (at least I hope it does).
More generically, it's relatively easy to determine all code
paths that might lead to each read-use of a variable. If all
read-uses of a variable have an assignment to the variable
along all leading code paths, the variable is always initialized
before use. So surely _some_ cases can be excluded.

> In the case of something like the "optimize out null pointer tests when
> they're irrelevant", it gets fussier. That said, Code Sourcery gave us
> a really good example of a reason for which doing that silently might be
> desireable.

It would be interesting to hear about those.


>> Two, like it or not, when routinely using -Werror to identify
>> and fail on certain warning conditions, people tend not to think so
>> much about those particular conditions, expecting the compiler to
>> catch their oversights. So it seems better to have warning conditions
>> be "hard" rather than "soft", with the understanding that we're
>> talking about using -Werror and that the set of warnings being tested
>> is not everything but a specifically chosen set. So at the end of
>> day I still grumble to myself about spurious warnings, but I think
>> it's better for overall quality and productivity to suffer these
>> once in a while to get the benefits of having hard-edged warning
>> conditions.
>
> You know, thinking about this, it occurs to me that there's probably a key

> difference between your workflow and mine. [snip elaboration]

Yes, that probably accounts for our different perspectives.


>> It certainly seems so. As predicted, I thought you
>> would see my point, once you saw my point. :)
>
> The first rule of tautology club is the first rule of tautology club.

That's a great cartoon! I should add, though, that my statement
wasn't quite a tautology, its intended meaning resting on two
slightly different intepretations of the phrase "see [or saw] my
point". (I hoped the smiley would be enough of a hint to
express that, but I guess not...)

Seebs

unread,
Mar 26, 2010, 12:20:05 PM3/26/10
to
On 2010-03-26, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> Seebs <usenet...@seebs.net> writes:
>> On 2010-03-22, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> [snip]
>> For something like -Wuninitialized (or however they spell it), I am not
>> at all confident that there exists a way to be certain of never missing
>> such a case short of printing the message unconditionally.

> Clearly there must be. At the very least, any code where the only
> (read-)use of a variable directly follows an assignment to the
> variable needn't be flagged. This example is ridiculously
> trivial but it illustrates the point (at least I hope it does).

You have a very good point.

>> In the case of something like the "optimize out null pointer tests when
>> they're irrelevant", it gets fussier. That said, Code Sourcery gave us
>> a really good example of a reason for which doing that silently might be
>> desireable.

> It would be interesting to hear about those.

Basically, macros and inline functions.

inline int fooplusone(int *p) { if (p) { return *p + 1; } else { return 0; } }

void foo(void) {
int i = 0;
int *ip = &i;
*ip = 3;
(void) fooplusone(ip);
}

Obviously, the compiler can optimize out the test.

>> You know, thinking about this, it occurs to me that there's probably a key
>> difference between your workflow and mine. [snip elaboration]

> Yes, that probably accounts for our different perspectives.

And in fact, thinking about it more... I think that, when I'm developing
the bulk of the code, I tend more towards your side of things; I want tons
and tons of warnings and errors. It's just when I am dealing with customers
who just want it to build already that I start wanting more flexibility.

> That's a great cartoon! I should add, though, that my statement
> wasn't quite a tautology, its intended meaning resting on two
> slightly different intepretations of the phrase "see [or saw] my
> point". (I hoped the smiley would be enough of a hint to
> express that, but I guess not...)

Oh, it was, I was just commenting on phrasing.

Tim Rentsch

unread,
Mar 26, 2010, 6:59:44 PM3/26/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-03-26, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
>> Seebs <usenet...@seebs.net> writes:

[snip]


>>> You know, thinking about this, it occurs to me that there's probably a key
>>> difference between your workflow and mine. [snip elaboration]
>
>> Yes, that probably accounts for our different perspectives.
>
> And in fact, thinking about it more... I think that, when I'm developing
> the bulk of the code, I tend more towards your side of things; I want tons
> and tons of warnings and errors. It's just when I am dealing with customers
> who just want it to build already that I start wanting more flexibility.

You might want to consider taking the warning flags (or some of them
anyway) out of the makefile distributed for customer builds.

I have my own gripe about this. I used to use -Wtraditional in
gcc. At the time it complained of constructs that are present in
both K&R C and ANSI/ISO C but have different semantics. At some
point it changed so it (also?) complained about constructs that
are in ANSI/ISO C but _are not_ in K&R C; the most obvious
example is function prototypes. In other words it changed from
being a pretty useful set of warning conditions to a totally
useless torrent of complaints about every function prototype.
Grrr.... And that's not the only time I've been bitten by gcc
deciding to change what a particular warning flag means.

Seebs

unread,
Mar 26, 2010, 7:09:09 PM3/26/10
to
On 2010-03-26, Tim Rentsch <t...@x-alumni2.alumni.caltech.edu> wrote:
> You might want to consider taking the warning flags (or some of them
> anyway) out of the makefile distributed for customer builds.

We have a couple of things like this. There's packages that use -Werror
until we're ready to ship, for instance.

Incidentally, I went through pseudo today with -W -Wextra and cleaned
up the warnings. I'm now using __atribute__((unused)) in a few places,
but I figure that's easy to make go away in the unlikely event that this
ever has cause to be compiled with something that doesn't support it.

(It's still sorta weird to me to be working on a program that is essentially
by definition nonportable.)

Nick

unread,
Mar 27, 2010, 4:23:13 AM3/27/10
to
Seebs <usenet...@seebs.net> writes:

> Incidentally, I went through pseudo today with -W -Wextra and cleaned
> up the warnings. I'm now using __atribute__((unused)) in a few places,
> but I figure that's easy to make go away in the unlikely event that this
> ever has cause to be compiled with something that doesn't support it.

I have

#ifdef __GNUC__
#define NEVER_RETURNS __attribute__ ((noreturn))
#define PERHAPS_UNUSED __attribute__((unused))
#else
#define NEVER_RETURNS
#define PERHAPS_UNUSED
#endif

in a "machine specific stuff" header file.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Seebs

unread,
Mar 27, 2010, 3:44:14 PM3/27/10
to
On 2010-03-27, Nick <3-no...@temporary-address.org.uk> wrote:
> I have
>
> #ifdef __GNUC__
> #define NEVER_RETURNS __attribute__ ((noreturn))
> #define PERHAPS_UNUSED __attribute__((unused))
> #else
> #define NEVER_RETURNS
> #define PERHAPS_UNUSED
> #endif
>
> in a "machine specific stuff" header file.

Mind if I steal this?

Keith Thompson

unread,
Mar 27, 2010, 9:34:32 PM3/27/10
to
Nick <3-no...@temporary-address.org.uk> writes:
> Seebs <usenet...@seebs.net> writes:
>> Incidentally, I went through pseudo today with -W -Wextra and cleaned
>> up the warnings. I'm now using __atribute__((unused)) in a few places,
>> but I figure that's easy to make go away in the unlikely event that this
>> ever has cause to be compiled with something that doesn't support it.
>
> I have
>
> #ifdef __GNUC__
> #define NEVER_RETURNS __attribute__ ((noreturn))
> #define PERHAPS_UNUSED __attribute__((unused))
> #else
> #define NEVER_RETURNS
> #define PERHAPS_UNUSED
> #endif
>
> in a "machine specific stuff" header file.

You could just do

#ifndef __GNUC__
#define __attribute()
#endif

I think the syntax is specifically designed to allow this. On the
other hand, it's not as flexible if another compiler supports a
different set of __attribute__'s and/or another mechanism for
specifying the same properties.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Richard Bos

unread,
Mar 28, 2010, 8:21:28 AM3/28/10
to
Nick <3-no...@temporary-address.org.uk> wrote:

> Seebs <usenet...@seebs.net> writes:
>
> > Incidentally, I went through pseudo today with -W -Wextra and cleaned
> > up the warnings. I'm now using __atribute__((unused)) in a few places,
> > but I figure that's easy to make go away in the unlikely event that this
> > ever has cause to be compiled with something that doesn't support it.
>
> I have
>
> #ifdef __GNUC__
> #define NEVER_RETURNS __attribute__ ((noreturn))
> #define PERHAPS_UNUSED __attribute__((unused))
> #else
> #define NEVER_RETURNS
> #define PERHAPS_UNUSED
> #endif
>
> in a "machine specific stuff" header file.

Why not simply

#ifndef __GNUC__
#define __attribute__(dummy)
#endif

Richard

Nick

unread,
Mar 28, 2010, 3:18:48 PM3/28/10
to
Keith Thompson <ks...@mib.org> writes:

> Nick <3-no...@temporary-address.org.uk> writes:
>> Seebs <usenet...@seebs.net> writes:
>>> Incidentally, I went through pseudo today with -W -Wextra and cleaned
>>> up the warnings. I'm now using __atribute__((unused)) in a few places,
>>> but I figure that's easy to make go away in the unlikely event that this
>>> ever has cause to be compiled with something that doesn't support it.
>>
>> I have
>>
>> #ifdef __GNUC__
>> #define NEVER_RETURNS __attribute__ ((noreturn))
>> #define PERHAPS_UNUSED __attribute__((unused))
>> #else
>> #define NEVER_RETURNS
>> #define PERHAPS_UNUSED
>> #endif
>>
>> in a "machine specific stuff" header file.
>
> You could just do
>
> #ifndef __GNUC__
> #define __attribute()
> #endif
>
> I think the syntax is specifically designed to allow this. On the
> other hand, it's not as flexible if another compiler supports a
> different set of __attribute__'s and/or another mechanism for
> specifying the same properties.

You certainly could, and I never thought of that.

I dislike multiple underscores, and leading underscores in my code
though. Hiding the stuff that lives in the implementation namespace in
as small a space as possible feels in some way I find hard to pin down
the "better" way of doing it. So whenever I use something non standard,
I tend to vector it through my own names as rapidly as possible.

This file is therefore also the one that assigns one of stricmp and
strcasecmp to caseless_strcmp (and provides the space to write your own
if you have neither).

Nick

unread,
Mar 28, 2010, 3:19:32 PM3/28/10
to
Seebs <usenet...@seebs.net> writes:

> On 2010-03-27, Nick <3-no...@temporary-address.org.uk> wrote:
>> I have
>>
>> #ifdef __GNUC__
>> #define NEVER_RETURNS __attribute__ ((noreturn))
>> #define PERHAPS_UNUSED __attribute__((unused))
>> #else
>> #define NEVER_RETURNS
>> #define PERHAPS_UNUSED
>> #endif
>>
>> in a "machine specific stuff" header file.
>
> Mind if I steal this?

I've posted it on Usenet - it's there for anyone who wants it.

Nick

unread,
Mar 28, 2010, 3:20:51 PM3/28/10
to
ral...@xs4all.nl (Richard Bos) writes:

Well one reason is that someone reading my code who isn't familiar with
GNU extensions will wonder what it's all about. I, perhaps wrongly, feel
my names make it much more obvious what they mean.

Phil Carmody

unread,
Mar 30, 2010, 3:46:50 AM3/30/10
to

Because you've just invaded every other implementation's
reserved (tautologically always and for every use) namespace?

Phil
--
I find the easiest thing to do is to k/f myself and just troll away
-- David Melville on r.a.s.f1

0 new messages