Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

C Standard Regarding Null Pointer Dereferencing

17 views
Skip to first unread message

Shao Miller

unread,
Jul 21, 2010, 6:51:23 PM7/21/10
to
Hello Readers,

Please respond with the _highest_ levels of pedantry you can muster
up.

This e-mail is in regards to how a C translator/compiler should handle
the expression:

*(char *)0

Consider the following program:

int main(void) {
(void)*(char *)0;
return 0;
}

The question is: Does the above program imply undefined behaviour?

References here from the C standard draft with filename 'n1256.pdf'.

Looking at the second line of the program:

(A) "Expression and null statements", 6.8.3, Semantics 2:

"The expression in an expression statement is evaluated as a void
expression for its side effects."

The footnote 134 adds, "Such as assignments, and function calls which
have side effects."

This appears to describe the second line of the program pretty nicely.

(B) "void", 6.3.2.2, point 1:

"The (nonexistent) value of a void expression...shall not be used in
any way... If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression
is evaluated for its side effects.)"

Note that this doesn't read "_only_ evaluated for its side effects."
However, (A) doesn't read "_only_", either, but one can get that
impression due to the explicit mentioning of "side effects" in both
(A) and (B).

(C) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...if [the operand] points to an object, the result is an lvalue
designating the object."

(D) "Address and indirection operators", 6.5.3.2, Semantics 4:

"If the operand has type 'pointer to type', the result has type
'type'.

(E) "Address and indirection operators", 6.5.3.2, Semantics 4:

"...If an invalid value has been assigned to the pointer, the
behavior...is undefined."

The footnote 87 adds, "Among the invalid values for dereferencing a
pointer...are a null pointer..." This footnote is referenced from
(E).

(C), (D) and (E) are in regards to the unary '*' operator, and where I
perceive a challenge in interpretation. This operator is followed by
a cast-expression, so such an expression would make up the operand, if
I'm not mistaken. The particular cast-expression in line two of the
program is '(char *)0'. _Is_this_an_assigned_value_?
_Is_"assigned"_meant_there_purposefully_or_not_?

(F) "object", 3.13, point 1:

"region of data storage in the execution environment, the contents of
which can represent values"

(G) "value", 3.17, point 1:

"precise meaning of the contents of an object when interpreted as
having a specific type"

By (G), is '(char *)0' a value? Maybe not by (G), but there are other
parts in the text which read as though expressions can have values,
without needing any objects. The "integer constant expression with
the value 0" in (H) below is such an example. Perhaps it _may_be_ a
value iff _used_ for its value?

(H) "Pointers", 6.3.2.3, points 3 and 4:

"An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer constant. If
a null pointer constant is converted to a pointer type, the resulting
pointer, called a null pointer, is guaranteed to compare unequal to a
pointer to any object or function."

"Conversion of a null pointer to another pointer type yields a null
pointer of that type. Any two null pointers shall compare equal."

By (H), it would appear that '(char *)0' is a pointer and a null
pointer. Also, it cannot point to an object. Thus, this operand does
not point to an object for (C), and we must forget (C)'s application
to our case.

(I) "Cast operators", 6.5.4, Semantics 4:

"Preceding an expression by a parenthesized type name converts the
value of the expression to the named type. ..."

Another example where an expression _has_ a value. But the text reads
"value of the expression" to describe the _use_ of that particular
property of the expression. Similar to "...expression with the value
0" in (H).

The footnote 89 adds, "A cast does not yield an lvalue."

By (I), it would appear that '(char *)0' converts the value of '0' to
a 'char *' type. But is this value _assigned_to_a_pointer_ in (E)?
We do now know that the type for this operand is 'char *' for (D).
Thus the unary '*' operator should yield a result with type 'char', by
(D).

(J) "The sizeof operator", 6.5.3.4, Semantics 2:

"...The size is determined from the type of the operand. ...the
operand is not evaluated."

In this, we see that a particular property "type" for the operand is
used. "...the operand is not evaluated" suggests that there is at
least one case in the C language where an expression can yield a
result with a type while avoiding that expression's evaluation.

But compare (J) with (A) and (B), which do describe evaluation, albeit
with "non-existant" values. (A) and (B) both mention side effects.

(K) "Program execution", 5.1.2.3, point 2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects, which are changes in the state of the execution environment.
Evaluation of an expression may produce side effects."

From (K), '*(char *)0' does not access a volatile object, nor does it
modify an object (remember that there's no assignment!), nor does it
modify a file, nor call a function doing any of those operations. It
does not appear to have any "side effects" at all. Iff '(char *)0'
can itself be considered an object (beyond being a pointer, a null
pointer, a cast expression, and having type 'char *'), then we _still_
don't have any side effects. For example: Would it be a volatile
object? Are we modifying an object?

If we constrain (A) and (B) to mean "_only_ evaluated for any side
effects", then (K) suggests '*(char *)0' has no side effects. This
constraint is not explicitly in the text, however. One can ponder if
it is meant or not.

Now then, let us please consider how '*(char *)0' evaluates if we take
"...If an invalid value has been assigned to the pointer..." from (E)
_literally_. There is no assignment here. There is conversion of the
value of the expression '0' to a null pointer. Then we are applying
the '*' operator to that null pointer. The result has of this
application yields a result with type 'char'. According to (J), this
expression can even be an operand to 'sizeof', since it has a type.
There is no object and there is no value.

Is there undefined behaviour? Perhaps consider it in terms of
variables and constants: In '*(char *)0', everything is constant. In
'*(char *)x', x is variable. Could be suppose that "has been
assigned" from (E) is used there _intentionally_, specifically because
with constants, we have full knowledge at translation-time, but with
variables, we need objects and an execution environment? In other
words, is an implementation _allowed_ to attempt to dereference a null
pointer, knowing 100% full well at translation time that that's what
the expression _looks_ like? With variables, the execution of the
program might or might not dereference a null pointer, and that can
trapped or not.

Consider the usual idea of '*x' as "object pointed-to by x" versus
splitting the idea into the more esoteric "result having a type,
possibly designating an object, and possibly having a value, depending
on properties of x".

What do you think? Thank you with sincerity for your time,

- Shao Miller

Denis McMahon

unread,
Jul 21, 2010, 7:11:18 PM7/21/10
to
On 21/07/10 23:51, Shao Miller wrote:

> Please respond with the _highest_ levels of pedantry you can muster
> up.

Do your own homework.

Rgds

Denis McMahon

Richard Heathfield

unread,
Jul 21, 2010, 7:42:40 PM7/21/10
to
Shao Miller wrote:
> Hello Readers,
>
> Please respond with the _highest_ levels of pedantry you can muster
> up.
>
> This e-mail is in regards to how a C translator/compiler should handle
> the expression:
>
> *(char *)0

Any way it likes.

>
> Consider the following program:
>
> int main(void) {
> (void)*(char *)0;
> return 0;
> }
>
> The question is: Does the above program imply undefined behaviour?

Yes.

>
> References here from the C standard draft with filename 'n1256.pdf'.
>
> Looking at the second line of the program:
>
> (A) "Expression and null statements", 6.8.3, Semantics 2:
>
> "The expression in an expression statement is evaluated as a void
> expression for its side effects."

(void)*(char *)0;

is an expression. Therefore, it is evaluated.

>
> The footnote 134 adds, "Such as assignments, and function calls which
> have side effects."
>
> This appears to describe the second line of the program pretty nicely.
>
> (B) "void", 6.3.2.2, point 1:
>
> "The (nonexistent) value of a void expression...shall not be used in
> any way... If an expression of any other type is evaluated as a void
> expression, its value or designator is discarded. (A void expression
> is evaluated for its side effects.)"

That's a violation of a "shall" outside a constraint - i.e. undefined
behaviour.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Shao Miller

unread,
Jul 21, 2010, 7:43:55 PM7/21/10
to
On Jul 21, 7:11 pm, Denis McMahon <denis.m.f.mcma...@googlemail.co.uk>
wrote:
> Do your own homework.

Wow. I've not been a student since 1995, but never did homework then,
either. This isn't homework in the typical, student-applicable
sense. I'm thinking about this subject matter here at home, yes. If
you are trying to hint at something more and this is not just a wild,
inaccurate guess, please do enlighten.

Keith Thompson

unread,
Jul 21, 2010, 8:10:20 PM7/21/10
to
Richard Heathfield <r...@see.sig.invalid> writes:

> Shao Miller wrote:
>> Please respond with the _highest_ levels of pedantry you can muster
>> up.
[...]

>> Consider the following program:
>>
>> int main(void) {
>> (void)*(char *)0;
>> return 0;
>> }
>>
>> The question is: Does the above program imply undefined behaviour?
>
> Yes.

I tend to agree (I haven't done the research yet), but ...

[...]

>> (B) "void", 6.3.2.2, point 1:
>>
>> "The (nonexistent) value of a void expression...shall not be used in
>> any way... If an expression of any other type is evaluated as a void
>> expression, its value or designator is discarded. (A void expression
>> is evaluated for its side effects.)"
>
> That's a violation of a "shall" outside a constraint - i.e. undefined
> behaviour.

But it doesn't apply here. The void expression in question is
(void)*(char *)0

The (nonexistent) value of that expression is not used in any way by the
program. It occurs as the expression of an expression-statement, so the
result is discarded. Note that the statement
(void)42;
has well-defined behavior.

Hmm. Off the top of my head, I can't think of any way to violate the
"shall" in 6.3.2.2p1 without also violating some constraint. For
example:

x = (void)42;

violates the constraint specified in 6.5.16.1p1; void isn't one of
the permitted types for the right operand of a simple assignment.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Shao Miller

unread,
Jul 21, 2010, 8:33:27 PM7/21/10
to
On Jul 21, 7:42 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> Any way it likes.
>
Despite the material I've offered?

>
> Yes.
>
Do you have any useful references to go along with that claim?

>
> (void)*(char *)0;
>
> is an expression. Therefore, it is evaluated.
>

If we take the text literally, we note the absence of "only" in "for
its side effects", as mentioned. Thus I agree that if we take the
text literally, the expression '*(char *)0' is evaluated. I'm not
sure why you mention '(void)*(char *)0;' being evaluated, however.
That doesn't seem relevant to me, since the question of UB surrounds
'*(char *)0', even beyond its now-agreed-upon evaluation for its
context as part of a void expression.

>
> That's a violation of a "shall" outside a constraint - i.e. undefined
> behaviour.
>

What is the violation? If there's no value, there's no value to be
used in any way or not to be used in any way.

Forget about the void expression context altogether and consider:

*(char *)0

Does this expression _have_ a value, given the text for the unary '*'
operator?

Can an implementation _even_attempt_to_get_ a value for this
expression, given that left alone, this expression has a result with a
type, but no mention of how its value can be gotten? A null pointer
does not refer to an object, so how do you get a value here?

Do you _need_ a value for this expression? Is there some kind of
requirement that during expression evaluation, a value is mandated at
some point? Back to the variables and constants perspective: We are
not comparing a value by using this expression, we are not attempting
to assign using this expression, the expression is not an argument to
a function, we are not attempting to use the expression as an lvalue,
etc. What exactly does evaluation entail?

(L) "Program execution", 5.1.2.3, point 3:

"In the abstract machine, all expressions are evaluated as specified
by the semantics. ..."

I do not see where the semantics entail that a value is required for
the expression '*(char *)0' outside of where such a value might be
_used_. For example:

(M) "Simple assignment", 6.5.16.1, Semantics 2:

"In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand."

Why the use of "the value of the right operand" rather than merely
"the operand"? Because it _demands_ a value. Back to the context of
a void expression, we do not _demand_ a value at all. In fact, any
value would be discarded. If there's no value, there's no value to
even discard.

By (L), does "evaluated" _really_ necessitate "the act of computing a
value" for '*(char *)0'? In the semantics (C), (D), (E), if we take
the text literally (like we did when we agreed above), there is no
assignment. The requirement for type is even satisfied.

Similarly, how about:

*(char *)58

Does that have a value? Does it not depend just the same upon
context? Used in a void expression, no value is required. Evaluate
all you like, you get something with a type, but there's no mention of
a mandatory value, is there? Essentially, I'm approaching it
constructively: The expression evaluation semantics have built us up a
thing for which a type is defined, but no value. Add to that, that
there is no requirement for a value. Is there some confusion between
"expression is evaluated" and "expression is evaluated, thus giving it
a value"?

Thanks anyway.

Richard Heathfield

unread,
Jul 21, 2010, 8:37:57 PM7/21/10
to
Keith Thompson wrote:
> Richard Heathfield <r...@see.sig.invalid> writes:
>> Shao Miller wrote:
>>> Please respond with the _highest_ levels of pedantry you can muster
>>> up.
> [...]
>>> Consider the following program:
>>>
>>> int main(void) {
>>> (void)*(char *)0;
>>> return 0;
>>> }
>>>
>>> The question is: Does the above program imply undefined behaviour?
>> Yes.
>
> I tend to agree (I haven't done the research yet), but ...
>
> [...]
>
>>> (B) "void", 6.3.2.2, point 1:
>>>
>>> "The (nonexistent) value of a void expression...shall not be used in
>>> any way... If an expression of any other type is evaluated as a void
>>> expression, its value or designator is discarded. (A void expression
>>> is evaluated for its side effects.)"
>> That's a violation of a "shall" outside a constraint - i.e. undefined
>> behaviour.
>
> But it doesn't apply here.

Yes, it does.


> The void expression in question is
> (void)*(char *)0

The undefined behaviour comes from the expression *(char *)0, which is
evaluated. The cast to void is neither here nor there.

> The (nonexistent) value of that expression is not used in any way by the
> program. It occurs as the expression of an expression-statement, so the
> result is discarded. Note that the statement
> (void)42;
> has well-defined behavior.

Sure. But that's because 42 doesn't invoke undefined behaviour when
evaluated.

Richard Heathfield

unread,
Jul 21, 2010, 8:44:55 PM7/21/10
to
Shao Miller wrote:
> On Jul 21, 7:42 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>> Any way it likes.
>>
> Despite the material I've offered?

Open and shut case.

>
>> Yes.
>>
> Do you have any useful references to go along with that claim?

You provided all the necessary references yourself.

>
>> (void)*(char *)0;
>>
>> is an expression. Therefore, it is evaluated.
>>
> If we take the text literally, we note the absence of "only" in "for
> its side effects", as mentioned. Thus I agree that if we take the
> text literally, the expression '*(char *)0' is evaluated.

Right.

> I'm not
> sure why you mention '(void)*(char *)0;' being evaluated, however.

<shrug> It's evaluated for its side effects, which are not overly
numerous in this case - in fact, the only side effect is UB.


> That doesn't seem relevant to me, since the question of UB surrounds
> '*(char *)0', even beyond its now-agreed-upon evaluation for its
> context as part of a void expression.
>
>> That's a violation of a "shall" outside a constraint - i.e. undefined
>> behaviour.
>>
> What is the violation? If there's no value, there's no value to be
> used in any way or not to be used in any way.

It isn't the use that matters. It's the evaluation that matters.

char *p;
*p; /* UB, even though *p is not used */

>
> Forget about the void expression context altogether and consider:
>
> *(char *)0
>
> Does this expression _have_ a value, given the text for the unary '*'
> operator?

Syntactically, yes. Semantically, no. Hence, UB.

>
> Can an implementation _even_attempt_to_get_ a value for this
> expression, given that left alone, this expression has a result with a
> type, but no mention of how its value can be gotten? A null pointer
> does not refer to an object, so how do you get a value here?

You are discovering why the behaviour is undefined.


<snip>

> Similarly, how about:
>
> *(char *)58
>
> Does that have a value?

It depends on whether (char *)58 points into an object.


> Does it not depend just the same upon
> context? Used in a void expression, no value is required. Evaluate
> all you like,

And the moment you try, if the pointer doesn't point into an object, the
B is U.

pete

unread,
Jul 21, 2010, 9:12:09 PM7/21/10
to
Shao Miller wrote:

> *(char *)0


> (C) "Address and indirection operators", 6.5.3.2, Semantics 4:
>
> "...if [the operand] points to an object,
> the result is an lvalue designating the object."


What does the standard say that the result is,
if the operand doesn't point to an object?


> Is there undefined behaviour?

> What do you think?

--
pete

Keith Thompson

unread,
Jul 21, 2010, 9:37:11 PM7/21/10
to
Richard Heathfield <r...@see.sig.invalid> writes:
> Keith Thompson wrote:
>> Richard Heathfield <r...@see.sig.invalid> writes:
>>> Shao Miller wrote:
>>>> Please respond with the _highest_ levels of pedantry you can muster
>>>> up.
>> [...]
>>>> Consider the following program:
>>>>
>>>> int main(void) {
>>>> (void)*(char *)0;
>>>> return 0;
>>>> }
>>>>
>>>> The question is: Does the above program imply undefined behaviour?
>>> Yes.
>>
>> I tend to agree (I haven't done the research yet), but ...
>>
>> [...]
>>
>>>> (B) "void", 6.3.2.2, point 1:
>>>>
>>>> "The (nonexistent) value of a void expression...shall not be used in
>>>> any way... If an expression of any other type is evaluated as a void
>>>> expression, its value or designator is discarded. (A void expression
>>>> is evaluated for its side effects.)"
>>> That's a violation of a "shall" outside a constraint -
>>> i.e. undefined behaviour.
>>
>> But it doesn't apply here.
>
> Yes, it does.

I believe you're mistaken. (As usual, I'm prepared to be convinced
otherwise.)

>> The void expression in question is
>> (void)*(char *)0
>
> The undefined behaviour comes from the expression *(char *)0, which is
> evaluated. The cast to void is neither here nor there.

I agree. But the "shall" you were referring to above was the one in
6.3.2.2p1, which applies only to void expressions. I don't argue that
the behavior of (void)*(char *)0 is well defined; I argue that it's
not 6.3.2.2p1 that causes it to be undefined.

Actually, I just noticed that there are two "shall"s in that paragraph.
Here's the whole thing:

The (nonexistent) value of a void expression (an expression
that has type void) shall not be used in any way, and implicit
or explicit conversions (except to void) shall not be applied
to such an expression. If an expression of any other type is


evaluated as a void expression, its value or designator is
discarded. (A void expression is evaluated for its side effects.)

Is either of these "shall"s violated by (void)*(char *)0?

If you didn't mean to imply that either of these particular "shall"s
is violated by this particular expression, please re-read what you
wrote above, particularly the line "Yes, it does."

>> The (nonexistent) value of that expression is not used in any way by the
>> program. It occurs as the expression of an expression-statement, so the
>> result is discarded. Note that the statement
>> (void)42;
>> has well-defined behavior.
>
> Sure. But that's because 42 doesn't invoke undefined behaviour when
> evaluated.

Precisely.

Shao Miller

unread,
Jul 21, 2010, 9:55:15 PM7/21/10
to
On Jul 21, 8:44 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> Open and shut case.
>
Obviously it "feels" like it should be undefined behaviour, since
we've all been trying to avoid the act of a "null pointer dereference"
for a long time. But how many of these deal with _objects_ assigned a
null pointer value, versus an expression which is merely a null
pointer on its own? I believe that the vast majority are in the
former category, possibly biasing an interpretation of the referenced
text. Think of it this way: Your abstract machine plops a null
pointer value into an object 'foo' ("has been assigned to the pointer"
from (E)). The next operation is "fetch the value of the object 'foo'
points to", perhaps because it's about to be used. If 'foo' contains
a null pointer value, the behaviour is undefined. This is what we're
used to. But given a non-variable operand to '*', (E) does not apply,
since there's no assignment. We can rely on (C) only.

>
> You provided all the necessary references yourself.
>

I still appreciate your feedback and would still appreciate if you
could directly address which portions of the referenced text support
your valuable discussion.

>
> Right.
>
I'm glad we can share a frame of reference here, then.

>
> <shrug> It's evaluated for its side effects, which are not overly
> numerous in this case - in fact, the only side effect is UB.
>

But _why_ is it UB? Because of some notion that "we are attempting to
get a value using a null pointer"? Some of my argument goes, "we are
_not_ required to get a value via the '*' unary operator." We
certainly must and do yield a result with a type, as per (D).

>
> It isn't the use that matters. It's the evaluation that matters.
>
> char *p;
> *p; /* UB, even though *p is not used */
>

This is a great example. Why is this UB? My conclusion would be
because of a path from (C). In an attempt to determine "if it points
to an object," we require the value of 'p', which is already UB. Note
the use of 'p' versus the non-use of '*p'. In '*(char *)0', we
already know darned well that '(char *)0' shall not point to an
object, so (C) does not apply. Neither does (E), so we're left with
(D), a type. It might help to use parentheses: *(p) First we need
the value of 'p', which is UB.

>
> Syntactically, yes. Semantically, no. Hence, UB.
>

It syntactically has a value? Where does the syntax describe this?
The syntax says it is a "unary-expression".

How about:

struct {
int x;
} foo, *bar;
bar = &foo;
(*bar).x = 10;

Did we get any value from '*bar' there? Surely we got an object, but
a value? Did we "fetch" the "value" of the object pointed-to by
'bar', simply to toss it away because we are only setting a member?
"Structure and union members", 6.5.2.3 doesn't say anything about
requiring a value for its first operand, does it? It talks about
type. (M) is quite clear about requiring a value.

>
> You are discovering why the behaviour is undefined.
>

Not really.

>
> It depends on whether (char *)58 points into an object.
>

Possibly thanks to (C), yes. There's no "depends" on '0', is there?

>
> And the moment you try, if the pointer doesn't point into an object, the
> B is U.
>

I'd really rather read your reasoning about _why_, based on the
referenced text, than "the B is U."

(N) "undefined behavior", 3.4.3, point 1:

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data,
for which this International Standard imposes no requirements"

Is there a nonportable program construct somewhere? Is there an
erroneous one? Is there erroneous data? I don't see them. I see us
having satisfied all requirements; '*(char *)0' has a type, as per (D).

Shao Miller

unread,
Jul 21, 2010, 9:58:22 PM7/21/10
to
On Jul 21, 9:12 pm, pete <pfil...@mindspring.com> wrote:
>
> What does the standard say that the result is,
> if the operand doesn't point to an object?
>
> > Is there undefined behaviour?
> What do you think?
>
I don't know the standard. I can only say that the referenced text (a
standard draft) has (D). It says the result has a type ('char'),
since its operand has type "pointer-to-char". No UB. See the struct
example in another post. We get an object, not a value.

Ben Bacarisse

unread,
Jul 21, 2010, 9:58:52 PM7/21/10
to
Keith Thompson <ks...@mib.org> writes:
<snip>

> Hmm. Off the top of my head, I can't think of any way to violate the
> "shall" in 6.3.2.2p1 without also violating some constraint. For
> example:
>
> x = (void)42;
>
> violates the constraint specified in 6.5.16.1p1; void isn't one of
> the permitted types for the right operand of a simple assignment.

Hardly important, but I think one such is:

int array[1] = {(void)42};

I think this violates no constraints so 6.3.2.2 p1 is important to
render it undefined.

Initialising a scalar does not provide a simpler example because 6.7.8
p11 imports all of the constraints attached to assignment. Thus

int x = {(void)42};

must be diagnosed.

--
Ben.

Keith Thompson

unread,
Jul 21, 2010, 10:42:37 PM7/21/10
to

6.7.8p11:
The initializer for a scalar shall be a single expression,
optionally enclosed in braces. The initial value of the object
is that of the expression (after conversion); the same type
constraints and conversions as for simple assignment apply,
taking the type of the scalar to be the unqualified version of
its declared type.

Ah, but it does violate a constraint.

The syntax for "initializer" is:

initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }

where an initializer-list is basically a list of initializers.
So {(void)42} is an initializer (for array), and (void)42 is also an
initializer (for array[0]). Since array[0] is a scalar, (void)42 is
"The initializer for a scalar", and the constraints referred to in
6.7.8p11 apply.

As I studied 6.7.8, I was relieved to discover this; otherwise a
compiler wouldn't be required to diagnose

double arr[] = { "hello" };

and that would be bad.

Tim Rentsch

unread,
Jul 21, 2010, 10:55:05 PM7/21/10
to
Shao Miller <sha0....@gmail.com> writes:

> Please respond with the _highest_ levels of pedantry you can muster
> up.
>
> This e-mail

You mean newsgroup posting.

> is in regards to how a C translator/compiler should handle
> the expression:
>
> *(char *)0
>
> Consider the following program:
>
> int main(void) {
> (void)*(char *)0;
> return 0;
> }
>
> The question is: Does the above program imply undefined behaviour?

> [snip 165 more lines]

Yes.

Shao Miller

unread,
Jul 21, 2010, 11:22:38 PM7/21/10
to
On Jul 21, 10:55 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
> You mean newsgroup posting.
>
Thanks for the correction, Tim.

>
> Yes.
Is this the "_highest_" level "of pedantry you" could "muster up"? If
so, then thanks.

How about the following code:

int main(void) {
struct foo {
int x;
char y[2048];
int z;
};
return (*(struct foo*)0).z;
}

Are we attempting to get any sort of "value" when evaluating the
expression '(*(struct foo*)0)'? We have a type, as (D) suggests.

Have you any thoughts on why the text referenced in (E) reads, "has
been assigned to the pointer", Tim? Is that a mistake like my "e-
mail" above, do you suppose? If not, do you see any assignment of a
null pointer value in this code?

Thanks.

Tim Rentsch

unread,
Jul 22, 2010, 12:40:32 AM7/22/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 21, 10:55 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>>
>> You mean newsgroup posting.
>>
> Thanks for the correction, Tim.
>
>>
>> Yes.
> Is this the "_highest_" level "of pedantry you" could "muster up"? If
> so, then thanks.

I'd used up all my pedantry capital in the earlier "newsgroup"
statement.


> How about the following code:
>
> int main(void) {
> struct foo {
> int x;
> char y[2048];
> int z;
> };
> return (*(struct foo*)0).z;
> }
>
> Are we attempting to get any sort of "value" when evaluating the
> expression '(*(struct foo*)0)'? We have a type, as (D) suggests.

No, but doing so is undefined behavior, for the same reason
as the '*(char*)0' example.

> Have you any thoughts on why the text referenced in (E) reads, "has

> been assigned to the pointer", Tim? [snip]

I think it's old, carelessly imprecise wording that's
never been revised to correct the imprecision, probably
partly because it means just what it seems to mean to
people who don't think about it too carefully.

christian.bau

unread,
Jul 22, 2010, 4:53:22 AM7/22/10
to
On Jul 22, 1:37 am, Richard Heathfield <r...@see.sig.invalid> wrote:

> The undefined behaviour comes from the expression *(char *)0, which is
> evaluated. The cast to void is neither here nor there.

First * (char *) 0 is evaluated, giving an lvalue (no undefined
behaviour yet). However, when that expression is used as the argument
in a cast, and also if it were just used as the complete expression in
an expression statement and in some other situations, it is converted
to an rvalue and the undefined behaviour happens at that point. There
is a slight difference in the C++ language, where using it in an
expression statement would not cause undefined behaviour.

For an application programmer, all this doesn't make any difference:
If there is _doubt_ about undefined behaviour or not, then don't do
it. For an optimising compiler, there is no difference because no code
should be generated for (void) * (char *) expr except for the side
effects of expr, whatever expr is. For a non-optimising compiler,
since this _is_ undefined behaviour, the compiler _may_ produce a read
access to the location (char *) 0 which will do whatever it does on
that particular machine. So if a non-optimising compiler doesn't know
to remove read accesses when the data is not used, that is fine.

Richard Heathfield

unread,
Jul 22, 2010, 5:01:51 AM7/22/10
to
Keith Thompson wrote:
> Richard Heathfield <r...@see.sig.invalid> writes:

<snip>

>
>>> The void expression in question is
>>> (void)*(char *)0
>> The undefined behaviour comes from the expression *(char *)0, which is
>> evaluated. The cast to void is neither here nor there.
>
> I agree.

That is all I was intending to claim. If you are picking a nit with the
way in which I claimed it, I am perfectly happy to cede the point, but
the claim itself (that the expression exhibits undefined behaviour) is
correct.

Richard Heathfield

unread,
Jul 22, 2010, 5:09:53 AM7/22/10
to
Shao Miller wrote:
> On Jul 21, 8:44 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>> Open and shut case.
>>
> Obviously it "feels" like it should be undefined behaviour, since
> we've all been trying to avoid the act of a "null pointer dereference"
> for a long time.

See 6.5.3.2.

"The unary * operator denotes indirection. If the operand points to a
function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If the operand
has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the
unary * operator is undefined."

NULL is an invalid value - it is guaranteed not to point to any object
or function. (See 6.3.2.3.)

Therefore, using * on a null pointer invokes UB.

Ben Bacarisse

unread,
Jul 22, 2010, 6:25:09 AM7/22/10
to
Keith Thompson <ks...@mib.org> writes:

Obviously I thought that the paragraph about scalars was not to be taken
as applying recursively. Reading your point below, that now seems to a
daft way to read it, but let me just say why I took to be so (if only
for a few hours!). First, just after the paragraph about scalars, p12
reads:

12 The rest of this subclause deals with initializers for objects that
have aggregate or union type.

which I took to mean "and the previous clauses don't" for no good reason
other than I seem to have a habit of reading more than is intended into
phrases that are simply informative. Second, p20 starts:

20 If the aggregate or union contains elements or members that are
aggregates or unions, these rules apply recursively to the
subaggregates or contained unions.

as if to suggest that "these rules" are applied recursively but only
down to the level of sub-aggregates not the scalars within them. Again,
this is just reading too much into it. That the rules about aggregates
apply recursively, does not mean that the others about scalars don't.

> As I studied 6.7.8, I was relieved to discover this; otherwise a
> compiler wouldn't be required to diagnose
>
> double arr[] = { "hello" };
>
> and that would be bad.

Yes, and for that reason alone it seems clear (now) that p11 must apply
to all enclosed scalars as much as to the top-level ones.

--
Ben.

Stargazer

unread,
Jul 22, 2010, 6:30:17 AM7/22/10
to

In the many quotes you forgot one which is (along with 6.3.2.3.[3-4])
only relevant to your inquiry:

6.3.2.3.[5] An integer may be converted to any pointer type. Except as
previously specified, the
result is implementation-defined, might not be correctly aligned,
might not point to an
entity of the referenced type, and might be a trap representation.

Without excerpt of integer 0 cast to pointer and "Except as previously
specified..." clause, *(char*)0 would qualify as undefined behavior by
6.3.2.3.[5] and 6.5.3.2.[4]. With the addition of null pointer
specification, (char*)0 is guaranteed NOT to point to an object, but
it's never specicied as a constraint or as requiring a specific
action. So it's somewhere between "undefined" and "illegal". Most
implementations seem to choose the "undefined" side, and previous
discussion here, IIRC, ended with a similar suggestion.

Daniel


Stargazer

unread,
Jul 22, 2010, 6:37:00 AM7/22/10
to
On Jul 22, 11:53 am, "christian.bau"

<christian....@cbau.wanadoo.co.uk> wrote:
> On Jul 22, 1:37 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> > The undefined behaviour comes from the expression *(char *)0, which is
> > evaluated. The cast to void is neither here nor there.
>
> First * (char *) 0 is evaluated, giving an lvalue (no undefined
> behaviour yet).

It is either undefined (consider 6.3.2.3.[5] with 6.5.3.2.[4]) or
illegal (consider 6.3.2.3.[3] that null pointer "is guaranteed to
compare unequal to a pointer to any object or function"). So '*' in
*(char*)0 either references address which is not defined to be legal,
or dereferences pointer to something that is necessarily not an
object.

Daniel

Shao Miller

unread,
Jul 22, 2010, 9:01:24 AM7/22/10
to
On Jul 22, 12:40 am, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
> I think it's old, carelessly imprecise wording that's
> never been revised to correct the imprecision, probably
> partly because it means just what it seems to mean to
> people who don't think about it too carefully.

I'm sorry you seem to be implying that I haven't thought about this
carefully. I'm glad that you think the wording needs to be
addressed. The latter seems like an opinion worth offering.

Shao Miller

unread,
Jul 22, 2010, 9:22:24 AM7/22/10
to
On Jul 22, 5:09 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> See 6.5.3.2.
>
> "The unary * operator denotes indirection. If the operand points to a
> function, the result is a function designator; if it points to an
> object, the result is an lvalue designating the object. If the operand
> has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
> invalid value has been assigned to the pointer, the behavior of the
> unary * operator is undefined."
>
This reference causes me to wonder if you actually read the original
newsgroup posting. For some reason, I get the impression that few
responders actually did. I realize it was a long one, and there's
only so much time in a day. I brought the question up because I
believe that either:
1. The standard needs to be addressed due to an ambiguity, should it
be the case that it has not been already, XOR
2. There is no undefined behaviour

>
> NULL is an invalid value - it is guaranteed not to point to any object
> or function. (See 6.3.2.3.)
>

Yet another item referenced in the original post. Look at your
previous reference then look at this one. Where is a null pointer
value assigned? It's not. Yet the cast expression (the operand)
'(char *)0' _has_ a type, so the result of applying '*' _has_ a type.
That is all that 'sizeof' requires. That is enough for a void
expression. It is enough for the '.' postfix operator. No _value_ is
required in any of those three contexts. It would not be enough for
an assignment or a comparison.

>
> Therefore, using * on a null pointer invokes UB.
>

Using '*' on a pointer that has been assigned an invalid value is UB.
'(char *)0' is not an lvalue (no cast expression is), hence it cannot
be assigned a value. It is a pointer. It is a null pointer. It is
not a pointer that has been assigned a null pointer value.

I do continue to value your feedback and am hopeful that you or
another responder may pinpoint a definitive reason for UB. So far,
Tim's suggestion that "the wording is imprecise" strikes me as most
likely, iff there really is undefined behaviour.

Tim Rentsch

unread,
Jul 22, 2010, 10:05:47 AM7/22/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 22, 12:40 am, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>>
>> I think it's old, carelessly imprecise wording that's
>> never been revised to correct the imprecision, probably
>> partly because it means just what it seems to mean to
>> people who don't think about it too carefully.
>
> I'm sorry you seem to be implying that I haven't thought about this

> carefully. [snip]

Oh no, I didn't mean that at all. It's obvious you have
thought about it carefully. If you hadn't, your question
wouldn't have come up. You see what I'm saying now?

Shao Miller

unread,
Jul 22, 2010, 10:14:26 AM7/22/10
to
On Jul 22, 6:30 am, Stargazer <stargazer3...@gmail.com> wrote:
>
> In the many quotes you forgot one which is (along with 6.3.2.3.[3-4])
> only relevant to your inquiry:
>
> 6.3.2.3.[5] An integer may be converted to any pointer type. Except as
> previously specified, the
> result is implementation-defined, might not be correctly aligned,
> might not point to an
> entity of the referenced type, and might be a trap representation.
>
> Without excerpt of integer 0 cast to pointer and "Except as previously
> specified..." clause, *(char*)0 would qualify as undefined behavior by
> 6.3.2.3.[5] and 6.5.3.2.[4].
>
char *x = (char *)0;
Here we declare a pointer-to-char and initialize it with '0' cast to
pointer-to-char. There's no UB here, right? So how is application of
the unary '*' operator to "the result" of the conversion in 6.5.3.2.
[3] (a null pointer, '(char *)0') impacted by these further
references? Section 6.3.2.3 has "had its say" by the time we have the
result '(char *)0'. There's no UB. Then we apply the unary '*'
operator.

>
> With the addition of null pointer
> specification, (char*)0 is guaranteed NOT to point to an object, but
> it's never specicied as a constraint or as requiring a specific
> action.
>

What is "it"? The expression? The result? The value? What
constraint do you need? What action do you need?

I propose that it is well-defined:
1. The expression details the operation of a cast. The result has a
type, pointer-to-char. An object of that type would have size and
alignment.
2. The result is called "a null pointer"
3. The result can be used as a value in an assignment

>
> So it's somewhere between "undefined" and "illegal". Most
> implementations seem to choose the "undefined" side, and previous
> discussion here, IIRC, ended with a similar suggestion.
>

You appear to be talking about '(char *)0' and '*(char *)0'
simultaneously.

Shao Miller

unread,
Jul 22, 2010, 10:25:18 AM7/22/10
to
On Jul 22, 6:37 am, Stargazer <stargazer3...@gmail.com> wrote:
>
> It is either undefined (consider 6.3.2.3.[5] with 6.5.3.2.[4]) or
> illegal (consider 6.3.2.3.[3] that null pointer "is guaranteed to
> compare unequal to a pointer to any object or function"). So '*' in
> *(char*)0 either references address
>
I believe that you are confused, here. 6.5.3.2 is "Address and
indirection operators". Is it specified which of '&' and '*' is
"address" and which is "indirection"? Do they have a one-to-one
correspondence? It really doesn't matter at all. _Nowhere_ does the
text for '*' read "reference" nor does it read "address" at all.
Please clarify.

>
> which is not defined to be legal,
> or dereferences pointer to something that is necessarily not an
> object.
>

The entirety of the 'n1256.pdf' draft doesn't include the word
"dereference," but that's what some of us call it, including me.
However, it might have led to "magical thinking" regarding the '*'
unary operator. It is an operator. It has operands. It doesn't need
to "dereference an object" and it doesn't need to "dereference
something that is not necessarily an object".

But I appreciate your feedback, regardless of that.

Shao Miller

unread,
Jul 22, 2010, 10:39:00 AM7/22/10
to
On Jul 22, 10:05 am, Tim Rentsch <t...@alumni.caltech.edu> wrote:
Thanks, Tim. I apologize for misunderstanding. I also agree that if
UB is intended for the case of:

(void)*(char *)0;

then something, somewhere in the text needs to be modified; 6.5.3.2
would be an easy place for it, should it be the case that the wording
is carelessly imprecise. An alternative might be that the wording is
accurate and intended, allowing for no undefined behaviour in the
original post's question. Any attempted use of the _value_ of the
result of '*(char *)0' is certainly undisputedly undefined behaviour,
since there is no defined means to obtain such a value (it is defined
that there is no object to obtain a value from).

Keith Thompson

unread,
Jul 22, 2010, 12:27:52 PM7/22/10
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> Keith Thompson <ks...@mib.org> writes:
[...]

> Obviously I thought that the paragraph about scalars was not to be taken
> as applying recursively. Reading your point below, that now seems to a
> daft way to read it, but let me just say why I took to be so (if only
> for a few hours!). First, just after the paragraph about scalars, p12
> reads:
>
> 12 The rest of this subclause deals with initializers for objects that
> have aggregate or union type.
>
> which I took to mean "and the previous clauses don't" for no good reason
> other than I seem to have a habit of reading more than is intended into
> phrases that are simply informative. Second, p20 starts:
>
> 20 If the aggregate or union contains elements or members that are
> aggregates or unions, these rules apply recursively to the
> subaggregates or contained unions.
>
> as if to suggest that "these rules" are applied recursively but only
> down to the level of sub-aggregates not the scalars within them. Again,
> this is just reading too much into it. That the rules about aggregates
> apply recursively, does not mean that the others about scalars don't.

But it would be nice if that section were clearer. You and I
both took a while to figure out how the wording reflects the (in
retrospect fairly obvious) intent.

>> As I studied 6.7.8, I was relieved to discover this; otherwise a
>> compiler wouldn't be required to diagnose
>>
>> double arr[] = { "hello" };
>>
>> and that would be bad.
>
> Yes, and for that reason alone it seems clear (now) that p11 must apply
> to all enclosed scalars as much as to the top-level ones.

But I'm not entirely comfortable interpreting ambiguous wording by
picking the interpretation that doesn't lead to bad consequences.

I think the "applies recursively" wording should be earlier in that
section.

Ben Bacarisse

unread,
Jul 22, 2010, 1:20:37 PM7/22/10
to
Keith Thompson <ks...@mib.org> writes:

> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>> Keith Thompson <ks...@mib.org> writes:

<snip>


>>> As I studied 6.7.8, I was relieved to discover this; otherwise a
>>> compiler wouldn't be required to diagnose
>>>
>>> double arr[] = { "hello" };
>>>
>>> and that would be bad.
>>
>> Yes, and for that reason alone it seems clear (now) that p11 must apply
>> to all enclosed scalars as much as to the top-level ones.
>
> But I'm not entirely comfortable interpreting ambiguous wording by
> picking the interpretation that doesn't lead to bad consequences.
>
> I think the "applies recursively" wording should be earlier in that
> section.

Agreed.

--
Ben.

pete

unread,
Jul 23, 2010, 3:29:13 AM7/23/10
to
Shao Miller wrote:
>
> On Jul 21, 9:12 pm, pete <pfil...@mindspring.com> wrote:
> >
> > What does the standard say that the result is,
> > if the operand doesn't point to an object?
> >
> > > Is there undefined behaviour?

> I don't know the standard.

You have undefined behavior because
you have a null pointer as an operand of the indirection operator.

This is the relevant text:

6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either
a function designator,
the result of a [] or unary * operator,
or an lvalue that designates an object
that is not a bit-field and is not declared
with the register storage-class specifier.

--
pete

Tim Rentsch

unread,
Jul 23, 2010, 3:43:40 AM7/23/10
to
pete <pfi...@mindspring.com> writes:

Wrong operator. Indirection is '*', not '&'.

pete

unread,
Jul 23, 2010, 3:52:45 AM7/23/10
to
Tim Rentsch wrote:
>
> pete <pfi...@mindspring.com> writes:
>
> > Shao Miller wrote:
> >>
> >> On Jul 21, 9:12 pm, pete <pfil...@mindspring.com> wrote:
> >> >
> >> > What does the standard say that the result is,
> >> > if the operand doesn't point to an object?
> >> >
> >> > > Is there undefined behaviour?
> >
> >> I don't know the standard.
> >
> > You have undefined behavior because
> > you have a null pointer as an operand of the indirection operator.

> Wrong operator. Indirection is '*', not '&'.

I was discussing '*'.

The following is the original post:

This e-mail is in regards to how a C translator/compiler should handle
the expression:

*(char *)0

--
pete

Tim Rentsch

unread,
Jul 23, 2010, 3:59:34 AM7/23/10
to
pete <pfi...@mindspring.com> writes:

I know that, but the paragraph you quoted was talking
about the address operator, not the indirection operator.

Here is the quoted paragraph again:

> This is the relevant text:
>
> 6.5.3.2 Address and indirection operators
> Constraints
> 1 The operand of the unary & operator shall be either
> a function designator,
> the result of a [] or unary * operator,
> or an lvalue that designates an object
> that is not a bit-field and is not declared
> with the register storage-class specifier.

This constraint is not about the indirection operator.

Shao Miller

unread,
Jul 23, 2010, 6:12:14 AM7/23/10
to
I found the relevant portion of the text of 'n1256.pdf' which renders:

(void)*(char *)0;

to yield undefined behaviour. It is "Cast operators", 6.5.4,
Semantics 4:

"Preceding an expression by a parenthesized type name converts the
value of the expression to the named type."

It is _here_ that we _require_ a _value_ for the _result_ (of the
expression) that is '*(char *)0'. Why? There are two casts here.
The cast to 'char *' is fine. The cast to void is not, since it
doesn't survive without a value for '*(char *)0'. This is before we
consider the whole as a void expression.

There is no definition for the value of the result of '*(char *)0'.

Unfortunately, we are still left with a tricky case; that of:

*(void *)0;

To summarize some points:
1. The value of the expression '0' is cast to 'void *'
2. The result of [1] has type 'void *'.
3. The result of [1] is a pointer.
4. The result of [1] is a null pointer.
5. The result of [1] is a null pointer constant.
6. The result of [1] is a scalar.
7. The result of [1] has a value (as per the cast conversion).
8. The value of the result of [1] is not assigned to a pointer.
9. We apply the unary '*' operator.
10. We give this operator in [9] the expression '(void *)0' as its
operand.
11. The result of [1] is thus the resulting (let's say, "effective")
operand in [9].
12. The effective operand in [9] does not point to a function.
13. The effective operand in [9] does not point to an object, as per
[4].
14. The result of [9] cannot be an lvalue by [13].
15. The effective operand in [9] has type 'void *', as per [2].
16. The result of [9] is defined to have type 'void', due to [15].
17. No invalid value has been assigned to the pointer, as per [8].
18. The result of [9] is not required to possess a value.
19. No undefined behaviour, thus far.
20. The entirety of '*(void *)0' is an expression.
21. The expression in [20] is a void expression.
22. The void expression in [20] has a non-existent value, congruent
with [18].
23. The void expression in [20] is evaluated, including for its side
effects.
24. The expression in [20] does not access a volatile object.
25. The expression in [20] does not modify an object.
26. The expression in [20] does not modify a file.
27. The expression in [20] does not call a function.
28. The expression statement '*(void *)0;' yields no undefined
behaviour and is fully defined as a legal expression statement.

Opinions, thoughts, pedantry?

Thank you for all of the feedback so far. This subject matter is good
to know in case of writing an implementation.

- Shao Miller

Ben Bacarisse

unread,
Jul 23, 2010, 7:23:09 AM7/23/10
to
pete <pfi...@mindspring.com> writes:

> Shao Miller wrote:
>>
>> On Jul 21, 9:12 pm, pete <pfil...@mindspring.com> wrote:
>> >
>> > What does the standard say that the result is,
>> > if the operand doesn't point to an object?
>> >
>> > > Is there undefined behaviour?
>
>> I don't know the standard.
>
> You have undefined behavior because
> you have a null pointer as an operand of the indirection operator.

Whilst I believe that's true in this case it is not (as I am sure you
know) a sufficient condition. For example, both

&*(char *)0

and

sizeof *(char *)0

are explicitly well-defined despite having * applied to a null pointer
operand. This may be the germ of Shao Miller's question. Since a void
expression is evaluated "for its side effects" can the absence of side
effects render the expression in question unevaluated?

For what it's worth, my view is that this wording is there simply to
give permission for an implementation to optimise away such side-effect
free evaluations as the one the OP gave originally. That does not
change the fact that it is undefined.

<snip>
--
Ben.

Tim Rentsch

unread,
Jul 23, 2010, 12:43:41 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

> I found the relevant portion of the text of 'n1256.pdf' which renders:
>
> (void)*(char *)0;
>
> to yield undefined behaviour. It is "Cast operators", 6.5.4,

> Semantics 4: [snip elaboration]

The expression '*(char *)0' is undefined behavior if it
is evaluated. Any subsequent cast is irrelevant to the
question about whether the behavior is defined.

Shao Miller

unread,
Jul 23, 2010, 12:48:21 PM7/23/10
to
On Jul 23, 7:23 am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
>   &*(char *)0
>
> and
>
>   sizeof *(char *)0
>
> are explicitly well-defined despite having * applied to a null pointer
> operand.  This may be the germ of Shao Miller's question.  Since a void
> expression is evaluated "for its side effects" can the absence of side
> effects render the expression in question unevaluated?
>
'Twas the original germ, yes. Then we agreed to take the text
literally and that the expression is evaluated. I still didn't see
any UB, because:

A value for the result of application of the unary '*' is not a
requirement of the text for the unary '*' operator in 'n1256.pdf'.

Consider:

(*p).f();

Where '(*p)' yields an lvalue (assuming 'p' points to an object), thus
satisfying the requirement for an object by the membership '.'
operator. But ponder what "value" was involved for '(*p)' in this
case. The aggregate value of the object pointed to by 'p' in its
entirety? I should hardly think so.

>
> For what it's worth, my view is that this wording is there simply to
> give permission for an implementation to optimise away such side-effect
> free evaluations as the one the OP gave originally.
>

Agreed.

>
> That does not
> change the fact that it is undefined.
>

As per my post in response to the original post, and due to the cast
when casting to '(void)', I now agree. Not due to application of the
unary '*' operator to a null pointer which has not been assigned a
null pointer value. See also the challenge with:

*(void *)0;

Shao Miller

unread,
Jul 23, 2010, 12:55:25 PM7/23/10
to
On Jul 23, 12:43 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
> The expression '*(char *)0' is undefined behavior if it
> is evaluated.  Any subsequent cast is irrelevant to the
> question about whether the behavior is defined.
>
If and only if you do not take the text for the unary '*' operator
literally. That text describes undefined behaviour when a null
pointer value has been assigned to the pointer. Here we have a null
pointer, not a nuller pointer value assigned to a pointer.

We agreed that it's possible that that text might be imprecise, and
might need to be addressed, did we not? But it's also possible that
it's precise, and there is no undefined behaviour until casting to
'(void)'.

Would you agree?

Tim Rentsch

unread,
Jul 23, 2010, 1:30:28 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

I don't. The wording could be better, but there is no
doubt about the meaning. The Standard is written in
formal English but it is not a math textbook, and it's
at best a waste of time to read it like one.

If you want to get technical, it can NEVER be the case
that the operand of an indirection operator has been
assigned. In the expression '*p', where p has been
declared to be of some pointer type, the operand 'p'
has already been converted to a value by virtue of
6.3.2.1p2. There is no difference between '*p' and
'*(char*)0' in this regard -- both operate on values,
not objects. So it's completely nonsensical to try to
understand "has been assigned" as applying to one class
of operand expression but not another. They are all
just values.

Seebs

unread,
Jul 23, 2010, 2:02:35 PM7/23/10
to
On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
> We agreed that it's possible that that text might be imprecise, and
> might need to be addressed, did we not? But it's also possible that
> it's precise, and there is no undefined behaviour until casting to
> '(void)'.
>
> Would you agree?

No. It is adequately precise, adequately clear, and the undefined behavior
is unambiguous.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Ben Bacarisse

unread,
Jul 23, 2010, 2:31:01 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 7:23 am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>>
>>   &*(char *)0
>>
>> and
>>
>>   sizeof *(char *)0
>>
>> are explicitly well-defined despite having * applied to a null pointer
>> operand.  This may be the germ of Shao Miller's question.  Since a void
>> expression is evaluated "for its side effects" can the absence of side
>> effects render the expression in question unevaluated?
>>
> 'Twas the original germ, yes. Then we agreed to take the text
> literally and that the expression is evaluated. I still didn't see
> any UB, because:
>
> A value for the result of application of the unary '*' is not a
> requirement of the text for the unary '*' operator in 'n1256.pdf'.

It says "if it [the operand] points to an object the result is...". If
the pointer does not point to an object the behaviour is undefined by
omission. There is a clarifying clause about invalid pointers but it
adds no new meanings, which is fortunate since it uses the clumsy "if an
invalid value has been assigned to the pointer" phrase. So *E is
defined only when E points to a function or when E points to an object.
To which function or object does (char *)0 point?

I don't understand your text about "a value for the result of [the]
application" not being "a requirement of the text" but I don't think I
need to. *E is defined when E points to a function or an object and I
think your example fails both tests.

> Consider:
>
> (*p).f();
>
> Where '(*p)' yields an lvalue (assuming 'p' points to an object), thus
> satisfying the requirement for an object by the membership '.'
> operator. But ponder what "value" was involved for '(*p)' in this
> case. The aggregate value of the object pointed to by 'p' in its
> entirety? I should hardly think so.

Why? I should think exactly that.

<snip>


> As per my post in response to the original post, and due to the cast
> when casting to '(void)', I now agree. Not due to application of the
> unary '*' operator to a null pointer which has not been assigned a
> null pointer value.

Let me get this clear. You are saying that (void)*(char *)0 is UB due
to the cast and, presumably, that *(char *)0 is not because there is no
cast? If so, I won't ask you to re-rash the argument -- I'll find it in
my news feed if I want to go look.

As you can see from the above, I disagree.

> See also the challenge with:
>
> *(void *)0;

Also undefined for the same reason.

--
Ben.

Shao Miller

unread,
Jul 23, 2010, 3:59:11 PM7/23/10
to
On Jul 23, 1:30 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
> I don't.  The wording could be better, but there is no
> doubt about the meaning.
>
After your fine reference to the text below, I'd have to agree.

>
> The Standard is written in
> formal English but it is not a math textbook, and it's
> at best a waste of time to read it like one.
>

I am not aware of anyone who's reading it like a math textbook and I'd
have to agree. It could be worth-while reading its fine detail and
discussing and resolving perceived ambiguities, for the case where one
might be interested in developing a translator for C.

>
> If you want to get technical,
>

Indeed I did.

>
> it can NEVER be the case
> that the operand of an indirection operator has been
> assigned.  In the expression '*p', where p has been
> declared to be of some pointer type, the operand 'p'
> has already been converted to a value by virtue of
> 6.3.2.1p2.  There is no difference between '*p' and
> '*(char*)0' in this regard -- both operate on values,
> not objects.  So it's completely nonsensical to try to
> understand "has been assigned" as applying to one class
> of operand expression but not another.  They are all
> just values.

This is to me an extremely valuable reference to the text of
'n1256.pdf'. I agree that with this reference in mind, it's
nonsensical to treat "If an invalid value has been assigned to the
pointer" as being intended to mean anything other than "If the operand
has an invalid value"... If only the text said "operand." It
doesn't. It says "pointer."

Is there any doubt that the operand has a value? We can assign '(char
*)0' or even '(void *)0' to an object. I don't think there's any
doubt that the operand has a value.

This could potentially be a cause for confusion, since sentences 2 and
3 explicitly use "operand" and "points to" and "has type". The next
sentence could very well mean, "if the value of the operand _is_ an
invalid value..." (Emphasis mine.) It could also mean, "if the value
of the operand was an invalid value assigned to the operand..."

Do you understand why I am asking about all of this? In the execution
environment if we attempt to access an object at an invalid location,
it should be undisputed as undefined behaviour. But expression
evaluation != execution. Evaluation of a constant scalar expression
such as '(char *)0' need not be "executed" at all. That is to say,
the text defines an attempted object access to an invalid location as
undefined behaviour. It could even be trapped by the best
implementation. But evaluation of an expression which is an
application of the unary '*' operator does noes necessitate an object
access to any location. If it did, the text should include something
like:

"The result of evaluation of the unary '*' operator shall be the value
of an object pointed to by the operand, if the operand point to an
object."

But that might not be the case. Consider these:

(*p).f();
(*q)->x = 10;
*r = 11;
(*s)();

For 'p', 'q' and 'r', if they point to an object, the result is an
lvalue. It's not a "value". There's no need to "fetch" the "value"
during the indirection at all, is there? Thus we only get undefined
behaviour if they _don't_ point to an object, which is a determination
that might only be possible during execution.

For 's', the indirection is intended to result in a function
designator. Not an lvalue. Not a "value".

It is clear that many people have tied evaluation of the unary '*'
operator to "yielding an object, pointed-to by the operand" in their
thinking. But this is not the case.

Also consider a Turing machine implementation with a tape and a head.
In the 'q' example above, if 'q' were assigned the value '(struct foo
*)0', the head might move to position zero, where "read" and "write"
are invalid. No read nor write is attempted. Then the head moves by
the offset of the 'x' member. At last, we attempt a write when we
assign, assuming that reads and writes are valid at that position.
Why should there be undefined behaviour by moving the head to position
0 any more so than to any other location which is invalid for objects
or for which the validity is not guaranteed?

Does anyone understand why "has been assigned" could be important?

char *p;
*p = 'Y';

If the Turing machine's head attempts to move to the location as per
'p', that location might not be a valid location for the head to move
to. Undefined behaviour. But how can you have _an_expression_ with a
_constant_scalar_value_ at _translation_time_ (let alone during
execution) possibly represent an invalid location for the head to move
to?

Keith Thompson

unread,
Jul 23, 2010, 4:00:48 PM7/23/10
to

I certainly agree that the wording in the description of unary "*"
needs to be improved. On the other hand, I think the intent is
reasonably unambiguous.

6.5.3.2p4:

The unary * operator denotes indirection. If the operand points
to a function, the result is a function designator; if it points

to an object, the result is an lvalue designating the object. If


the operand has type ‘‘pointer to type’’, the result

has type ‘‘type’’. If an invalid value has been assigned
to the pointer, the behavior of the unary * operator is undefined.

The phrase "has been assigned to" makes sense only for a pointer
*object*, but the operand of "*" needn't be an lvalue; it can be any
arbitrary expression of pointer type.

The unqualified word "pointer" very often means "pointer object",
but it can also mean "pointer value"; see, for example, the Standard's
description of the value returned by malloc(). I think the author of
the above paragraph just momentarily failed to make the distinction
properly.

The intent is that applying "*" to an invalid pointer value has
undefined behavior; a null pointer is "invalid" in this context).
I'm sure beyond reasonable doubt that the authors intended this to
apply whether the operand is an lvalue or not. Apart from having
it apply only to lvalues being nonsensical, if that were the intent
it could have been worded much more clearly.

Shao Miller

unread,
Jul 23, 2010, 4:12:52 PM7/23/10
to
On Jul 23, 2:02 pm, Seebs <usenet-nos...@seebs.net> wrote:
>
> No.  It is adequately precise, adequately clear, and the undefined behavior
> is unambiguous.
>
Thank you for your opinion. I would have also appreciated some
reasoning, in order to help to convince myself of this. The opinion
is appreciated regardless of the lack of reasoning. Poll-wise,
dereferencing a null pointer is undefined behaviour during
evaluation. Reason-wise, it's still incomplete, for me.

Shao Miller

unread,
Jul 23, 2010, 4:31:05 PM7/23/10
to
On Jul 23, 2:31 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> It says "if it [the operand] points to an object the result is...".  If
> the pointer does not point to an object the behaviour is undefined by
> omission.
>
But it also defines the result in terms of its type, based on the type
of the operand. In our situation, this is defined. The omission of
when the pointer does not point to an object only impacts the
definition of the result, _when_ that result is an _lvalue_. It
doesn't apply to functions, for example. The "has type" clause covers
all three situations:
1. The operand points to an object
2. The operand points to a function
3. The operand points to neither, but has a pointer type

>
> There is a clarifying clause about invalid pointers but it
> adds no new meanings, which is fortunate since it uses the clumsy "if an
> invalid value has been assigned to the pointer" phrase.
>

Whole-heartedly agreed as clumsy, if and only if it's not explicitly
there for a good reason.

>
> So *E is
> defined only when E points to a function or when E points to an object.
> To which function or object does (char *)0 point?
>

Its type is defined when the operand has a pointer type. It's
_further_ defined as an lvalue or a function designator, under certain
_additional_ circumstances.

>
> I don't understand your text about "a value for the result of [the]
> application" not being "a requirement of the text" but I don't think I
> need to.  *E is defined when E points to a function or an object and I
> think your example fails both tests.
>

Again, what do you mean by '*E'? Do you mean "the value of '*E'" or
"the type of '*E'" or both or neither or more than both?

>
> Why?  I should think exactly that.
>

So when 'p' points to a structure and we apply '*' to it, you suggest
that evaluation entails the requirement for knowing the value of '*p'
altogether? If so, does that knowledge require fetching the value?

>
> Let me get this clear.  You are saying that (void)*(char *)0 is UB due
> to the cast
>

The cast to 'void', yes, since a cast requires a value.

>
> and, presumably, that *(char *)0 is not because there is no
> cast?
>

There is no cast in evaluation of this expression that requires a
value for '*(char *)0'. One versus two casts, above. Note that the
result of '*(char *)0' has type 'char', and is thus not a void
expression. '*(void *)0' has type 'void', is is thus a void
expression.

>
> If so, I won't ask you to re-rash the argument -- I'll find it in
> my news feed if I want to go look.
>

I apologize for re-hashing it anyway, should that inconvenience you.
Perhaps it'll help someone else.

>
> As you can see from the above, I disagree.
>

That's fine, and your attention has been appreciated. I mean it.

>
> Also undefined for the same reason.
>

Uncertain for me.

Shao Miller

unread,
Jul 23, 2010, 4:43:01 PM7/23/10
to
On Jul 23, 4:00 pm, Keith Thompson <ks...@mib.org> wrote:
>
> I certainly agree that the wording in the description of unary "*"
> needs to be improved.
>
Agreed. Thank you.

>
> On the other hand, I think the intent is
> reasonably unambiguous.
>

If I had been able to find a discussion concerning this agreed-upon
ambiguity, I would have said "somewhat unambiguous." This is the
first discussion I am aware of, so I would say "all but entirely
unambiguous." :)

>
> The phrase "has been assigned to" makes sense only for a pointer
> *object*, but the operand of "*" needn't be an lvalue; it can be any
> arbitrary expression of pointer type.
>

Agreed. Thanks.

>
> The unqualified word "pointer" very often means "pointer object",
> but it can also mean "pointer value"; see, for example, the Standard's
> description of the value returned by malloc().
>

Agreed. Thanks.

>
> I think the author of
> the above paragraph just momentarily failed to make the distinction
> properly.
>

Possibly. But the use of "operand" in the other sentences, the use of
"value of" in other parts of the text for other operators, does leave
me uncertain.

>
> The intent is that applying "*" to an invalid pointer value has
> undefined behavior; a null pointer is "invalid" in this context).
> I'm sure beyond reasonable doubt that the authors intended this to
> apply whether the operand is an lvalue or not.  Apart from having
> it apply only to lvalues being nonsensical, if that were the intent
> it could have been worded much more clearly.
>

This intent certainly seems likely if we treat the responses to this
discussion as providing evidence for intended meaning to common
interpretation. Thank you for this type of evidence, for what it's
worth.

Seebs

unread,
Jul 23, 2010, 5:09:04 PM7/23/10
to
On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
> Thank you for your opinion. I would have also appreciated some
> reasoning, in order to help to convince myself of this. The opinion
> is appreciated regardless of the lack of reasoning. Poll-wise,
> dereferencing a null pointer is undefined behaviour during
> evaluation. Reason-wise, it's still incomplete, for me.

I really don't think the problem is reasoning, because you've seen tons
of that, and it's had no effect whatsoever. I don't know what the issue
is; this really is pretty straight forward. '(char *) 0' is a pointer, and
it does not point to a valid object, therefore, '*(char *) 0' is undefined
behavior to evaluate. In the abstract machine, expressions are evaluated,
with a few specialized exceptions (such as sizeof). So whether or not you
cast it, or use the value, the expression *is* evaluated, and evaluating the
expression produces undefined behavior.

I honestly can't see what the confusing part is. The expression is
evaluated, and evaluating the expression yields undefined behavior, therefore
undefined behavior occurs. There's nothing tricky or fancy going on.
The only possible problems with the wording are cases in which it's still
clear what is intended, even if the text is poorly-phrased.

Seebs

unread,
Jul 23, 2010, 5:10:46 PM7/23/10
to
On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
> But it also defines the result in terms of its type, based on the type
> of the operand. In our situation, this is defined. The omission of
> when the pointer does not point to an object only impacts the
> definition of the result, _when_ that result is an _lvalue_.

No, it impacts the definition of what happens. There is no definition
given for what happens.

Okay, look at it this way:

*(char *) 0

What is the defined result of evaluating this expression? Show us the
explicit definition of what you get when you evaluate this.

If you can't, it's undefined. Failure-to-define is sufficient to make
behavior undefined; even if you don't think there's an explicit statement that
it's undefined, unless you can provide the definition, it's still undefined.

Shao Miller

unread,
Jul 23, 2010, 5:30:47 PM7/23/10
to
On Jul 23, 5:10 pm, Seebs <usenet-nos...@seebs.net> wrote:
>
> No, it impacts the definition of what happens.  There is no definition
> given for what happens.
>
> Okay, look at it this way:
>
>         *(char *) 0
>
> What is the defined result of evaluating this expression?  Show us the
> explicit definition of what you get when you evaluate this.
>
> If you can't, it's undefined.  Failure-to-define is sufficient to make
> behavior undefined; even if you don't think there's an explicit statement that
> it's undefined, unless you can provide the definition, it's still undefined.
>
Thank you for this feedback once again, Seebs.

*(char *)0

is an expression defined to evaluate to a result having type 'char'.
The operand for the unary '*' operator has type pointer-to-char, which
is why.

Now then, if the operand points to an object, the result is
furthermore an lvalue.

If the operand points to a function, the result is furthermore a
function designator.

Seems defined quite nicely to me. Do you see any error with this
reasoning? Does the text of 'n1256.pdf' stipulate that the result of
this expression is required to have anything _more_ than a type; that
it must additionally be at least one of an lvalue or a function
designator?

pete

unread,
Jul 23, 2010, 5:34:56 PM7/23/10
to
Tim Rentsch wrote:
>
> pete <pfi...@mindspring.com> writes:
>
> > Tim Rentsch wrote:

> >> Wrong operator. Indirection is '*', not '&'.
> >
> > I was discussing '*'.

> I know that, but the paragraph you quoted was talking


> about the address operator, not the indirection operator.

Oops!

--
pete

Seebs

unread,
Jul 23, 2010, 5:49:40 PM7/23/10
to
On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
> *(char *)0

> is an expression defined to evaluate to a result having type 'char'.
> The operand for the unary '*' operator has type pointer-to-char, which
> is why.

And what is that result?

> Now then, if the operand points to an object, the result is
> furthermore an lvalue.

It doesn't point to an object. But it does point to an object type.

The unary * operator denotes indirection. If the operand points to
a function, the result is a function designator; if it points to


an object, the result is an lvalue designating the object. If the
operand has type "pointer to type", the result has type "type". If

an invalid value has been assigned to the pointer, the behavior of
the unary * operator is undefined.)

There are three ways we can consider this text. Both yield identical
conclusions.

METHOD #1:

"Has been assigned to" is clumsy wording, but it obviously includes any
possible case in which a pointer *has* an invalid value. A null pointer
is by definition invalid, because it doesn't point to an object. The
behavior is undefined.

METHOD #2:

Consider more closely this sentence:
If the operand points to a function, the result is a function
designator; if it points to an object, the result is an
lvalue designating the object.

This offers the sum total of definitions of the behavior of the unary-*
operator. Since the operand does not point to a function or an object, its
behavior is not defined by this sentence. The behavior is undefined.

METHOD #3:

Let's imagine that the "type" argument is meaningful, and that since the
operand has a *type* of a pointer-to-object, the result is "an lvalue
designating the object". Then let's see 6.3.2.1, paragraph 1:

An lvalue is an expression with an object type or an
incomplete type other than void; if an lvalue does not
designate an object when it is evaluated, the behavior is
undefined.

It does not designate an object, we evaluate it, therefore the behavior is
undefined.

What it comes down to is: Dereferencing null pointers yields undefined
behavior. We know this, the standard is adequately clear on it, and running
around ignoring parts of it at random or adding extra significance to
"has been assigned" does not change it. The indirection operator does not
have any defined behavior when applied to something which is not a pointer
to an object. The behavior is undefined. It is up to you whether you prefer
to think of this as being undefined because the lvalue does not generate
an object, or because * is not defined in its behavior when not given a
pointer to an object-or-function; either way, it's undefined.

We could doubtless improve the text with something like "if the pointer
does not point to an object or function, the behavior is undefined", but
that the text could be improved does not mean that there is any ambiguity
here. In some cases, you can assert confidently that the wording is
poor but that the meaning is clear, and this is one of those cases.

Ben Bacarisse

unread,
Jul 23, 2010, 7:25:45 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 2:31 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>>
>> It says "if it [the operand] points to an object the result is...".  If
>> the pointer does not point to an object the behaviour is undefined by
>> omission.
>>
> But it also defines the result in terms of its type, based on the type
> of the operand. In our situation, this is defined. The omission of
> when the pointer does not point to an object only impacts the
> definition of the result, _when_ that result is an _lvalue_. It
> doesn't apply to functions, for example. The "has type" clause covers
> all three situations:
> 1. The operand points to an object
> 2. The operand points to a function
> 3. The operand points to neither, but has a pointer type

No. You are playing games with the language. Two clauses of one
sentence (separated by ;) talk about the result. A separate sentence
talks about the type. These are not three cases but two attributes of
this form of expression.

C defines the type of many expression forms whether the result is
defined or not. << and >> expressions have the type of the promoted
left operand. In some cases the result (or behaviour) is undefined.
The fact that the type is known does not make all << and >> expressions
defined.

What, in your opinion, is the result of the expression *(char *)0? If
you can't find words from that standard to explain what the result is
(not just its type) then it is undefined by omission.

>> There is a clarifying clause about invalid pointers but it
>> adds no new meanings, which is fortunate since it uses the clumsy "if an
>> invalid value has been assigned to the pointer" phrase.
>>
> Whole-heartedly agreed as clumsy, if and only if it's not explicitly
> there for a good reason.

It's clear we disagree about that phrase too. Presumably you accept as
valid code like this:

int *ip;
{ int i = 42; ip = &i; }
*ip = 0;

because no invalid pointer has been assigned to ip -- the pointer has
merely become invalid without any assignment.

>> So *E is
>> defined only when E points to a function or when E points to an object.
>> To which function or object does (char *)0 point?
>>
> Its type is defined when the operand has a pointer type. It's
> _further_ defined as an lvalue or a function designator, under certain
> _additional_ circumstances.

Yes and, by omission, when neither circumstance applies nothing more can be
inferred about the result (from the standard). This is the definition of
undefined. That the expression has type is not in dispute (my original
intervention using sizeof *(char *)0 relies on the expression having a
type) by a type does not mean that the expression has a defined result.

>> I don't understand your text about "a value for the result of [the]
>> application" not being "a requirement of the text" but I don't think I
>> need to.  *E is defined when E points to a function or an object and I
>> think your example fails both tests.
>>
> Again, what do you mean by '*E'? Do you mean "the value of '*E'" or
> "the type of '*E'" or both or neither or more than both?

I mean the expression form (E was a kind of syntax place-holder) is
undefined. It often has a type, but its evaluation has no defined
result.

>> Why?  I should think exactly that.
>>
> So when 'p' points to a structure and we apply '*' to it, you suggest
> that evaluation entails the requirement for knowing the value of '*p'
> altogether? If so, does that knowledge require fetching the value?

The result of *p is the entire object. I can't answer your first
question because I don't know what "knowing the value of *p altogether"
means (the problem word for me is "knowing"). As for the second, all
reasonable compilers will look to see what is actually used from *p
to avoid fetching any more than is needed.

*p; /* probably fetch nothing (volatile objects excepted) */
s = *p; /* probably fetch it all -- at least we know where to put it */
(*p).m; /* probably behaves like p->m (i.e. only m is fetched) */

<snip>
--
Ben.

Shao Miller

unread,
Jul 23, 2010, 7:31:07 PM7/23/10
to
On Jul 23, 5:49 pm, Seebs <usenet-nos...@seebs.net> wrote:
>
> And what is that result?
>
What is "what"? The result is defined to have a type. The result
_is_ a result. It _has_ a type.

>
> It doesn't point to an object.  But it does point to an object type.
>

Agreed for '(char *)0'. Not so for '(void *)0'. I'm with you here,
though.

>
> There are three ways we can consider this text.  Both yield identical
> conclusions.
>
> METHOD #1:
>
> "Has been assigned to" is clumsy wording, but it obviously includes any
> possible case in which a pointer *has* an invalid value.
>

Agreed on clumsy wording, if and only if it's not _intentional_. That
remains to be proven or at the very least, demonstrated. What makes
this obvious to you?

>
> A null pointer
> is by definition invalid,
>

Invalid _what_? A null pointer is a valid value for assignment, is it
not? If you are specifically referring to the context of the
evaluation of the expression '*(char *)0', your claim just above
_requires_ us to accept your previous claim before that; namely, that
"has been assigned" is clumsy and that it includes the case where the
pointer _has_ an invalid value. Furthermore, the non-normative
footnote is the only clue we have as to the possibility of a null
pointer being such an invalid value.

>
> because it doesn't point to an object.  The
> behavior is undefined.
>

You have invented the requirement for the pointer operand to "point to
an object". Consider a function pointer. You have invented
supposedly undefined behaviour. In this Method #1, you have not
convinced me, I'm sorry to say. I was hopeful. It might even be of
no consequence to you whether I've been convinced or not. That's
fine.

Where I might say "invented" below, I additionally mean, "or have not
cited a reference which supports."

>
> METHOD #2:
>
> Consider more closely this sentence:
>         If the operand points to a function, the result is a function
>         designator; if it points to an object, the result is an
>         lvalue designating the object.
>
> This offers the sum total of definitions of the behavior of the unary-*
> operator.

Considered. You have just invented the offering that this sentence is
the sum total of definitions of the behaviour for the operator. If
that were so, we could discard the statement regarding type. In fact,
why is that statement in there at all? If the sentence you reference
just above is the "sum total" you describe, why would there be any
reason to add a further definition for the result's type? I propose
that the evaluation result can thus have a type and not be one of an
lvalue, a function designator. I would be much more convinced of the
validity of this Method #2 if the referenced text either explained
that one of these possibilities is required, or some other portion of
the draft explained that sentences like these have such a requirement
implicitly. This Method #2 does not address the "has been assigned to
the pointer", which would make it even more convincing to me.

>
> Since the operand does not point to a function or an object, its
> behavior is not defined by this sentence.  The behavior is undefined.
>

This claim requires acceptance of the disputed claim above whereby the
sentence is the "sum total" of definitions for the result of
evaluation. Acceptance of that claim means we can discard the
possibility of invalid values being assigned to the pointer. Thus,
the "has been assigned to" sentence could be discarded. So why is it
in there at all, along with the definition of the type?

>
> METHOD #3:
>
> Let's imagine that the "type" argument is meaningful, and that since the
> operand has a *type* of a pointer-to-object, the result is "an lvalue
> designating the object".
>

It does _not_ have type pointer-to-object. The expression '(char *)0'
has type pointer-to-char. 'char' can certainly _be_ the type _for_ an
object. '(char *)0' certainly _isn't_ a pointer-to-object, as it is a
null pointer, "guaranteed to compare unequal to a pointer to any
object or function" according to 6.3.2.3, point 3. The result is thus
_not_ "an lvalue designating the object."

>
> Then let's see 6.3.2.1, paragraph 1:
>
>         An lvalue is an expression with an object type or an
>         incomplete type other than void; if an lvalue does not
>         designate an object when it is evaluated, the behavior is
>         undefined.
>
> It does not designate an object, we evaluate it, therefore the behavior is
> undefined.
>

This claim requires acceptance of the disputed claim above that the
result must be an lvalue because the type of the operand is pointer-to-
char. If the result is not an lvalue, evaluation of the result is not
evaluation of an lvalue and the above reference does not apply.

>
> What it comes down to is:  Dereferencing null pointers yields undefined
> behavior.  We know this, the standard is adequately clear on it,
>

We _don't_ know this. Some of us, possibly the majority, _believe_
it. According to some of your arguments, the standard is adequately
clear on it. Why then does Method #1 detail "clumsy wording"? That
would appear to make it inadequately clear. Do we discard Method #1
or do you instead mean, "so far, everyone I've ever known to interpret
null pointer indirection shares the interpretation I have."

>
> and running
> around ignoring parts of it at random
>

Which parts have been ignored? Your 6.3.2.1, point 1 has been
considered and responded to. If I have ignored a cited reference in
another responder's response, then I would be glad of it being brought
to my attention, so I can settle this matter to rest once and for all,
and forget about all of the opinions, inventions, and what appear to
be some plain-and-simple "No, you can't [but I cannot seem to put my
finger on exactly why]" responses that I am interpreting.

I'm not running around. I'm not dead-set in thinking that there
_is_no_ undefined behaviour. I found that there _is_ undefined
behaviour for evaluation of the expression '(void)*(cast *)0;' Nobody
has shown it (UB for '*(char *)0' yet to a satisfactorily reasonable
degree without invoking "I don't believe the intended meaning is
congruent with your interpretation; the wording could be improved" I
have provided a few arguments regarding _no_need_ for the evaluated
result to imply undefined behaviour; only the remote possibility that
implementations with a desire to conform might need to be aware of or
revisit a couple of scenarios and treat the behaviour as defined
rather than undefined. I was hopeful that "obvious" meant that there
was a simple sequence of reasoning to follow based on the text of the
draft or of a standard of C. This hope remains, with the kind
intentions and assistance of responders such as yourself.

>
> or adding extra significance to
> "has been assigned" does not change it.
>

This is backwards. I am _not_deducting_ significance from "has been
assigned." You and other kind responders have been deducting. My
arguments treat the significance literally, do they not?

>
> The indirection operator does not
> have any defined behavior when applied to something which is not a pointer
> to an object.
>

Except that pointers to functions are defined, as well as with the
defined behaviour of having a result with a type. Thus this claim is
false. The claim does, however, certainly re-emphasize the
commonality of this argument throughout this thread.

>
> The behavior is undefined.  It is up to you whether you prefer
> to think of this as being undefined because the lvalue does not generate
> an object, or because * is not defined in its behavior when not given a
> pointer to an object-or-function; either way, it's undefined.
>

You require one of the previous claims to be true here. You suggest
that the definition of the result's type makes for an incomplete
definition for the result. I shall continue to choose "looks as
though it's defined," instead. This might be of no consequence to
you, but I do appreciate your attempts to provide evidence.

>
> We could doubtless improve the text with something like "if the pointer
> does not point to an object or function, the behavior is undefined", but
> that the text could be improved does not mean that there is any ambiguity
> here.
>

You have brought to light two points of ambiguity:
1. The semantic point including "has been assigned to" may only be
intended to mean "if the value of the pointer is an invalid value."
In which case, there still is no normative definition for what
constitutes an invalid value, though we have a hint from the footnote.
2. The semantic point's sentence regarding the result's type should
not be specified independently from the other two definitions of
_possible_ properties of the result, given certain circumstances.

>
> In some cases, you can assert confidently that the wording is
> poor but that the meaning is clear, and this is one of those cases.
>

At this time, I cannot. You and some others have.

Thank you so much!

Ben Bacarisse

unread,
Jul 23, 2010, 7:50:15 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 5:49 pm, Seebs <usenet-nos...@seebs.net> wrote:
>>
>> And what is that result?
>>
> What is "what"? The result is defined to have a type. The result
> _is_ a result. It _has_ a type.

Oh, be serious!

6.5 Expressions

1 An expression is a sequence of operators and operands that specifies
computation of a value, or that designates an object or a function,
or that generates side effects, or that performs a combination
thereof.

What about the expression *(char *)0? Does it generate side-effects?
No. Does it designate an object or a function? No. It must therefore
specify the computation of a value. We are all certain of the type of
that value (char) by we don't yet know which of the many chars is it.

By the way, I am happy just to agree to disagree about this. I'll
continue to write C avoiding constructs like *(char *)0 and so will you!

--
Ben.

Seebs

unread,
Jul 23, 2010, 8:17:34 PM7/23/10
to
On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
> On Jul 23, 5:49 pm, Seebs <usenet-nos...@seebs.net> wrote:

>> And what is that result?

> What is "what"? The result is defined to have a type. The result
> _is_ a result. It _has_ a type.

But unless we know what the result is -- not just its type -- it is
NOT DEFINED.

> Agreed for '(char *)0'. Not so for '(void *)0'. I'm with you here,
> though.

Yes, so "*(void *) 0" is a constraint violation.

> Agreed on clumsy wording, if and only if it's not _intentional_. That
> remains to be proven or at the very least, demonstrated. What makes
> this obvious to you?

Complete consistency across dozens of implementations and the last thirty
years of writing about C, reading about C, programming in C, and looking
at the code to implementations. I was on the committee. We have all, always,
agreed absolutely that dereferencing null pointers is clearly and
unambiguously undefined behavior.

>> A null pointer
>> is by definition invalid,

> Invalid _what_?

An invalid pointer -- as opposed to one which points to an object.

>A null pointer is a valid value for assignment, is it
> not? If you are specifically referring to the context of the
> evaluation of the expression '*(char *)0', your claim just above
> _requires_ us to accept your previous claim before that; namely, that
> "has been assigned" is clumsy and that it includes the case where the
> pointer _has_ an invalid value. Furthermore, the non-normative
> footnote is the only clue we have as to the possibility of a null
> pointer being such an invalid value.

No, it doesn't. It only requires that we understand that the standard
is consistent about referring to a pointer which does not definitely point
to an object as "invalid". (Invalid includes both null pointers and to
objects whose lifetime is over.)

>> because it doesn't point to an object.  The
>> behavior is undefined.

> You have invented the requirement for the pointer operand to "point to
> an object".

No, I haven't.

I have observed that there is no definition provided for the behavior of
indirection through a pointer which does not point to either an object or
a function.

> It might even be of
> no consequence to you whether I've been convinced or not. That's
> fine.

At this point, I see nothing in any of your posts to suggest that you are
even sincere; you are acting precisely like a troll.

> Where I might say "invented" below, I additionally mean, "or have not
> cited a reference which supports."

Because obviously, accusing someone of lying (which is what "invented" means
in this context) is the clearest way to communicate that you didn't see or
understand a citation.

> Considered. You have just invented the offering that this sentence is
> the sum total of definitions of the behaviour for the operator.

No, I haven't.

First off, "invented" means "newly created", and that is precisely equivalent
to accusing me of lying about what the standard says. I see no real reason
to continue arguing with you at this point.

Secondly, that is the ENTIRE POINT of a standard. It is the sum total of
the definition. You have not offered or cited or suggested or hinted at
any explanation of what value should be yielded by dereferencing a null
pointer, *because there is none*. That means it's undefined. If it were
defined, we would know not only its type, but its value.

> If that were so, we could discard the statement regarding type.

No, we couldn't, because the value and the type are two different things,
and it is in some cases possible to dispute which type an expression would
have even knowing its value.

> It does _not_ have type pointer-to-object. The expression '(char *)0'
> has type pointer-to-char.

Yes, and since char is an object type, pointer-to-char is a pointer-to-object
type, as opposed to a pointer to an incomplete type or a pointer to a function
type.

> 'char' can certainly _be_ the type _for_ an
> object. '(char *)0' certainly _isn't_ a pointer-to-object, as it is a
> null pointer, "guaranteed to compare unequal to a pointer to any
> object or function" according to 6.3.2.3, point 3. The result is thus
> _not_ "an lvalue designating the object."

And that means that the standard does not define the results of the
indirection.

What is not defined is undefined. QED.

And because you are either an idiot or an unbelievable jerk, I'm plonking
you. You can't possibly be this stupid, and there is no way I'm putting
up with your continued totally unsupported accusations that other people are
lying. The only way that could be unintentional would be if your English
skills were weak enough that it is necessarily trolling for you to be
arguing with people about what they think sentences in English mean.

Either you should stop disputing peoples' interpretations of English, or you
know it well enough that the accusations of dishonesty are clearly
intentional. Either way, we're done, and I sincerely hope I never have to
see anything you have to say again. Go away. You do not have an attitude
conducive to learning about C, or any other language, and with the way you've
treated people, I see no reason to believe you will ever acquire one.

Keith Thompson

unread,
Jul 23, 2010, 8:17:35 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:
> On Jul 23, 5:49 pm, Seebs <usenet-nos...@seebs.net> wrote:
>>
>> And what is that result?
>>
> What is "what"? The result is defined to have a type. The result
> _is_ a result. It _has_ a type.

And a value. What is the value of the result? Nothing in the standard
defines what that value is, therefore the value is undefined.

[...]

>> "Has been assigned to" is clumsy wording, but it obviously includes any
>> possible case in which a pointer *has* an invalid value.
>>
> Agreed on clumsy wording, if and only if it's not _intentional_. That
> remains to be proven or at the very least, demonstrated. What makes
> this obvious to you?

How could it be intentional? If it's intentional, it doesn't make any
sense.

Once again, here's the passage in question, 6.5.3.2p4:

The unary * operator denotes indirection. If the operand points


to a function, the result is a function designator; if it points

to an object, the result is an lvalue designating the object. If


the operand has type ‘‘pointer to type’’, the result has

type ‘‘type’’. If an invalid value has been assigned to


the pointer, the behavior of the unary * operator is undefined.

It is not possible to assign a value, invalid or otherwise, to "the
pointer" unless "the pointer" is a pointer object. The context does not
imply the existence of any pointer object to which the phrase "the
pointer" could refer. Even if the operand happens to be an lvalue,
it is no longer an lvalue (and thus no longer designates an object)
before the "*" operator is applied to it. 6.3.2.1p2:

Except when it is the operand of [list snipped] an lvalue that
does not have array type is converted to the value stored in
the designated object (and is no longer an lvalue).

The author implicitly assumed the existence of a pointer object that
does not exist.

[big snip]

Keith Thompson

unread,
Jul 23, 2010, 8:34:38 PM7/23/10
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
[...]

> 6.5 Expressions
>
> 1 An expression is a sequence of operators and operands that specifies
> computation of a value, or that designates an object or a function,
> or that generates side effects, or that performs a combination
> thereof.
>
> What about the expression *(char *)0? Does it generate side-effects?
> No. Does it designate an object or a function? No. It must therefore
> specify the computation of a value. We are all certain of the type of
> that value (char) by we don't yet know which of the many chars is it.

(void)0 neither computes a value, nor designates an object or function,
nor generates side effects, or any combination thereof.

This is yet another reason the standard's definition of "expression" is
flawed. The other is that some expressions, for example 42, contain no
operators or operands.

It's not *fatally* flawed; I don't think it's led anyone to an incorrect
understanding of what "expression" means. But a more accurate
definition would refer to the syntax.

Morris Keesan

unread,
Jul 23, 2010, 9:03:37 PM7/23/10
to
On Fri, 23 Jul 2010 19:50:15 -0400, Ben Bacarisse <ben.u...@bsb.me.uk>
wrote:
...

> What about the expression *(char *)0? Does it generate side-effects?
> No.

Can you point to anything in the standard which says that the expression
doesn't generate side-effects? Since evaluating that expression results
in undefined behavior, I would argue that whether or not there are side
effects is undefined.
--
Morris Keesan -- mke...@post.harvard.edu

Ben Bacarisse

unread,
Jul 23, 2010, 9:23:46 PM7/23/10
to
"Morris Keesan" <mke...@post.harvard.edu> writes:

> On Fri, 23 Jul 2010 19:50:15 -0400, Ben Bacarisse
> <ben.u...@bsb.me.uk> wrote:
> ...
>> What about the expression *(char *)0? Does it generate side-effects?
>> No.
>
> Can you point to anything in the standard which says that the expression
> doesn't generate side-effects? Since evaluating that expression results
> in undefined behavior, I would argue that whether or not there are side
> effects is undefined.

Shao Miller does not believe that. I was taking him through the
consequences of that belief so that he would have to say what value the
expression has.

--
Ben.

Shao Miller

unread,
Jul 23, 2010, 9:45:43 PM7/23/10
to
On Jul 23, 7:25 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> No.  You are playing games with the language.  Two clauses of one
> sentence (separated by ;) talk about the result.  A separate sentence
> talks about the type.  These are not three cases but two attributes of
> this form of expression.
>
I did not intend to play games with the language. If that has been
the case, then I sincerely apologize for doing so. Any inability of
my own to have implicitly understood this is nobody's challenge but
mine. Perhaps I have been confused surrounding "expressions" and
"results". The simplest way for me to accept your claim would be to
assert that "every expression must have a value." Since there is no
definition of a value for this scenario, that would quite simply lead
to an incompletely defined result, which I would be happy to call
"undefined behaviour" regarding the evaluation of the expression, or
even during execution.

>
> C defines the type of many expression forms whether the result is
> defined or not.
>

Here again, I was possibly missing the equivalence between "value" and
"result." Often interchangeable in everyday usage, I perhaps have
incorrectly assumed that the referenced C draft might have very
specific meanings for each and might distinguish them. In the
original post, we see a definition for "value" which could easily
contribute to such a misunderstanding.

>
> << and >> expressions have the type of the promoted
> left operand.  In some cases the result (or behaviour) is undefined.
> The fact that the type is known does not make all << and >> expressions
> defined.
>

Your argument for tying both "type" and "value" together before
yielding a defined result is a very good one. By my previous
suggestions, "The type of the result is
that of the promoted left operand" for these two operators would again
yield a result with a type, but possibly no defined value. The text
for these operators does seem to cover all possibilities however,
signed, unsigned, UB, implementation-defined. The constraint for
"integer type" helps. You cannot have an integer type which is
neither signed nor unsigned. It would be nice if a constraint for the
unary '*' operator were that the pointer must either point to an
object or to a function. Perhaps some kind reader could introduce
such a constraint into a future standard. Thanks for this reference,
Ben.

>
> What, in your opinion, is the result of the expression *(char *)0?  If
> you can't find words from that standard to explain what the result is
> (not just its type) then it is undefined by omission.
>

Here again I can only say "result with a type". Accepting that
evaluation of an expression shall yield a result with both type and
value means that I would have to say that '*(char *)0' is an
incompletely defined result, which I would happily call undefined
behaviour.

>
> It's clear we disagree about that phrase too.  Presumably you accept as
> valid code like this:
>
>   int *ip;
>   { int i = 42; ip = &i; }
>   *ip = 0;
>

I accept this code as guaranteed to imply UB. If responsible for some
portion of development of a C implementation advertising conformance
and criticized by a stake-holder regarding diligence in regards to
this kind of code, I would be much more confident to be able to point
at the (albeit, non-normative) footnote which suggests that 'ip' _has_
been assigned a value which is "the address of an object after the end
of its lifetime". I would be less confident without.

>
> because no invalid pointer has been assigned to ip -- the pointer has
> merely become invalid without any assignment.
>

I disagreed above, meaning I believe we share a qualification for
"undefined behaviour" here.

>
> Yes and, by omission, when neither circumstance applies nothing more can be
> inferred about the result (from the standard).  This is the definition of
> undefined.  That the expression has type is not in dispute (my original
> intervention using sizeof *(char *)0 relies on the expression having a
> type) by a type does not mean that the expression has a defined result.
>

This is possible misunderstanding of mine has been detailed twice
above. You appear to suggest that the semantics must define both a
value and a type to accomplish a defined result. Anything less is
undefined.

>
> I mean the expression form (E was a kind of syntax place-holder) is
> undefined.  It often has a type, but its evaluation has no defined
> result.
>

Ok.

>
> The result of *p is the entire object.
>

This troubles me just a bit, due to section 6.5, point 2. I would
worry that in:

*x = *x + 1;

That we have "read" the "prior value" twice, inappropriately, for each
evaluation of the unary '*' operator. I have not fully explored this
avenue of thought and do not intend it as an argument by any means.
Please feel free to discard.

>
> I can't answer your first
> question
>

That you have answered any of the questions at all is valuable for
anyone concerned about the subject.

>
> because I don't know what "knowing the value of *p altogether"
> means (the problem word for me is "knowing").  As for the second, all
> reasonable compilers will look to see what is actually used from *p
> to avoid fetching any more than is needed.
>

Agreed.

>
>   *p; /* probably fetch nothing (volatile objects excepted) */
>   s = *p; /* probably fetch it all -- at least we know where to put it */
>   (*p).m; /* probably behaves like p->m (i.e. only m is fetched) */
>

Also agreed. This makes me curious about something like:

int main(void) {
static int foo = 15;
struct bar {
char c[(size_t)&foo];
int baz;
};
return (*(struct foo *)0).baz;
}

for the not-yet-accepted (disputed) interpretation that something like
moving a Turing machine's head to position 0 but then moving it by the
offset of 'baz' before reading or writing to a potential object might
be a reasonable thing to do in C. (The common response has been that
it's UB to try this.) Some implementations might allow for such
behaviour, but that's obviously not evidence of any sort that it's not
UB.

My sense after your post here is that it would be very easy to let go
of any uncertainty regarding the un/defined behaviour of '*(char *)0'
and friends if we easily accept that a result must have a defined
value and a defined type. Thanks for that, Ben.

Shao Miller

unread,
Jul 23, 2010, 9:51:54 PM7/23/10
to
On Jul 23, 7:50 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> Oh, be serious!
>
Please accept my apologies for not meeting your expectations for
serious discussion. I will try harder. :)

>
> 6.5 Expressions
>
>   1 An expression is a sequence of operators and operands that specifies
>     computation of a value, or that designates an object or a function,
>     or that generates side effects, or that performs a combination
>     thereof.
>

This fine reference is _extremely_ helpful in accepting that
evaluation of an expression (which we don't do for sizeof, for
example, but _as_defined_ for sizeof) implies both a type and a
value. Thank you, Ben!

>
> What about the expression *(char *)0?  Does it generate side-effects?
> No.  Does it designate an object or a function?  No.  It must therefore
> specify the computation of a value.  We are all certain of the type of
> that value (char) by we don't yet know which of the many chars is it.
>

Very convincing argument. Excellent.

>
> By the way, I am happy just to agree to disagree about this.
>

I would be to, but perhaps that's not a requirement. You have kindly
chipped away here a good bit.

>
> I'll
> continue to write C avoiding constructs like *(char *)0 and so will you!
>

Absolutely. The only reason something like '*(void *)0;' might be
interesting to me if it did _not_ have undefined behaviour would be
for development of an implementation or for an easy "do-nothing"
preprocessor macro; but something more than just ';'.

Shao Miller

unread,
Jul 23, 2010, 10:01:34 PM7/23/10
to
On Jul 23, 8:17 pm, Seebs <usenet-nos...@seebs.net> wrote:

A thoroughly considered response.

It is clear that my discussion has agitated you. I did not intend to
imply to any audience that you have been lying to me with your
responses. To clarify, I do not believe you have lied to anyone in
your responses. Your experience and the experience of other
responders is what I was hopeful for the benefit of with regards to
this subject matter.

I _sincerely_ apologize, Peter. I will try to keep all of your
discussion's valuable points in mind as the subject matter becomes
clearer. Please do not attribute ill intent where an explanation of
another sort will do. I help people with computer-related subjects
all day, every day, for years. Sometimes I want to ask, "How much are
they paying you to waste my time?!" If that's how you feel, I don't
wish to aggravate that feeling. I won't trouble you any more, with
good fortune.

Perhaps this could lighten the mood? Agitating you with your
evaluation of my void expression '(void)*(char *)0;' was an unintended
side-effect.

Rich Webb

unread,
Jul 23, 2010, 10:15:45 PM7/23/10
to
On 24 Jul 2010 00:17:34 GMT, Seebs <usenet...@seebs.net> wrote:

>On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
>> On Jul 23, 5:49 pm, Seebs <usenet-nos...@seebs.net> wrote:
>
>>> And what is that result?
>
>> What is "what"? The result is defined to have a type. The result
>> _is_ a result. It _has_ a type.
>
>But unless we know what the result is -- not just its type -- it is
>NOT DEFINED.

If I can jump in here ...

As I understand it, undefined behavior can be but is not required to be
defined by an implementation. And so ...

>> Agreed for '(char *)0'. Not so for '(void *)0'. I'm with you here,
>> though.
>
>Yes, so "*(void *) 0" is a constraint violation.

... is indeed a constraint violation but nonetheless is permitted to be
defined by e.g. an implementation targeted at embedded applications.

From the Rationale: Undefined behavior gives the implementer license not
to catch certain program errors that are difficult to diagnose. It also
identifies areas of possible conforming language extension: the
implementer may augment the language by providing a definition of the
officially undefined behavior.

Or am I missing the whole point here? (Wouldn't be the first time.)

--
Rich Webb Norfolk, VA

Shao Miller

unread,
Jul 23, 2010, 10:34:53 PM7/23/10
to
On Jul 23, 8:34 pm, Keith Thompson <ks...@mib.org> wrote:
>
> (void)0 neither computes a value, nor designates an object or function,
> nor generates side effects, or any combination thereof.
>
Wait a minute, didn't post #56 have something about 'void's and
values? Here '0' has type and value. Casting to 'void' appears to
discard that value. Does that mean that result of evaluating
'(void)0' has a type but no value? Is that void expression a
legitimate expression statement? Thanks, the Other Keith.

So is the result of:

*(void *)0

required to have a value or could it just as well be a void
expression, possibly used in an expression statement? Is it somewhere
required that unary '*'' should have defined type and value in the
result but that casting to void may not? "Cast operators" has
semantics that appear to define "converts the value of the expression
to the named type." So '(void)58' yields UB, right? Even before
considered as a void expression?

>
> This is yet another reason the standard's definition of "expression" is
> flawed.  The other is that some expressions, for example 42, contain no
> operators or operands.
>

Agreed. And while we might know what to do and what not to do while
programming, development of an implementation might require a stricter
understanding.

>
> It's not *fatally* flawed; I don't think it's led anyone to an incorrect
> understanding of what "expression" means.  But a more accurate
> definition would refer to the syntax.
>

Agreed, with the exception of the question above.

Shao Miller

unread,
Jul 23, 2010, 10:45:35 PM7/23/10
to
On Jul 23, 8:17 pm, Keith Thompson <ks...@mib.org> wrote:
>
> And a value.  What is the value of the result?  Nothing in the standard
> defines what that value is, therefore the value is undefined.
>
This appears to be the critical piece. A result shall have a defined
value and a defined type. Anything less is undefined behaviour. And
yet, a void expression describes a non-existent value for an
expression, albeit with a type of 'void'. We can get one of these
from casting to 'void', right? Even though a cast converts a value to
a named type?

>
> How could it be intentional?  If it's intentional, it doesn't make any
> sense.
>
> Once again, here's the passage in question, 6.5.3.2p4:
>
>     The unary * operator denotes indirection. If the operand points
>     to a function, the result is a function designator; if it points
>     to an object, the result is an lvalue designating the object. If
>     the operand has type ‘‘pointer to type’’, the result has
>     type ‘‘type’’. If an invalid value has been assigned to
>     the pointer, the behavior of the unary * operator is undefined.
>
> It is not possible to assign a value, invalid or otherwise, to "the
> pointer" unless "the pointer" is a pointer object.  The context does not
> imply the existence of any pointer object to which the phrase "the
> pointer" could refer.  Even if the operand happens to be an lvalue,
> it is no longer an lvalue (and thus no longer designates an object)
> before the "*" operator is applied to it.  6.3.2.1p2:
>

There is no constraint that the operand is a pointer object. The
constraint is that the operand is a pointer. We know from an earlier
reference of yours that the operand is a value. Perhaps this "has
been assigned..." goes beyond "the operand" (it doesn't mention the
operand, unlike the other sentences), and means something more like
"if the value was that of a pointer object, and that object had been
assigned..." It's just plain fishy, at the very least.

> ... ... ...


> The author implicitly assumed the existence of a pointer object that
> does not exist.
>

Could very well be. Even seems to be a common interpretation amongst
discussants.

Shao Miller

unread,
Jul 23, 2010, 11:07:21 PM7/23/10
to
On Jul 23, 10:15 pm, Rich Webb <bbew...@mapson.nozirev.ten> wrote:

> On 24 Jul 2010 00:17:34 GMT, Seebs <usenet-nos...@seebs.net> wrote:
>
> >Yes, so "*(void *) 0" is a constraint violation.
>
> ... is indeed a constraint violation but nonetheless is permitted to be
> defined by e.g. an implementation targeted at embedded applications.
>
And what constraint is that? There is one constraint for unary '*':
"The operand of the unary * operator shall have pointer type."
Constraints versus semantics. 5.1.2.3, point 3 (again): "In the
abstract machine, all expressions are evaluated as specified by the
semantics." As a matter of fact, 'sizeof' even makes use of the
semantics insofar as the result's type, even though the expression
operand is not evaluated!

>
> From the Rationale: Undefined behavior gives the implementer license not
> to catch certain program errors that are difficult to diagnose.  It also
> identifies areas of possible conforming language extension:  the
> implementer may augment the language by providing a definition of the
> officially undefined behavior.
>
> Or am I missing the whole point here? (Wouldn't be the first time.)
>

An enjoyable reference. Thanks for sharing it, Rich. The question of
the original post has been answered because its cast to 'void'
requires value because its operand does not already have type void.
There are a few useful points floating around, from my perspective:
1. Does '*(void *)0' yield undefined behaviour?
2. The last sentence of 6.5.3.2, Semantics 4 seems to apply only to an
lvalue operand, but the constraint 2 does not require this. Something
there should change.
3. The definition for "expression" doesn't appear to apply to Keith's
"42" nor his "(void)0". Something there should change.
4. It is not entirely clear whether or not a result is required to
have both a type and a value.
5. Undefined behaviour is undesirable in a general sense.

Ben Bacarisse

unread,
Jul 23, 2010, 11:36:44 PM7/23/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 7:25 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>>
>> No.  You are playing games with the language.  Two clauses of one
>> sentence (separated by ;) talk about the result.  A separate sentence
>> talks about the type.  These are not three cases but two attributes of
>> this form of expression.
>>
> I did not intend to play games with the language. If that has been
> the case, then I sincerely apologize for doing so. Any inability of
> my own to have implicitly understood this is nobody's challenge but
> mine.

OK, I accept that. You said:

| The omission of when the pointer does not point to an object only
| impacts the definition of the result, _when_ that result is an
| _lvalue_.

To the extent I can give that any meaning at all, it is at odds with the
plain words of the paragraph that is causing you so much trouble. This
made me suspect that you are looking for trouble -- deliberately trying
to misread the plain words to find a confusion. I am happy to be wrong
about that.

To be clear, the omission (the failure to specify a result) impacts the
result when the operand is neither a function pointer nor an object
pointer.

<snip>


> The simplest way for me to accept your claim would be to
> assert that "every expression must have a value."

But that would be wrong, e.g. (void)(1+2).

> Since there is no
> definition of a value for this scenario, that would quite simply lead
> to an incompletely defined result, which I would be happy to call
> "undefined behaviour" regarding the evaluation of the expression, or
> even during execution.

OK, undefined behaviour it is, then.

>> C defines the type of many expression forms whether the result is
>> defined or not.
>>
> Here again, I was possibly missing the equivalence between "value" and
> "result." Often interchangeable in everyday usage, I perhaps have
> incorrectly assumed that the referenced C draft might have very
> specific meanings for each and might distinguish them. In the
> original post, we see a definition for "value" which could easily
> contribute to such a misunderstanding.

I think you've not taken Tim's excellent advice to heart. If you treat
the standard as a set formal definitions with rigid consequences (like a
piece of mathematics) you will find that almost no programs have any
meaning. 0 is not an expression (it lacks operators); 42 has no value
(there is no object to have a value as per the definition) and so on.
You have to read it with slice of common sense. The meaning of its
terms is partly to be gleaned from absorbing how they are used. Look
at how "result" and "value" are used. "Result" is not defined so you
have to guess. What is "the result of an expression"? Most people
conclude that it is some notion of a quantity with an associated type.

>> << and >> expressions have the type of the promoted
>> left operand.  In some cases the result (or behaviour) is undefined.
>> The fact that the type is known does not make all << and >> expressions
>> defined.
>>
> Your argument for tying both "type" and "value" together before
> yielding a defined result is a very good one. By my previous
> suggestions, "The type of the result is
> that of the promoted left operand" for these two operators would again
> yield a result with a type, but possibly no defined value. The text
> for these operators does seem to cover all possibilities however,
> signed, unsigned, UB, implementation-defined.

It does not matter. If it only covered a few cases, those not
explicitly covered would be undefined. Knowing the type would not make
the result any less defined.

> The constraint for
> "integer type" helps. You cannot have an integer type which is
> neither signed nor unsigned.

No, it does not help. Knowing that -1 << 512 is of type int does not
make it any less undefined.

> It would be nice if a constraint for the
> unary '*' operator were that the pointer must either point to an
> object or to a function. Perhaps some kind reader could introduce
> such a constraint into a future standard.

Such a constraint is not possible. Constraints must be diagnosed by the
implementation when the program is translated and the compiler can't
tell when the operand of * does not point to an object. Is this OK or
not:

int f(int *ip) { return *ip; }

?

>> What, in your opinion, is the result of the expression *(char *)0?  If
>> you can't find words from that standard to explain what the result is
>> (not just its type) then it is undefined by omission.
>>
> Here again I can only say "result with a type". Accepting that
> evaluation of an expression shall yield a result with both type and
> value means that I would have to say that '*(char *)0' is an
> incompletely defined result, which I would happily call undefined
> behaviour.

We are agreed!

<snip>


> This is possible misunderstanding of mine has been detailed twice
> above. You appear to suggest that the semantics must define both a
> value and a type to accomplish a defined result. Anything less is
> undefined.

Yes, that is my view. If a "defined result" could be just a type, why
does the standard not say more about what one can do with these results?
Is *(char *)0 << *(char *)0 just a pure int result? An expression's
result is either defined or undefined -- just a type is not enough.

<snip>


>> The result of *p is the entire object.
>>
> This troubles me just a bit, due to section 6.5, point 2. I would
> worry that in:
>
> *x = *x + 1;
>
> That we have "read" the "prior value" twice, inappropriately, for each
> evaluation of the unary '*' operator. I have not fully explored this
> avenue of thought and do not intend it as an argument by any means.
> Please feel free to discard.

You can read the prior value as often as you like. The limit is on
modifying the stored value more than once. And whilst I agree that it's
not entirely clear what constitutes a "read" of the value, most people
would say that is requires an lvalue to value conversion to be a read.

<snip>


>>   *p; /* probably fetch nothing (volatile objects excepted) */
>>   s = *p; /* probably fetch it all -- at least we know where to put it */
>>   (*p).m; /* probably behaves like p->m (i.e. only m is fetched) */
>>
> Also agreed. This makes me curious about something like:
>
> int main(void) {
> static int foo = 15;
> struct bar {
> char c[(size_t)&foo];
> int baz;
> };
> return (*(struct foo *)0).baz;
> }

That's a constraint violation. If your compiler does not complain about
it get another one! The array size must be an integer constant
expression.

> for the not-yet-accepted (disputed) interpretation that something like
> moving a Turing machine's head to position 0 but then moving it by the
> offset of 'baz' before reading or writing to a potential object might
> be a reasonable thing to do in C. (The common response has been that
> it's UB to try this.) Some implementations might allow for such
> behaviour, but that's obviously not evidence of any sort that it's not
> UB.

I don't follow this at all. The example is undefined because of the
constraint violation -- it need not even generate any executable code.

> My sense after your post here is that it would be very easy to let go
> of any uncertainty regarding the un/defined behaviour of '*(char *)0'
> and friends if we easily accept that a result must have a defined
> value and a defined type.

Just do it. Take the red pill.

--
Ben.

Ben Bacarisse

unread,
Jul 23, 2010, 11:47:12 PM7/23/10
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> Shao Miller <sha0....@gmail.com> writes:
<snip>

>> int main(void) {
>> static int foo = 15;
>> struct bar {
>> char c[(size_t)&foo];
>> int baz;
>> };
>> return (*(struct foo *)0).baz;
>> }
>
> That's a constraint violation. If your compiler does not complain about
> it get another one! The array size must be an integer constant
> expression.

Actually it is not a constraint violation but it certainly violates a
"shall" about integer constant expressions (6.6 p6). I am surprised
this is not a CV since I can't see any reason it can't be checked at
compile time, but it's not one.

<snip>
--
Ben.

Richard Harter

unread,
Jul 24, 2010, 12:27:56 AM7/24/10
to
On 23 Jul 2010 21:09:04 GMT, Seebs <usenet...@seebs.net>
wrote:

>On 2010-07-23, Shao Miller <sha0....@gmail.com> wrote:
>> Thank you for your opinion. I would have also appreciated some
>> reasoning, in order to help to convince myself of this. The opinion
>> is appreciated regardless of the lack of reasoning. Poll-wise,
>> dereferencing a null pointer is undefined behaviour during
>> evaluation. Reason-wise, it's still incomplete, for me.
>
>I really don't think the problem is reasoning, because you've seen tons
>of that, and it's had no effect whatsoever. I don't know what the issue
>is; this really is pretty straight forward. '(char *) 0' is a pointer, and
>it does not point to a valid object, therefore, '*(char *) 0' is undefined
>behavior to evaluate. In the abstract machine, expressions are evaluated,
>with a few specialized exceptions (such as sizeof). So whether or not you
>cast it, or use the value, the expression *is* evaluated, and evaluating the
>expression produces undefined behavior.

You may have conceded his point by admitting to the specialized
exceptions. Evaluating *(char *)0 is undefined; none-the-less it
has a defined type. The statement

sizeof(*(char *)0);

does not (or at least should not) invoke undefined behaviour
because the expression is not evaluated.


Shao Miller

unread,
Jul 24, 2010, 1:05:53 AM7/24/10
to
On Jul 23, 11:36 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> OK, I accept that.  You said:
>
> | The omission of when the pointer does not point to an object only
> | impacts the definition of the result, _when_ that result is an
> | _lvalue_.
>
> To the extent I can give that any meaning at all, it is at odds with the
> plain words of the paragraph that is causing you so much trouble.  This
> made me suspect that you are looking for trouble -- deliberately trying
> to misread the plain words to find a confusion.  I am happy to be wrong
> about that.
>
You _are_ wrong about "looking for trouble". Please continue to be
happy. :)

>
> To be clear, the omission (the failure to specify a result) impacts the
> result when the operand is neither a function pointer nor an object
> pointer.
>

Ok. But I'd rather that even this was clearer.
1. There is a sentence which specifies a value for the result.
2. There is a sentence which specifies a type for the result.
3. If the sentence regarding the value does not apply, the sentence
regarding the type is _insufficient_ to define a whole result.

>
> >  The simplest way for me to accept your claim would be to
> > assert that "every expression must have a value."
>
> But that would be wrong, e.g. (void)(1+2).
>

Ok. But I'd rather that even this was clearer:
1. There is a sentence which specifies a value for the result.
2. There is a sentence which specifies a type for the result.
3. If the sentence regarding the value does not apply, the sentence
regarding the type is _sufficient_ to define a whole result.

>
> OK, undefined behaviour it is, then.
>

Agreed conditional upon acceptance of at least one of:
1. "...has been assigned..." really means something more like "is an
invalid value"

OR:

2. Casting to 'void' and application of the unary '*' operator are
treated differently. Both may fail to define a value for the result
of an evaluation, but the cast is permitted as defined behaviour.

>
> I think you've not taken Tim's excellent advice to heart.  If you treat
> the standard as a set formal definitions with rigid consequences (like a
> piece of mathematics) you will find that almost no programs have any
> meaning.  0 is not an expression (it lacks operators); 42 has no value
> (there is no object to have a value as per the definition) and so on.

Each of these points feels like a blow, including any failure on my
part to treat the referenced draft as anything more than a guide to be
supplemented by popular consensus.

>
> You have to read it with slice of common sense.
>

"Common sense" meaning "popular interpretation" to me. Very well;
accepted.

>
> The meaning of its
> terms is partly to be gleaned from absorbing how they are used.  Look
> at how "result" and "value" are used.  "Result" is not defined so you
> have to guess.  What is "the result of an expression"?  Most people
> conclude that it is some notion of a quantity with an associated type.
>

If writing a translator, I might have a 'struct result' with a pointer
to a type and a pointer to a value. I might initialize these with
NULL each. If an "operator" for a 'struct result' demanded one of
these properties but it was not defined, I might diagnose undefined
behaviour. It simply seemed to me that there were circumstances in
which some code path for "evaluation" might not ever use one of the
properties, which would lead me to question the validity of diagnosing
as UB if one compares as NULL but there was no expectation for it to
be non-NULL... Such as a void expression, which appears to be more
limited than I thought (casts to void and functions returning void,
for example).

>
> It does not matter.  If it only covered a few cases, those not
> explicitly covered would be undefined.  Knowing the type would not make
> the result any less defined.
>

Well...

>
> No, it does not help.  Knowing that -1 << 512 is of type int does not
> make it any less undefined.
>

> Such a constraint is not possible.  Constraints must be diagnosed by the
> implementation when the program is translated and the compiler can't
> tell when the operand of * does not point to an object.
>

A constraint that: "except when the '*' operator is used as the
operand to the 'sizeof' operator, an expression evaluating to a null
pointer constant or to a null pointer constant cast to any pointer
type shall not be the operand," might do, mightn't it?

>
> Is this OK or
> not:
>
>   int f(int *ip) { return *ip; }
>
> ?
>

Yes. The function call assigns the value of an argument to the 'ip'
parameter. Passing in invalid value would result in UB.

>
> We are agreed!


>
> Yes, that is my view.  If a "defined result" could be just a type, why
> does the standard not say more about what one can do with these results?
> Is *(char *)0 << *(char *)0 just a pure int result?  An expression's
> result is either defined or undefined -- just a type is not enough.
>

Well actually, it does explain what you can do with the results. I
had made earlier references to these. "Cast operators"' first
constraint says "Unless...the operand shall have scalar type". Its
first semantic point talks about "the value of the expression."
"Simple assignment" talks about "type" and "value" for the
"operands". That explicitness (along with void expressions) was part
of why a result was not required to have both, against the consensus
here. However it was the _consumers_ of the results that I was taking
to give meaning to constraint-valid and semantically valid
expressions. The consensus appears to be that the results are defined
or not, regardless of the consumers or their properties, except for
'sizeof' (which nobody has disputed).

>
> You can read the prior value as often as you like.  The limit is on
> modifying the stored value more than once.  And whilst I agree that it's
> not entirely clear what constitutes a "read" of the value, most people
> would say that is requires an lvalue to value conversion to be a read.
>

That saved me some investigation. Very much appreciated.

>
> Just do it.  Take the red pill.
>

The tiny print of the brand name appears to read "DeFacto"; I think
that's an Italian company.

Thanks so much, Ben. You've been a great help.

Richard Heathfield

unread,
Jul 24, 2010, 2:59:15 AM7/24/10
to
Given Seebs's plonk announcement, I consider it worthwhile to echo Shao
Miller's apology here, in full. [Apart from my sig block, I have added
no further content past this point. Hence the top-posting.]


--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Seebs

unread,
Jul 24, 2010, 3:12:27 AM7/24/10
to
On 2010-07-24, Richard Harter <c...@tiac.net> wrote:
> You may have conceded his point by admitting to the specialized
> exceptions. Evaluating *(char *)0 is undefined; none-the-less it
> has a defined type. The statement

> sizeof(*(char *)0);

> does not (or at least should not) invoke undefined behaviour
> because the expression is not evaluated.

Right.

But that's because sizeof() is magical in that it uses the type, not the
value. The expressions under discussion *are* evaluated. Basically,
sizeof()'s special language is the exception that proves the rule -- and
also explains why you might care about the type of an expression which
cannot be evaluated without producing undefined behavior. It's perfectly
reasonable to use such an expression *as the operand of sizeof* -- because
that uses the type rather than the result.

Shao Miller

unread,
Jul 24, 2010, 3:51:47 AM7/24/10
to
On Jul 24, 12:27 am, c...@tiac.net (Richard Harter) wrote:
>
> You may have conceded his point by admitting to the specialized
> exceptions.  Evaluating *(char *)0 is undefined; none-the-less it
> has a defined type.  The statement
>
>     sizeof(*(char *)0);
>
> does not (or at least should not) invoke undefined behaviour
> because the expression is not evaluated.
>
Thanks for suggesting that, Richard. That seems reasonable.

It's really unfortunate that on-topic, civil discussion resulted in
upset, here. Evidently the word "invented" can be perceived by some
folks as un-civil. I wish that I'd known of that possibility before-
hand. I honestly meant it in the context of definitions without
references, either made-up on-the-spot or not helping due to a lack of
citation. Also, there should be the possibility of "mistakenly
interpreting" and "mistakenly remembering" versus "intentionally
misrepresenting."

Communications can be challenged when one doesn't understand another
person's frame of reference. For Peter, he claims to have a wealth of
knowledge and experience. For me to boldly state that parts of his
discussion were invented might have come as a most unexpected and most
unlikely circumstance. I did not anticipate that possibility and now
suffer the loss of any future valuable discussion from him.

I try to interact much of the time through questions-only. This often
works out extremely effectively. In this discussion, I was a bit
desperate to understand the situation, especially after many
unsatisfactory (to me) responses, and must admit having neglected this
nice strategy, which has traditionally kept things civil, in my
experience. This lesson will stick with me.

Now you are quite right by me Richard, when you suggest that if anyone
at all had ever managed to recognize that my frame of reference
included the point that a result need not have both value and type,
then a simple response of "Your result has a defined type but no
defined value. Defined results must have both. Here's the
reference..." This might have shortened the path to a question:

Is the result of evaluating an expression with void type defined to
have all three of: 1. A result, 2. a type, 3. a value, given that all
constraints are met and the evaluation can be performed entirely
according to the semantics without invoking undefined behaviour?

Also, to be fair, there really was some ambiguity in the definitions
used by the discussants, due to ambiguity in the referenced draft. At
some point along I had assumed that there might be an oversight found
in some implementations and peoples' interpretations. Instead, I
discovered that these things, in fact, define the reality for C. The
draft standard is perhaps a possible _target_ for conformance, but
perhaps not the best definition to discuss in regards to adherence,
without some guess-work or asking around.

Having said that, it only confused me further when kind responders
both suggested that the draft is clear/obvious/unambiguous and that is
isn't. It really would have been better (for my benefit, anyway) to
agree on one or the other as soon as possible. Perhaps this feedback
will benefit future discussion, with good fortune.

Shao Miller

unread,
Jul 24, 2010, 6:11:01 AM7/24/10
to
Well I might as well document this bit of trivia, since there've been
a couple of other bits of trivia mentioned.

01. void foo(void) {
02. return;
03. }
04.
05. int main(void) {
06. int i = 13;
07. void *v = &i;
08. (void)13;
09. foo();
10. *v;
11. return 0;
12. }

Does evaluation of the cast operator on line 08 yield a defined result
with a defined type and a defined value?

Does evaluation of the function call operator on line 09 yield a
defined result with a defined type and a defined value?

Do all defined results require a defined type and a defined value?

Does the evaluation of the indirection operator on line 10 yield a
defined result with a defined type?

Only posted as a trivial reference. Feel free to respond or ignore,
at your capable discretion. :)

Richard Heathfield

unread,
Jul 24, 2010, 6:42:33 AM7/24/10
to
Shao Miller wrote:

<snip>

> It's really unfortunate that on-topic, civil discussion resulted in
> upset, here. Evidently the word "invented" can be perceived by some
> folks as un-civil. I wish that I'd known of that possibility before-
> hand. I honestly meant it in the context of definitions without
> references, either made-up on-the-spot or not helping due to a lack of
> citation. Also, there should be the possibility of "mistakenly
> interpreting" and "mistakenly remembering" versus "intentionally
> misrepresenting."
>
> Communications can be challenged when one doesn't understand another
> person's frame of reference. For Peter, he claims to have a wealth of
> knowledge and experience.

The word "claim" is rather loaded. Peter /does/ have a wealth of
knowledge and experience. That doesn't mean he is necessarily right, but
it does mean that the probability of his being right is significantly
high. When I find myself disagreeing with Peter, it always gives me
pause for thought.

<snip>

Richard Heathfield

unread,
Jul 24, 2010, 6:44:54 AM7/24/10
to
Shao Miller wrote:
> Well I might as well document this bit of trivia, since there've been
> a couple of other bits of trivia mentioned.
>
> 01. void foo(void) {
> 02. return;
> 03. }
> 04.
> 05. int main(void) {
> 06. int i = 13;
> 07. void *v = &i;
> 08. (void)13;
> 09. foo();
> 10. *v;
> 11. return 0;
> 12. }
>
> Does evaluation of the cast operator on line 08 yield a defined result
> with a defined type and a defined value?

No. Operators yield values, but they are not themselves evaluated.
Therefore, there is no "evaluation of the cast operator".

>
> Does evaluation of the function call operator on line 09 yield a
> defined result with a defined type and a defined value?

No, for the same reason.

>
> Do all defined results require a defined type and a defined value?

What do you mean by "result" in this context? Is a successful write to
stdout a "result"? Some would say yes.

>
> Does the evaluation of the indirection operator on line 10 yield a
> defined result with a defined type?

No - see above.

pete

unread,
Jul 24, 2010, 7:59:55 AM7/24/10
to
Shao Miller wrote:

> 08. (void)13;

> Does evaluation of the cast operator on line 08 yield a defined result
> with a defined type and a defined value?

The standard only mentions values for expressions of object type.

The expression in the expression statement
on line 08 is of type void.

The value of such an expression
is described by the standard as being "nonexistent".

N869
6.3.2.2 void
[#1] The (nonexistent) value of a void expression (an
expression that has type void) shall not be used in any way,
and implicit or explicit conversions (except to void) shall
not be applied to such an expression.


--
pete

Tim Rentsch

unread,
Jul 24, 2010, 10:45:54 AM7/24/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 1:30 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>>
>> I don't. The wording could be better, but there is no
>> doubt about the meaning.
>>
> After your fine reference to the text below, I'd have to agree.
>
>>
>> The Standard is written in
>> formal English but it is not a math textbook, and it's
>> at best a waste of time to read it like one.
>>
> I am not aware of anyone who's reading it like a math textbook and I'd
> have to agree. It could be worth-while reading its fine detail and
> discussing and resolving perceived ambiguities, for the case where one
> might be interested in developing a translator for C.
>
>>
>> If you want to get technical,
>>
> Indeed I did.
>
>>
>> it can NEVER be the case
>> that the operand of an indirection operator has been
>> assigned. In the expression '*p', where p has been
>> declared to be of some pointer type, the operand 'p'
>> has already been converted to a value by virtue of
>> 6.3.2.1p2. There is no difference between '*p' and
>> '*(char*)0' in this regard -- both operate on values,
>> not objects. So it's completely nonsensical to try to
>> understand "has been assigned" as applying to one class
>> of operand expression but not another. They are all
>> just values.
> This is to me an extremely valuable reference to the text of
> 'n1256.pdf'. I agree that with this reference in mind, it's
> nonsensical to treat "If an invalid value has been assigned to the
> pointer" as being intended to mean anything other than "If the operand
> has an invalid value"... If only the text said "operand." It
> doesn't. It says "pointer."
>
> Is there any doubt that the operand has a value? We can assign '(char
> *)0' or even '(void *)0' to an object. I don't think there's any
> doubt that the operand has a value.
>
> This could potentially be a cause for confusion, since sentences 2 and
> 3 explicitly use "operand" and "points to" and "has type". The next
> sentence could very well mean, "if the value of the operand _is_ an
> invalid value..." (Emphasis mine.) It could also mean, "if the value
> of the operand was an invalid value assigned to the operand..."
>
> Do you understand why I am asking about all of this? In the execution
> environment if we attempt to access an object at an invalid location,
> it should be undisputed as undefined behaviour. But expression
> evaluation != execution. Evaluation of a constant scalar expression
> such as '(char *)0' need not be "executed" at all. That is to say,
> the text defines an attempted object access to an invalid location as
> undefined behaviour. It could even be trapped by the best
> implementation. But evaluation of an expression which is an
> application of the unary '*' operator does noes necessitate an object
> access to any location. If it did, the text should include something
> like:
>
> "The result of evaluation of the unary '*' operator shall be the value
> of an object pointed to by the operand, if the operand point to an
> object."
>
> But that might not be the case. Consider these:
>
> (*p).f();
> (*q)->x = 10;
> *r = 11;
> (*s)();
>
> For 'p', 'q' and 'r', if they point to an object, the result is an
> lvalue. It's not a "value". There's no need to "fetch" the "value"
> during the indirection at all, is there? Thus we only get undefined
> behaviour if they _don't_ point to an object, which is a determination
> that might only be possible during execution.
>
> For 's', the indirection is intended to result in a function
> designator. Not an lvalue. Not a "value".
>
> It is clear that many people have tied evaluation of the unary '*'
> operator to "yielding an object, pointed-to by the operand" in their
> thinking. But this is not the case.
>
> Also consider a Turing machine implementation with a tape and a head.
> In the 'q' example above, if 'q' were assigned the value '(struct foo
> *)0', the head might move to position zero, where "read" and "write"
> are invalid. No read nor write is attempted. Then the head moves by
> the offset of the 'x' member. At last, we attempt a write when we
> assign, assuming that reads and writes are valid at that position.
> Why should there be undefined behaviour by moving the head to position
> 0 any more so than to any other location which is invalid for objects
> or for which the validity is not guaranteed?
>
> Does anyone understand why "has been assigned" could be important?
>
> char *p;
> *p = 'Y';
>
> If the Turing machine's head attempts to move to the location as per
> 'p', that location might not be a valid location for the head to move
> to. Undefined behaviour. But how can you have _an_expression_ with a
> _constant_scalar_value_ at _translation_time_ (let alone during
> execution) possibly represent an invalid location for the head to move
> to?

Your thinking seems very confused. I suggest you stop
thinking about execution on real machines or Turing
machines, and focus on reading the Standard to understand
what it says about semantics on the abstract machine. The
questions you've asked can be answered by reading the
Standard carefully, not just isolated sections but all of
sections 1-6, and considering what it's trying to say about
semantics on the abstract machine, which is the only one
that really counts. I'm not inclined to spend any more
effort responding to someone who seems to want other people
to do the work for something that he appears to be capable
of doing himself, if only he would put in more effort on
that and less on raising captious objections.

Ben Bacarisse

unread,
Jul 24, 2010, 11:09:37 AM7/24/10
to
Shao Miller <sha0....@gmail.com> writes:

> On Jul 23, 11:36 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

<snip>


>> To be clear, the omission (the failure to specify a result) impacts the
>> result when the operand is neither a function pointer nor an object
>> pointer.
>>
> Ok. But I'd rather that even this was clearer.
> 1. There is a sentence which specifies a value for the result.

Check. 6.5.3.2 p4 "If the operand points to a function, the result is a
function designator; if it points to an object, the result is an lvalue
designating the object."

> 2. There is a sentence which specifies a type for the result.

Check (well there are two, in fact). 6.5.3.2 p2 "The operand of the
unary * operator shall have pointer type." And 6.5.3.2 p4 "If the


operand has type 'pointer to type', the result has type 'type'."

> 3. If the sentence regarding the value does not apply, the sentence


> regarding the type is _insufficient_ to define a whole result.

Check. 4 p2 "Undefined behavior is otherwise indicated in this
International Standard by the words 'undefined behavior' or by the
omission of any explicit definition of behavior."

Underlying the specific issue you have is a problem that the standard
has never quite managed to resolve. There are three attributes that
matter about an expression and/or its "result": (a) the quantity (which
character, which integer, etc.), (b) the type, and (c) whether it is an
lvalue (there is a detail about whether its is also a modifiable lvalue
but lets simplify for the moment).

(b) and (c) can be determined from the syntactic form of the expression
along with some type analysis whereas (a) is a dynamic property of the
expression at run time. To use "result" for all of these clouds this
distinction and has led you to think that a "result" can be defined when
only the type is known.

I'd prefer the wording to be done like this:

Form: *E
Constraints: The operand, E, must have type 'pointer to T'.
Type: An expression of the form *E has type T and is an
lvalue if T is an object type.
Result: If the result of evaluating E is a pointer to a function,
the result is a function designator denoting the pointed
to function. If E the result of evaluating E is a
pointer to an object, the result denotes that object.

It would then be clear that the type is not really "part of the result"
but a property of the expression form -- something essentially static
and not associated with the evaluation. I'm not suggesting it -- the
work would be monstrous and there would be endless details to get right
(variably modified array types spring to mind) but this highlights what
the current wording is dealing with.

Of course, the way it is done now is much more intuitive. For most
expressions, it suggest that the result is a quantity tagged with a type
and lvalue-ness. But this does not work for sizeof, for example. It
does not (usually) evaluate it's result so the dynamic view of a
type-tagged result has no meaning. People know that the type can be
determined without evaluation so they apply common sense to understand
the sizeof operator. It's a shame that the wording is not perfect, but
it is not nearly as confusing as you seem to think.

<snip another discussion about void expressions. I don't want to get
into that here>

>> OK, undefined behaviour it is, then.
>>
> Agreed conditional upon acceptance of at least one of:
> 1. "...has been assigned..." really means something more like "is an
> invalid value"

That is what most people take it to mean. Why? Because making special
provision for when a pointer is an object that has been assigned to
makes no sense when taken literally. Given:

const int *ip = 0;

*ip would not be covered but it would be after:

int *ip;
ip = 0;

Both would remain undefined by omission, so what value would the literal
interpretation serve?

> OR:
>
> 2. Casting to 'void' and application of the unary '*' operator are
> treated differently. Both may fail to define a value for the result
> of an evaluation, but the cast is permitted as defined behaviour.

I don't see how this makes any difference but it does not matter because
I'm choosing (1) not (2)!

>> I think you've not taken Tim's excellent advice to heart.  If you treat
>> the standard as a set formal definitions with rigid consequences (like a
>> piece of mathematics) you will find that almost no programs have any
>> meaning.  0 is not an expression (it lacks operators); 42 has no value
>> (there is no object to have a value as per the definition) and so on.
> Each of these points feels like a blow, including any failure on my
> part to treat the referenced draft as anything more than a guide to be
> supplemented by popular consensus.

Hmmm. Now I doubt your sincerity again. Read what you wrote. You are
suggesting that I (and indirectly Tim) want you to treat the largely
unpaid work of dozens of experts over more than two decades as no more
than a guide to the language.

Further more, neither of us is suggesting that "popular consensus" is
the main tool to be used when there is ambiguity. That would be absurd.
Do you see how that comes over?

>> You have to read it with slice of common sense.
>>
> "Common sense" meaning "popular interpretation" to me. Very well;
> accepted.

I think you need to refine your understanding of the term "common
sense".

>> The meaning of its
>> terms is partly to be gleaned from absorbing how they are used.  Look
>> at how "result" and "value" are used.  "Result" is not defined so you
>> have to guess.  What is "the result of an expression"?  Most people
>> conclude that it is some notion of a quantity with an associated type.
>>
> If writing a translator, I might have a 'struct result' with a pointer
> to a type and a pointer to a value. I might initialize these with
> NULL each. If an "operator" for a 'struct result' demanded one of
> these properties but it was not defined, I might diagnose undefined
> behaviour.

That's hard to get right though it can be done. You need to make sure
that sizeof (1/0) works properly and that you distinguish between
"plain" values and lvalues. Your eval function needs flags to say what
sort of "evaluation" to do. In the example, 1/0 needs to be "evaluated"
for its type alone. I put evaluation in quotes because it is not C's
notion of evaluate but one that comes from the interpreter you are
writing.

> It simply seemed to me that there were circumstances in
> which some code path for "evaluation" might not ever use one of the
> properties, which would lead me to question the validity of diagnosing
> as UB if one compares as NULL but there was no expectation for it to
> be non-NULL... Such as a void expression, which appears to be more
> limited than I thought (casts to void and functions returning void,
> for example).

I don't follow all of this. Yes, there are cases where one would not
ever use one of the properties such as in sizeof (1/0), but in
(void)(1/0) you need to evaluate 1/0 for its value property so that you
can throw it away.

<snip>

| It would be nice if a constraint for the
| unary '*' operator were that the pointer must either point to an
| object or to a function. Perhaps some kind reader could introduce
| such a constraint into a future standard.

>> Such a constraint is not possible.  Constraints must be diagnosed by the


>> implementation when the program is translated and the compiler can't
>> tell when the operand of * does not point to an object.
>>
> A constraint that: "except when the '*' operator is used as the
> operand to the 'sizeof' operator, an expression evaluating to a null
> pointer constant or to a null pointer constant cast to any pointer
> type shall not be the operand," might do, mightn't it?

Either this can't be a constraint (because you mean to include something
that can't be tested at compile time) or you have now changed the
suggestion to catch only a few cases. It all depends on what you mean
by "evaluating to a null pointer constant or to a null pointer constant
cast to any pointer type".

>> Is this OK or
>> not:
>>
>>   int f(int *ip) { return *ip; }
>>
>> ?
>>
> Yes. The function call assigns the value of an argument to the 'ip'
> parameter. Passing in invalid value would result in UB.

You proposed a constraint (which I have put back since you cut it)


"were that the pointer must either point to an object or to a

function". My example (yes, we both agree it is UB) shows that you
can't tell *at compile time* if the constraint you originally proposed
is or is not violated.

>> We are agreed!
>>
>> Yes, that is my view.  If a "defined result" could be just a type, why
>> does the standard not say more about what one can do with these results?
>> Is *(char *)0 << *(char *)0 just a pure int result?  An expression's
>> result is either defined or undefined -- just a type is not enough.
>>
> Well actually, it does explain what you can do with the results. I
> had made earlier references to these. "Cast operators"' first
> constraint says "Unless...the operand shall have scalar type". Its
> first semantic point talks about "the value of the expression."
> "Simple assignment" talks about "type" and "value" for the
> "operands". That explicitness (along with void expressions) was part
> of why a result was not required to have both, against the consensus
> here.

There is no mention of your suggested type-only results. They would be
a major part of the language. How do they work? What can we do with
them?

> However it was the _consumers_ of the results that I was taking
> to give meaning to constraint-valid and semantically valid
> expressions. The consensus appears to be that the results are defined
> or not, regardless of the consumers or their properties, except for
> 'sizeof' (which nobody has disputed).

If it is a consensus, it is born out of the wording. An evaluation of
*E is defined when E points to an object or a function. (char *)0 does
neither. To allow the fact the expression form has a type to mean that
it also as some sort of valueless, type-only result is to invent a whole
new language.

<snip>
--
Ben.

Ben Bacarisse

unread,
Jul 24, 2010, 11:51:34 AM7/24/10
to
Shao Miller <sha0....@gmail.com> writes:

Let me propose something else. Post an example where you think that
some significant part of the meaning of the program depends on the
answers to your questions. I.e. find an example that matters. This
will interest people.

Everyone here (I am guessing) has their favourite examples of where the
literal wording in the standard falls short of giving an answer to some
question or other but most people want to write effective well-defined C
programs and they somehow manage to that despite these details.

--
Ben.

Shao Miller

unread,
Jul 24, 2010, 1:34:45 PM7/24/10
to
On Jul 24, 6:42 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> The word "claim" is rather loaded. Peter /does/ have a wealth of
> knowledge and experience. That doesn't mean he is necessarily right, but
> it does mean that the probability of his being right is significantly
> high. When I find myself disagreeing with Peter, it always gives me
> pause for thought.
>
Is it possible that "claims" could be injected with additional meaning
by the reader but not by the writer? If I don't fully know the whole
picture regarding everyone's status, can I reasonably say "has"
instead of "claims to have"? In other words, I am a newcomer, here.
I don't know any of you. It might be beneficial to my understanding
of C to get to know some of you. :) But can I possibly know before-
hand what magic words will set people off, such as "invent" and
"claims"?

Is it reasonable for me to simply take note of these words and avoid
them in the future? Can I do so without feedback like yours,
Richard? My answer would be "no." And so I thank you.

And also you have provided some evidence for the claim; this is
recorded. Would you have offered that evidence without my use of
"claims"? Does that last question suggest that I used "claims"
intentionally towards that end? I didn't. It was meant as a simple
statement of fact.

"Claims" will be dropped from my vocabulary here now, surely. Do you
happen to know of a nice list of words to avoid, like these ones?
That would be great. :)

Shao Miller

unread,
Jul 24, 2010, 2:06:35 PM7/24/10
to
On Jul 24, 6:44 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> No. Operators yield values, but they are not themselves evaluated.
> Therefore, there is no "evaluation of the cast operator".
>
> No, for the same reason.
>
> What do you mean by "result" in this context? Is a successful write to
> stdout a "result"? Some would say yes.
>
Previous discussion led me to believe that there was a common
understanding of the term "result". If that's so, that common
understanding is what I intended to ask about, here.

>
> No - see above.
>
By your correction above, it would appear that these questions are
broken. I shall attempt to ask different ones, instead. Thank you!

01. void foo(void) {
02. return;
03. }
04.
05. int main(void) {
06. int i = 13;
07. void *v = &i;
08. (void)13;
09. foo();
10. *v;
11. return 0;
12. }

When line 08 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression
'(void)13'? Does that definition define both a type for the result as
well as a value for the result?

When line 09 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression
'foo()'? Does that definition define both a type for the result as
well as a value for the result?

Using a conforming implementation, is the evaluation of every
expression within a strictly conforming program well-defined to
produce a result? In this same circumstance, is that result well-
defined to possess both a type as well as a value?

When line 10 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression '*v'?
Does that definition define a type for the result?

Shao Miller

unread,
Jul 24, 2010, 2:10:18 PM7/24/10
to
On Jul 24, 7:59 am, pete <pfil...@mindspring.com> wrote:
>
> The standard only mentions values for expressions of object type.
>
> The expression in the expression statement
> on line 08 is of type void.
>
Agreed. How can a conforming implementation make that very
determination?

>
> The value of such an expression
> is described by the standard as being "nonexistent".
>
> N869
>        6.3.2.2  void
>        [#1] The  (nonexistent)  value  of  a  void  expression  (an
>        expression that has type void) shall not be used in any way,
>        and implicit or explicit conversions (except to void)  shall
>        not  be  applied to such an expression.
>

Thanks, pete!

Shao Miller

unread,
Jul 24, 2010, 2:25:53 PM7/24/10
to
On Jul 24, 10:45 am, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>
> Your thinking seems very confused.  I suggest you stop
> thinking about execution on real machines or Turing
> machines,
Turing machines were only mentioned as a possible explanation for why
the author of the "has been assigned..." piece of semantic for unary
'*' might have intentionally meant the text to be taken literally.

>
> and focus on reading the Standard to understand
> what it says about semantics on the abstract machine.  The
> questions you've asked can be answered by reading the
> Standard carefully, not just isolated sections but all of
> sections 1-6, and considering what it's trying to say about
> semantics on the abstract machine, which is the only one
> that really counts.
>

Have you taken my Turing machine example to mean that I am worried
about anything but the abstract machine?

>
> I'm not inclined to spend any more
> effort responding to someone who seems to want other people
> to do the work for something that he appears to be capable
> of doing himself, if only he would put in more effort on
> that and less on raising captious objections.
>

If one perceives a gross ambiguity in the draft of a standard for C,
what is the best course of action to gain clarity on it after
exhausting the material in that draft? Do the references cited
through the original post and thereafter by its author suggest that
the draft material was read throughly? How many references by other
posters appear in the original poster's initial handful of posts? How
many do not? How many times should a person read something before
asking for others to help by sharing their interpretations and their
reasoning for those interpretations?

I can appreciate that you have made your judgment about me here and I
thank you for what you did contribute, Tim.

Seebs

unread,
Jul 24, 2010, 3:29:27 PM7/24/10
to
On 2010-07-24, Richard Heathfield <r...@see.sig.invalid> wrote:
> The word "claim" is rather loaded. Peter /does/ have a wealth of
> knowledge and experience. That doesn't mean he is necessarily right, but
> it does mean that the probability of his being right is significantly
> high. When I find myself disagreeing with Peter, it always gives me
> pause for thought.

Which, given who's saying it, I find quite flattering.

I guess that word, again, strengthens my point: If those remarks really
aren't intended as offensive (and "claim" has the same sorts of connotations
of dishonesty that "invented" does), then that implies a level of familiarity
with English inconsistent with arguing with more fluent speakers when they
tell you that a given text is clear in its meaning.

Shao Miller

unread,
Jul 24, 2010, 3:50:22 PM7/24/10
to
On Jul 24, 11:09 am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> Check. 6.5.3.2 p4 "If the operand points to a function, the result is a
> function designator; if it points to an object, the result is an lvalue
> designating the object."
>
Thank you. It does appear that this point defines a value for the
result under certain circumstances.

>
> > 2. There is a sentence which specifies a type for the result.
>
> Check (well there are two, in fact).  6.5.3.2 p2 "The operand of the
> unary * operator shall have pointer type."  And 6.5.3.2 p4 "If the
> operand has type 'pointer to type', the result has type 'type'."
>

Agreed.

> > 3. If the sentence regarding the value does not apply, the sentence
> > regarding the type is _insufficient_ to define a whole result.
>
> Check. 4 p2 "Undefined behavior is otherwise indicated in this
> International Standard by the words 'undefined behavior' or by the
> omission of any explicit definition of behavior."
>

Agreed, and originally intended as the meaning.

>
> Underlying the specific issue you have is a problem that the standard
> has never quite managed to resolve.  There are three attributes that

> matter about an expression and/or its "result"...
> ... ... ...


> I'd prefer the wording to be done like this:
>
>    Form:        *E
>    Constraints: The operand, E, must have type 'pointer to T'.
>    Type:        An expression of the form *E has type T and is an
>                 lvalue if T is an object type.
>    Result:      If the result of evaluating E is a pointer to a function,
>                 the result is a function designator denoting the pointed
>                 to function.  If E the result of evaluating E is a
>                 pointer to an object, the result denotes that object.
>
> It would then be clear that the type is not really "part of the result"
> but a property of the expression form -- something essentially static

> and not associated with the evaluation...
> ... ... ...
Agreed. Could your suggestion be met with criticisms for your simple
preference? Would it be enjoyable if anyone suggested that your
opinion on perceived ambiguity for readers of this material is not
worth offering? I thank you for it.

>
> Of course, the way it is done now is much more intuitive...
> ... ... ...


> It's a shame that the wording is not perfect, but
> it is not nearly as confusing as you seem to think.
>

What is it exactly that causes you to believe that I find the wording
confusing?

>
> <snip another discussion about void expressions.  I don't want to get
> into that here>
>

That's unfortunate for me, but I accept it.

>
> That is what most people take it to mean.  Why?  Because making special
> provision for when a pointer is an object that has been assigned to
> makes no sense when taken literally.  Given:
>
>   const int *ip = 0;
>

I see the truth of it. This initialization specifies the value
initially stored in the object but there is no assignment expression.
If we take the text to mean "...assigned to the pointer by an
assignment expression...", then "the pointer" could not have been
assigned-to.

>
> *ip would not be covered but it would be after:
>
>   int *ip;
>   ip = 0;
>
> Both would remain undefined by omission, so what value would the literal
> interpretation serve?
>

I don't follow you here. You said '*ip' "would be" covered "after"
and then said both would remain undefined by omission.

>
> > 2. Casting to 'void' and application of the unary '*' operator are
> > treated differently.  Both may fail to define a value for the result
> > of an evaluation, but the cast is permitted as defined behaviour.
>
> I don't see how this makes any difference but it does not matter because
> I'm choosing (1) not (2)!
>

I will choose (1) as well, as that's the overwhelming consensus, even
though I perceive the possibility of a different intention by the
author. I won't bore anyone by repeating it. Consider me convinced.

>
> Hmmm.  Now I doubt your sincerity again.  Read what you wrote.  You are
> suggesting that I (and indirectly Tim) want you to treat the largely
> unpaid work of dozens of experts over more than two decades as no more
> than a guide to the language.
>

Please excuse me. Have I said suggested incorrectly? Have I
suggested that this "unpaid work of dozens of experts over more than
two decades" is not a valuable resource? If so, what would cause you
to suggest that have implied/meant/stated that? I believe this
resource to be _the_ most valuable resource for C. Is it a guide? In
what way(s) is it more than guide? Is it a math textbook? I would
answer that with "no."

>
> Further more, neither of us is suggesting that "popular consensus" is
> the main tool to be used when there is ambiguity.  That would be absurd.
>
> Do you see how that comes over?
>

Then I have misinterpreted and apologize. What should be the main
tool in case of ambiguity?

>
> I think you need to refine your understanding of the term "common
> sense".
>

An accepted possibility. So do you suggest that common sense and
perusal of the draft/standard is all that is required to develop a
conforming implementation? If so, do you see how that might come over
to someone? Would you be willing to help to refine an interpretation
of the term "common sense"?


>
> That's hard to get right though it can be done.  You need to make sure
> that sizeof (1/0) works properly and that you distinguish between
> "plain" values and lvalues.  Your eval function needs flags to say what
> sort of "evaluation" to do.  In the example, 1/0 needs to be "evaluated"
> for its type alone.  I put evaluation in quotes because it is not C's
> notion of evaluate but one that comes from the interpreter you are
> writing.
>

Good tips. :)

>
> I don't follow all of this.  Yes, there are cases where one would not
> ever use one of the properties such as in sizeof (1/0), but in
> (void)(1/0) you need to evaluate 1/0 for its value property so that you
> can throw it away.
>

You have explained that you are not interested in discussing 'void' at
this time. I accept this and won't trouble you, here.

>
> Either this can't be a constraint (because you mean to include something
> that can't be tested at compile time) or you have now changed the
> suggestion to catch only a few cases.  It all depends on what you mean
> by "evaluating to a null pointer constant or to a null pointer constant
> cast to any pointer type".
>

Can an expression at translation-time evaluate to a null pointer
constant? Can an expression at run-time evaluate to a null pointer
constant? 6.3.2.3 p3 includes detail concerning an "integer constant
expression". Could the value of an object be an integer constant
expression? If everyone knows or at least is adamant that "you can't
dereference a null pointer," could this constraint hurt the standard
for C? Could it help to prevent questions like mine from accumulating
more responses than a single response with a citation?

>
> You proposed a constraint (which I have put back since you cut it)
> "were that the pointer must either point to an object or to a
> function".  My example (yes, we both agree it is UB) shows that you
> can't tell *at compile time* if the constraint you originally proposed
> is or is not violated.
>

Agreed. I have abandoned that constraint in favour of the "null
pointer constant" constraint mentioned.

>
> There is no mention of your suggested type-only results.  They would be
> a major part of the language.  How do they work?  What can we do with
> them?
>

Could a void expression be considered a type-only result? 6.3.2.2 p1
describes an expression with no value (an empty set) and type 'void'.
Did we agree that such an expression is evaluated? They way I
perceive them to work is: If an expression's type is defined to have a
type "T" and a value is not defined for evaluation, and that type "T"
is 'void', the expression is a void expression. Is this a reasonable
perspective? Sorry that we wound up with 'void', here. I apologize
but wished to answer your questions.

>
> If it is a consensus, it is born out of the wording....
>
Agreed.

Thanks, Ben!

Shao Miller

unread,
Jul 24, 2010, 3:54:02 PM7/24/10
to
On Jul 24, 11:51 am, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>
> Let me propose something else.  Post an example where you think that
> some significant part of the meaning of the program depends on the
> answers to your questions.  I.e. find an example that matters.  This
> will interest people.
>
Are you suggesting that the above example doesn't matter?

>
> Everyone here (I am guessing) has their favourite examples of where the
> literal wording in the standard falls short of giving an answer to some
> question or other but most people want to write effective well-defined C
> programs and they somehow manage to that despite these details.
>

One such well-defined program might even be itself a C
implementation. It would be good to do it right by everyone here.

Shao Miller

unread,
Jul 24, 2010, 4:10:59 PM7/24/10
to
On Jul 24, 3:29 pm, Seebs <usenet-nos...@seebs.net> wrote:
>
> Which, given who's saying it, I find quite flattering.
>
> I guess that word, again, strengthens my point:  If those remarks really
> aren't intended as offensive (and "claim" has the same sorts of connotations
> of dishonesty that "invented" does), then that implies a level of familiarity
> with English inconsistent with arguing with more fluent speakers when they
> tell you that a given text is clear in its meaning.
>
There's no pleasing all of the people, all of the time. If someone is
misinterpreting my words to carry connotations that weren't intended,
I can only try to address such in the future based on feedback, but
it's really _their_ constraint on interpretation. If someone chooses
to interpret negative connotations, I don't know exactly where that
comes from. Perhaps it is congruent with the same perspective that
makes it all right to presume homogeneity amongst English speakers'
use of English in discussion of a subject matter.

There is nothing non-objective about using "claims," but it'll be
dropped nonetheless in the future.

Please have tolerance for other people and they way they might write,
even if you've previously been worn down by abusers. If Bayes tells
you an abuser is likely at hand, by all means, fine. That's entirely
reasonable.

Keith Thompson

unread,
Jul 24, 2010, 7:30:50 PM7/24/10
to
Shao Miller <sha0....@gmail.com> writes:
> On Jul 24, 7:59 am, pete <pfil...@mindspring.com> wrote:
>> The standard only mentions values for expressions of object type.
>>
>> The expression in the expression statement
>> on line 08 is of type void.
>>
> Agreed. How can a conforming implementation make that very
> determination?

I don't understand the question. The expression statement in question
was
(void)13;
Determining the type of (void)13 is just one of the thousands
of things an implementation is required to do.


--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Richard Heathfield

unread,
Jul 24, 2010, 7:33:25 PM7/24/10
to
Shao Miller wrote:
> On Jul 24, 6:42 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>> The word "claim" is rather loaded. Peter /does/ have a wealth of
>> knowledge and experience. That doesn't mean he is necessarily right, but
>> it does mean that the probability of his being right is significantly
>> high. When I find myself disagreeing with Peter, it always gives me
>> pause for thought.
>>
> Is it possible that "claims" could be injected with additional meaning
> by the reader but not by the writer?

Yes, of course, and there's nothing the writer can do about that. That's
why we have to be careful how we write, if we do not wish to be
misunderstood. (That applies to everyone, in all contexts, not just in
comp.lang.c, and it applies as much to me as it does to you or to anyone
else.)


> If I don't fully know the whole
> picture regarding everyone's status, can I reasonably say "has"
> instead of "claims to have"?

Status is as status does. People acquire "status" (if that's the right
word) in comp.lang.c by posting good, sound advice, by entering civilly
into technical discussions and demonstrating their knowledge of C in
those discussions. (That doesn't mean they are always right, of course.
But the way in which a person "loses" a discussion can tell you much
about them, good or bad.) And, in general, those who are best at C tend
to be those who aren't actually all that interested in status. What
they're interested in is C. But there are times when a poster's personal
experience is highly relevant. For example, Peter Seebach ("Seebs") was
for some years an active member of the ISO committee responsible for
updating the C language specification. (So was Larry Jones. He will no
doubt correct me if I am wrong, but I *think* he's still on the
committee.) Such people are not guaranteed to be right by any means, but
proving them wrong requires care and skill!

You get to know, after a while, who tends to be right more often than
not. Seebs is right more often than not. So is Keith Thompson. So is
Eric Sosman. So is David Thompson. (Non-exhaustive list.) If you are
ever lucky enough to see the return of Lawrence Kirby or Chris Torek or
Steve Summit, you will find that they are hardly ever wrong (but "hardly
ever" is not "never").

That's one reason that newcomers to a newsgroup (*any* newsgroup) do
well to "lurk" (read, but not post) for a fairly long time - when Usenet
was first becoming popular, six months was the general recommendation.
In practice, most people don't do that, and it is hardly rare for them
to regret posting prematurely.

By reading the group regularly, as I say, you get to know who are the
people who actually know the language. As for how to deal with such
people, you could do worse than check out articles by "blmblm" (B L
Massingill, I think), who - although far from ignorant about C - would
perhaps not lay claim to deep expertise, but whose articles are
invariably polite and well-argued.

> In other words, I am a newcomer, here.
> I don't know any of you. It might be beneficial to my understanding
> of C to get to know some of you. :) But can I possibly know before-
> hand what magic words will set people off, such as "invent" and
> "claims"?

You can't, I suppose. But what you /can/ do is learn.

>
> Is it reasonable for me to simply take note of these words and avoid
> them in the future? Can I do so without feedback like yours,
> Richard? My answer would be "no." And so I thank you.

It ain't so much the words as the way how you use them. It comes with
experience.

>
> And also you have provided some evidence for the claim; this is
> recorded. Would you have offered that evidence without my use of
> "claims"?

Nope. No point in rebutting a non-existent statement.


> Does that last question suggest that I used "claims"
> intentionally towards that end? I didn't. It was meant as a simple
> statement of fact.
>
> "Claims" will be dropped from my vocabulary here now, surely. Do you
> happen to know of a nice list of words to avoid, like these ones?
> That would be great. :)

It isn't so much a word-list as an exercise in objectivity. It can be
quite difficult to stand back from the text you write and see it as
others will see it. But it is a worthwhile exercise, nonetheless.

Richard Heathfield

unread,
Jul 24, 2010, 7:49:35 PM7/24/10
to
Shao Miller wrote:

<snip>

> 01. void foo(void) {
> 02. return;
> 03. }
> 04.
> 05. int main(void) {
> 06. int i = 13;
> 07. void *v = &i;
> 08. (void)13;
> 09. foo();
> 10. *v;
> 11. return 0;
> 12. }
>
> When line 08 is evaluated by a conforming implementation, is the
> behaviour well-defined to produce a result for the expression
> '(void)13'?

Line 08 works like this. 13 (which is of type int) is evaluated. The
value thus yielded is used as an operand for the cast operator. The cast
is to void, so the value is discarded. It's hard to argue that there is
any "result" here - the compiler is perfectly at liberty to ignore the
line completely, since there are no side effects.

> Does that definition define both a type for the result as
> well as a value for the result?

There is no definition on line 08. The expression (void)13 has type
void, and this type "comprises an empty set of values", so it is
meaningless to talk about a "value" for an expression of void type.

>
> When line 09 is evaluated by a conforming implementation, is the
> behaviour well-defined to produce a result for the expression
> 'foo()'? Does that definition define both a type for the result as
> well as a value for the result?

See above.

>
> Using a conforming implementation, is the evaluation of every
> expression within a strictly conforming program well-defined to
> produce a result?

What do you mean by "result" here? A value? If so, the answer is "no".
Your line 09 is an example of an expression that produces no value.


> In this same circumstance, is that result well-
> defined to possess both a type as well as a value?

See above.

> When line 10 is evaluated by a conforming implementation, is the
> behaviour well-defined to produce a result for the expression '*v'?
> Does that definition define a type for the result?

Since void * cannot be dereferenced, the question does not arise.

Seebs

unread,
Jul 24, 2010, 11:42:00 PM7/24/10
to
On 2010-07-24, Richard Heathfield <r...@see.sig.invalid> wrote:
> You get to know, after a while, who tends to be right more often than
> not. Seebs is right more often than not. So is Keith Thompson. So is
> Eric Sosman. So is David Thompson. (Non-exhaustive list.) If you are
> ever lucky enough to see the return of Lawrence Kirby or Chris Torek or
> Steve Summit, you will find that they are hardly ever wrong (but "hardly
> ever" is not "never").

My boss once found a bug in some of Chris Torek's code!

... once.

I haven't yet, although I have now at least once managed to sneak a bug
past his code review. (That said, my ability to sneak retroactively-obvious
bugs past code review has become something of a local legend.)

>> In other words, I am a newcomer, here.
>> I don't know any of you. It might be beneficial to my understanding
>> of C to get to know some of you. :) But can I possibly know before-
>> hand what magic words will set people off, such as "invent" and
>> "claims"?

> You can't, I suppose. But what you /can/ do is learn.

That said, I'm pretty sure the usual convention of using "invented" to mean
"made up" (and thus, not derived from reality, and thus implicitly dishonest)
is sufficiently widespread to not need special newsgroup-specific knowledge.

> It isn't so much a word-list as an exercise in objectivity. It can be
> quite difficult to stand back from the text you write and see it as
> others will see it. But it is a worthwhile exercise, nonetheless.

Very much so.

As a quick starting point, just check the things you say to see whether they
would make any sense at all if you were confident the people you're talking
to were being honest. If they wouldn't, people will reasonably assume you
to be implying that they are dishonest.

Accusing someone of having "invented" something right after they've said
that a given text communicates it implies clearly that the text didn't
communicate that at all -- meaning it implies that they're being dishonest.

Interesting side note: Several of our protagonist's problems with the
Standard reflect a similar problem -- not being aware of the logical
implications of what is said and what is unsaid. If I were going to try to
address such a thing, I'd start by studying the Gricean Maxims, because
they're the underlying substrate over which words create meaning.

Shao Miller

unread,
Jul 25, 2010, 3:40:29 AM7/25/10
to
On Jul 24, 7:30 pm, Keith Thompson <ks...@mib.org> wrote:
>
> I don't understand the question.  The expression statement in question
> was
>     (void)13;
> Determining the type of (void)13 is just one of the thousands
> of things an implementation is required to do.
>
Absolutely agreed that it is just one of those things. The challenge
I perceive from this code is:

Why is a cast to 'void' well-defined but "dereferencing" a 'void *'
not well-defined?

If I'm not mistaken, an expression with 'void' type has no value
(6.3.2.2,p1 "nonexistent" , 6.2.5,p19 "empty set"). When we read
about the "Cast operators" (6.5.4), what I directly observe from its
text is that this operator "converts the value of the expression to
the named type." The named type in '(void)13' is, of course, 'void'.
We know that an expression with 'void' type has no value, so that must
mean that the value is discarded during the conversion of the value,
would you agree?

Now we look at the text for the unary '*' operator (6.5.3.2,p4).
There we see a definition for the type of the result (of an
evaluation), just as we do for casting, would you agree? Thus if the
operand has type pointer-to-void, it suggests that the result has type
'void'. I would suggest that we must use the very same reasoning as
we do in our interpretation of cast operators to conclude that the
result is thus a void-expression (6.3.2.2,p1). The sentences
describing the result if pointing to an object and pointing to a
function do not apply. Nonetheless, the text appears to define the
result of evaluating application of unary '*' to an expression with
type pointer-to-void as being a result with type 'void'. A result
with type 'void' can be considered a void expression just as much as
the cast can be, can it not?

Same thing with a "function returning void" (6.5.2.2,p1). Its
evaluation is defined to be a result with type 'void' (6.5.2.2,p5).

Why is it that at least the C implementation named "GCC" appears to
distinguish between these three scenarios? Do other implementations,
as well?

My experience and my "common sense" suggests that "you cannot
dereference a 'void *'." Unfortunately, the text of the referenced
draft does not make that explicit (as far as I've yet been able to
determine; 6.5.3.2). Then thinking it through, it appears that
there's really _no_need_ for such indirection to yield UB. It could
simply be another form of void expression, like the two others.

If we can agree on this, what implications might there be for existent
implementations? I can only perceive a change to treat the behaviour
(of dereferencing a pointer-to-void) as well-defined, rather than
undefined. That really doesn't seem like a big deal, to me.

If we forget about _any_ of the debate regarding the "...has been
assigned..." business and accept it to mean what a quick glance might
suggest, we would _still_ have UB if the operand was a null pointer
value, if we additionally accept the non-normative footnote regarding
a null pointer being an invalid value. No gigantic implications for
implementations, there.

I fear that someone might respond as though the "points to and object"
and "points to a function" combined sentence somehow has some type of
priority over the following "has type" sentence... But _please_ note
the lack of "shall"s and "shall not"s.

I also fear that someone might respond that any 'void *' value would
be an invalid value, because such a value neither points to an object
nor to a function. That argument would also ignore the equal
precedence of the sentence regarding "has type".

Direct comparison with the "cast operators" might be useful. The text
does not define a value when the conversion is to type 'void'. But
rather than being undefined behaviour, we know from other parts of the
draft that 'void' represents no value(s). So defining the type of an
expression to be 'void' essentially defines the expression to be a
void expression. I cannot see how this could be any different for the
unary '*' operator (which we might have to temporarily detach our
familiarity with in order to study in this regard).

Please take your time to consider the implications before responding.
I am very hopeful for an agreement here, but assume that if major
overhauls and negative implications might be perceived, that an
agreement is less likely to happen. I really don't see how:

int x;
void *p = &x;
*p;

could be a huge deal as well-defined behaviour.

It is also possible that someone will point out a reference in the
text that I've read several times but have interpreted differently,
which makes the above proposal impossible. I'm happy either way.

Shao Miller

unread,
Jul 25, 2010, 4:14:31 AM7/25/10
to
On Jul 24, 7:33 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
> ... ... ...

> > And also you have provided some evidence for the claim; this is
> > recorded.  Would you have offered that evidence without my use of
> > "claims"?
>
> Nope. No point in rebutting a non-existent statement.
>
You might have misunderstood me, here. The "evidence" I meant to
refer to was another person offering that Peter _does_ have a wealth
of knowledge and experience for C. Regardless of this evidence, to be
honest, it was never a question to me. I have been operating under
the _assumption_ that _any_ intelligent response received in this
forum (as Peter Seebach's posts after his first are) are those of
seasoned C developers. The benefit of the doubt is there; my
expectation would be for a responder to have to establish their self
as _less_than_that_ before I would begin to associate less
credibility. Additionally, I shall not mistakenly associate personal
incompatibilities with a lack of C "seasoning".

>
> ... ... ...
>
I agree 100% with the entirety of this response post of yours,
Richard. It is, in my opinion, a worth-while read for any newcomer in
my situation. I will try to keep what you have stated in mind and
consider your references. I can't thank you enough, really. I did
not particularly anticipate these experiences when posting about
interests in C.

I'll also not be drawn into personal back-and-forth, because my focus
here is C.

It's possible that part of what has landed me in "hot water" here is
my expectation that discussion could take the form of directly
addressing a response's points in a manner of debate. Also, that I
sometimes attempt to calibrate my responses based on inferences about
the posters. This can obviously backfire, big-time. For example, one
way to establish rapport might be to mimic some of the attitudes
perceived as being possessed by a discussant. If a poster appears to
me to demonstrate authority without references, I might respond in the
same tone, insofar as it doesn't violate any of my personal
constraints for civil discussion. It's quite possible to fail to
determine what constitutes civil discussion in the perspective of the
other poster.

Anyway, back to C... Because that's what the subject-at-hand is... :)

Richard Heathfield

unread,
Jul 25, 2010, 4:18:09 AM7/25/10
to
Shao Miller wrote:
> On Jul 24, 7:30 pm, Keith Thompson <ks...@mib.org> wrote:
>> I don't understand the question. The expression statement in question
>> was
>> (void)13;
>> Determining the type of (void)13 is just one of the thousands
>> of things an implementation is required to do.
>>
> Absolutely agreed that it is just one of those things. The challenge
> I perceive from this code is:
>
> Why is a cast to 'void' well-defined but "dereferencing" a 'void *'
> not well-defined?

Because the Standard defines the one but not the other. It's not clear
what you're trying to get at here.


> If I'm not mistaken, an expression with 'void' type has no value
> (6.3.2.2,p1 "nonexistent" , 6.2.5,p19 "empty set"). When we read
> about the "Cast operators" (6.5.4), what I directly observe from its
> text is that this operator "converts the value of the expression to
> the named type." The named type in '(void)13' is, of course, 'void'.
> We know that an expression with 'void' type has no value, so that must
> mean that the value is discarded during the conversion of the value,
> would you agree?

13 has a value. (void)13 has no value. Neither, it appears, does your point.

<snip>

Richard Heathfield

unread,
Jul 25, 2010, 4:22:15 AM7/25/10
to
Shao Miller wrote:

<snip>

> The benefit of the doubt is there;

Yes, but you are rapidly using it up.


<snip>

> It's possible that part of what has landed me in "hot water" here is
> my expectation that discussion could take the form of directly
> addressing a response's points in a manner of debate.

No, that's the usual way of things in clc, so that is not the
explanation for the high temperature of your entry into the group.

<snip>

Shao Miller

unread,
Jul 25, 2010, 4:59:42 AM7/25/10
to
On Jul 25, 4:18 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> Because the Standard defines the one but not the other. It's not clear
> what you're trying to get at here.
>
I cannot refer to the standard at this time, I'm afraid. I can only
refer to the draft with filename 'n1256.pdf'. In the rest of my post
I have provided quite a bit of detail regarding my interpretation that
this draft _does_ fully define the result of evaluating the (sole)
unary-expression on the third line below of something like:

int i;
void *v = &i;
*v;

>
> 13 has a value. (void)13 has no value. Neither, it appears, does your point.
>

Did you read the rest of the post? If so, it would be beneficial to
me if you would address exactly where you perceive faults in my
reasoning about the subject matter.

"The Standard defines the one but not the other" and the implication
that my point has no value really do not help me. My post provides
detail for how the draft _does_ define both. I have even met with
agreement on this point by another discussant in another C-devoted
forum, whose pedantry and accuracy I have historically valued.

Why is it that you have snipped so much of my post instead of pointing
out statements you disagree with?

My point is that there is a form of void expression in C which is not
commonly considered. I have observed at least one implementation to
print a warning, where I see no trouble. I have reasoned that this
and any other implementation's treatment of this form of void
expression may need addressing.

Is that clear?

Shao Miller

unread,
Jul 25, 2010, 5:03:45 AM7/25/10
to
On Jul 25, 4:22 am, Richard Heathfield <r...@see.sig.invalid> wrote:

> Shao Miller wrote:
>
> > The benefit of the doubt is there;
>
> Yes, but you are rapidly using it up.
>
I am sorry to report that I do not understand this statement. Could
you please clarify what you mean here?

>
> > It's possible that part of what has landed me in "hot water" here is
> > my expectation that discussion could take the form of directly
> > addressing a response's points in a manner of debate.
>
> No, that's the usual way of things in clc, so that is not the
> explanation for the high temperature of your entry into the group.
>

You have not provided any alternative explanation, but that's fine. I
shall continue to focus on my concerns with C.

It is loading more messages.
0 new messages