Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Indeterminate values and Undefined behaviour - anyone "know" ?

12 views
Skip to first unread message

Roger Onslow

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

There appears to be some confusion (or at least discussion on
interpretation) about what constitutes 'undefined behaviour' in relation to
'indeterminate values' according to the standard. Mostly this has been in
relation to what happens to a pointer after 'free' or 'fclose'.

The standard says that undefined behaviour includes the use of an object
with indeterminate value (Note - it say use of the object, NOT use of the
indeterminate value)

I'd like to see if there is a consensus on ALL (or any :-) of the following
examples.

For all these, assume we have..

int* p;

and the value of p is currently indeterminate (either not initialised yet,
or just after a call to 'free' - whatever).

Which of the following expressions would officially (by the standard) result
in 'undefined behaviour'. This does NOT mean which ones actually work in
some particular implementation.

p

*p

&p

*&p

p == NULL

p = NULL

(unsigned char*)p

*(unsigned char*)p

(unsigned char*)&p

*(unsigned char*)p

sizeof(p)

sizeof(*p)

memcpy(q, p, sizeof(int)) /* where q is an int* as well */

memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */

One way of interpreting that part of the standard would say that ALL of
these are undefined because they 'use' an objet of indeterminate value - but
that cannot be right, because then you could never assign a value. Does
'use' mean "use as an r-value"?

I hope to hear some interesting and informed replies to this.

Roger


Peter Seebach

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <#glFW$H98G...@geraldo.newaygo.mi.us>,

Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
>and the value of p is currently indeterminate (either not initialised yet,
>or just after a call to 'free' - whatever).

I would argue that those cases should, some day, be different, although
they aren't now.

> p
> *p
UD
> &p
NUD


> *&p
> p == NULL
> p = NULL
> (unsigned char*)p
> *(unsigned char*)p

UD
> (unsigned char*)&p
NUD
> *(unsigned char*)p
UD
> sizeof(p)
NUD
> sizeof(*p)
NUD


> memcpy(q, p, sizeof(int)) /* where q is an int* as well */

UD


> memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */

UD

The last one is controversial; I say that the library gets no dispensation
here; memcpy doesn't say it can copy indeterminately valued objects, therefore
it can't.

>One way of interpreting that part of the standard would say that ALL of
>these are undefined because they 'use' an objet of indeterminate value - but
>that cannot be right, because then you could never assign a value. Does
>'use' mean "use as an r-value"?

The reason the sizeof's aren't undefined is that sizeof doesn't (in C89)
evaluate its argument. (In C9X, it may somewhat in VLA contexts.) Taking
the address of an object is not using the object. Assigning to an object
is not using it.

I think you've basically got it - use as an rvalue is a good description.

-s
--
se...@plethora.net -- I am not speaking for my employer. Copyright '97
All rights reserved. This was not sent by my cat. C and Unix wizard -
send mail for help, or send money for a consultation. Visit my new ISP
<URL:http://www.plethora.net/> --- More Net, Less Spam! Plethora . Net

Lawrence Kirby

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <64usco$prh$1...@darla.visi.com>
se...@plethora.net "Peter Seebach" writes:

>In article <#glFW$H98G...@geraldo.newaygo.mi.us>,
>Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
>>and the value of p is currently indeterminate (either not initialised yet,
>>or just after a call to 'free' - whatever).
>
>I would argue that those cases should, some day, be different, although
>they aren't now.

Do you mean separate out indeterminate value from indeterminate bit pattern,
or indeterminate value as seen through a different type of lvalue? It isn't
clear that they aren't different now, or to put it another way it isn't
clear they they are the same.

>> p
>> *p
>UD
>> &p
>NUD
>> *&p
>> p == NULL
>> p = NULL
>> (unsigned char*)p
>> *(unsigned char*)p
>UD

I certainly hope that p = NULL is NUD! (I have of course included <stddef.h>)

>> (unsigned char*)&p
>NUD
>> *(unsigned char*)p
>UD
>> sizeof(p)
>NUD
>> sizeof(*p)
>NUD
>> memcpy(q, p, sizeof(int)) /* where q is an int* as well */
>UD
>> memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */
>UD
>
>The last one is controversial; I say that the library gets no dispensation
>here; memcpy doesn't say it can copy indeterminately valued objects, therefore
>it can't.

I agree that the library gets no dispensation because the definition of the
behaviour of functions like memcpy() is in terms of copying characters, so
it is defined if and only if copying characters is defined.

The argument seems to be that unsigned char is in some sense special in that
it can represent as valid values all bit combinations in a byte. Whether
this is true or not is a matter for debate in itself but assume for the
moment that it is. The argument then seems to be that since any bit pattern
in the object represents a valid value reading the value can't result in
undefined behaviour. I don't agree with this, if the standard says that
an operation results in undefined behaviour then it does just that. For
example a compiler could recognise that the value of an indeterminate object
is being read and generate code to abort when that happens.

>>One way of interpreting that part of the standard would say that ALL of
>>these are undefined because they 'use' an objet of indeterminate value - but
>>that cannot be right, because then you could never assign a value. Does
>>'use' mean "use as an r-value"?
>
>The reason the sizeof's aren't undefined is that sizeof doesn't (in C89)
>evaluate its argument. (In C9X, it may somewhat in VLA contexts.) Taking
>the address of an object is not using the object. Assigning to an object
>is not using it.

The last is stretching it as far as the wording is concerned. However
it must be the intent, not being able to assign to an indeterminately
valued object would be untenable in practice. Is there any reason not to
amend 3.16 from "or of indeterminately valued objects" to "or of
indeterminate values". The problem is with the value, not any object it
may or may not be stored in.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------


Peter Seebach

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <879956...@genesis.demon.co.uk>,

Lawrence Kirby <fr...@genesis.demon.co.uk> wrote:
>>> p == NULL
>>> p = NULL
>>> (unsigned char*)p
>>> *(unsigned char*)p
>>UD

>I certainly hope that p = NULL is NUD! (I have of course included <stddef.h>)

Y'know, some days, everything you put on a plate looks like crow.

*sigh*.

>I agree that the library gets no dispensation because the definition of the
>behaviour of functions like memcpy() is in terms of copying characters, so
>it is defined if and only if copying characters is defined.

Yes.

>The argument seems to be that unsigned char is in some sense special in that
>it can represent as valid values all bit combinations in a byte. Whether
>this is true or not is a matter for debate in itself but assume for the
>moment that it is. The argument then seems to be that since any bit pattern
>in the object represents a valid value reading the value can't result in
>undefined behaviour. I don't agree with this, if the standard says that
>an operation results in undefined behaviour then it does just that. For
>example a compiler could recognise that the value of an indeterminate object
>is being read and generate code to abort when that happens.

Right. This is clarified (not very well) in C9X, where we have "trap
representations". See, there's two *unrelated* reasons for reads
of indeterminately-valued objects to be undefined behavior.

1. Because We Said So, with the intent that an implementor could
legitimately trap references to uninitialized data, and with the
understanding that, by definition, such are programming errors.
2. Because some implementations cannot be reasonably implemented
if they are required to handle invalid objects, especially pointers
or floating point.

Hmm.

Douglas A. Gwyn

unread,
Nov 19, 1997, 3:00:00 AM11/19/97
to

In article <#glFW$H98G...@geraldo.newaygo.mi.us>,
"Roger Onslow" <Roger_Ons...@compsys.com.au> wrote:
[p has an indeterminate value]
> &p
> (unsigned char*)&p
> sizeof(p)
> sizeof(*p)

> memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */

The above are allowed in a strictly conforming program;
the others are not (because they require evaluation of p).

Incidentally, you don't need the parentheses in the sizeof operands.


Kaz Kylheku

unread,
Nov 20, 1997, 3:00:00 AM11/20/97
to

In article <#glFW$H98G...@geraldo.newaygo.mi.us>,
Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
>There appears to be some confusion (or at least discussion on
>interpretation) about what constitutes 'undefined behaviour' in relation to
>'indeterminate values' according to the standard. Mostly this has been in
>relation to what happens to a pointer after 'free' or 'fclose'.
>
>The standard says that undefined behaviour includes the use of an object
>with indeterminate value (Note - it say use of the object, NOT use of the
>indeterminate value)

Actually I could cite several quotes from the standard which contradict this
directly.

For example, after an fclose, it is the value of the pointer that is
indeterminate. It is not written that any object is indeterminate.
All existing copies of that pointer are indeterminate no matter where
they are, even if they are not stored in objects.

Thus, in the expression

fclose(p) + p;

the value of 'p' on the right side is indeterminate, even if it is fetched
prior to the call to fclose(). The call will invalidate even the intermediate
value that exists somewhere other than in a storage object.

Michael Norrish

unread,
Nov 20, 1997, 3:00:00 AM11/20/97
to

se...@plethora.net (Peter Seebach) writes:

>> The argument seems to be that unsigned char is in some sense
>> special in that it can represent as valid values all bit
>> combinations in a byte. Whether this is true or not is a matter for
>> debate in itself but assume for the moment that it is. The argument
>> then seems to be that since any bit pattern in the object
>> represents a valid value reading the value can't result in
>> undefined behaviour. I don't agree with this, if the standard says
>> that an operation results in undefined behaviour then it does just
>> that. For example a compiler could recognise that the value of an
>> indeterminate object is being read and generate code to abort when
>> that happens.

> Right. This is clarified (not very well) in C9X, where we have "trap
> representations". See, there's two *unrelated* reasons for reads
> of indeterminately-valued objects to be undefined behavior.

But, but, but.

With this understanding, you can't use unsigned chars to pick structs
apart byte by byte because you can't tell if you might be about to
load some indeterminately valued bit of padding into your byte
variable.

If you're willing to give unsigned char special dispensation to not
suffer "trap" errors, then you can surely allow unsigned char lvalues
to access indeterminate values.

But hey, if you're right the set of strictly conforming programs is
just smaller than I thought it was.

Michael.
-- Oh for a language where everything that was syntactically valid
-- had a meaning (Java and SML have this property I believe).


Roger Onslow

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

Kaz Kylheku wrote in message <651r6g$dmc$1...@helios.crest.nt.com>...

>In article <#glFW$H98G...@geraldo.newaygo.mi.us>,
>Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
>>There appears to be some confusion (or at least discussion on
>>interpretation) about what constitutes 'undefined behaviour' in relation
to
>>'indeterminate values' according to the standard. Mostly this has been in
>>relation to what happens to a pointer after 'free' or 'fclose'.
>>
>>The standard says that undefined behaviour includes the use of an object
>>with indeterminate value (Note - it say use of the object, NOT use of the
>>indeterminate value)
>
>Actually I could cite several quotes from the standard which contradict
this
>directly.

You can?

Does that mean the standard is self-contradictory?

>For example, after an fclose, it is the value of the pointer that is
>indeterminate.

Yeup - that is correct. The standard says the value of the pointer is
INDETERMINATE.

It also defines the USE of an OBJECT with INDETERMINATE VALUE (ie. the
pointer) as being (one kind of) undefined behaviour.

>It is not written that any object is indeterminate.

I never said that. I (and the standard) refer to the USE of an object with
indeterminate value (not an indeterminate object - which is not a term used
in the standard) is undefined behaviour.

It does NOT (interestingly enough) say anything about using the
indeterminate value (of course, I would ASSUME myself that that would result
in undefined behaviour, but that is not what the standard says (from my
reading))

>All existing copies of that pointer are indeterminate no matter where
>they are, even if they are not stored in objects.


That's right... but how does that "contradict this directly" ??

>Thus, in the expression
>
> fclose(p) + p;
>
>the value of 'p' on the right side is indeterminate, even if it is fetched
>prior to the call to fclose().

That's right - even other pointers to the same memory would be indeterminate

>The call will invalidate even the intermediate

>value...

The call doesn't invalidate it - the value is indeterminate - not INVALID
(the standard doesn't say the value HAS to change to something invalid, but
that its value is indeterminate - you (or your program) cannot determine
what it is).

>...that exists somewhere other than in a storage object.

The standard didn't say a 'storage object' is said an 'object' with
indeterminate value. One could argue that the intermediate/temporary value
is an object without a name.

Any thoughts anyone??

BTW: You didn't address the list of expressions that I gave to say which
would be undefined behaviour. For the benefit of other readers, the list
was...

p
*p
&p
*&p


p == NULL
p = NULL
(unsigned char*)p
*(unsigned char*)p

(unsigned char*)&p
*(unsigned char*)p
sizeof(p)
sizeof(*p)

memcpy(q, p, sizeof(int)) /* where q is an int* as well */


memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */

where the int* p; has indeterminate value (either unassigned or after a call
to free(p) etc).

Roger


R S Haigh

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <#glFW$H98G...@geraldo.newaygo.mi.us>, "Roger Onslow" <Roger_Ons...@compsys.com.au> writes:
> There appears to be some confusion (or at least discussion on
> interpretation) about what constitutes 'undefined behaviour' in relation to
> 'indeterminate values' according to the standard. Mostly this has been in
> relation to what happens to a pointer after 'free' or 'fclose'.
>
> [etc]

Supplementary question: is it safe to copy a struct by assignment if not
all members have been initialized? Does it make a difference if there
are uninitialized pointer members?

--


Tom Payne

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In comp.std.c Roger Onslow <Roger_Ons...@compsys.com.au> wrote:

[...]
: I (and the standard) refer to the USE of an object with


: indeterminate value (not an indeterminate object - which is not a term used
: in the standard) is undefined behaviour.

: It does NOT (interestingly enough) say anything about using the
: indeterminate value (of course, I would ASSUME myself that that would result
: in undefined behaviour, but that is not what the standard says (from my
: reading))

[...]
: >Thus, in the expression


: >
: > fclose(p) + p;
: >
: >the value of 'p' on the right side is indeterminate, even if it is fetched
: >prior to the call to fclose().

: That's right - even other pointers to the same memory would be indeterminate

[...]
: (the standard doesn't say the value HAS to change to something invalid, but


: that its value is indeterminate - you (or your program) cannot determine
: what it is).

[...]
: The standard didn't say a 'storage object' is said an 'object' with


: indeterminate value. One could argue that the intermediate/temporary value
: is an object without a name.

Perhaps provide a specific citation from the standard.

Appendix G.2 attempts to catalog the circumstances in which behavior
is undefined. What I found that seems relevant to your questions are
the following three items:

- An invalid array reference, null pointer reference, or reference
to an object declared with automatic storage duration in a
terminated block occurs (6.3.3.2).

- The value of an uninitialized object that has automatic storage
duration is used before a value is assigned (6.5.7).

- The value of a pointer that refers to space deallocated by a call
to free or realloc function is referred to (7.10.3).

Specifically, I did not find any mention of "indeterminate value".
Rather these prohibit, respectively:

- The occurrence of certain invalid pointer values.

- The use of the value of any uninitialized automatic object.

- Referring to certain other invalid pointer values.

Perhaps G.2 is incomplete.

Tom Payne

Lawrence Kirby

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <SOztunm...@geraldo.newaygo.mi.us>
Roger_Ons...@compsys.com.au "Roger Onslow" writes:

...

>It does NOT (interestingly enough) say anything about using the
>indeterminate value (of course, I would ASSUME myself that that would result
>in undefined behaviour, but that is not what the standard says (from my
>reading))

I suspect that there isn't any situation where you create an indeterminate
value other than in an object. In the case of not returning a value
from a non-void function 6.6.6.4 bypasses the indeterminate value issue
completely by simply saying:

"If a return statement without an expression is executed, and the value of
the function is used by the caller, the behaviour is undefined."

...

>The call doesn't invalidate it - the value is indeterminate - not INVALID

>(the standard doesn't say the value HAS to change to something invalid, but
>that its value is indeterminate - you (or your program) cannot determine
>what it is).
>

>>...that exists somewhere other than in a storage object.
>

>The standard didn't say a 'storage object' is said an 'object' with
>indeterminate value. One could argue that the intermediate/temporary value
>is an object without a name.
>

>Any thoughts anyone??

Don't do that - a value is *not* an object. Anyway as I noted above I don't
think this is necessary since only values stored in objects can be
indeterminate.

Tom Payne

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In comp.std.c Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
: In article <654osc$oi5$1...@skylark.ucr.edu>, Tom Payne <t...@cs.ucr.edu> wrote:
: >
: >Appendix G.2 attempts to catalog the circumstances in which behavior

: >is undefined. What I found that seems relevant to your questions are
: >the following three items: ...
: >
: >Specifically, I did not find any mention of "indeterminate value".
: >
: >Perhaps G.2 is incomplete.

: When I looked into this area, I convinced myself that the C standard DID
: include the concept of "indeterminate value", but only implicitly. G.2
: describes only one of the half-dozen forms of "undefined" that occur in
: the standard - the others are usually undefined by implication, rather
: than by specification.


The apparent prohibition on read access to pointer objects holding
invalid values seems to be a special case, introduced into the
language to accomodate architectures that trap such access to certain
registers. From first principles, I would expect the use of an
"indeterminate value" in evaluating an expression to yield an
indeterminate value but not *necessarily* lead to "undefined
behavior." In fact, if the indeterminate value were, say, the left
operand of a comma operator, I'd expect it to have no significant
effect.

Tom Payne

Nick Maclaren

unread,
Nov 21, 1997, 3:00:00 AM11/21/97
to

In article <654osc$oi5$1...@skylark.ucr.edu>, Tom Payne <t...@cs.ucr.edu> wrote:
>
>Appendix G.2 attempts to catalog the circumstances in which behavior
>is undefined. What I found that seems relevant to your questions are
>the following three items: ...
>
>Specifically, I did not find any mention of "indeterminate value".
>
>Perhaps G.2 is incomplete.

When I looked into this area, I convinced myself that the C standard DID
include the concept of "indeterminate value", but only implicitly. G.2
describes only one of the half-dozen forms of "undefined" that occur in
the standard - the others are usually undefined by implication, rather
than by specification.

One of the few places that the word "indeterminate" is used explicitly
is the return value from a valid call to signal() that returns SIGERR
when within a signal handler. But there are a fair number of others
where it is implicit (e.g. isdigit('0')) or determinate in strange ways
(e.g. the return value from ftell() on a text stream).


Nick Maclaren,
University of Cambridge Computer Laboratory,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

Nick Maclaren

unread,
Nov 22, 1997, 3:00:00 AM11/22/97
to

In article <654vo9$653$1...@skylark.ucr.edu>, Tom Payne <t...@cs.ucr.edu> wrote:
>
>The apparent prohibition on read access to pointer objects holding
>invalid values seems to be a special case, introduced into the
>language to accomodate architectures that trap such access to certain
>registers. From first principles, I would expect the use of an
>"indeterminate value" in evaluating an expression to yield an
>indeterminate value but not *necessarily* lead to "undefined
>behavior." In fact, if the indeterminate value were, say, the left
>operand of a comma operator, I'd expect it to have no significant
>effect.

I failed to find ANY use of the "indeterminate" concept that WASN'T a
special case! Yes, I agree with your summary - well, almost. There
are at least the following types of indeterminacy in the standard:

Indeterminate, but valid and meaningful (in some sense)
Indeterminate and meaningless, but valid and consistent
Indeterminate, meaningless and inconsistent, but still valid
Indeterminate and invalid, but consistent and usable
Indeterminate, invalid and inconsistent, but usable
Indeterminate, and possibly even unusable in itself

The difference between "invalid" and "unusable" can apply to all data
types, but is really only implied by the standard for pointers. A
pointer to a non-object (perhaps something that WAS an object) can be
tested against NULL - or so I read the standard. A pointer value that
is wholeheartedly unusable cannot even be tested against NULL or (your
example) delivered as the first operand of a comma operator.

Remember that some architectures will raise an exception when storing a
value that is completely invalid - and that this is NOT restricted to
floating-point. From a software engineering point of view, it would
be wonderful to see such architectures become the norm, though things
like most windowing systems would die horribly until their specifications
and coding were improved.

Roger_...@compsys.com.au

unread,
Nov 22, 1997, 3:00:00 AM11/22/97
to

In article <64tusn$p...@sjx-ixn9.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>
> In article <#glFW$H98G...@geraldo.newaygo.mi.us>,
> "Roger Onslow" <Roger_Ons...@compsys.com.au> wrote:
> [p has an indeterminate value]
> > &p
> > (unsigned char*)&p
> > sizeof(p)
> > sizeof(*p)
> > memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */
>
> The above are allowed in a strictly conforming program;
> the others are not (because they require evaluation of p).

But the standard says "use" of an OBJECT with indeterminate value - not
evaluation of an indeterinate value, or even taking evaluating an
_object_ with indeterminate value.

I agree, of course, that sizeof (in C) is not a use because it is a
compile time action - so neither object nor its value are used - only the
type.

[I really was putting in some 'trick' questions here ;-)]

> Incidentally, you don't need the parentheses in the sizeof operands.

I know - I just do it out of habit - just like some people use

return (value);

mind you, I guess you could use that to make return a macro and do
something special like debug output on return.

Roger

-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet

Tom Payne

unread,
Nov 23, 1997, 3:00:00 AM11/23/97
to

In comp.std.c Nick Maclaren <nm...@cus.cam.ac.uk> wrote:

: I failed to find ANY use of the "indeterminate" concept that WASN'T a


: special case! Yes, I agree with your summary - well, almost. There
: are at least the following types of indeterminacy in the standard:

: Indeterminate, but valid and meaningful (in some sense)
: Indeterminate and meaningless, but valid and consistent
: Indeterminate, meaningless and inconsistent, but still valid
: Indeterminate and invalid, but consistent and usable
: Indeterminate, invalid and inconsistent, but usable
: Indeterminate, and possibly even unusable in itself

Perhaps, C/C++ programmers need many words for "indeterminacy," like
the Eskimo's need for "snow". From the implementation perspective,
there are:

-- There are objects that contain, for whatever reason, a bit pattern
that does not encode a value in the *current* range of their declared
type. This is a static concept for types like float, but it is
dynamic for types like pointer and reference. For such types,
the same bit pattern can be valid one instant and not the next.
(Can it go the other way, as well?)

-- There are objects that are uninitialized, but happen to have a valid
(or invalid) bit pattern, quite by accident. In any case the programmer
doesn't have a useful way to deal with such entities.

-- There are objects that contain values that have carefully been tailored
to be random.

The question in my mind is the relationship between these various
kinds of indeterminacy and various kinds indeterminacy of behavior.

: The difference between "invalid" and "unusable" can apply to all data


: types, but is really only implied by the standard for pointers. A
: pointer to a non-object (perhaps something that WAS an object) can be
: tested against NULL - or so I read the standard. A pointer value that
: is wholeheartedly unusable cannot even be tested against NULL or (your
: example) delivered as the first operand of a comma operator.

Right. As I understand it, the Standard has been deliberately crafted
so that any read access to a dangling pointer object begets undefined
behavior. That's a bit of a special case in that a pointer value
might be valid when it is stored but invalid when it is read. The
standard seems to prohibit individually each of the possible ways of
obtaining an "invalid value," which automatically prohibits storing
an invalid value.

Tom Payne

Kaz Kylheku

unread,
Nov 23, 1997, 3:00:00 AM11/23/97
to

In article <SOztunm...@geraldo.newaygo.mi.us>,

Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
>Kaz Kylheku wrote in message <651r6g$dmc$1...@helios.crest.nt.com>...
>>>The standard says that undefined behaviour includes the use of an object
>>>with indeterminate value (Note - it say use of the object, NOT use of the
>>>indeterminate value)
>>
>>Actually I could cite several quotes from the standard which contradict
>this
>>directly.
>
>You can?
>
>Does that mean the standard is self-contradictory?
>
>>For example, after an fclose, it is the value of the pointer that is
>>indeterminate.
>
>Yeup - that is correct. The standard says the value of the pointer is
>INDETERMINATE.
>
>It also defines the USE of an OBJECT with INDETERMINATE VALUE (ie. the
>pointer) as being (one kind of) undefined behaviour.

But objects don't *have* value! They can only *represent* values. A value
arises when an object is interpreted through an expression. The expresion's
type implies the representation that is used to interpret a value.

>>It is not written that any object is indeterminate.
>

>I never said that. I (and the standard) refer to the USE of an object with


>indeterminate value (not an indeterminate object - which is not a term used
>in the standard) is undefined behaviour.
>

>It does NOT (interestingly enough) say anything about using the
>indeterminate value (of course, I would ASSUME myself that that would result
>in undefined behaviour, but that is not what the standard says (from my
>reading))

Use of the object's value and use of an object are really the same thing.
Can you use an object without interpreting its value?

It has already been acknowledged in the earlier debates that the standard
is somewhat poorly worded in these matters. At one point it actually implies
that an object has an inherent value that exists independently of the
expression used to access that object.

>The call doesn't invalidate it - the value is indeterminate - not INVALID

Well, an indeterminate pointer is invalid as well. But I apologize for
bringing in that term!

>(the standard doesn't say the value HAS to change to something invalid, but
>that its value is indeterminate - you (or your program) cannot determine
>what it is).
>
>>...that exists somewhere other than in a storage object.
>
>The standard didn't say a 'storage object' is said an 'object' with
>indeterminate value. One could argue that the intermediate/temporary value
>is an object without a name.

One could not reasonably argue that at all.

Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <64tusn$p...@sjx-ixn9.ix.netcom.com>,
Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>[p has an indeterminate value]
>> &p
>> (unsigned char*)&p
>> sizeof(p)
>> sizeof(*p)
>> memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */
>The above are allowed in a strictly conforming program;
>the others are not (because they require evaluation of p).

Oops, I missed
p = NULL
(just didn't notice it in the list). That too is allowed,
for the same reason as all the others: it doesn't require
evaluation of the indeterminately-valued prior contents of p.


Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <879956...@genesis.demon.co.uk>,

fr...@genesis.demon.co.uk (Lawrence Kirby) wrote:
I don't agree with this, if the standard says that
>an operation results in undefined behaviour then it does just that.

Good thing that's not what it says for the example in question.


Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <651r6g$dmc$1...@helios.crest.nt.com>,

k...@helios.crest.nt.com (Kaz Kylheku) wrote:
>For example, after an fclose, it is the value of the pointer that is
>indeterminate. It is not written that any object is indeterminate.

>All existing copies of that pointer are indeterminate no matter where
>they are, even if they are not stored in objects.

That's an excellent illustration of the fact that it is the
*value* that is indeterminate, not any particular object qua
object. That (along with the property that all objects can be
inspected via aliasing to an array of unsigned char) is why
memcpy() of an indeterminately-valued* addressable object is
safe.

* The adjective "indeterminately-valued" is meaningless unless
the type is specified (or implicit from the context), because
the notion of the "value" of an object is meaningless without
knowing the associated type to be used to map bit patterns to
values.

There is an unrelated situation in which an *object* itself
can be invalid, for example an auto variable that has gone
out of scope. It is possible for the dead activation record
to be inaccessible. This doesn't usually happen on simple
architectures, but it could in a truly segmented architecture.


Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <654slk$sls$1...@lyra.csx.cam.ac.uk>,

nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
>One of the few places that the word "indeterminate" is used explicitly
>is the return value from a valid call to signal() that returns SIGERR
>when within a signal handler. But there are a fair number of others
>where it is implicit (e.g. isdigit('0')) or determinate in strange ways
>(e.g. the return value from ftell() on a text stream).

What do you think is indeterminate about the value of isdigit('0')?


Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <654osc$oi5$1...@skylark.ucr.edu>,

Tom Payne <t...@cs.ucr.edu> wrote:
>Appendix G.2 attempts to catalog the circumstances in which behavior
>is undefined.

But it doesn't repeat the definition of what constitutes undefined
behavior, which definition does in fact say that accessing
indeterminate values is one way to obtain undefined behavior.
(The use of the word "object" in that context is unfortunate,
and perhaps the root of all this confusion. The important thing
is the value, so long as the object is still alive; outside their
lifetimes, objects *also* have indeterminate values, but that's a
more severe case in that not even aliasing to array of unsigned
char allows strictly-conforming access.)

Since the subject has come up, I recently completed a scan of the
C9x draft for the relevant keywords and updated Annex G (called
Annex K in the most recent C9x draft); the result (not yet official)
is posted at URLs ftp://ftp.arl.mil/arch/annex-k.ps (PostScript) and
ftp://ftp.arl.mil/arch/arch/annex-k.txt (plain ASCII text).


Douglas A. Gwyn

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <8801980...@dejanews.com>,

Roger_...@compsys.com.au wrote:
>But the standard says "use" of an OBJECT with indeterminate value -

No, it does not. Indeed, in a separate posting you cited the wording:

| * Undefined behavior --- behavior, upon use of a
| nonportable or erroneous program construct, of
| erroneous data, or of indeterminately-valued objects,
| for which the Standard imposes no requirements.

The main reason "objects" is in there is because that was the
smoothest way to phrase the definition. (Since there will
always be an object involved, it is technically correct to
include "object" in that phrase.) Your emphasis should be
on the indeterminate value, not on the object.

When you take words out of context and add your own emphasis,
it is easy to misunderstand the intent.


Tom Payne

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In comp.std.c Tom Payne <t...@cs.ucr.edu> wrote:

: In comp.std.c Roger Onslow <Roger_Ons...@compsys.com.au> wrote:

: [...]
: : I (and the standard) refer to the USE of an object with


: : indeterminate value (not an indeterminate object - which is not a term used
: : in the standard) is undefined behaviour.

: : It does NOT (interestingly enough) say anything about using the
: : indeterminate value (of course, I would ASSUME myself that that would result
: : in undefined behaviour, but that is not what the standard says (from my
: : reading))

[...]
: Perhaps provide a specific citation from the standard.

: Appendix G.2 attempts to catalog the circumstances in which behavior
: is undefined. What I found that seems relevant to your questions are
: the following three items:

: - An invalid array reference, null pointer reference, or reference


: to an object declared with automatic storage duration in a
: terminated block occurs (6.3.3.2).

: - The value of an uninitialized object that has automatic storage
: duration is used before a value is assigned (6.5.7).

: - The value of a pointer that refers to space deallocated by a call
: to free or realloc function is referred to (7.10.3).

: Specifically, I did not find any mention of "indeterminate value".
: Rather these prohibit, respectively:

Oops! According to 3.16:

undefined behavior: Behavior, upon use of a nonportable or erroneous
program construct, of erroneous data, or of indeterminately valued
objects, for which this International Standard imposes no
requirements.

This seems to contradict my (earlier) understanding that the
prohibitions on read access to pointer objects holding invalid values
were somehow special.

I am, however, a bit puzzled by the distinction between "erroneous
data" and an indeterminate value.

Tom Payne

Kaz Kylheku

unread,
Nov 24, 1997, 3:00:00 AM11/24/97
to

In article <880146...@genesis.demon.co.uk>,

Lawrence Kirby <fr...@genesis.demon.co.uk> wrote:
>In article <SOztunm...@geraldo.newaygo.mi.us>
> Roger_Ons...@compsys.com.au "Roger Onslow" writes:
>>The standard didn't say a 'storage object' is said an 'object' with
>>indeterminate value. One could argue that the intermediate/temporary value
>>is an object without a name.
>>
>>Any thoughts anyone??
>
>Don't do that - a value is *not* an object. Anyway as I noted above I don't
>think this is necessary since only values stored in objects can be
>indeterminate.

What about fclose(p) + p?

The right hand operand could be fetched before the call to fclose. After being
fetched, it exists somewhere as a value independent of the underlying object.
Yet the call to fclose() can invalidate that value.

Craig Franck

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>In article <8801980...@dejanews.com>,
> Roger_...@compsys.com.au wrote:
>>But the standard says "use" of an OBJECT with indeterminate value -
>
>No, it does not. Indeed, in a separate posting you cited the wording:
>
>| * Undefined behavior --- behavior, upon use of a

>| nonportable or erroneous program construct, of
>| erroneous data, or of indeterminately-valued objects,
>| for which the Standard imposes no requirements.
>
>The main reason "objects" is in there is because that was the
>smoothest way to phrase the definition. (Since there will
>always be an object involved, it is technically correct to
>include "object" in that phrase.) Your emphasis should be
>on the indeterminate value, not on the object.

That may be true, but I can still see how an indeterminately-valued
object can have an indeterminate value, *regardless* of what it's
current state is. The idea is *any* value it has is indeterminate.
Imagine an indeterminate region of memory: Read its value once, get
one value, read it again, get another. Which one is correct? The
values are valid; you just can't determine what that value is. To
use a low level analogy, if you clocked RAM so fast that the data
bus didn't have time to settle you would have an indeterminately-
valued object located at any given byte. You can use memcpy all you
like, but you'd just be transfering gibberish.

>When you take words out of context and add your own emphasis,
>it is easy to misunderstand the intent.

I think most people internally add their own emphasis to just about
everything.

--
Craig
clfr...@worldnet.att.net
Manchester, NH
I try to take one day at a time, but sometimes several
days attack me at once. -- Ashleigh Brilliant


Douglas A. Gwyn

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65cct3$58$1...@skylark.ucr.edu>,

Tom Payne <t...@cs.ucr.edu> wrote:
>I am, however, a bit puzzled by the distinction between "erroneous
>data" and an indeterminate value.

Erroneous data has a well-defined value, but does not meet the
preconditions for usage in some context. E.g.
char foo[3];
strcpy(foo,"bar"); // erroneous


Oon Lin

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to


Hi , I have programmed in C for a few years now and had been using
pointers very frequently. But suddenly out of the blue it just struck me..

What EXACTLY is a pointer ??
I know that a variable is actually a memory location which stores values /
data and that a pointer points to a memory location. But , does a pointer
itself actually is a variable but it is special where it stores the
address of a memory location ? Does the compiler allocate any memory for
a pointer like it does for a variable ? Would the size of memory
allocation be the same ?

I tried the following simple codes to see if there's any difference :

// Program with variables
#include <stdio.h>

int main(void) {
int a ;
int b ;
int c ;

return ;
}

// Program with pointers
#include <stdio.h>

int main(void) {
int *a ;
int *b ;
int *c ;


return ;
}

BOTH OF THESE TWO PROGRAMS COMPILED TO OBJECT AND EXECUTABLE CODES WITH
DIFFERENT SIZES. IT SEEMS LIKE THE PROGRAM WITH POINTER ACTUALLY COMPILED
TO A CODE WITH BIGGER SIZE OF MY MACHINE.

CAN ANYONE HELP ?

THANKS !

KEAN


***************************************************************************
Lin Oon Kean * Kean's Corner In The World Wide Web : *
email : Oon...@jcu.edu.au * http://lionfish.jcu.edu.au/~sci-okl *
Bachelor Of Science (Comp.) ******************************************
James Cook University Of North Queensland * oon...@cyberdude.com
***************************************************************************
"Computer Science is the recipe for stress and freaking out"
**************************************************************************

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65apap$l...@sjx-ixn6.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>
> In article <8801980...@dejanews.com>,
> Roger_...@compsys.com.au wrote:
> >But the standard says "use" of an OBJECT with indeterminate value -
>
> No, it does not.

Yes it does - well it says "use ... of an indeterminately-valued object"
(if that means anything different to 'an object with indeterminate value'
then I'd like to know what).

In any case it is the use of the object that is undefined behaviour - NOT
the use of the value.

> Indeed, in a separate posting you cited the wording:
> | * Undefined behavior --- behavior, upon use of a
> | nonportable or erroneous program construct, of
> | erroneous data, or of indeterminately-valued objects,
> | for which the Standard imposes no requirements.
>
> The main reason "objects" is in there is because that was the
> smoothest way to phrase the definition. (Since there will
> always be an object involved, it is technically correct to
> include "object" in that phrase.) Your emphasis should be
> on the indeterminate value, not on the object.

No, The STANDARD's emphasis should have been on the indeterminate value -
its not MY wording or emphsasis that is wrong here. It would have been
just as 'smooth' to have said ".. or of an indeterminate value". But the
standard does NOT say that, it says "or of indeterminately valued
objects". Its pretty clear and obvious

Surely those who take the time to write standards so that the intent is
clear and whose language is precise could have chosen much clearer
working if that had NOT been the intent.

> When you take words out of context and add your own emphasis,
> it is easy to misunderstand the intent.

It was NOT taken out of context (what could be more in context than the
definition of undefinedd behaviour) and it was NOT my own emphasis.

Taking words out of context and adding your own emphasis seems to be
EXACTLY what you have done here - by saying the word 'objects' is just
there to make the sentence sound smooth.

Sounds a pretty lame excuse for very poor wording of the standard (if the
'intent' was not as it was written)

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65dgen$r...@bgtnsc02.worldnet.att.net>,

Craig Franck <clfr...@worldnet.att.net> wrote:
>
> Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
> >In article <8801980...@dejanews.com>,
> > Roger_...@compsys.com.au wrote:
> >>But the standard says "use" of an OBJECT with indeterminate value -
> >
> >No, it does not. Indeed, in a separate posting you cited the wording:

> >
> >| * Undefined behavior --- behavior, upon use of a
> >| nonportable or erroneous program construct, of
> >| erroneous data, or of indeterminately-valued objects,
> >| for which the Standard imposes no requirements.
> >
> >The main reason "objects" is in there is because that was the
> >smoothest way to phrase the definition. (Since there will
> >always be an object involved, it is technically correct to
> >include "object" in that phrase.) Your emphasis should be
> >on the indeterminate value, not on the object.
>
> That may be true, but I can still see how an indeterminately-valued
> object can have an indeterminate value, *regardless* of what it's
> current state is. The idea is *any* value it has is indeterminate.
> Imagine an indeterminate region of memory: Read its value once, get
> one value, read it again, get another. Which one is correct? The
> values are valid; you just can't determine what that value is. To
> use a low level analogy, if you clocked RAM so fast that the data
> bus didn't have time to settle you would have an indeterminately-
> valued object located at any given byte. You can use memcpy all you
> like, but you'd just be transfering gibberish.

I think you are missing what the concept of 'indeterminate value' is in
the context of the standard.

It means (AFAIK) that you cannot predict (in theory) what the value will
be. It may be a valid value that you might 'expect' - or it could be some
other valid value, or it could be an invalid value (eg. if ints are
stored in BCD, then it might be an invalid bcd word).

> >When you take words out of context and add your own emphasis,
> >it is easy to misunderstand the intent.
>

> I think most people internally add their own emphasis to just about
> everything.

hmm - now how should that read.. I think most people INTERNALLY add their
own emphasis to just about everything. I think MOST people internally add
their own emphasis to just about everything. I think most people
internally add their own emphasis to JUST ABOUT everything. I think most
PEOPLE internally add their own emphasis to jsut about everything. ...
:-)

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65an0k$3...@sjx-ixn1.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>
> In article <64tusn$p...@sjx-ixn9.ix.netcom.com>,

> Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
> >[p has an indeterminate value]
> >> &p
> >> (unsigned char*)&p
> >> sizeof(p)
> >> sizeof(*p)
> >> memcpy(&q, &p, sizeof(int*)) /* where q is an int* as well */
> >The above are allowed in a strictly conforming program;
> >the others are not (because they require evaluation of p).
>
> Oops, I missed
> p = NULL
> (just didn't notice it in the list). That too is allowed,
> for the same reason as all the others: it doesn't require
> evaluation of the indeterminately-valued prior contents of p.

Mind you, as I previously noted, the standard doesn't say that
'evaluation of indeterminately valued contents' is undefined behaviour
but the 'use of indeterminately-valued objects' is.

It depends on what is 'defined' as "use".

I personally would expect it to REALLY mean 'use as an r-value'

Does the standard define what 'use' of an object means??

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65ag07$ieq$1...@skylark.ucr.edu>,

But...

The standard still defines the 'use .. of indeterminately-valued objects'
as (one kind of) undefined behaviour.

So indeterminate really means un-usable - not matter what. And as such
it doesn't matter whether the actual value is valid or invalid or
meaningful or what - you cannot (in a standard conforming program) use
the object anyway.

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65aokk$d...@dfw-ixnews5.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
[snip]

> (The use of the word "object" in that context is unfortunate,
> and perhaps the root of all this confusion. The important thing
> is the value, so long as the object is still alive; outside their
> lifetimes, objects *also* have indeterminate values, but that's a
> more severe case in that not even aliasing to array of unsigned
> char allows strictly-conforming access.)

It is very unfortunate, if that is not what was meant. How did that slip
through?

> Since the subject has come up, I recently completed a scan of the
> C9x draft for the relevant keywords and updated Annex G (called
> Annex K in the most recent C9x draft); the result (not yet official)
> is posted at URLs ftp://ftp.arl.mil/arch/annex-k.ps (PostScript) and
> ftp://ftp.arl.mil/arch/arch/annex-k.txt (plain ASCII text).

I cannot get access to either of those sites. I would be very interested
to read the relevant sections (eg. where undefined behaviour is defined)
- could you please post them?

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65cct3$58$1...@skylark.ucr.edu>,

Tom Payne <t...@cs.ucr.edu> wrote:
> Oops! According to 3.16:
>
> undefined behavior: Behavior, upon use of a nonportable or erroneous
> program construct, of erroneous data, or of indeterminately valued
> objects, for which this International Standard imposes no
> requirements.
>
> This seems to contradict my (earlier) understanding that the
> prohibitions on read access to pointer objects holding invalid values
> were somehow special.

That's right !!

> I am, however, a bit puzzled by the distinction between "erroneous
> data" and an indeterminate value.

Prehaps "erroneous data" is bad input for the program??

Also an indeterminate value is not the same as a value with bad data in
it.

With an indeterminate value, you (literally) cannot determine WHAT is in
it (valid or not).

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65alsi$2ot$1...@helios.crest.nt.com>,

k...@helios.crest.nt.com (Kaz Kylheku) wrote:
> In article <SOztunm...@geraldo.newaygo.mi.us>,
> Roger Onslow <Roger_Ons...@compsys.com.au> wrote:
> >Kaz Kylheku wrote in message <651r6g$dmc$1...@helios.crest.nt.com>...
> >>>The standard says that undefined behaviour includes the use of an object
> >>>with indeterminate value (Note - it say use of the object, NOT use of the
> >>>indeterminate value)
> >>Actually I could cite several quotes from the standard which contradict
> >>this directly.
> >You can?
> >Does that mean the standard is self-contradictory?
> >>For example, after an fclose, it is the value of the pointer that is
> >>indeterminate.
> >Yeup - that is correct. The standard says the value of the pointer is
> >INDETERMINATE.
> >It also defines the USE of an OBJECT with INDETERMINATE VALUE (ie. the
> >pointer) as being (one kind of) undefined behaviour.
> But objects don't *have* value! They can only *represent* values. A value
> arises when an object is interpreted through an expression. The expresion's
> type implies the representation that is used to interpret a value.

So what was the standard talking about then?

The standard does say "The value of a pointer that refers to freed space
is indeterminate". That could certainly be read as saying the pointer is
indeterminately-valued (because its value is indeterminate).

> >>It is not written that any object is indeterminate.

> >I never said that. I (and the standard) refer to the USE of an object with


> >indeterminate value (not an indeterminate object - which is not a term used
> >in the standard) is undefined behaviour.
> >It does NOT (interestingly enough) say anything about using the
> >indeterminate value (of course, I would ASSUME myself that that would result
> >in undefined behaviour, but that is not what the standard says (from my
> >reading))

> Use of the object's value and use of an object are really the same thing.
> Can you use an object without interpreting its value?

Yes - you can assign to it, or take its address.

> It has already been acknowledged in the earlier debates that the standard
> is somewhat poorly worded in these matters. At one point it actually implies
> that an object has an inherent value that exists independently of the
> expression used to access that object.

The hide of it !!! :-)

> >The call doesn't invalidate it - the value is indeterminate - not INVALID
> Well, an indeterminate pointer is invalid as well. But I apologize for
> bringing in that term!

no - it isn't invalid. It _could_ still be a valid value, but you cannot
determinet what it will be and a standard conforming program wouldn't be
able to tell because looking will cause undefined behaviour. Its a bit
like shroedinger's cat.

> >(the standard doesn't say the value HAS to change to something invalid, but
> >that its value is indeterminate - you (or your program) cannot determine
> >what it is).
> >>...that exists somewhere other than in a storage object.

> >The standard didn't say a 'storage object' is said an 'object' with
> >indeterminate value. One could argue that the intermediate/temporary value
> >is an object without a name.

> One could not reasonably argue that at all.

Then does that means that the use of an intermeditate/temporary value
that is indeterminate does NOT constitue undefined behviour, because
there is no 'indeterminately-valued object' to use?

Roger_...@compsys.com.au

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <65anvh$n...@sjx-ixn10.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
[snip]
> * The adjective "indeterminately-valued" is meaningless unless
> the type is specified (or implicit from the context)

So (that part of) the definition of undefined behaviour is meaningless.

This ain't looking too good for the poor old standard (or those that
wrote it).

Steve W. Jackson

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to Oon Lin

[Posted and mailed]

In article <Pine.OSF.3.93.971125...@sailfish.jcu.edu.au>,
Oon Lin <sci...@jcu.edu.au> writes:
;>
;>
;> Hi , I have programmed in C for a few years now and had been using


;> pointers very frequently. But suddenly out of the blue it just struck me..
;>
;> What EXACTLY is a pointer ??
;> I know that a variable is actually a memory location which stores values /
;> data and that a pointer points to a memory location. But , does a pointer
;> itself actually is a variable but it is special where it stores the
;> address of a memory location ? Does the compiler allocate any memory for
;> a pointer like it does for a variable ? Would the size of memory
;> allocation be the same ?

You basically answered your own question above. A variable is a named way
of referencing a memory location of a specified type. When you say "int a"
you're using the name "a" to refer to a memory location of your system's
choosing which is of type "int" and occupies whatever amount of storage
an "int" requires. When you say "int *p" you're creating a name "p" to
refer to a location, also -- but this location is an address itself, which
indicates the location of memory which is intended to contain a value of
type "int". Every variable you declare must have memory allocated for it,
and pointers are no exception. Just as "a" had enough memory allocated
to hold an "int" value, "p" must have enough memory allocated to hold a
value of whatever size your system requires for memory addresses (pointer
to "int", in this specific case).

;>
;> I tried the following simple codes to see if there's any difference :


;>
;> // Program with variables
;> #include <stdio.h>
;>
;> int main(void) {
;> int a ;
;> int b ;
;> int c ;
;>
;> return ;
;> }
;>
;> // Program with pointers
;> #include <stdio.h>
;>
;> int main(void) {
;> int *a ;
;> int *b ;
;> int *c ;
;>
;>
;> return ;
;> }
;>
;> BOTH OF THESE TWO PROGRAMS COMPILED TO OBJECT AND EXECUTABLE CODES WITH
;> DIFFERENT SIZES. IT SEEMS LIKE THE PROGRAM WITH POINTER ACTUALLY COMPILED
;> TO A CODE WITH BIGGER SIZE OF MY MACHINE.

What this means is that the size of a memory address on your system is a
larger size than that of an "int" value. If you're on a machine with a
32-bit word, say, an "int" might require a word (32 bits) of storage. On
that same system, though, with modern memory capacities, you might need a
64-bit location to hold the memory address of any dynamically allocated
memory given to you -- regardless of its type.

;>
;> CAN ANYONE HELP ?


;>
;> THANKS !
;>
;> KEAN
;>
;>
;> ***************************************************************************
;> Lin Oon Kean * Kean's Corner In The World Wide Web : *
;> email : Oon...@jcu.edu.au * http://lionfish.jcu.edu.au/~sci-okl *
;> Bachelor Of Science (Comp.) ******************************************
;> James Cook University Of North Queensland * oon...@cyberdude.com
;> ***************************************************************************
;> "Computer Science is the recipe for stress and freaking out"
;> **************************************************************************

;>
;>

I hope I didn't ramble meaninglessly and that this is in some way
helpful to you.

= Steve =

--
+-------------------------+-----------------------------------+
| Steve W. Jackson | CACI, Inc - FEDERAL |
| Project Manager | Software Revitalization Division |
| 334-244-7400 | 600 Interstate Park Drive |
| 334-244-7447 Fax | Suite 623 |
| sjac...@mgm.caci.com | Montgomery, AL 36109 USA |
+-------------------------+-----------------------------------+

hami...@soapnotes.com

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

A pointer is a variable that holds an address and nothing else


Alicia Carla Longstreet

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to Oon Lin

Oon Lin wrote:
>
> Hi , I have programmed in C for a few years now and had been using
> pointers very frequently. But suddenly out of the blue it just struck me..

> What EXACTLY is a pointer ??

A pointer is a variable whose contents refer to a location within the
platforms address space. What it is EXACTLY is platform dependent.
Typically it is a memory address. It can effectively be used to
indirectly reference a different data object.

> I know that a variable is actually a memory location which stores values /
> data and that a pointer points to a memory location. But , does a pointer
> itself actually is a variable but it is special where it stores the
> address of a memory location ? Does the compiler allocate any memory for
> a pointer like it does for a variable ? Would the size of memory
> allocation be the same ?

Generally storage is allocated for a pointer. I can not think of an
architecture where this would not be true. This in no way implies that
such an architecture does not exist. Nor does it imply that such an
architecture may not be developed at some point in the future.

The need to know the specifics of what a pointer is and how it functions
is platform dependent. You can obtain more indepth information if you
ask on a newsgroup dedicated to the platform you want to know about.

See the trailer for a list of newgroups.



> I tried the following simple codes to see if there's any difference :

> // Program with variables
> #include <stdio.h>

> int main(void) {
> int a ;
> int b ;
> int c ;
> return ;
> }

> // Program with pointers
> #include <stdio.h>
> int main(void) {
> int *a ;
> int *b ;
> int *c ;
> return ;
> }

> BOTH OF THESE TWO PROGRAMS COMPILED TO OBJECT AND EXECUTABLE CODES WITH
> DIFFERENT SIZES. IT SEEMS LIKE THE PROGRAM WITH POINTER ACTUALLY COMPILED
> TO A CODE WITH BIGGER SIZE OF MY MACHINE.

On my platform (Intel Pentium running Windows 95) a pointer is 32 bits
in size. Depending on the compiler I use an int could be either 16 bits
in size or 32 bits in size. Try this program:

#include <stdio.h>
#include <stdlib.h>

int main( void ) {
printf( "Size of Pointer:%d\n", sizeof(void*) );
printf( "Size of int:%d\n", sizeof(int) );
printf( "Size of float:%d\n", sizeof(float) );
printf( "Size of double:%d\n", sizeof(double) );
printf( "Size of char:%d\n", sizeof(char) );
printf( "Size of pointer to char:%d\n", sizeof(char*) );
printf( "Size of pointer to long:%d\n", sizeof(long*) );
/* Add more if you want. */
return (EXIT_SUCCESS);
}

Observe the results on your platform with your compiler. This may
explain your variance.

f:\learnc>lcc vsizes.c

f:\learnc>lcclnk vsizes.obj

f:\learnc>vsizes.exe
Size of Pointer:4
Size of int:4
Size of float:4
Size of double:8
Size of char:1
Size of pointer to char:4
Size of pointer to long:4

f:\learnc>

--
*****************************************************************
* Very funny Scotty, now beam down my clothes.
* Puritanism: the haunting fear that someone, somewhere may be happy.
* Ever stop to think and forget to start again?
* Keep honking...I'm reloading.
=========================================
Alicia Carla Longstreet ca...@ici.net
=========================================
READ THE FAQ for more information:
C-FAQ ftp sites: ftp://ftp.eskimo.com or ftp://rtfm.mit.edu
Hypertext C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Newsgroups:
comp.os.msdos.djgpp The DJGPP Port of Gnu C
gnu.gcc The gcc compiler
comp.lang.asm.x86 Intel Assembler
comp.os.msdos.programmer DOS O/S Related Issues
comp.os.ms-windows.programmer.misc MS/Windows Programming
comp.os.os2.programmer.misc OS/2 Programming
comp.sys.mac.programmer.misc Macintosh Programming
comp.os.linux.misc General Linux Questions
comp.unix.programmer General Unix Questions
comp.unix.[vendor] Various Unix vendors

Dan Pop

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

>Hi , I have programmed in C for a few years now and had been using
>pointers very frequently. But suddenly out of the blue it just struck me..
>
>What EXACTLY is a pointer ??

A variable, pretty much like any other variable.

>I know that a variable is actually a memory location which stores values /
>data and that a pointer points to a memory location. But , does a pointer
>itself actually is a variable but it is special where it stores the
>address of a memory location ?

Yes. It is also special because it can be used as an operand for the
unary * operator (ordinary variables don't have this property).

>Does the compiler allocate any memory for
>a pointer like it does for a variable ?

Yes.

>Would the size of memory allocation be the same ?

Not necessarily. On most systems, the size of a pointer is either the
same as the size of an int or a long. But you don't need to rely on
that, if you need to know the size of a pointer, just use the sizeof
operator on that pointer.

>I tried the following simple codes to see if there's any difference :
>

>int main(void) {
>int a ;
>int b ;
>int c ;
>
> return ;
>}
>

>int main(void) {
>int *a ;
>int *b ;
>int *c ;
>
> return ;
>}
>
>BOTH OF THESE TWO PROGRAMS COMPILED TO OBJECT AND EXECUTABLE CODES WITH
>DIFFERENT SIZES. IT SEEMS LIKE THE PROGRAM WITH POINTER ACTUALLY COMPILED
>TO A CODE WITH BIGGER SIZE OF MY MACHINE.

There is no need to shout, we can hear you. You can't draw any conclusion
whatsoever from the size of the executable codes. Since both your
programs are the functional equivalent to:

int main() {}

a compiler could just produce the same code, and this would give you no
clue about the sizes of pointers vs int's.

Here's a better way:

dxplus02:~/tmp 125> cat sizes.c
#include <stdio.h>

int main()
{
printf("char:\t%ld (obviously)\n", (long)sizeof(char));
printf("short:\t%ld\n", (long)sizeof(short));
printf("int:\t%ld\n", (long)sizeof(int));
printf("long:\t%ld\n", (long)sizeof(long));
printf("void *:\t%ld\n", (long)sizeof(void *));
return 0;
}
dxplus02:~/tmp 126> cc sizes.c
dxplus02:~/tmp 127> ./a.out
char: 1 (obviously)
short: 2
int: 4
long: 8
void *: 8

On your system you may get different results, e.g.

ues1:~/tmp 27> ./a.out
char: 1 (obviously)
short: 2
int: 4
long: 4
void *: 4

On the vast majority of implementations, all pointers have the same size,
regardless of the type they point to, but you should not rely on that.
All that is guaranteed is that void pointers and char pointers have the
same size and representation.

Dan
--
Dan Pop
CERN, IT Division
Email: Dan...@cern.ch
Mail: CERN - PPE, Bat. 31 1-014, CH-1211 Geneve 23, Switzerland

Kaz Kylheku

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

In article <Pine.OSF.3.93.971125...@sailfish.jcu.edu.au>,

Oon Lin <sci...@jcu.edu.au> wrote:
>
>
>Hi , I have programmed in C for a few years now and had been using
>pointers very frequently. But suddenly out of the blue it just struck me..
>
>What EXACTLY is a pointer ??

A pointer belongs to the family of scalar types, which also includes the
integral types and floating point types (these latter two are called
arithmetic types to distinguish them from pointers).

When stored in a storage object, a pointer value has some kind of concrete
representation, the same way that an integer or floating point number have
a representation. That representation is basically an assignment of meaning
to various bit patterns stored in that object.

In most C implementations, pointers are simply represented using natural
machine addresses, which are basically binary numbers. In segmented mainframe
architectures, a pointer might consist of two fields: an object selector field,
and an offset into that object. The selector is an index into a descriptor
table, which holds the actual physical address of each described object, as
well as information about its size, access permissions and so on.

Whatever representation is chosen, it must somehow support the idea of pointer
arithmetic. C gives you the ability to subtract two pointers of like type, or
to displace a pointer by adding an integer to it. These things are readily
possible if a pointer is represented as binary machine address. They are also
possible if the pointer is represented as a selector/offset, because the
arithmetic can be done using the offset portion of a pointer.

>address of a memory location ? Does the compiler allocate any memory for
>a pointer like it does for a variable ? Would the size of memory


>allocation be the same ?

In translating a definition of a pointer varaible, the C implementation
allocates as much storage for the object as is required to represent a pointer
of the given type. It must be remembered that pointers to different types need
not have the same size and representation. But the objects are real, and have a
size.

>I tried the following simple codes to see if there's any difference :
>

>// Program with variables
>#include <stdio.h>
>

>int main(void) {
>int a ;
>int b ;
>int c ;
>
> return ;
>}
>

>// Program with pointers
>#include <stdio.h>
>

>int main(void) {
>int *a ;
>int *b ;
>int *c ;
>
>
> return ;
>}
>
>BOTH OF THESE TWO PROGRAMS COMPILED TO OBJECT AND EXECUTABLE CODES WITH
>DIFFERENT SIZES. IT SEEMS LIKE THE PROGRAM WITH POINTER ACTUALLY COMPILED
>TO A CODE WITH BIGGER SIZE OF MY MACHINE.

This does not tell you anything, because in both cases the variables in
question are local to the function main. Local variables are not normally
allocated in the program's translated image and hence do not contribute
to its size. They are allocated dynamically (at run time) when their enclosing
statement block is executed and disposed of when their enclosing block
terminates. This is called ``automatic storage''.

Why is the second program bigger? Who knows.

Even though this is probably not the reason why the second
program is bigger, it is possible that values of type ``pointer to int''
require more storage than values of type ``int''. This is not unheard of. For
example, on the Motorola 68000 architecture, a natural representation for the
type int is a 16 bit word which requires two bytes. But pointers (to all types)
are naturally represented as 32 bit quantities requiring four bytes.

Wny don't you try this test instead?

#include <stdio.h>

main()
{
printf("size of an int: %lu\n", (unsigned long) sizeof(int));
printf("size of an int *: %lu\n", (unsigned long) sizeof(int *));
return 0;
}

Craig Franck

unread,
Nov 25, 1997, 3:00:00 AM11/25/97
to

Roger_...@compsys.com.au wrote:
>In article <65dgen$r...@bgtnsc02.worldnet.att.net>,
> Craig Franck <clfr...@worldnet.att.net> wrote:

>> That may be true, but I can still see how an indeterminately-valued
>> object can have an indeterminate value, *regardless* of what it's
>> current state is. The idea is *any* value it has is indeterminate.
>> Imagine an indeterminate region of memory: Read its value once, get
>> one value, read it again, get another. Which one is correct? The
>> values are valid; you just can't determine what that value is. To
>> use a low level analogy, if you clocked RAM so fast that the data
>> bus didn't have time to settle you would have an indeterminately-
>> valued object located at any given byte. You can use memcpy all you
>> like, but you'd just be transfering gibberish.
>
>I think you are missing what the concept of 'indeterminate value' is in
>the context of the standard.
>
>It means (AFAIK) that you cannot predict (in theory) what the value will
>be. It may be a valid value that you might 'expect' - or it could be some
>other valid value, or it could be an invalid value (eg. if ints are
>stored in BCD, then it might be an invalid bcd word).

I realize that's the case. The value of an indeterminately-valued
object may be:

[1] An bit-pattern invalid for an object of that type.
[2] A valid bit-pattern for an object of that type.
[3] A valid bit-pattern for an object of that type that changes
on successive accesses.
[4] Some other bit-pattern that has nothing to do with the current
bit-pattern at the location of that object; the compiler could
just make it so that an ideterminately-valued object is all 0s,
or all 1's, ect.

Steve mouatt

unread,
Nov 26, 1997, 3:00:00 AM11/26/97
to

Steve W. Jackson wrote:
>
> [Posted and mailed]
>
> In article <Pine.OSF.3.93.971125...@sailfish.jcu.edu.au>,
> Oon Lin <sci...@jcu.edu.au> writes:
> ;>
> ;>
> ;> Hi , I have programmed in C for a few years now and had been using

> ;> pointers very frequently. But suddenly out of the blue it just struck me..
> ;>
> ;> What EXACTLY is a pointer ??

Whilst in C terms it is a variable as you, and others suggest in
physical terms it may also be a memory location or it might in fact be
simply the contents of a register for a transient moment.

Not only is the physical manifestation platform dependant it is also
compiler and optimiser dependant.

Lawrence Kirby

unread,
Nov 26, 1997, 3:00:00 AM11/26/97
to

In article <65apap$l...@sjx-ixn6.ix.netcom.com>

gw...@ix.netcom.com "Douglas A. Gwyn" writes:

>In article <8801980...@dejanews.com>,
> Roger_...@compsys.com.au wrote:
>>But the standard says "use" of an OBJECT with indeterminate value -
>
>No, it does not. Indeed, in a separate posting you cited the wording:
>

>| * Undefined behavior --- behavior, upon use of a


>| nonportable or erroneous program construct, of

>| erroneous data, or of indeterminately-valued objects,
>| for which the Standard imposes no requirements.
>
>The main reason "objects" is in there is because that was the
>smoothest way to phrase the definition.

What would be wrong with changing "or of indeterminately valued objects"
to "or of indeterminate values"?

To me it looks simpler, smoother and more accurate.

> (Since there will
>always be an object involved, it is technically correct to
>include "object" in that phrase.) Your emphasis should be
>on the indeterminate value, not on the object.

Then the standard should have a similar emphasis.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------


Lawrence Kirby

unread,
Nov 26, 1997, 3:00:00 AM11/26/97
to

In article <8804615...@dejanews.com> Roger_...@compsys.com.au writes:

...

>I think you are missing what the concept of 'indeterminate value' is in
>the context of the standard.
>
>It means (AFAIK) that you cannot predict (in theory) what the value will
>be. It may be a valid value that you might 'expect' - or it could be some
>other valid value, or it could be an invalid value (eg. if ints are
>stored in BCD, then it might be an invalid bcd word).

It means more than that - you can't even look at the value so there's
no point even trying to guess what the value might be.

Douglas A. Gwyn

unread,
Nov 27, 1997, 3:00:00 AM11/27/97
to

In article <880565...@genesis.demon.co.uk>,

fr...@genesis.demon.co.uk (Lawrence Kirby) wrote:
>What would be wrong with changing "or of indeterminately valued objects"
>to "or of indeterminate values"?

There would not be anything significantly wrong with making that change,
which you should feel free to suggest in a public comment.


Douglas A. Gwyn

unread,
Nov 27, 1997, 3:00:00 AM11/27/97
to

In article <8804625...@dejanews.com>,

Roger_...@compsys.com.au wrote:
>So indeterminate really means un-usable - not matter what. And as such
>it doesn't matter whether the actual value is valid or invalid or
>meaningful or what - you cannot (in a standard conforming program) use
>the object anyway.

If you would stop spouting your misconceptions and pay attention to
the explanation(s) I (and Kaz, and maybe others) have been giving,
perhaps you would understand this issue by now. FORGET THE WORD
"OBJECT" IN 3.16; YOUR FIXATION ON "OBJECT" HAS LED YOU ASTRAY.

"Value" has meaning only in association with a *type*; if the same
object (representation) is accessed using a different type, it is
possible for a different, possibly valid, value to result. If the
other type is inappropriate, undefined behavior results; however,
during the lifetime of an object, a strictly conforming program
can always access the bytes of an addressable object via aliasing
to array of unsigned char, a la memcpy().

An attempt to use the value of an indeterminately-valued object
is certainly one way to obtain undefined behavior, but this has
nothing to do with the nature of objectness -- it has everything
to do with the nature of valueness. Such an object qua object
may be just fine, it's just that its value (according to some
type, e.g. the declared type) is at that moment not a valid
value for whatever reason.

There are also "objects" that are no longer valid *as objects*,
for example auto variables that have gone out of scope. They
also are considered to "have indeterminate values", but for a
different reason than in the case of invalidated pointer values.
Such objects may *not* be memcpy()ed, because they don't exist.


Lawrence Kirby

unread,
Nov 27, 1997, 3:00:00 AM11/27/97
to

In article <65j5ph$n...@sjx-ixn10.ix.netcom.com>

gw...@ix.netcom.com "Douglas A. Gwyn" writes:

Well, I've already made the comment in public! :-)

Seriously, what are the requirements of a public comment that would actually
bring it to the attention of the committee? I imagine most submissions
are channelled through representatives. Maybe having a tame Peter Seebach
would help. :-) Presumaby the committee would be looking for this sort of
comment in response to the C9X draft when it comes out, I guess there's
not much point in trying to fix the current form of C89 now.

Peter Seebach

unread,
Nov 27, 1997, 3:00:00 AM11/27/97
to

In article <65j5ie$e...@sjx-ixn1.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>possible for a different, possibly valid, value to result. If the
>other type is inappropriate, undefined behavior results; however,
>during the lifetime of an object, a strictly conforming program
>can always access the bytes of an addressable object via aliasing
>to array of unsigned char, a la memcpy().

Here's a rare thing: Members of the committee disagreeing.

If I say
{
unsigned char x;
printf("%c\n", x);
}
I have invoked undefined behavior, because x had indeterminate value.

The promise that you can access something as an array of unsigned char
does not affect or qualify 3.16's prohibition on accessing indeterminately
valued objects. It is a qualification on the rule that trying to look
at an object through the "wrong" type is undefined behavior.

Indeterminately valued things are, so far as I can tell, *not* protected
in any way by this. Access to them, by any means, is undefined behavior.

>An attempt to use the value of an indeterminately-valued object
>is certainly one way to obtain undefined behavior, but this has
>nothing to do with the nature of objectness -- it has everything
>to do with the nature of valueness.

Hmm. Perhaps this can be made consistent in terms of the current
standard after all:

When I accesss an indeterminately valued object, i.e., stack garbage,
I am accessing something which, as uninitialized space, is "always"
indeterminate.

When I access a pointer to freed space through an lvalue of type
unsigned char, I am accessing something which is only indeterminately
valued *as a pointer*.

Hmm.

-s
--
se...@plethora.net -- I am not speaking for my employer. Copyright '97
All rights reserved. This was not sent by my cat. C and Unix wizard -
send mail for help, or send money for a consultation. Visit my new ISP
<URL:http://www.plethora.net/> --- More Net, Less Spam! Plethora . Net

Lawrence Kirby

unread,
Nov 28, 1997, 3:00:00 AM11/28/97
to

In article <65ktv8$gbf$8...@darla.visi.com>
se...@plethora.net "Peter Seebach" writes:

...

>Here's a rare thing: Members of the committee disagreeing.
>
>If I say
> {
> unsigned char x;
> printf("%c\n", x);
> }
>I have invoked undefined behavior, because x had indeterminate value.

Right, I don't think that there is any doubt about that.

...

>>An attempt to use the value of an indeterminately-valued object
>>is certainly one way to obtain undefined behavior, but this has
>>nothing to do with the nature of objectness -- it has everything
>>to do with the nature of valueness.
>
>Hmm. Perhaps this can be made consistent in terms of the current
>standard after all:
>
>When I accesss an indeterminately valued object, i.e., stack garbage,
>I am accessing something which, as uninitialized space, is "always"
>indeterminate.
>
>When I access a pointer to freed space through an lvalue of type
>unsigned char, I am accessing something which is only indeterminately
>valued *as a pointer*.

Agreed, that is the conclusion I have come to over the course of this
discussion. I was wondering if there were any odd cases due to the
possibility of aliasing corresponding signed and unsigned types (as
per 6.3) but I can't think of any. The most worrying case is that of
unions where writing to a member and reading from another is
implementation-defined (6.3.2.3). On platforms where reading a value
can cause a trap does the implementation have to bend over backwards to
make sure it doesn't in this case. It has always struck me that this
guarantee of implementation-defined behaviour causes a lot of headaches
for no consumate benefits. Would removing it actually break anything?

Lawrence Kirby

unread,
Nov 28, 1997, 3:00:00 AM11/28/97
to

In article <65d9ob$6a2$1...@helios.crest.nt.com>
k...@helios.crest.nt.com "Kaz Kylheku" writes:

>In article <880146...@genesis.demon.co.uk>,
>Lawrence Kirby <fr...@genesis.demon.co.uk> wrote:
>>In article <SOztunm...@geraldo.newaygo.mi.us>


>> Roger_Ons...@compsys.com.au "Roger Onslow" writes:
>>>The standard didn't say a 'storage object' is said an 'object' with
>>>indeterminate value. One could argue that the intermediate/temporary value
>>>is an object without a name.
>>>

>>>Any thoughts anyone??
>>
>>Don't do that - a value is *not* an object. Anyway as I noted above I don't
>>think this is necessary since only values stored in objects can be
>>indeterminate.
>
>What about fclose(p) + p?
>
>The right hand operand could be fetched before the call to fclose. After being
>fetched, it exists somewhere as a value independent of the underlying object.
>Yet the call to fclose() can invalidate that value.

An interesting case but not one that causes a problem. Since there is an
ordering that has undefined behaviour then the expresison overall has
undefined behaviour. Therefore there's no need to worry about any other
ordering.

Douglas A. Gwyn

unread,
Nov 28, 1997, 3:00:00 AM11/28/97
to

In article <880676...@genesis.demon.co.uk>,

fr...@genesis.demon.co.uk (Lawrence Kirby) wrote:
>>If I say
>> {
>> unsigned char x;
>> printf("%c\n", x);
>> }
>>I have invoked undefined behavior, because x had indeterminate value.
>Right, I don't think that there is any doubt about that.

While the standard currently categorizes that as undefined behavior,
nonetheless it will not "break" using any conforming implementation.
(But the output will be more or less unpredictable.)
It would have been more interesting with
printf("%u\n", (unsigned)x);
This is a form of "benign" undefined behavior.

Here is the *real* test, because it doesn't output anything that
depends on the indeterminate value:

int main(void) { unsigned char x; x = x; return 0; }

I maintain that that is (or should be) a universally acceptable
program, despite the fact that it "uses" an indeterminate value.
(This depends on the fact that there are only a finite number of
possibilities for the contents of the storage designated by x,
and the behavior is well-defined for every single one of them.)

>The most worrying case is that of
>unions where writing to a member and reading from another is
>implementation-defined (6.3.2.3).

As I recall, that's another thing that should be fixed by C9x.
Certainly, this should not have been specified as "implementation-
defined", because we don't want to insist that it be meaningful in
all cases for every implementation.


Peter Seebach

unread,
Nov 28, 1997, 3:00:00 AM11/28/97
to

In article <65ln1a$e...@dfw-ixnews4.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>In article <880676...@genesis.demon.co.uk>,
> fr...@genesis.demon.co.uk (Lawrence Kirby) wrote:
>>>If I say
>>> {
>>> unsigned char x;
>>> printf("%c\n", x);
>>> }
>>>I have invoked undefined behavior, because x had indeterminate value.
>>Right, I don't think that there is any doubt about that.

>While the standard currently categorizes that as undefined behavior,
>nonetheless it will not "break" using any conforming implementation.

Why not? The conforming implementation is allowed to, whenever an
object is referenced, look in a separate table of "is this determinate",
realize the byte has indeterminate value, and refuse to load it.

Now, no *practical* implementation will break it, but I think they're
allowed to.

>Here is the *real* test, because it doesn't output anything that
>depends on the indeterminate value:

> int main(void) { unsigned char x; x = x; return 0; }

>I maintain that that is (or should be) a universally acceptable
>program, despite the fact that it "uses" an indeterminate value.

>(This depends on the fact that there are only a finite number of
>possibilities for the contents of the storage designated by x,
>and the behavior is well-defined for every single one of them.)

Ahh, but that's the *contents* - values in C can have attributes which
are not contained in them, such as "type", and I believe the can also
have attributes such as "determinacy" - i.e., in the same way that
an interpreter can check and realize that these bytes were not originally
a float at all, it is allowed to realize that they were not originally
determinate, even if any pattern they had *would* be valid.

x is allowed to have no pattern of bits at all, at this point.

Craig Franck

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>In article <880676...@genesis.demon.co.uk>,
> fr...@genesis.demon.co.uk (Lawrence Kirby) wrote:
>>>If I say
>>> {
>>> unsigned char x;
>>> printf("%c\n", x);
>>> }
>>>I have invoked undefined behavior, because x had indeterminate value.
>>Right, I don't think that there is any doubt about that.
>
>While the standard currently categorizes that as undefined behavior,
>nonetheless it will not "break" using any conforming implementation.
>(But the output will be more or less unpredictable.)

It seems you are saying that there are certain circumstances in which
the undefined behavoir that gets invoked must still result in a well
behaved program, or the implementation is broken.

>It would have been more interesting with
> printf("%u\n", (unsigned)x);
>This is a form of "benign" undefined behavior.

So, your imposing a benignness requirement.

>Here is the *real* test, because it doesn't output anything that
>depends on the indeterminate value:
>
> int main(void) { unsigned char x; x = x; return 0; }
>
>I maintain that that is (or should be) a universally acceptable
>program, despite the fact that it "uses" an indeterminate value.
>(This depends on the fact that there are only a finite number of
>possibilities for the contents of the storage designated by x,
>and the behavior is well-defined for every single one of them.)

OK, but what exactly are you saying?

[1] It shouldn't be undefined behavoir; lets change it.

[2] An implementation must continue to function as if you didn't
invoke undefined behavoir in some circumstances (it has to ignore
it and continue to function in a stable manner; no going loopy at
that point or terminating the program with a diagnostic, for
example).

Douglas A. Gwyn

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65mu8k$4kv$3...@darla.visi.com>,

se...@plethora.net (Peter Seebach) wrote:
>> int main(void) { unsigned char x; x = x; return 0; }
>x is allowed to have no pattern of bits at all, at this point.

That's why I drew up that example, because it points up how we disagree.


Peter Seebach

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

[note crosspost]
In article <65o8kv$9...@dfw-ixnews12.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:

Hmm. Okay, let's summarize - your argument is that, since x is
an unsigned char, it *must* have a meaningful value. Mine is that
x has indeterminate value (because it's not initialized), and that
access to indeterminately-valued objects invokes undefined behavior,
and that, thus, whether or not x has a value, the compiler is allowed
to do anything it wants at this point.

It might be worth seeking interpretation from the committee on this
in the future, because this has the potential to make or break a *LOT*
of code.

Think about
struct { int a, b; } x, y;
x.a = 1;
y = x;
I say "undefined, copies x.b".

Now, think about
x.a = 1; x.b = 2;
y = x;
I say it's defined, but I could see someone arguing that I'm accessing
indeterminately-valued padding. ;) (I think they're wrong.)

Douglas A. Gwyn

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65nnrq$r...@mtinsc04.worldnet.att.net>,

Craig Franck <clfr...@worldnet.att.net> wrote:
>OK, but what exactly are you saying?

I already said it!

There is a conflict between the guarantees access of the innards
of live objects via aliasing to array of unsigned char and the
"indeterminate value" of an uninitialized auto object of type
unsigned char. The way *I* would resolve this conflict is to
note what happens for uninitialized auto objects of other types;
there is no statement in the standard that the values of the
*component unsigned chars* be indeterminate, and indeed the
clear intent (according to Clive and me, anyway) is that
accessing *those* unsigned chars does not evoke undefined
behavior. On that model, it must be deemed a mere accident that
the declared type happened to be unsigned char; the component
unsigned char is still accessible. The blanket statement that
using indeterminate values produces undefined behavior did not
take this special situation into account. A correct fix would
be to change that condition to state that using indeterminate
values, other than values of objects of character type during
their lifetimes, produces undefined behavior.


Peter Seebach

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65o9fa$7...@dfw-ixnews11.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>There is a conflict between the guarantees access of the innards
>of live objects via aliasing to array of unsigned char and the
>"indeterminate value" of an uninitialized auto object of type
>unsigned char.

I don't think so. There is *NOT* a guarantee that you can access
"anything" through unsigned char. Where is this guarantee? All
I can find is that the 'unsigned char' is exempt from *yet another*
source of undefined behavior, which is inappropriate aliasing.

>The way *I* would resolve this conflict is to
>note what happens for uninitialized auto objects of other types;
>there is no statement in the standard that the values of the
>*component unsigned chars* be indeterminate, and indeed the
>clear intent (according to Clive and me, anyway) is that
>accessing *those* unsigned chars does not evoke undefined
>behavior.

See, I think it's there that we part ways - I see no guarantee that
it is possible to access anything through unsigned char.

I see only a guarantee that, if you can safely read it in any way
whatsoever, you can *also* safely read it as unsigned char.

>A correct fix would
>be to change that condition to state that using indeterminate
>values, other than values of objects of character type during
>their lifetimes, produces undefined behavior.

I don't think that's necessary - the language as it stands is, IMHO,
consistent and viable. There is never any *correct* reason to access
an indeterminately-valued object, or more generally, to access
indeterminately-valued space.

Douglas A. Gwyn

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65o9qp$kb2$1...@darla.visi.com>,

se...@plethora.net (Peter Seebach) wrote:
>>There is a conflict between the guarantees access of the innards
>>of live objects via aliasing to array of unsigned char and the
>>"indeterminate value" of an uninitialized auto object of type
>>unsigned char.
>I don't think so. There is *NOT* a guarantee that you can access
>"anything" through unsigned char. Where is this guarantee? All
>I can find is that the 'unsigned char' is exempt from *yet another*
>source of undefined behavior, which is inappropriate aliasing.

That's a literal reading of the text in 6.3, but the footnote
indicates a wider intent, which I thought had been adopted for
C9x (one of Clive's proposals). Certainly the committee has
agreed verbally during meetings that memcpy-like operations
are supposed to work on padding and indeterminate values (it
was always assumed implicitly that the operation occurs within
the object's lifetime).

>... There is never any *correct* reason to access


>an indeterminately-valued object, or more generally, to access
>indeterminately-valued space.

That's easy to dispute -- consider memcpy of a dynamically
allocated structure, some of whose members are not valid because
they specify properties that are not relevant for the node type.
The uninitialized members are indeterminate, but everybody seems
to agree that copying the whole node bytewise should be safe.


Craig Franck

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

se...@plethora.net (Peter Seebach) wrote:
>In article <65o9fa$7...@dfw-ixnews11.ix.netcom.com>,
>Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>>There is a conflict between the guarantees access of the innards
>>of live objects via aliasing to array of unsigned char and the
>>"indeterminate value" of an uninitialized auto object of type
>>unsigned char.
>
>I don't think so. There is *NOT* a guarantee that you can access
>"anything" through unsigned char. Where is this guarantee? All
>I can find is that the 'unsigned char' is exempt from *yet another*
>source of undefined behavior, which is inappropriate aliasing.

There is a feeling held by some that you should be able to do "raw
reads" of memory, completely unfettered access on a byte by byte
basis of any solid object.

[There have only been a couple of days in the last week or so that
I have been able to spend any real time with the news; I will have
to go back and read this thread all the way through at some point.]

>>The way *I* would resolve this conflict is to
>>note what happens for uninitialized auto objects of other types;
>>there is no statement in the standard that the values of the
>>*component unsigned chars* be indeterminate, and indeed the
>>clear intent (according to Clive and me, anyway) is that
>>accessing *those* unsigned chars does not evoke undefined
>>behavior.
>
>See, I think it's there that we part ways - I see no guarantee that
>it is possible to access anything through unsigned char.
>
>I see only a guarantee that, if you can safely read it in any way
>whatsoever, you can *also* safely read it as unsigned char.

That is the way I understand it.

[...]

>There is never any *correct* reason to access
>an indeterminately-valued object, or more generally, to access
>indeterminately-valued space.

That's a solid position to take. In most instances it would be bad
indeed (especially if issues such as with padding bytes are resolved
by considering them to be undefined and not indeterminately-valued).

Kaz Kylheku

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65o90h$jer$1...@darla.visi.com>,

Peter Seebach <se...@plethora.net> wrote:
>[note crosspost]
>In article <65o8kv$9...@dfw-ixnews12.ix.netcom.com>,

>Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>>In article <65mu8k$4kv$3...@darla.visi.com>,

>> se...@plethora.net (Peter Seebach) wrote:
>>>> int main(void) { unsigned char x; x = x; return 0; }
>>>x is allowed to have no pattern of bits at all, at this point.
>>That's why I drew up that example, because it points up how we disagree.
>
>Hmm. Okay, let's summarize - your argument is that, since x is
>an unsigned char, it *must* have a meaningful value. Mine is that
>x has indeterminate value (because it's not initialized), and that
>access to indeterminately-valued objects invokes undefined behavior,
>and that, thus, whether or not x has a value, the compiler is allowed
>to do anything it wants at this point.

C9X (Working Draft 1997-11-21) clearly is in aggreement with Doug (and myself)
in this issue: namely, that an object may be accessed through an l-value of
type unsigned char even if its indeterminate as a value of its declared
type.

The only problem is that this isn't clear from the 9899:1990 standard.

Thus, what matters is which interpretation is meaningful for all or most of the
existing implementations of the language. New implementations will quite likely
use the newer documentation to settle questions such as these. My understanding
is that when the new working document becomes a standard, the previous standard
shall be retired.

If all the implementations have predictable behavior upon accessing an
uninitialized unsigned char objects, then the issue is really moot.

Now, for the relevant bits from the Working Document (footnotes elided):

6.1.2.8 Representations of types

...

6.1.2.8.1 General

1 Values of type unsigned char shall be represented using a pure binary notation.

2 When stored in objects of any other object type, values of that type consist of
n*CHAR_BIT bits, where n is the size of an object of that type, in bytes. The
value may be copied into an object of type unsigned char [n] (e.g., by memcpy);
the resulting set of bytes is called the object representation of the value.
Two values with the same object representation shall compare equal, but values
that compare equal might have different object representations.

3 Certain object representations might not represent a value of that type. If the
stored value of an object has such a representation and is accessed by an lvalue
expression that does not have character type, the behavior is undefined.
If such a representation is produced by a side effect that modifies all or any
part of the object by an lvalue expression that does not have character type,
the behavior is undefined. Such representations are called trap representations.

The new standard makes it quite clear that you can fiddle with an object using
an l-value of character type without causing a problem.

If this is true of all existing C89 implementations, then it's true of the
current language as a matter of fact. We can interpret the current standard this
way of that, but it will be superseded anyway.

Douglas A. Gwyn

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65o90h$jer$1...@darla.visi.com>,

se...@plethora.net (Peter Seebach) wrote:
>Think about
> struct { int a, b; } x, y;
> x.a = 1;
> y = x;
>I say "undefined, copies x.b".
>Now, think about
> x.a = 1; x.b = 2;
> y = x;
>I say it's defined, but I could see someone arguing that I'm accessing
>indeterminately-valued padding. ;) (I think they're wrong.)

Actually, I agree with you about the definedness of these.
The first example copies the "value" of the structure,
which has to mean copies the values of its members, and
the value of x.b is indeterminate.
In the second example, all the members have determinate
values. Padding does not participate.


Peter Seebach

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65puij$hef$1...@helios.crest.nt.com>,

Kaz Kylheku <k...@helios.crest.nt.com> wrote:
>C9X (Working Draft 1997-11-21) clearly is in aggreement with Doug (and myself)
>in this issue: namely, that an object may be accessed through an l-value of
>type unsigned char even if its indeterminate as a value of its declared
>type.

I disagree; I don't see the wording guaranteeing that.

> part of the object by an lvalue expression that does not have
>character type,
> the behavior is undefined. Such representations are called trap
>representations.

>The new standard makes it quite clear that you can fiddle with an object using
>an l-value of character type without causing a problem.

No, it makes it clear that character types have no *trap representations*.

Do you think you can do
unsigned char x = *(unsigned char *)0;
? No.

THE ABILITY TO REFER TO THINGS THROUGH UNSIGNED CHAR LVALUES DOES NOT
MAKE UP FOR ANY *OTHER* KIND OF UNDEFINED BEHAVIOR.

Undefined behavior's rules say that access to indeterminately valued objects
is undefined behavior. It doesn't say "indeterminately valued objects whose
bits are a trap representation for a given type", it doesn't qualify this
in *any way*. None of the things that say "you can do this with an unsigned
char lvalue" say "even if it is indeterminately valued". Therefore, they
are not limiting that undefinedness.

I really don't see the wording in either standard making this guarantee.

All it's saying is that, if there is *any* way you can access something
safely, you can *also* access it as a sequence of chars.

Peter Seebach

unread,
Nov 29, 1997, 3:00:00 AM11/29/97
to

In article <65po2b$8...@sjx-ixn4.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>That's a literal reading of the text in 6.3, but the footnote
>indicates a wider intent, which I thought had been adopted for
>C9x (one of Clive's proposals). Certainly the committee has
>agreed verbally during meetings that memcpy-like operations
>are supposed to work on padding and indeterminate values (it
>was always assumed implicitly that the operation occurs within
>the object's lifetime).

I have seen it agreed that it works on padding. I have seen
people say that they think unsigned charness allows copying
indeterminate values, but I've also seen them back down confronted
with 3.16. :)

>>... There is never any *correct* reason to access
>>an indeterminately-valued object, or more generally, to access
>>indeterminately-valued space.

>That's easy to dispute -- consider memcpy of a dynamically


>allocated structure, some of whose members are not valid because
>they specify properties that are not relevant for the node type.
>The uninitialized members are indeterminate, but everybody seems
>to agree that copying the whole node bytewise should be safe.

I'm not sure I agree, and I think that, if we want that, the wording
should be changed to reflect it. (For instance, 3.16 could say
"... access of indeterminately valued objects except through lvalues
of character type" to give the meaning you seem to find in it.)

Kaz Kylheku

unread,
Nov 30, 1997, 3:00:00 AM11/30/97
to

In article <65q0ni$7ut$2...@darla.visi.com>,

Peter Seebach <se...@plethora.net> wrote:
>>The new standard makes it quite clear that you can fiddle with an object using
>>an l-value of character type without causing a problem.
>
>No, it makes it clear that character types have no *trap representations*.
>
>Do you think you can do
> unsigned char x = *(unsigned char *)0;
>? No.

Of course not, because there is no object at the null pointer address, so
we can no longer say that we are accessing an object.

This is a whole different issue completely. We are now talking about the
semantics of the unary operator * in connection with invalid pointers.

>THE ABILITY TO REFER TO THINGS THROUGH UNSIGNED CHAR LVALUES DOES NOT
>MAKE UP FOR ANY *OTHER* KIND OF UNDEFINED BEHAVIOR.

It clearly makes up for undefined behavior related to referencing a valid,
existing object that has an indeterminate value from the point of view of
some representation which leaves room for non-values or trap representations.

>Undefined behavior's rules say that access to indeterminately valued objects
>is undefined behavior. It doesn't say "indeterminately valued objects whose

But that is vacuous unless you pin down which type you are talking about.
An object can simultaneously exhibit more than one value based on which
lvalue type is used to look at it. Every possible encoding in a byte
corresponds to some valid character type value.

I'm saying that an object interpreted as a character type cannot possibly have
an indeterminate value, so the rule does not apply.

The value of an uninitialized auto object of type unsigned char is simply
a random number between 0 and UCHAR_MAX. It is no less determinate than the
return value of rand().

>bits are a trap representation for a given type", it doesn't qualify this
>in *any way*. None of the things that say "you can do this with an unsigned
>char lvalue" say "even if it is indeterminately valued". Therefore, they
>are not limiting that undefinedness.
>
>I really don't see the wording in either standard making this guarantee.
>
>All it's saying is that, if there is *any* way you can access something
>safely, you can *also* access it as a sequence of chars.

No, because it also essentially says that even if there is no other way to
access something safely, you can still access it as a sequence of chars. The
object could have a bit pattern that is not a value of its declared type (e.g.
a trap representation, or other), but it can still be accessed through a
character type.

Kaz Kylheku

unread,
Nov 30, 1997, 3:00:00 AM11/30/97
to

In article <65q0r0$7ut$3...@darla.visi.com>,

Peter Seebach <se...@plethora.net> wrote:
>I'm not sure I agree, and I think that, if we want that, the wording
>should be changed to reflect it. (For instance, 3.16 could say
>"... access of indeterminately valued objects except through lvalues
>of character type" to give the meaning you seem to find in it.)

That would be redundant since an object can't be indeterminate from the
point of view of an lvalue of character type. :) Yet the change would be
beneficial as it would settle arguments.

Craig Franck

unread,
Nov 30, 1997, 3:00:00 AM11/30/97
to

k...@helios.crest.nt.com (Kaz Kylheku) wrote:
>In article <65q0ni$7ut$2...@darla.visi.com>,
>Peter Seebach <se...@plethora.net> wrote:

>>Undefined behavior's rules say that access to indeterminately valued objects
>>is undefined behavior. It doesn't say "indeterminately valued objects whose
>
>But that is vacuous unless you pin down which type you are talking about.

Doesn't the lack of qualification say it all?

>An object can simultaneously exhibit more than one value based on which
>lvalue type is used to look at it. Every possible encoding in a byte
>corresponds to some valid character type value.
>
>I'm saying that an object interpreted as a character type cannot possibly have
>an indeterminate value, so the rule does not apply.

I think it's more of an issue of access rights.

>The value of an uninitialized auto object of type unsigned char is simply
>a random number between 0 and UCHAR_MAX. It is no less determinate than the
>return value of rand().

But you invoke undefined behavoir if you try to look at it. If your
program abends at this point, can you really say you determined its
value?

>>I really don't see the wording in either standard making this guarantee.
>>
>>All it's saying is that, if there is *any* way you can access something
>>safely, you can *also* access it as a sequence of chars.
>
>No, because it also essentially says that even if there is no other way to
>access something safely, you can still access it as a sequence of chars.

I think it would be better to let an implementation loose of this
perceived requirement.

--
Craig
clfr...@worldnet.att.net
Manchester, NH

Do not choose to be wrong for the sake of being
different. -- Lord Samuel


Peter Seebach

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In article <65sdb5$iif$1...@helios.crest.nt.com>,

Kaz Kylheku <k...@helios.crest.nt.com> wrote:
>>THE ABILITY TO REFER TO THINGS THROUGH UNSIGNED CHAR LVALUES DOES NOT
>>MAKE UP FOR ANY *OTHER* KIND OF UNDEFINED BEHAVIOR.

>It clearly makes up for undefined behavior related to referencing a valid,
>existing object that has an indeterminate value from the point of view of
>some representation which leaves room for non-values or trap representations.

No, because the "undefined behavior from accessing through the wrong type"
does not say anything about "value", it just says "undefined behavior because
wrong type". All we say is that (unsigned char) lvalues aren't in a *type*
conflict.

>>Undefined behavior's rules say that access to indeterminately valued objects
>>is undefined behavior. It doesn't say "indeterminately valued objects whose

>But that is vacuous unless you pin down which type you are talking about.

Oh, so malloc'd memory isn't indeterminately valued then?

Newly allocated memory and auto objects are indeterminately valued for
all types - any attempt to limit their indeterminacy to "their type" is
missing the point, which is that there is *no* meaningful value. Not
in any type.

I believe this is a Good Thing - it allows things like Purify, and I
believe that "access to indeterminately valued object" in the general
case is a legitimate run-time diagnostic, and one which should be allowed
to abort execution.

>An object can simultaneously exhibit more than one value based on which
>lvalue type is used to look at it. Every possible encoding in a byte
>corresponds to some valid character type value.

So? Objects may have attributes which are not their value. When you
look through a pointer, you are allowed to know, not just what bits are
pointed at, but what type they are, and whether or not they are currently
determinate.

>I'm saying that an object interpreted as a character type cannot possibly have
>an indeterminate value, so the rule does not apply.

What you're saying is, IMHO, wrong. An object interpreted as a character
type cannot have a value which is a trap representation, to use the C9X
terms. The *representation* cannot be bad to load as a char - but the
language is not limited by what is sane to implement.

Accessing indeterminately valued space through unsigned char values is
like accessing an array-of-arrays as a single big array - a compiler is
allowed to know the extra information it needs to trap the error, even
though the error is *entirely* a function of the abstract machine, not of
the underlying physical machine.

C objects have at least one extra state beyond that their bits can represent,
and that is "there's no value here".

>The value of an uninitialized auto object of type unsigned char is simply
>a random number between 0 and UCHAR_MAX. It is no less determinate than the
>return value of rand().

No. It is *DEFINED* as indeterminate. Therefore, the bets are off.

This is like saying that
unsigned char x = 0;
x = x++;
*must* generate a number between 0 and UCHAR_MAX. A rule has been broken,
and the text of that rule is not modified by any of the text for chars.

>No, because it also essentially says that even if there is no other way to
>access something safely, you can still access it as a sequence of chars.

No, *you* say that. The standard says nothing of the sort. It says that
behavior is undefined when you access something through the wrong type,
but that unsigned char is never the wrong type. Indeterminacy is not
solely type related - look at allocated memory.

>The
>object could have a bit pattern that is not a value of its declared type (e.g.
>a trap representation, or other), but it can still be accessed through a
>character type.

Right! As long as it's *determinate* but invalid, that's fine. So, if
I do
char *foo;
unsigned char *x = (unsigned char *) &foo;
accesses to *x are still wrong. Now, I can
int i;
for (i = 0; i < sizeof(char *); ++i)
*x++ = 0;
and now I am guaranteed that
x = (unsigned char *) &foo;
for(i = 0; i < sizeof(char *); ++i)
assert(*++x == 0);
will work... But 'foo' is not required to be a valid pointer.

At this point, the 'unsigned char' guarantee means that I can treat foo
as an array of (sizeof(char *)) unsigned chars, and I'm fine. Nothing goes
wrong.

However, it is not the *TYPE* of foo that made it undefined; it was that
it was indeterminate.

If you want to pursue this type thing, you have to convince me that
malloc produces memory of a specific known type. :)

Douglas A. Gwyn

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In article <65tdoh$806$8...@darla.visi.com>,

se...@plethora.net (Peter Seebach) wrote:
>>I'm saying that an object interpreted as a character type cannot possibly have
>>an indeterminate value, so the rule does not apply.
>What you're saying is, IMHO, wrong. An object interpreted as a character
>type cannot have a value which is a trap representation, to use the C9X
>terms. The *representation* cannot be bad to load as a char - but the
>language is not limited by what is sane to implement.

I agree with Kaz on this (assuming the object is within its lifetime).
"Trap representation" is a red herring -- it's true that "character types"
cannot have trap representations, but the main thing is that all bit
patterns in a char representation are valid values (regardless of any
trapping issues), which we insisted on to ensure that arbitrary (live)
objects *could* be accessed bytewise in s.c. programs. This differs
from the situation for other data types, which might (for example)
contain extra "must be zero" bits or else the implementation's
assumptions will be violated, which is really what makes such cases
undefined behavior. (Trapping is another, somewhat different, case.)

>C objects have at least one extra state beyond that their bits can represent,
>and that is "there's no value here".

I disagree with that -- it's not an evident part of the "abstract
machine". The C Standard's use of "indeterminate value" is simply
to allow implementations license to misbehave when such values are
accessed, thus "undefined behavior". But in examples such as the
one we were discussing, where the representation has to be of a
valid value, there is no way for the implementation to misbehave
(except if the implementor intentionally decides to make it
misbehave, which requires a lot of extra mechanism, because he
believes that the standard allows him to play games like that).

>>The value of an uninitialized auto object of type unsigned char is simply
>>a random number between 0 and UCHAR_MAX. It is no less determinate than the
>>return value of rand().
>No. It is *DEFINED* as indeterminate. Therefore, the bets are off.

`But "glory" doesn't mean "a nice knock-down argument,"' Alice objected.
`When I use a word,' Humpty Dumpty said in rather a scornful tone, `it means just what I choose it to mean -- neither more nor less.'
`The question is,' said Alice, `whether you can make words mean so many different things.'
` The question is,' said Humpty Dumpty, `which is to be master -- that's all.'

>This is like saying that
> unsigned char x = 0;
> x = x++;
>*must* generate a number between 0 and UCHAR_MAX. A rule has been broken,
>and the text of that rule is not modified by any of the text for chars.

This is a different situation. The original discussion concerned
bytewise access "in a proper manner", while this one is "improper".
There are several good practical reasons for x = x++ to fail,
but no good reason for bytewise access of a live object to fail.

>>No, because it also essentially says that even if there is no other way to
>>access something safely, you can still access it as a sequence of chars.
>No, *you* say that. The standard says nothing of the sort. It says that
>behavior is undefined when you access something through the wrong type,
>but that unsigned char is never the wrong type.

Which should tell one something about the intention!


R S Haigh

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In article <65pnga$k...@dfw-ixnews6.ix.netcom.com>, Douglas A. Gwyn <gw...@ix.netcom.com> writes:
> In article <65o90h$jer$1...@darla.visi.com>,

> se...@plethora.net (Peter Seebach) wrote:
> >Think about
> > struct { int a, b; } x, y;
> > x.a = 1;
> > y = x;
> >I say "undefined, copies x.b".
> >Now, think about
> > x.a = 1; x.b = 2;
> > y = x;
> >I say it's defined, but I could see someone arguing that I'm accessing
> >indeterminately-valued padding. ;) (I think they're wrong.)
>
> Actually, I agree with you about the definedness of these.
> The first example copies the "value" of the structure,
> which has to mean copies the values of its members, and
> the value of x.b is indeterminate.
> In the second example, all the members have determinate
> values. Padding does not participate.

Suppose I zero a struct with calloc or memset, and then copy it
by assignment. Are all the members considered to have values?
Does it depend on the types of the members? If so, what happens
if the object is a union instead of a struct, and e.g the union
has both int and pointer members?

--

Kaz Kylheku

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In article <65tjfa$i...@sjx-ixn8.ix.netcom.com>,
Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>In article <65tdoh$806$8...@darla.visi.com>,

> se...@plethora.net (Peter Seebach) wrote:
>>>I'm saying that an object interpreted as a character type cannot possibly have
>>>an indeterminate value, so the rule does not apply.
>>What you're saying is, IMHO, wrong. An object interpreted as a character
>>type cannot have a value which is a trap representation, to use the C9X
>>terms. The *representation* cannot be bad to load as a char - but the
>>language is not limited by what is sane to implement.
>
>I agree with Kaz on this (assuming the object is within its lifetime).
>"Trap representation" is a red herring -- it's true that "character types"
>cannot have trap representations, but the main thing is that all bit
>patterns in a char representation are valid values (regardless of any
>trapping issues), which we insisted on to ensure that arbitrary (live)
>objects *could* be accessed bytewise in s.c. programs. This differs
>from the situation for other data types, which might (for example)
>contain extra "must be zero" bits or else the implementation's
>assumptions will be violated, which is really what makes such cases
>undefined behavior. (Trapping is another, somewhat different, case.)

I don't believe that a C object can carry any information that is not
somehow represented in the bits which comprise it. In an implementation where
CHAR_BIT is eight, an unsigned char can be in one of 256 states. Nothing more.
The width of the object is defined as eight bits. More information requires
more width.

Purify is a useful, but it's non conforming. It changes a C implementation into
something stricter than the abstract language. This is analogous to the LCLint
checker which imposes stricter rules than what C requires (albeit it does
so at the static check stage rather than at execution, nevertheless the
analogy is good).

What I know of Purify is that it keeps extra information about memory in
separate structures. At the level of an individual byte, it is able to
determine whether that byte is initialized, and other attributes.
Purify does not work together with the language at the abstract level.

The concept behind Purify is somewhat flawed because it catches innocuous
errors, but lets the big ones go. It's like a non-Euclidean fishnet that only
catches small fish, but somehow lets through the big ones. :) For example, it
would never catch the use of a pointer object that became indeterminate due to
fclose()---the very error that sparked the debate we are yet carrying on.


The last time I used Purify, it wasn't able to do proper bounds checking
on pointers. If you were able to make a pointer hop from one object to another
using illegal arithmetic, it would be approved, because the validity of
accesses is checked by looking at the objects rather than the pointers
through which they are accessed. Purify says that if the pointer refers to
any valid object, it's a valid pointer. Contrast this with Bounds Checking GCC
which can diagnose incorrect pointer arithmetic.

If I am approaching some kind of point, then it is that it's not worthwhile to
have the language definition support debugging tools like Purify. These tools
don't have to behave in a conforming manner; instead, they should do whatever
it takes to find software defects.

In any case, a conforming C implementation can easily diagnose the use of an
uninitialized value, either at compile time or at run time, without being
considered non-conforming. Regardless of whether or not it is well defined to
use the value of an uninitialized unsigned char, the C implementation can
diagnose such use.

Tom Payne

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In comp.std.c Kaz Kylheku <k...@helios.crest.nt.com> wrote:

[...]
: Now, for the relevant bits from the Working Document (footnotes elided):

: 6.1.2.8 Representations of types

: ...

: 6.1.2.8.1 General

: 1 Values of type unsigned char shall be represented using a pure binary notation.

: 2 When stored in objects of any other object type, values of that type consist of
: n*CHAR_BIT bits, where n is the size of an object of that type, in bytes. The

: value may be copied into an object of type unsigned char [n] (e.g., by memcpy);


: the resulting set of bytes is called the object representation of the value.
: Two values with the same object representation shall compare equal, but values
: that compare equal might have different object representations.

: 3 Certain object representations might not represent a value of that type. If the

: stored value of an object has such a representation and is accessed by an lvalue


: expression that does not have character type, the behavior is undefined.

: If such a representation is produced by a side effect that modifies all or any
: part of the object by an lvalue expression that does not have character type,


: the behavior is undefined. Such representations are called trap representations.

: The new standard makes it quite clear that you can fiddle with an object using


: an l-value of character type without causing a problem.

The words quoted from the draft *say* that if you fiddle with an
object holding a trap representatinon using an lvalue other than
character you invoke undefined behavior. Does the committee *intend*
the converse, i.e., that if you fiddle with such an object using an
(unsigned) character lvalue, there is no problem?

Also, a determinate value in a pointer object can sometimes turn into
a trap representation without modification, e.g., as the result of a
return, a free, or a close. I had understood that certain
architectures trap read access to certain pointer objects when they
hold trap values. Is this not going to cause a problem?

Tom Payne

Thad Smith

unread,
Dec 1, 1997, 3:00:00 AM12/1/97
to

In article <65o9qp$kb2$1...@darla.visi.com>,
se...@plethora.net (Peter Seebach) wrote:
>In article <65o9fa$7...@dfw-ixnews11.ix.netcom.com>,

>Douglas A. Gwyn <gw...@ix.netcom.com> wrote:

>>A correct fix would
>>be to change that condition to state that using indeterminate
>>values, other than values of objects of character type during
>>their lifetimes, produces undefined behavior.
>
>I don't think that's necessary - the language as it stands is, IMHO,

>consistent and viable. There is never any *correct* reason to access


>an indeterminately-valued object, or more generally, to access
>indeterminately-valued space.

Does "correct reason" refer to style outside the scope of the
standard? Must the following program return EXIT_SUCCESS? What if a
and b are unsigned int?

#include <stdlib.h>
int main (void) {
unsigned char a, b;
b = a+1;
if (b == a+1) return EXIT_SUCCESS;
else return EXIT_FAILURE;
}

Thad

Peter Seebach

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <65tjfa$i...@sjx-ixn8.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>In article <65tdoh$806$8...@darla.visi.com>,

> se...@plethora.net (Peter Seebach) wrote:
>>What you're saying is, IMHO, wrong. An object interpreted as a character
>>type cannot have a value which is a trap representation, to use the C9X
>>terms. The *representation* cannot be bad to load as a char - but the
>>language is not limited by what is sane to implement.

>I agree with Kaz on this (assuming the object is within its lifetime).


>"Trap representation" is a red herring -- it's true that "character types"
>cannot have trap representations, but the main thing is that all bit
>patterns in a char representation are valid values (regardless of any
>trapping issues), which we insisted on to ensure that arbitrary (live)
>objects *could* be accessed bytewise in s.c. programs. This differs
>from the situation for other data types, which might (for example)
>contain extra "must be zero" bits or else the implementation's
>assumptions will be violated, which is really what makes such cases
>undefined behavior. (Trapping is another, somewhat different, case.)

No, it's why we *labeled* them undefined behavior. All that *makes*
them undefined behavior is that we say, without any qualifications,
that access to indeterminately-valued objects is undefined behavior.
Not only for types where some values may be meaningless, but for *all*
types.

>I disagree with that -- it's not an evident part of the "abstract
>machine". The C Standard's use of "indeterminate value" is simply
>to allow implementations license to misbehave when such values are
>accessed, thus "undefined behavior". But in examples such as the
>one we were discussing, where the representation has to be of a
>valid value, there is no way for the implementation to misbehave
>(except if the implementor intentionally decides to make it
>misbehave, which requires a lot of extra mechanism, because he
>believes that the standard allows him to play games like that).

If this is the intent, our words are broken; right now, the standard
does indeed allow the implementor to play extra games like that, and
I would personally love to have such an implementation.

>>No. It is *DEFINED* as indeterminate. Therefore, the bets are off.

>`But "glory" doesn't mean "a nice knock-down argument,"' Alice objected.

>`When I use a word,' Humpty Dumpty said in rather a scornful tone, `it
>means just what I choose it to mean -- neither more nor less.'
>`The question is,' said Alice, `whether you can make words mean so many
>different things.'
>` The question is,' said Humpty Dumpty, `which is to be master -- that's all.'

Yes. The C standard says, unambiguously, that the cases in discussion
are undefined behavior. You and Kaz are arguing that, since there's no
good reason for this behavior to be undefined, it really isn't, but that's
not how the *definition* works.

>>This is like saying that
>> unsigned char x = 0;
>> x = x++;
>>*must* generate a number between 0 and UCHAR_MAX. A rule has been broken,
>>and the text of that rule is not modified by any of the text for chars.

>This is a different situation. The original discussion concerned


>bytewise access "in a proper manner", while this one is "improper".
>There are several good practical reasons for x = x++ to fail,
>but no good reason for bytewise access of a live object to fail.

It doesn't need to have a good reason to be allowed. We allowed it;
the words are there.

I think there *is* a good reason for bytewise access of a live object to
fail - because the object has never been initialized, and this is a violation
of a rule in the stanard. That's good enough for me.

Furthermore, such a failure is *useful* - it can catch something which
is by its very nature a programming error.

>>>No, because it also essentially says that even if there is no other way to
>>>access something safely, you can still access it as a sequence of chars.
>>No, *you* say that. The standard says nothing of the sort. It says that
>>behavior is undefined when you access something through the wrong type,
>>but that unsigned char is never the wrong type.

>Which should tell one something about the intention!

Yes. It is clearly our intention that unsigned char aliasing never
*introduces* undefined behavior. I see no solid intention that it should
provide definition for otherwise undefined behavior.

Peter Seebach

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <65sdj6$iq6$1...@helios.crest.nt.com>,
Kaz Kylheku <k...@helios.crest.nt.com> wrote:
>In article <65q0r0$7ut$3...@darla.visi.com>,

>Peter Seebach <se...@plethora.net> wrote:
>>I'm not sure I agree, and I think that, if we want that, the wording
>>should be changed to reflect it. (For instance, 3.16 could say
>>"... access of indeterminately valued objects except through lvalues
>>of character type" to give the meaning you seem to find in it.)

>That would be redundant since an object can't be indeterminate from the
>point of view of an lvalue of character type. :)

Can you find any wording which says this? The quoted material about
character types has all had to do with aliasing, not determinacy.

>Yet the change would be
>beneficial as it would settle arguments.

I would like either change. I prefer a change clarifying that garbage space
(for instance, newly allocated memory) is *always* indeterminate, for
all types, because that would lead to a cleaner language, with no skin
off any legitimate code's nose.

Michael Rubenstein

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

On 2 Dec 1997 04:16:18 GMT, se...@plethora.net (Peter Seebach) wrote:

>In article <65tjfa$i...@sjx-ixn8.ix.netcom.com>,
>Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>>I disagree with that -- it's not an evident part of the "abstract
>>machine". The C Standard's use of "indeterminate value" is simply
>>to allow implementations license to misbehave when such values are
>>accessed, thus "undefined behavior". But in examples such as the
>>one we were discussing, where the representation has to be of a
>>valid value, there is no way for the implementation to misbehave
>>(except if the implementor intentionally decides to make it
>>misbehave, which requires a lot of extra mechanism, because he
>>believes that the standard allows him to play games like that).
>
>If this is the intent, our words are broken; right now, the standard
>does indeed allow the implementor to play extra games like that, and
>I would personally love to have such an implementation.

While I'm very happy to say that such machines are out of fashion, I
have worked with a one in which accessing an uninitialized variable
could cause strange things to happen without the compiler playing
special games to cause this.

The IBM 1620 (early 1960s) was a variable field width machine. The
last digit of a number (it was a decimal machine) had a marker bit set
to indicate that it was the end. When moving a value, the length
specified by the source was moved to the destination, including the
marker bit.

GOTRAN was a FORTRAN-like language for the 1620. It did not support
variable length numbers under control of the program -- an integer was
always the same length in a program, though it could be set by a
compiler option. It did not initialize variables automatically at
all, so until you stored something in a variable there was no end
marker set. Hence, something like

I = J

with J not initialized could move too many digits, overwriting
something other than I.

The 1620 would have been quite unsuitable for C for a number of
reasons, but I don't see that the variable field length would cause a
problem. Since the marker was not accessable from a GOTRAN program
and presumably would not be from a C program, it shouldn't run afoul
of the prohibition against holes in char types any more than a parity
bit in memory does.

Michael M Rubenstein

Douglas A. Gwyn

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <34879a1b....@nntp.ix.netcom.com>,

mik...@ix.netcom.com (Michael Rubenstein) wrote:
>The IBM 1620 (early 1960s) was a variable field width machine.

Happens the 1620 was the first computer I programmed.
I don't think the variable-length fields illuminate the
issue under discussion, because C's memory model allows
for "padding" in any data type other than a character
type, and requires all data types to be represented (in
effect) in an array of character type. We weren't
proposing that arbitrary indeterminate values could be
properly processed, just values of character type. Any
conforming C implementation on a 1620 (if it were
possible, which I doubt) would have to arrange for
character-type access to not require field marks, and
indeed to access field marks without treating them
differently from any other bit pattern.


Douglas A. Gwyn

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <1997Dec1.1...@leeds.ac.uk>,

ecl...@leeds.ac.uk (R S Haigh) wrote:
>Suppose I zero a struct with calloc or memset, and then copy it
>by assignment. Are all the members considered to have values?
>Does it depend on the types of the members? If so, what happens
>if the object is a union instead of a struct, and e.g the union
>has both int and pointer members?

This is an old question. The answer is, all the integer types
that were 0-byte initialized by calloc (or memset) have value 0,
but the value of a 0-byte initialized pointer or floating type
is not necessarily determinate.

As to the union, it depends on how it's accessed. If you use
an integer-type member, it would have a well-defined 0 value;
if you use a pointer or floating type, it is not necessarily
determinate.

By "not necessarily determinate", I mean that's what actually
happens. In standardese it is simply "indeterminate", so a
strictly conforming program cannot use such values.


Alex Krol

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

Peter Seebach wrote:
>
> In article <65tjfa$i...@sjx-ixn8.ix.netcom.com>,
> Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
<snip >

> I think there *is* a good reason for bytewise access of a live object to
> fail - because the object has never been initialized, and this is a violation
> of a rule in the stanard. That's good enough for me.
>
> Furthermore, such a failure is *useful* - it can catch something which
> is by its very nature a programming error.
>
> >>>No, because it also essentially says that even if there is no other way to
> >>>access something safely, you can still access it as a sequence of chars.
> >>No, *you* say that. The standard says nothing of the sort. It says that
> >>behavior is undefined when you access something through the wrong type,
> >>but that unsigned char is never the wrong type.
>
> >Which should tell one something about the intention!
>
> Yes. It is clearly our intention that unsigned char aliasing never
> *introduces* undefined behavior. I see no solid intention that it should
> provide definition for otherwise undefined behavior.


typedef struct myStructTag {
char ch;
int i;
long l;
} myStruct;

.......
myStruct foo,foo1;
unsigned char *bar;
unsigned char *ptr;
unsigned char *ptr1;

.......

foo.ch = 1;
foo.i = 2;
foo.l =3;

bar = malloc(sizeof myStruct);
ptr = (unsigned char *)&foo;
memcpy(bar,ptr,sizeof myStruct);

Does this code invoke undefined behaviour in case of structure
padding? If yes, why
memcpy(&foo1,&foo,sizeof myStruct);
is OK? What about
ptr1 = (unsigned char *)&foo1;
memcpy(ptr1,ptr,sizeof myStruct);
?
If not, why must it invoke undefined behaviour in case of
uninitialised foo? What's the difference between indeterminate
value of padding byte and indeterminate value of uninitialised
foo.ch?

Regards,
Alex Krol

Douglas A. Gwyn

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <66022i$n4g$2...@darla.visi.com>,

se...@plethora.net (Peter Seebach) wrote:
>>` The question is,' said Humpty Dumpty, `which is to be master -- that's all.'
>Yes. The C standard says, unambiguously, that the cases in discussion
>are undefined behavior. You and Kaz are arguing that, since there's no
>good reason for this behavior to be undefined, it really isn't, but that's
>not how the *definition* works.

My point was, we're in the process of revising the C standard,
and we can make it say whatever we want it to say. I believe
C89's characterization of "accessing indeterminately-valued
objects" as being one way to produce undefined behavior was
(a) confusing and (b) too broad, because of the special
dispensation we give aliased access via the character types.
(And I would say that a character type aliases itself, too.)

>>>>No, because it also essentially says that even if there is no other way to
>>>>access something safely, you can still access it as a sequence of chars.
>>>No, *you* say that. The standard says nothing of the sort. It says that
>>>behavior is undefined when you access something through the wrong type,
>>>but that unsigned char is never the wrong type.

>>Which should tell one something about the intention!
>Yes. It is clearly our intention that unsigned char aliasing never
>*introduces* undefined behavior. I see no solid intention that it should
>provide definition for otherwise undefined behavior.

You're still on the wrong wavelength -- the mention of
"undefined behavior" seems to have blinded you to the issue
of whether bytewise access of a live object ever *should*
be undefined behavior. Forget for a moment what C89 seems
to say, and ask whether a C standard *should* say that, and
if so, why? (The "why" is crucial; I see no good reason
for it, and several valid uses for the opposite property.)
If there is no practical reason for it, then we made a
mistake in the wording in C89, and should fix it for C9x.


Michael Rubenstein

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

On Tue, 02 Dec 1997 11:23:57 GMT, Douglas A. Gwyn <gw...@ix.netcom.com>
wrote:

>In article <34879a1b....@nntp.ix.netcom.com>,

I don't see any problem with requiring that something like

unsigned char c;
unsigned char d = c;

be legal; certainly I'd never argue that we should accommodate the
1620 (I still have unfond memories of what happened when I forgot to
load the addition and multiplication tables).

My point was that arguments based on what compilers do are
unconvincing. As I read the standard and the C9x draft, the above
results in undefined behavior and a conforming implementation on a
variable field length machine (certainly not the 1620) could copy c to
d with an instruction that presumes that the field marker for c is
set.

The problem is that I can find nothing in the draft that says that the
only reason accessing an uninitialized variable can cause undefined
behavior is that it might contain a trap representation. It says that
accessing indeterminately valued objects results in undefined behavior
and accessing objects that contain a trap representation (other than
as a character type) results in undefined behavior. Nothing I can
find relates these two statements.

If the intention is to require the above code to be valid, I think
explicit wording that indeterminately valued object may be accessed
using an lvalue of char type is required.

Michael M Rubenstein

R S Haigh

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <660reg$7...@sjx-ixn3.ix.netcom.com>, Douglas A. Gwyn <gw...@ix.netcom.com> writes:
> In article <1997Dec1.1...@leeds.ac.uk>,
> ecl...@leeds.ac.uk (R S Haigh) wrote:
> >Suppose I zero a struct with calloc or memset, and then copy it
> >by assignment. Are all the members considered to have values?
> >Does it depend on the types of the members? If so, what happens
> >if the object is a union instead of a struct, and e.g the union
> >has both int and pointer members?
>
> This is an old question. The answer is, all the integer types
> that were 0-byte initialized by calloc (or memset) have value 0,
> but the value of a 0-byte initialized pointer or floating type
> is not necessarily determinate.
>
> As to the union, it depends on how it's accessed. If you use
> an integer-type member, it would have a well-defined 0 value;
> if you use a pointer or floating type, it is not necessarily
> determinate.

My question was actually about copying by assignment a struct or union
initialised this way. That is

(a) if I have a struct in which all the bits are zero but the (pointer
values of the) pointer members are not necessarily determinate, can I
assign the value of that struct to another struct? (It's been said that
the "value" of a struct for assignment purposes is essentially those
of the members, implying the answer "no".)

(b) if the answer to (a) is no, then what's the value of a union for
assignment purposes, and what's the requirement for assigning a union
to be valid -- i.e. can I assign it uninitialised, and if not, how
much do I have to do to it before I can?

--


R S Haigh

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <660269$n4g$3...@darla.visi.com>, se...@plethora.net (Peter Seebach) writes:

> I would like either change. I prefer a change clarifying that garbage space
> (for instance, newly allocated memory) is *always* indeterminate, for
> all types, because that would lead to a cleaner language, with no skin
> off any legitimate code's nose.

Hang on. Are you saying that if I malloc a struct, and one of its members
is a char array, into which I strcpy a string, I should have to zero
all the spare bytes after the nul byte before I can copy or argument-pass
the struct? Does anybody do this?

--


Peter Seebach

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <660tc6$m...@dfw-ixnews5.ix.netcom.com>,

Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
>My point was, we're in the process of revising the C standard,
>and we can make it say whatever we want it to say.

Ahh. I think I see the confusion; I've read your posts as saying that
it *isn't* undefined behavior, not that it *shouldn't be* undefined.

>I believe
>C89's characterization of "accessing indeterminately-valued
>objects" as being one way to produce undefined behavior was
>(a) confusing and (b) too broad, because of the special
>dispensation we give aliased access via the character types.
>(And I would say that a character type aliases itself, too.)

I would say it's just right - I believe all access to indeterminately
valued objects to be presumptively invalid, and I don't think anyone
is hurt by it being left undefined.

>>Yes. It is clearly our intention that unsigned char aliasing never
>>*introduces* undefined behavior. I see no solid intention that it should
>>provide definition for otherwise undefined behavior.

>You're still on the wrong wavelength -- the mention of
>"undefined behavior" seems to have blinded you to the issue
>of whether bytewise access of a live object ever *should*
>be undefined behavior.

It should if there's no *reason* for it to have a meaningful value. Not
"I can't see an easy way for the value to be meaningless", but "in the
abstract machine, there is not yet any value there".

unsigned char x;
x is a piece of paper. I can write a value between 0 and UCHAR_MAX on it.

I have not yet written on this piece of paper.

I do not believe it is meaningful or correct to say "what is written on
this piece of paper"?

>Forget for a moment what C89 seems
>to say, and ask whether a C standard *should* say that, and
>if so, why?

Because the question "what is in this place where nothing is" is not
meaningful. By the semantics of the abstract machine, the indeterminately
valued object has *no* stored value. None. There is no value there. It
is meaningless to retrieve that which was never stored.

>If there is no practical reason for it, then we made a
>mistake in the wording in C89, and should fix it for C9x.

I see a very compelling reason for it: Cleanliness.

I think it would be silly to try to define an abstract machine such
that regions of memory which have never been written to, and have no
initializer, explicit or implicit, must have values.

I think it is very reasonable to leave retrieving values which do not
exist as undefined behavior.

Peter Seebach

unread,
Dec 2, 1997, 3:00:00 AM12/2/97
to

In article <1997Dec2.1...@leeds.ac.uk>,

Oh, hey, that's a *good* question...

Hmm.

I think you should have to zero them. I had never thought about this
before, and I'm not *entirely* comfortable with this answer - but I do
grant that it seems to me that otherwise, you are copying pieces of
uninitialized memory.

Here's the basic question:

Case 1:
int *ip;

I don't think anyone believes that '*ip' is necessarily an int.

Case 2:
int *ip = malloc(sizeof(int));

I don't believe that '*ip' is necessarily an int, but some people may.

Case 3:
int **ipp = malloc(sizeof(int *));
I don't believe that *ipp is necessarily a pointer to int, and I'm
fairly sure no one believes that **ipp is an int.

Case 4:
unsigned char x;
I don't believe that x has a value yet. Some people disagree.

Case 5:
unsigned char *x = malloc(1);
I don't believe that *x has a value yet. Some people disagree.

Case 6:
unsigned char *x = malloc(2);
x[0] = '\0';
I don't believe that x[1] has a value yet. What do you think?

Norman Diamond

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <V+tg0Q9y...@csn.net>, th...@csn.net (Thad Smith) writes:
>In article <65o9qp$kb2$1...@darla.visi.com>,
>se...@plethora.net (Peter Seebach) wrote:
>>In article <65o9fa$7...@dfw-ixnews11.ix.netcom.com>,

>>Douglas A. Gwyn <gw...@ix.netcom.com> wrote:

>>>A correct fix would be to change that condition to state that using

>>>indeterminate values, other than values of objects of character type


>>>during their lifetimes, produces undefined behavior.

>>There is never any *correct* reason to access an indeterminately-valued


>>object, or more generally, to access indeterminately-valued space.

Does there exist such a thing as "indeterminately-valued space" other
than an object? After all if you have a pointer to malloc'ed space then
the space is an object of the type forced on it by the pointer, and if
you don't have a pointer to the space then you (or at least your C program)
doesn't have the space.

>Does "correct reason" refer to style outside the scope of the standard?

I can't speak for Seebs but the only logical interpretation I make is "yes".

>Must the following program return EXIT_SUCCESS? [...]


> #include <stdlib.h>
> int main (void) {
> unsigned char a, b;
> b = a+1;
> if (b == a+1) return EXIT_SUCCESS;
> else return EXIT_FAILURE;
> }

Even assuming Gwyn-san's fix is applied, the answer is still no.
a+1 might equal UCHAR_MAX+1 while b might equal 0. (After all,
it's theoretically possible for char to be shorter than int :-)

>What if a and b are unsigned int?

Whether or not Gwyn-san's fix is applied, evaluation of a yields
undefined behavior.

=====

Now, here's another problem:
struct s { int a; int b; } s1, s2;
s1.a = 27;
s2.a = s1.a; /* undefined behavior */

To see the problem, compare it to:
s2 = s1; /* undefined behavior */
where the value of s1 includes the indeterminate value of s1.b.

Then remember that in evaluating the expression s1.a, the first operand
of the dot operator is evaluated before the named member is selected.

--
<< If this were the company's opinion, I would not be allowed to post it. >>
"I paid money for this car, I pay taxes for vehicle registration and a driver's
license, so I can drive in any lane I want, and no innocent victim gets to call
the cops just 'cause the lane's not goin' the same direction as me" - J Spammer

Jack Klein

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

Peter Seebach <se...@plethora.net> wrote in article
<65tdoh$806$8...@darla.visi.com>...

[snip]

> No, because the "undefined behavior from accessing through the
wrong type"
> does not say anything about "value", it just says "undefined
behavior because
> wrong type". All we say is that (unsigned char) lvalues
aren't in a *type*
> conflict.

[snip]

I come to a troubling situation if I follow your idea to what I
_think_ is its logical conclusion, please correct me if my
extrapolation misrepresents your thinking.

I think you are maintaining that all storage has indeterminate
value unless explicitly initialized by the program (statically
or by assignment) or by a library function. Of course this
includes such things as function arguments, which are
initialized by the calling code. Then you insist that any
reference to the contents of such indeterminate storage results
in undefined behavior, even if though the program has the right
to access this memory, and even if performed using a pointer or
cast to unsigned char, which has no trap implementation.

Am I carrying this to far to conclude that it is impossible to
perform hardware access in conforming C? The C89 and C9x
standards both permit initializing a pointer with an address
constant, which can be the hardware address of a UART, disk
drive controller, etc. Many of these addresses will be read
only hardware registers and can never be initialized before
reading. Furthermore, the value they contain when read can at
any give time be totally unpredictable to the code, and take on
any value between 0 and UCHAR_MAX, assuming CHAR_BITS equal to
register width.

One could always push the hardware interface back into the
operating system, but that itself is often written in C.

On processors where hardware can be mapped into a separate space
(e.g., Intel x86) you can avoid this question since these
unpredictable values can only be read by non standard extension
functions (i.e., inp(), outp(), which are often inlined), or by
calling a non C function written in assembly language or
anything other than C. But even in this case, unless the "non
C" function does some translating instead of returning the raw
contents of a device register, we haven't really changed the
indeterminate nature of the value returned.

So if a memory mapped hardware device is mapped to a structure
or an array containing only unsigned chars, by assigning an
address constant to a pointer to the array or structure type,
does a program cease to become conforming when it reads an
unsigned char which is volatile and totally unpredictable (i.e.,
indeterminate) at any time it is not being read?

Am I extrapolating your line of reasoning incorrectly, or would
hardware devices be a special case and exempt from automatic
undefined behavior? I ask this seriously because in my job I do
a lot of device programming, some in assembly language, some in
distinctly non conforming C, but some through mechanisms
described above which _could_ be strictly conforming unless the
fact that the code has not written, and cannot know until it
reads, the value it will read from the received data register of
a UART, for example.

Jack


Peter Seebach

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <662bmm$lme$1...@nntpd.lkg.dec.com>,

Norman Diamond <dia...@tbj.dec.com> wrote:
>>>There is never any *correct* reason to access an indeterminately-valued
>>>object, or more generally, to access indeterminately-valued space.

>Does there exist such a thing as "indeterminately-valued space" other
>than an object?

I believe so, and I think this is the crux of the matter.

>After all if you have a pointer to malloc'ed space then
>the space is an object of the type forced on it by the pointer, and if
>you don't have a pointer to the space then you (or at least your C program)
>doesn't have the space.

void *v = malloc(10);

at this point, I have '10 bytes' of space. That space is indeterminately
valued, because it has never been initialized. What type is it?

int *ip = v;
unsigned char *up = v;

Now what type is it?

(I know, if sizeof(int) > 10, the int pointer is even less valid than
ususal.)

>>Does "correct reason" refer to style outside the scope of the standard?

>I can't speak for Seebs but the only logical interpretation I make is "yes".

Indeed. Basically, what "intrinsically" meaningful code would we hope
to support by saying that random garbage is not garbage if you access it
in a certain way? There's no value there!

>Now, here's another problem:
> struct s { int a; int b; } s1, s2;
> s1.a = 27;
> s2.a = s1.a; /* undefined behavior */

>To see the problem, compare it to:
> s2 = s1; /* undefined behavior */
>where the value of s1 includes the indeterminate value of s1.b.

>Then remember that in evaluating the expression s1.a, the first operand
>of the dot operator is evaluated before the named member is selected.

Oh, bloody hell.

Pardon me while I despair of ever having a usable language.

Michael Norrish

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

se...@plethora.net (Peter Seebach) writes:

>> Now, here's another problem:
>> struct s { int a; int b; } s1, s2;
>> s1.a = 27;
>> s2.a = s1.a; /* undefined behavior */

>> To see the problem, compare it to:
>> s2 = s1; /* undefined behavior */
>> where the value of s1 includes the indeterminate value of s1.b.

>> Then remember that in evaluating the expression s1.a, the first
>> operand of the dot operator is evaluated before the named member is
>> selected.

> Oh, bloody hell.

> Pardon me while I despair of ever having a usable language.

Well, I don't know that the language in 6.3.2.3 actually requires the
evaluation of the first expression. It just says that a field
selection "designates a member of a structure or union object".

This is wrong for another reason in my opinion: it should admit the
possibility that the field is designating a member of a structure or
union _value_. One of the examples talks about using the dot operator
with the result of a function call; just where is the structure or
union object in this case?

Anyway, my reading of the example looks like:
s1 is an lvalue. Therefore s1.a is also an lvalue.
Therefore s1.a denotes an object, and this object is the
(initialised) integer with value 27, and referring to it is
legitimate.

Michael.
"Noone will ever be able to give a precise account of a language
like C without first formalising their presentation" - me, just now.

Michael Norrish

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

se...@plethora.net (Peter Seebach) writes:

> I think it would be silly to try to define an abstract machine such
> that regions of memory which have never been written to, and have no
> initializer, explicit or implicit, must have values.

> I think it is very reasonable to leave retrieving values which do not
> exist as undefined behavior.

Why not specify that they have unspecified values?

You say that a good reason for your approach is "cleanliness", but it
is not clean in my opinion. To my mind, it complicates the abstract
machine unnecessarily.

In particular, your abstract machine now has to keep track of which
bits of memory have been written to. If you allow all "in-scope"
locations to be accessed with an unsigned char, you have less baggage
in your abstract machine.

Your conception of the machine has every byte in memory in one of
three possible states:

i. inaccessible
ii. uninitialised
iii. initialised and accessible

And what about padding? I believed that C9x was going to say special
things about this situation, so perhaps we have a fourth alternative,
iv. padding

In the other model, the abstract machine keeps track only of whether a
byte is accessible or not. This also nicely subsumes the structure
padding question, making for a more elegant language specification
that also nicely corresponds to what an implementation is likely to do
anyway.

Michael.


Anthony Towns

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

se...@plethora.net (Peter Seebach) writes:

> >Hang on. Are you saying that if I malloc a struct, and one of its members
> >is a char array, into which I strcpy a string, I should have to zero
> >all the spare bytes after the nul byte before I can copy or argument-pass
> >the struct? Does anybody do this?
> Oh, hey, that's a *good* question...
> Hmm.
> I think you should have to zero them.

Oh, yuck!

I'd _much_ rather something more akin to setting the remaining elements
to be undefined.

So that:

struct s { int a; int b; };

struct s x, y;
x.a = 1;
y = x; /* okay, copies x.a, unspecified behaviour
for {x,y}.b */
assert( y.a == 1 ); /* okay */
assert( y.b == x.b ); /* undefined behaviour */

This extends to:

struct s x, y;
x = y; /* okay, but unspecified behaviour for all
* members of both x and y. */
assert( x.a == y.a ); /* undefined behaviour */

You could, alternately, add a special case that structures with no
defined members may not be copied.

I find this a much better way of thinking about things. I've even
defined C++ constructors explicitly so this worked the same way.

OTOH, I'm just a kid. What do I know?

Cheers,
aj

--
Anthony Towns <a...@humbug.org.au> <http://student.uq.edu.au/~s343676/>
I don't speak for anyone save myself. PGP encrypted mail preferred.

``NT, Networking, Security. Pick any two (you can't have all three).''
-- _The Twelve Networking Truths_, RFC 1925, paraphrased

Ulric Eriksson

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <661fds$3lh$3...@darla.visi.com>,

Peter Seebach <se...@plethora.net> wrote:
>In article <1997Dec2.1...@leeds.ac.uk>,
>R S Haigh <ecl...@leeds.ac.uk> wrote:
>>In article <660269$n4g$3...@darla.visi.com>, se...@plethora.net (Peter

>>Seebach) writes:
>>> I would like either change. I prefer a change clarifying that garbage space
>>> (for instance, newly allocated memory) is *always* indeterminate, for
>>> all types, because that would lead to a cleaner language, with no skin
>>> off any legitimate code's nose.
>
>>Hang on. Are you saying that if I malloc a struct, and one of its members
>>is a char array, into which I strcpy a string, I should have to zero
>>all the spare bytes after the nul byte before I can copy or argument-pass
>>the struct? Does anybody do this?
>
>Oh, hey, that's a *good* question...
>
>Hmm.
>
>I think you should have to zero them. I had never thought about this
>before, and I'm not *entirely* comfortable with this answer - but I do
>grant that it seems to me that otherwise, you are copying pieces of
>uninitialized memory.

But memory per se *never* has a type, or a value for that matter.
It is only when you want to access a value that the type comes
into consideration. My understanding was that this is exactly
what unsigned char is for: a type where every bit pattern can be
interpreted as a value.

Ulric
--
They ask me what gauge strings I use. They all play like in 6, 7, 8, 9
gauge strings and my smallest gauge is 16, then it goes to 18, it goes
to 20, it goes to 38, 48, 58 and 60. I don't play on no wuzzy strings.

Peter Seebach

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <yvsbtyy...@merganser.cl.cam.ac.uk>,

Michael Norrish <mn...@cl.cam.ac.uk> wrote:
>se...@plethora.net (Peter Seebach) writes:
>> I think it would be silly to try to define an abstract machine such
>> that regions of memory which have never been written to, and have no
>> initializer, explicit or implicit, must have values.

>> I think it is very reasonable to leave retrieving values which do not
>> exist as undefined behavior.

>Why not specify that they have unspecified values?

Well, why specify them at all? They have no purpose...

>You say that a good reason for your approach is "cleanliness", but it
>is not clean in my opinion. To my mind, it complicates the abstract
>machine unnecessarily.

I think it simplifies it; rather than a special set of cases where
you can access space that has never been stored to, there's a simple
rule.

The other solution (it's unspecified in all types) is bad because it
imposes penalties on systems with lots of trap representations, unless
we say 'unspecified and may be a trap representation for some types'.

>In particular, your abstract machine now has to keep track of which
>bits of memory have been written to. If you allow all "in-scope"
>locations to be accessed with an unsigned char, you have less baggage
>in your abstract machine.

No, it doesn't. It merely *may* keep track of which bits of memory
have been written to. This leaves it more freedom, without imposing
requirements.

>Your conception of the machine has every byte in memory in one of
>three possible states:

> i. inaccessible
> ii. uninitialised
>iii. initialised and accessible

Hmm... I don't see any way to make an 'inaccessible' byte of memory.

>And what about padding? I believed that C9x was going to say special
>things about this situation, so perhaps we have a fourth alternative,
> iv. padding

Yeah...

>In the other model, the abstract machine keeps track only of whether a
>byte is accessible or not. This also nicely subsumes the structure
>padding question, making for a more elegant language specification
>that also nicely corresponds to what an implementation is likely to do
>anyway.

Hmm. That's a point - I just hate the idea that you're allowed to
look at uninitialized space.

Ulric Eriksson

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <6647dq$ko3$5...@darla.visi.com>,

Peter Seebach <se...@plethora.net> wrote:
>In article <yvsbtyy...@merganser.cl.cam.ac.uk>,
>Michael Norrish <mn...@cl.cam.ac.uk> wrote:
>
>>In particular, your abstract machine now has to keep track of which
>>bits of memory have been written to. If you allow all "in-scope"
>>locations to be accessed with an unsigned char, you have less baggage
>>in your abstract machine.
>
>No, it doesn't. It merely *may* keep track of which bits of memory
>have been written to. This leaves it more freedom, without imposing
>requirements.
>
>>In the other model, the abstract machine keeps track only of whether a
>>byte is accessible or not. This also nicely subsumes the structure
>>padding question, making for a more elegant language specification
>>that also nicely corresponds to what an implementation is likely to do
>>anyway.
>
>Hmm. That's a point - I just hate the idea that you're allowed to
>look at uninitialized space.

What about realloc? That seems impossible to write in C if there
is no way to copy uninitialized memory.

R S Haigh

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

In article <6647dq$ko3$5...@darla.visi.com>, se...@plethora.net (Peter Seebach) writes:

> No, it doesn't. It merely *may* keep track of which bits of memory
> have been written to. This leaves it more freedom, without imposing
> requirements.
>

> [snip]


>
> I just hate the idea that you're allowed to
> look at uninitialized space.

What about unions? Assigning to any one member is typically not going to
touch bytes that are there to be used by other members.

OTOH, I don't think a tagging scheme is necessarily excluded. The
significant property of a byte isn't really whether it's been
written to at all, it's whether it acquired its state
as a result of assigning some scalar value to an object that
the byte forms part of. This attribute can be propagated by raw
aggregate copying operations, so that e.g. the result of copying a
struct with some (or all) "uninitialized" bytes (in this sense)
is another struct with the same "uninitialized" bytes, notwithstanding
those bytes obviously having been written to.

--


Douglas A. Gwyn

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

R S Haigh wrote:
> My question was actually about copying by assignment a struct or union
> initialised this way. That is
> (a) if I have a struct in which all the bits are zero but the (pointer
> values of the) pointer members are not necessarily determinate, can I
> assign the value of that struct to another struct? (It's been said that
> the "value" of a struct for assignment purposes is essentially those
> of the members, implying the answer "no".)

That's right -- struct assignment (= operator) is (or should be) defined
in terms of assignments of all its members. But you can memcpy() it.

> (b) if the answer to (a) is no, then what's the value of a union for
> assignment purposes, and what's the requirement for assigning a union
> to be valid -- i.e. can I assign it uninitialised, and if not, how
> much do I have to do to it before I can?

The union's assignment (to a compatible union) is (or should be) defined
as assignment of the last member stored into (that's the one a s.c.
program is allowed to fetch), and it would thus be undefined behavior if
no member has yet been initialized to a determinate value.

(It is probable that the [draft] standard insufficiently specifies what
is meant by simple assignment, as these questions indicate.)

Calloc or memset initialization of objects of integer type (with
0-valued
bytes) is one valid way to initialize their values. I don't recommend
this in general, but there are situations where it is better than the
alternatives.

Douglas A. Gwyn

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

Michael Rubenstein wrote:
> If the intention is to require the above code to be valid, I think
> explicit wording that indeterminately valued object may be accessed
> using an lvalue of char type is required.

Yes, I can agree with that, and I even suggested such wording earlier
in the discussion.

Douglas A. Gwyn

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

Peter Seebach wrote:
> In article <660tc6$m...@dfw-ixnews5.ix.netcom.com>,

> Douglas A. Gwyn <gw...@ix.netcom.com> wrote:
> >My point was, we're in the process of revising the C standard,
> >and we can make it say whatever we want it to say.
> Ahh. I think I see the confusion; I've read your posts as saying that

I'm saying that (a) the intent, as illuminated by committee
discussions, was that it not be undefined and (b) we should
improve the specification to make this intent clear.

> I would say it's just right - I believe all access to indeterminately
> valued objects to be presumptively invalid, and I don't think anyone
> is hurt by it being left undefined.

Again, I'm not talking about *actually* indeterminate values.
The fact that these character values got labeled "indeterminate"
is unfortunate, because in actuality they are "unspecified", not
"indeterminate". That technical distinction has practical import.

> unsigned char x;
> x is a piece of paper. I can write a value between 0 and UCHAR_MAX on it.
> I have not yet written on this piece of paper.
> I do not believe it is meaningful or correct to say "what is written on
> this piece of paper"?

As with most analogies, that is imperfect and thus misleading.

The actual in-scope character variable is not a piece of paper;
its identifier denotes a specific collection of bits (which are
implemented using real physical resources, e.g. flip-flop
transistor pairs in an integrated circuit). Those bits must
actually exist for the lifetime of the object, *and* each of them
must actually be in either a 0 or 1 state, although the C standard
does not specify which state (for auto or malloc()ed storage).

> I think it would be silly to try to define an abstract machine such
> that regions of memory which have never been written to, and have no
> initializer, explicit or implicit, must have values.

When accessed as bytes, they *must* have values, because otherwise
one could not memcpy() structures (which in general contain
uninitialized padding) or other non-character types. Even when
accessed as non-character types, their values are "indeterminate",
not "nonexistent". The reason we said that accessing "indeterminate"
values falls into the category of "undefined behavior" is that
*requiring* well-defined behavior could put an undue burden on some
reasonable implementations that have MBZ or other "normalization"
requirements on representations of values for certain types.
However, this does not apply to character types, and the committee
has deliberately decided that. The only real problem is that we
have not reconciled the "indeterminate value" wording with this
intent. I have (in an earlier posting) suggested how we could
easily do so.

Douglas A. Gwyn

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

Jack Klein wrote:
> Peter Seebach <se...@plethora.net> wrote ...

> Am I carrying this to far to conclude that it is impossible to
> perform hardware access in conforming C? The C89 and C9x
> standards both permit initializing a pointer with an address
> constant, which can be the hardware address of a UART, disk
> drive controller, etc. Many of these addresses will be read
> only hardware registers and can never be initialized before
> reading. Furthermore, the value they contain when read can at
> any give time be totally unpredictable to the code, and take on
> any value between 0 and UCHAR_MAX, assuming CHAR_BITS equal to
> register width.

You asked questions about "conforming" programs. A "conforming"
program is relatively uninteresting, because its only requirement
is that *some* conforming C implementation accept it. The
useful category is "strictly conforming" program.

Introduction of volatile objects confuses the issue by
introducing external factors beyond the scope of strict
conformance to the standard.

Douglas A. Gwyn

unread,
Dec 3, 1997, 3:00:00 AM12/3/97
to

Peter Seebach wrote:
> R S Haigh <ecl...@leeds.ac.uk> wrote:
> >Hang on. Are you saying that if I malloc a struct, and one of its members
> >is a char array, into which I strcpy a string, I should have to zero
> >all the spare bytes after the nul byte before I can copy or argument-pass
> >the struct? Does anybody do this?
> I think you should have to zero them. I had never thought about this
> before, and I'm not *entirely* comfortable with this answer - but I do
> grant that it seems to me that otherwise, you are copying pieces of
> uninitialized memory.

Yes, you have to do that (in a s.c. program) before assigning
the struct or passing it as an argument.
(Note that one reason the section on simple assignment doesn't
say in detail that assigning a struct is equivalent to assigning
its members is that an array member gets elementwise assigned,
not assigned as an array [which is not allowed in Standard C].
We could still figure out a way to correctly specify the details.)

> Here's the basic question:
> Case 1:
> int *ip;
> I don't think anyone believes that '*ip' is necessarily an int.

That expression has type int, but (assuming ip is not properly
initialized) does not designate a valid value ("indeterminate"
in both the English and C standard sense).

> Case 2:
> int *ip = malloc(sizeof(int));
> I don't believe that '*ip' is necessarily an int, but some people may.

That expression has type int, but (until the pointed-to storage
is properly initialized) does not designate a valid value
("indeterminate" in both the English and C standard sense).

> Case 3:
> int **ipp = malloc(sizeof(int *));
> I don't believe that *ipp is necessarily a pointer to int, and I'm
> fairly sure no one believes that **ipp is an int.

The first expression has type int*, the second int**, but
(assuming the pointed-to storage is not properly initialized)
do not designate valid values ("indeterminate" in both the
English and C standard sense).

> Case 4:
> unsigned char x;
> I don't believe that x has a value yet. Some people disagree.

That expression has type unsigned char (before promotion, when
used in mixed contexts), and designates an unspecified value
("indeterminate" in the English sense, but we're arguing about
whether that is an appropriate application of the C standard's
terminology, or more accurately, whether in this case the
relationship between "use of indeterminate value" and
"undefined behavior" is appropriate).

> Case 5:
> unsigned char *x = malloc(1);
> I don't believe that *x has a value yet. Some people disagree.

That expression has type unsigned char, and designates an
unspecified value ("indeterminate" in the English sense).

> Case 6:
> unsigned char *x = malloc(2);
> x[0] = '\0';
> I don't believe that x[1] has a value yet. What do you think?

That expression has type unsigned char, and designates an
unspecified value ("indeterminate" in the English sense,
same comment as for case 5 re. the standard's sense).

I'm not sure this clarified anything. The real issue has to
do with the way we use the term "indeterminate value" in the
standard. As always, when we overload meanings onto the same
concept, confusion and lack of flexibility arise. Because
"indeterminate value" is an appropriate English characterization
of the condition, I don't propose changing that to something
like "indeterminate unless accessed bytewise" or other mess;
however, the implication that *every* use of an indeterminate value
produces undefined behavior is simply wrong and should be emended.

It is loading more messages.
0 new messages