Using deleted pointers. What is undefined?

Peter C. Chapin

unread,

May 5, 2004, 6:48:45 AM5/5/04

to

This question recently came up for me while having a conversation with
another programmer.

After deleting a dynamically allocated object the pointer that was
pointing at that object becomes "invalid." What does this mean exactly? In
particular, what operations are still allowed on the pointer? In the 1998
standard, section 3.7.3.2, paragraph 4 it says, "The effect of using an
invalid pointer value is undefined." However, that paragraph does not
describe what "use" means in this context. I can't find anywhere in the
standard where it describes what it means to use a pointer. Are the
following "uses" of a pointer?

1. Deleting the pointer again (this is obviously a use).

2. Passing the pointer to a function or returning it from a function?

3. Copying the pointer?

4. Converting the pointer, for example to void* ?

The programmer I was talking with earlier said that on some systems merely
loading a bad address into a register can cause a fault. Thus passing a
deleted pointer to a function might cause a fault on such a system if a
register calling convention was being used. It seems to me that an
implementation for such a system could get around that problem if it
wanted to. Is it required to?

This came up in the following context: Suppose one had a vector of
pointers where each element of the vector pointed at a dynamically
allocated object. Suppose that one then deleted all of those objects,
producing a vector of invalid pointers. Then suppose that one erased one
of the pointers in the middle of the vector. Does this invoke undefined
behavior?

Peter

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

R.F. Pels

unread,

May 5, 2004, 3:19:51 PM5/5/04

to

Peter C. Chapin wrote:

> standard where it describes what it means to use a pointer.

Using the pointer probably means dereferencing it. If the pointer is
dereferenced, anything can happen. From nothing to random behavior to
crash.

> The programmer I was talking with earlier said that on some systems merely
> loading a bad address into a register can cause a fault.

True. I can remember Motorola chips that barfed on addresses not being an
even number.

> This came up in the following context: Suppose one had a vector of
> pointers where each element of the vector pointed at a dynamically
> allocated object. Suppose that one then deleted all of those objects,
> producing a vector of invalid pointers.

Correct.

> Then suppose that one erased one of the pointers in the middle of the
> vector. Does this invoke undefined behavior?

I think one has to keep in mind that a pointer in itself is a variable like
any others. Using its value is where the difference is. In this case,
removing a pointer with an invalid value isn't going to create problems.

--
Ruurd
.o.
..o
ooo

Andrew Koenig

unread,

May 5, 2004, 3:32:56 PM5/5/04

to

"Peter C. Chapin" <pch...@sover.net> wrote in message
news:Xns94DFE833AD38...@207.106.92.237...

> The programmer I was talking with earlier said that on some systems merely
> loading a bad address into a register can cause a fault. Thus passing a
> deleted pointer to a function might cause a fault on such a system if a
> register calling convention was being used. It seems to me that an
> implementation for such a system could get around that problem if it
> wanted to. Is it required to?

The programmer you were talking with is correct: Passing a deleted pointer
to a function is undefined behavior.

The only thing you are allowed to do with a deleted pointer is destroy it or
give it a new value.

> This came up in the following context: Suppose one had a vector of
> pointers where each element of the vector pointed at a dynamically
> allocated object. Suppose that one then deleted all of those objects,
> producing a vector of invalid pointers. Then suppose that one erased one
> of the pointers in the middle of the vector. Does this invoke undefined
> behavior?

Yes, because doing so will copy the pointers after the one erased, and
copying a deleted pointer invokes undefined behavior.

llewelly

unread,

May 5, 2004, 3:36:26 PM5/5/04

to

"Peter C. Chapin" <pch...@sover.net> writes:

> This question recently came up for me while having a conversation with
> another programmer.
>
> After deleting a dynamically allocated object the pointer that was
> pointing at that object becomes "invalid." What does this mean exactly? In
> particular, what operations are still allowed on the pointer? In the 1998
> standard, section 3.7.3.2, paragraph 4 it says, "The effect of using an
> invalid pointer value is undefined." However, that paragraph does not
> describe what "use" means in this context. I can't find anywhere in the
> standard where it describes what it means to use a pointer. Are the
> following "uses" of a pointer?
>
> 1. Deleting the pointer again (this is obviously a use).
>
> 2. Passing the pointer to a function or returning it from a function?
>
> 3. Copying the pointer?
>
> 4. Converting the pointer, for example to void* ?

Core issue 312 is about this:
http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/cwg_active.html#312

And there have been some threads about it in comp.std.c++ :
http://xrl.us/bze7
http://xrl.us/bze8

(and others you can find by searching the
archives.)

>
> The programmer I was talking with earlier said that on some systems merely
> loading a bad address into a register can cause a fault.

On IA32, loading a invalid segment selector into a segment register
will cause a fault. A C++ implementation that wished to support
IA32's 46-bit segmented addressing mode would use segment
registers.

I can't name other examples (68020?), but I believe that this was
considered an important safety feature in the pre-RISC era.

> Thus passing a
> deleted pointer to a function might cause a fault on such a system if a
> register calling convention was being used. It seems to me that an
> implementation for such a system could get around that problem if it
> wanted to. Is it required to?

No-one seems to think so.

> This came up in the following context: Suppose one had a vector of
> pointers where each element of the vector pointed at a dynamically
> allocated object. Suppose that one then deleted all of those objects,
> producing a vector of invalid pointers. Then suppose that one erased one
> of the pointers in the middle of the vector. Does this invoke undefined
> behavior?

[snip]

Probably, but there is an uresolved issue.

Francis Glassborow

unread,

May 5, 2004, 4:40:44 PM5/5/04

to

In message <Xns94DFE833AD38...@207.106.92.237>, Peter C.
Chapin <pch...@sover.net> writes
>

>This question recently came up for me while having a conversation with
>another programmer.
>
>After deleting a dynamically allocated object the pointer that was
>pointing at that object becomes "invalid." What does this mean exactly? In
>particular, what operations are still allowed on the pointer? In the 1998
>standard, section 3.7.3.2, paragraph 4 it says, "The effect of using an
>invalid pointer value is undefined." However, that paragraph does not
>describe what "use" means in this context. I can't find anywhere in the
>standard where it describes what it means to use a pointer. Are the
>following "uses" of a pointer?

We have to distinguish between 'the pointer value' and 'the pointer as a
storage location'. The storage location is fine but even looking at its
contents (i.e. the pointer's value) invokes undefined behaviour.

>
>1. Deleting the pointer again (this is obviously a use).

UB

>
>2. Passing the pointer to a function or returning it from a function?

If you mean a pointer value, yes UB, if you mean pointer or reference to
a pointer (i.e. an object that could contain a pointer value) no.

>
>3. Copying the pointer?

UB

>
>4. Converting the pointer, for example to void* ?

UB
Both these last two unambiguously use the value stored and that is the
thing that must not be touched.

>
>The programmer I was talking with earlier said that on some systems merely
>loading a bad address into a register can cause a fault. Thus passing a
>deleted pointer to a function might cause a fault on such a system if a
>register calling convention was being used. It seems to me that an
>implementation for such a system could get around that problem if it
>wanted to. Is it required to?

No it isn't even if it could.

>
>This came up in the following context: Suppose one had a vector of
>pointers where each element of the vector pointed at a dynamically
>allocated object. Suppose that one then deleted all of those objects,
>producing a vector of invalid pointers. Then suppose that one erased one
>of the pointers in the middle of the vector. Does this invoke undefined
>behavior?

Yes.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Max Zinal

unread,

May 5, 2004, 4:43:46 PM5/5/04

to

Peter C. Chapin wrote:
> After deleting a dynamically allocated object the pointer that was
> pointing at that object becomes "invalid." What does this mean exactly? In
> particular, what operations are still allowed on the pointer?

In all the architectures which are known to me a pointer is an address
of a memory region - and, this, a number. Its size depends on the
architecture and may depend on pointer type (BTW, does Crays really have
32-bit instruction - e.g. function - pointers and 64-bit data pointers?)
So on all those architectures you can safely do anything with that
number unless you do not try to dereference it: copying and pointer
arithmetic are safe operations.

>
> 1. Deleting the pointer again (this is obviously a use).

You get an unspecified behaviour. Depending on the platform and
particular implementation - and on a set of used runtime libraries! -
you can end with:
- immediate segmentation fault signal
- Heap damage which will show itself _only_ with the next
call to allocation/deallocation primitives or even later
- runtime error from a smart library in debug mode
- much, much more

>
> 2. Passing the pointer to a function or returning it from a function?
>

This is a copy operation. Pretty safe, I think, unless a function or its
caller tries to dereference the pointer. It is not very smart to allow
an invalid pointer to be passed anywhere - unsafe practice.

> 3. Copying the pointer?
>

Same as above.

> 4. Converting the pointer, for example to void* ?
>

Same as above.

> The programmer I was talking with earlier said that on some systems merely
> loading a bad address into a register can cause a fault. Thus passing a
> deleted pointer to a function might cause a fault on such a system if a
> register calling convention was being used. It seems to me that an
> implementation for such a system could get around that problem if it
> wanted to. Is it required to?
>

I do not think that this is specified anywhere in the Standard, but I
*really* doubt that such system exist.

Antoun Kanawati

unread,

May 6, 2004, 5:11:52 AM5/6/04

to

Max Zinal wrote:
> Peter C. Chapin wrote:
>
>>After deleting a dynamically allocated object the pointer that was
>>pointing at that object becomes "invalid." What does this mean exactly? In
>>particular, what operations are still allowed on the pointer?

> [snip]

>>4. Converting the pointer, for example to void* ?
> Same as above.

Doesn't virtual inheritance introduce some sort of stored
offset inside the object? In such cases, pointer conversion
would require dereferencing, hence mucking up the program.

Right?

Dave Moore

unread,

May 6, 2004, 10:09:29 AM5/6/04

to

Francis Glassborow <fra...@robinton.demon.co.uk> wrote in message news:<SDArefS1...@robinton.demon.co.uk>...

> In message <Xns94DFE833AD38...@207.106.92.237>, Peter C.
> Chapin <pch...@sover.net> writes
> >
> >This question recently came up for me while having a conversation with
> >another programmer.
> >
> >After deleting a dynamically allocated object the pointer that was
> >pointing at that object becomes "invalid." What does this mean exactly? In
> >particular, what operations are still allowed on the pointer? In the 1998
> >standard, section 3.7.3.2, paragraph 4 it says, "The effect of using an
> >invalid pointer value is undefined." However, that paragraph does not
> >describe what "use" means in this context. I can't find anywhere in the
> >standard where it describes what it means to use a pointer. Are the
> >following "uses" of a pointer?
>
> We have to distinguish between 'the pointer value' and 'the pointer as a
> storage location'. The storage location is fine but even looking at its
> contents (i.e. the pointer's value) invokes undefined behaviour.
>

Yes, clearly the semantics are very important here .. in reading the
replies to this post, I got very confused. I will try to summarize
the confusion here in hopes of enlightenment.

I always understood that a pointer's value was an "address" .. that
is, some number referring to a region of memory. If one tries to
dereference the pointer before the memory is initialized, or after it
has been cleared (deleted), then this is UB. Otherwise, (I thought)
one is free to do whatever with the pointer. So, proceeding to the
OP's examples:

>
> >1. Deleting the pointer again (this is obviously a use).
>
> UB

Ok, this seems clear .. since deletion obviously requires deferencing
the pointer.

> >
> >2. Passing the pointer to a function or returning it from a function?
>
> If you mean a pointer value, yes UB, if you mean pointer or reference to
> a pointer (i.e. an object that could contain a pointer value) no.
>
> >
> >3. Copying the pointer?
>
> UB

Ok, these two I don't understand .. to me, a pointer's "value" is the
address it is storing. Why is it UB to pass (i.e. copy) that address
into another location? It seems to me that this whould just involve
copying of the address stored in the pointer. Of course it (almost)
goes without saying that the invalid pointer should not be used .. and
thus propogating it around is probably a "bad thing" ... so is this
why the Standard specifies this as UB? Or is there something else
that I am missing? For example, is there a difference between a
null-pointer and a deferenced one? AFAIK, it is kosher to pass, copy
or otherwise use a null-pointer, provided you never dereference it ...
so what is the difference with a deleted pointer?

int *i1=0; // NULL pointer .. deferencing is UB
double *d1 = new double(1.0);
delete d1; // deleted pointer .. deferencing is UB

int *i2=i1; // copying a NULL pointer .. this is not UB, right?
double *d2=d1; // copying a deleted pointer .. is this really UB?

> >4. Converting the pointer, for example to void* ?
> UB
> Both these last two unambiguously use the value stored and that is the
> thing that must not be touched.

Once again, I don't see how this should be UB ... how does conversion
to void * deference the deleted pointer? AFAICT it should just change
the way the C++ program deals with the memory addressed by the pointer
in the future, without actually looking at the contents of the memory.

For that matter, I would expect even sizeof(*p) to be a valid
operation for a deleted pointer, since the sizeof operator explicitly
does not deference its argument.

>
> >
> >The programmer I was talking with earlier said that on some systems merely
> >loading a bad address into a register can cause a fault.

Ok .. now I need a definition of a "bad address" I guess ... to me, it
would be an address referring to a physically non-existent or perhaps
otherwise non-addressable (e.g. somehow reserved) region of memory.
However here the OP implies that a pointer to uninitialized memory is
also somehow a bad address. I fail to see how this can be true at the
level of machine code, unless the delete operation somehow changes the
actual address stored in the pointer, and not just the memory it
points to. If this is the case, and deletion actually changes the
address stored in the pointer, why not just change it to 0? Wouldn't
that make life easier, or am I missing something again?

Any clarification of the above issues would be greatly appreciated.
Thanks,

Dave Moore

ka...@gabi-soft.fr

unread,

May 6, 2004, 12:10:36 PM5/6/04

to

"Andrew Koenig" <a...@acm.org> wrote in message
news:<3e7mc.39637$Xj6.6...@bgtnsc04-news.ops.worldnet.att.net>...

> "Peter C. Chapin" <pch...@sover.net> wrote in message
> news:Xns94DFE833AD38...@207.106.92.237...

> > This came up in the following context: Suppose one had a vector of

> > pointers where each element of the vector pointed at a dynamically
> > allocated object. Suppose that one then deleted all of those
> > objects, producing a vector of invalid pointers. Then suppose that
> > one erased one of the pointers in the middle of the vector. Does
> > this invoke undefined behavior?

> Yes, because doing so will copy the pointers after the one erased, and
> copying a deleted pointer invokes undefined behavior.

And more fundamentally, of course, because the standard says that the
contents of an std::vector must be Assignable and Copiable, and a
deleted pointer is neither. Formally, the simple presence of an invalid
pointer in a standard container means that you have undefined behavior.
Even the sequence:

delete v[ i ] ;
v[ i ] = NULL ;

is illegal, because an implementation is allowed to read the element
when calculating the reference. (I can't imagine any implementation
doing this, of course.) In fact, the only strictly legal way to delete
a pointer in a vector would be:

T* p = v[ i ] ;
v[ i ] = NULL ;
delete p ;

In practice, of course, the first case will never cause you any
problems. Deleting all of the pointers in a vector, then destructing
the vector won't either. In neither case is there any reason for the
vector to use the deleted pointers.

I don't think I would risk it much further, however.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,

May 6, 2004, 12:12:56 PM5/6/04

to

Max Zinal <Zl...@mail.ru> wrote in message
news:<10837713...@smtp.tvcom.ru>...

> Peter C. Chapin wrote:
> > After deleting a dynamically allocated object the pointer that was
> > pointing at that object becomes "invalid." What does this mean
> > exactly? In particular, what operations are still allowed on the
> > pointer?

> In all the architectures which are known to me a pointer is an address
> of a memory region - and, this, a number. Its size depends on the
> architecture and may depend on pointer type (BTW, does Crays really
> have 32-bit instruction - e.g. function - pointers and 64-bit data
> pointers?) So on all those architectures you can safely do anything
> with that number unless you do not try to dereference it: copying and
> pointer arithmetic are safe operations.

The question isn't whether certain machines let you get away with
something. The question is whether it is guaranteed by the standard.
(Or at least I understood it that way.)

I don't know about the Cray, but different sized pointers for different
things used to be pretty commom. I've done a lot of work with 32 bit
data pointers and 16 bit function pointers, or vice versa, and I've used
machines with 16 int* and 32 bit char*. (Byte addressing is actually a
fairly recent innovation.)

> > 1. Deleting the pointer again (this is obviously a use).

> You get an unspecified behaviour.

You get undefined behavior.

> Depending on the platform and particular implementation - and on a set
> of used runtime libraries! - you can end with:
> - immediate segmentation fault signal
> - Heap damage which will show itself _only_ with the next
> call to allocation/deallocation primitives or even later
> - runtime error from a smart library in debug mode
> - much, much more

Especially much, much more. One of the most frequent effects is that
the code runs correctly with all of your tests, but crashes on the first
big demo in front of an important customer.

> > 2. Passing the pointer to a function or returning it from a function?

> This is a copy operation. Pretty safe, I think, unless a function or
> its caller tries to dereference the pointer. It is not very smart to
> allow an invalid pointer to be passed anywhere - unsafe practice.

Just accessing the pointer is illegal. The standard is fairly vague
about what "using" a pointer means, but traditionnally, this has been
understood to more or less correspond to an lvalue to rvalue conversion;
you can write a new value to the pointer, but you cannot read the old
value.

The exact phrase in the standard is "The effect of using an invalid
pointer value is undefined." IMHO, the key operative word here is
"value"; it is the lvalue to rvalue conversion which uses the value of
the pointer. But it probably wouldn't hurt to clarify the wording
some.

> > 3. Copying the pointer?

> Same as above.

Correct. Except the above that you wrote wasn't correct.

> > 4. Converting the pointer, for example to void* ?

> Same as above.

Yes, since it is always undefined behavior.

> > The programmer I was talking with earlier said that on some systems
> > merely loading a bad address into a register can cause a fault. Thus
> > passing a deleted pointer to a function might cause a fault on such
> > a system if a register calling convention was being used. It seems
> > to me that an implementation for such a system could get around that
> > problem if it wanted to. Is it required to?

> I do not think that this is specified anywhere in the Standard, but I
> *really* doubt that such system exist.

Of course it's specified in the standard. And you can doubt all you
want, but such systems DID exist, which is why the rules are so
stringent.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

John Potter

unread,

May 7, 2004, 8:00:55 AM5/7/04

to

On 6 May 2004 10:09:29 -0400, dtm...@rijnh.nl (Dave Moore) wrote:

> to me, a pointer's "value" is the
> address it is storing. Why is it UB to pass (i.e. copy) that address
> into another location?

The best way to think about this is to recognize that the standard
allows many things, not just what I might think makes sense.

int main () {
int x;
x == 0;
}

This program has undefined behavior because it looks at the value of
an uninitialized int. The standard allows a C++ interpreter which
stores a trap value in uninitialized variables and aborts the program
on any access.

The plain simple answer to the question is that it is UB because the
standard says it is UB. It is law not logic, do not reason.

John

Frank Birbacher

unread,

May 7, 2004, 8:09:43 AM5/7/04

to

Hi!

Francis Glassborow wrote:
>>This came up in the following context: Suppose one had a vector of
>>pointers where each element of the vector pointed at a dynamically
>>allocated object. Suppose that one then deleted all of those objects,
>>producing a vector of invalid pointers. Then suppose that one erased one
>>of the pointers in the middle of the vector. Does this invoke undefined
>>behavior?
>
>
> Yes.

o_O *puzzled*

So what about passing around past-the-end iterators of arrays (which are invalid pointers, because you may not dereference them)?? Is passing
them around undefined behaviour?? If not, what's the difference?

Frank

Francis Glassborow

unread,

May 7, 2004, 8:59:54 AM5/7/04

to

In message <306d400f.04050...@posting.google.com>, Dave Moore
<dtm...@rijnh.nl> writes

>Ok .. now I need a definition of a "bad address" I guess ... to me, it
>would be an address referring to a physically non-existent or perhaps
>otherwise non-addressable (e.g. somehow reserved) region of memory.
>However here the OP implies that a pointer to uninitialized memory is
>also somehow a bad address. I fail to see how this can be true at the
>level of machine code, unless the delete operation somehow changes the
>actual address stored in the pointer, and not just the memory it
>points to. If this is the case, and deletion actually changes the
>address stored in the pointer, why not just change it to 0? Wouldn't
>that make life easier, or am I missing something again?

OK let me focus on this last section because all the problems you have
with the rest are based on this. A pointer (variable) stores a pointer
value (often referred to as an address to avoid confusion). In order to
store that value it stores a specific bit-pattern but that bit-pattern
is not itself an address just a representation of one (indeed a single
address can -- and does on some systems -- have more than one
bit-representation.

Now when a pointer becomes indeterminate, even if the bit-pattern
remains the same (and certainly according to the opinion of the last
WG14 -- C -- there is no requirement that this be so) it no longer
represents a valid value. More-over this is not playing games with
words, real programs change the valid range of addresses by such means
as (for some architectures) changing segment registers. In addition a
multitude of other low level events can change the meaning of a
bit-pattern.

The process of deletion can, and sometimes does make the bit-pattern no
longer represent a valid address owned by the process. Not least, a
garbage collector could I believe legally change the bit-pattern.

The upshot is that as far as C and C++ are concerned any touching of the
bit-pattern stored in a pointer whose value was the subject of a delete
expression has undefined behaviour. Anything else would unduly constrain
implementors.

If you really want to copy the bit-pattern you can do so by accessing
the storage as an array of unsigned char. But even then the bit-pattern
may have changed (though the C and C++ Standards do not require it to
change.)

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Peter C. Chapin

unread,

May 7, 2004, 9:13:41 AM5/7/04

to

dtm...@rijnh.nl (Dave Moore) wrote in news:306d400f.0405060052.534c64a6
@posting.google.com:

> int *i1=0; // NULL pointer .. deferencing is UB
> double *d1 = new double(1.0);
> delete d1; // deleted pointer .. deferencing is UB
>
> int *i2=i1; // copying a NULL pointer .. this is not UB, right?
> double *d2=d1; // copying a deleted pointer .. is this really UB?

It certainly seems to be the consensus of this group that the last
statement does, in fact, invoke undefined behavior.

> Ok .. now I need a definition of a "bad address" I guess ... to me, it
> would be an address referring to a physically non-existent or perhaps
> otherwise non-addressable (e.g. somehow reserved) region of memory.
> However here the OP implies that a pointer to uninitialized memory is
> also somehow a bad address. I fail to see how this can be true at the
> level of machine code, unless the delete operation somehow changes the
> actual address stored in the pointer, and not just the memory it
> points to.

One could imagine an implementation where the deallocation function hands
memory back to the operating system now and then. Thus after doing

delete p;

The memory pointed at by p might eventually leave the valid address space
of the process even if the address stored in p never changes. I suspect
there are few deallocators like this in typical PC programs, but it
certainly seems possible.

Peter

ka...@gabi-soft.fr

unread,

May 8, 2004, 4:02:11 PM5/8/04

to

dtm...@rijnh.nl (Dave Moore) wrote in message
news:<306d400f.04050...@posting.google.com>...

Not at all. The pointer has a value, which is (or rather may be) an
address. Roughly speaking, we can consider this value as being in one
of four categories:
1. the address of an object, or the raw memory which will or has held
an object,
2. the address one behind the last object in an array -- for this use,
a scalar object is considered to be an array of one,
3. a null pointer, or
4. an invalid pointer.

What you can do with a pointer depends on which category its value is
in. You can only dereference pointers in category 1. You can only
compare pointers for inequality if they are in category 1 or 2, and only
then if they both point into the same array. You can only access the
value of the pointer (the address) if the pointer is in category 1, 2 or
3. You can take the address of the pointer, or assign something to it,
in all four categories.

> So, proceeding to the
> OP's examples:

>>> 1. Deleting the pointer again (this is obviously a use).

>> UB

> Ok, this seems clear .. since deletion obviously requires deferencing
> the pointer.

Actually, there are possible implementations of delete that wouldn't
require dereferencing. But the standard says it is illegal, so it is
illegal.

>>> 2. Passing the pointer to a function or returning it from a
>>> function?

>> If you mean a pointer value, yes UB, if you mean pointer or
>> reference to a pointer (i.e. an object that could contain a pointer
>> value) no.

>>> 3. Copying the pointer?

>> UB

> Ok, these two I don't understand .. to me, a pointer's "value" is the
> address it is storing.

Correct.

> Why is it UB to pass (i.e. copy) that address into another location?

Because the standard says so. And because there have been
implementations (at least of C) where simply reading such a pointer
caused a hardware trap. In all cases, I'm sure that the compiler could
have worked around this, using general purpose registers or something
like memcpy, but the standard explicitly doesn't require it to.

> It seems to me that this whould just involve copying of the address
> stored in the pointer. Of course it (almost) goes without saying that
> the invalid pointer should not be used .. and thus propogating it
> around is probably a "bad thing" ... so is this why the Standard
> specifies this as UB? Or is there something else that I am missing?
> For example, is there a difference between a null-pointer and a
> deferenced one? AFAIK, it is kosher to pass, copy or otherwise use a
> null-pointer, provided you never dereference it ... so what is the
> difference with a deleted pointer?

The fact that the standard says its value is invalid.

A similar situation exists with regards to uninitialized POD types:
anything which might cause an uninitialized int or double (or pointer)
to be read is also undefined behavior.

> int *i1=0; // NULL pointer .. deferencing is UB
> double *d1 = new double(1.0);
> delete d1; // deleted pointer .. deferencing is UB

> int *i2=i1; // copying a NULL pointer .. this is not UB, right?

Right.

> double *d2=d1; // copying a deleted pointer .. is this really UB?

Right.

>>> 4. Converting the pointer, for example to void* ?
>> UB
>> Both these last two unambiguously use the value stored and that is
>> the thing that must not be touched.

> Once again, I don't see how this should be UB ... how does conversion
> to void * deference the deleted pointer?

It doesn't derference it (unless you use dynamic_cast for the
conversion, and the original type is polymorphic). It does access the
value, however, and that is forbidden.

> AFAICT it should just change the way the C++ program deals with the
> memory addressed by the pointer in the future, without actually
> looking at the contents of the memory.

> For that matter, I would expect even sizeof(*p) to be a valid
> operation for a deleted pointer, since the sizeof operator explicitly
> does not deference its argument.

It is, since sizeof doesn't actually evaluate its operand. All sorts of
things are legal in a sizeof but will get you into trouble if you
actually were to execute them.

>>> The programmer I was talking with earlier said that on some systems
>>> merely loading a bad address into a register can cause a fault.

> Ok .. now I need a definition of a "bad address" I guess ... to me, it
> would be an address referring to a physically non-existent or perhaps
> otherwise non-addressable (e.g. somehow reserved) region of memory.

For example. And calling delete can remap the memory controller so that
the deleted address becomes inaccessible.

> However here the OP implies that a pointer to uninitialized memory is
> also somehow a bad address.

You can have a valid pointer to uninitialized memory. What else would
you pass to a constructor, for example.

> I fail to see how this can be true at the level of machine code,
> unless the delete operation somehow changes the actual address stored
> in the pointer, and not just the memory it points to.

That's because there are machine architectures you are not familiar
with. Almost nothing would surprise me at the level of machine code.

> If this is the case, and deletion actually changes the address stored
> in the pointer, why not just change it to 0? Wouldn't that make life
> easier, or am I missing something again?

Well, it would mask a certain number of errors, so that they would only
show up later, when it would be much harder to find the problem.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

May 8, 2004, 4:02:47 PM5/8/04

to

Antoun Kanawati <ant...@comcast.net> wrote in message
news:<ubdmc.29072$TD4.4173085@attbi_s01>...

> Max Zinal wrote:
>> Peter C. Chapin wrote:

>>> After deleting a dynamically allocated object the pointer that was
>>> pointing at that object becomes "invalid." What does this mean
>>> exactly? In particular, what operations are still allowed on the
>>> pointer?

>> [snip]

>>> 4. Converting the pointer, for example to void* ?
>> Same as above.

> Doesn't virtual inheritance introduce some sort of stored
> offset inside the object? In such cases, pointer conversion
> would require dereferencing, hence mucking up the program.

If you do a dynamic_cast on a pointer to an object which isn't
constructed, you can get into deep trouble, very quickly. Yes. But
even a static_cast on a deleted pointer can fail.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Francis Glassborow

unread,

May 8, 2004, 9:53:33 PM5/8/04

to

In message <2fv6n7F...@uni-berlin.de>, Frank Birbacher
<bloodym...@gmx.net> writes

>Francis Glassborow wrote:
> >>This came up in the following context: Suppose one had a vector of
> >>pointers where each element of the vector pointed at a dynamically
> >>allocated object. Suppose that one then deleted all of those objects,
> >>producing a vector of invalid pointers. Then suppose that one erased one
> >>of the pointers in the middle of the vector. Does this invoke undefined
> >>behavior?
> >
> >
> > Yes.
>
>o_O *puzzled*
>
>So what about passing around past-the-end iterators of arrays (which are
>invalid pointers, because you may not dereference them)?? Is passing
>them around undefined behaviour?? If not, what's the difference?

One beyond the end pointers are explicitly required to contain valid
addresses, just ones that must not be dereferenced. The address is valid
and so accessible but we must not try to use the address to access the
(non-existent) object.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

llewelly

unread,

May 8, 2004, 10:15:12 PM5/8/04

to

Combine that with an architecture that traps when an invalid pointer
is loaded into an address register, and we have the reason why:
'delete p; T* q= p;' is undefined behavior.

llewelly

unread,

May 8, 2004, 10:30:05 PM5/8/04

to

Frank Birbacher <bloodym...@gmx.net> writes:

> Hi!
>
> Francis Glassborow wrote:
> >>This came up in the following context: Suppose one had a vector of
> >>pointers where each element of the vector pointed at a dynamically
> >>allocated object. Suppose that one then deleted all of those objects,
> >>producing a vector of invalid pointers. Then suppose that one erased one
> >>of the pointers in the middle of the vector. Does this invoke undefined
> >>behavior?
> >
> >
> > Yes.
>
> o_O *puzzled*
>
> So what about passing around past-the-end iterators of arrays (which
> are invalid pointers, because you may not dereference them)?? Is
> passing them around undefined behaviour?? If not, what's the
> difference?

A past-the-end pointer is not invalid. It can be used in
additive (5.7), relational (5.9), equality (5.10), decrement
(5.2.6), and other expressions.

Similar for past-the-end iterators. AFAIK, the only things you can't
do with past-the-end iterators or pointer is advance them or
dereference them.

John Potter

unread,

May 9, 2004, 8:22:48 AM5/9/04

to

On 7 May 2004 08:09:43 -0400, Frank Birbacher <bloodym...@gmx.net>
wrote:

> So what about passing around past-the-end iterators of arrays (which
> are invalid pointers, because you may not dereference them)?? Is
> passing them around undefined behaviour?? If not, what's the difference?

Confusion with terms. There are four kinds of pointer values and three
kinds of iterator values. The set of operations allowed depends upon the
kind of value.

Dereferencible pointer/iterator : all operations Past the end
pointer/iterator : no dereference Null pointer : no dereference
Arithmetic must stay within the valid range. Adding 0 to null.
Invalid pointer/iterator : no use

The first three are all valid values which may be copied and compared.
Invalid pointer/iterator values occur through lack of initialization or some
operation. If the standard says it is invalid, any use is UB.

John

Maciej Sobczak

unread,

May 9, 2004, 8:23:36 AM5/9/04

to

Hi,

Frank Birbacher wrote:

> So what about passing around past-the-end iterators of arrays (which
> are invalid pointers, because you may not dereference them)?? Is passing
them around undefined behaviour?? If not, what's the difference?

The difference is that the pointer can be:
- dereferenceable,
- past-the-end, or
- invalid.

The Standard intentionally contains the wording (5.7/5) that makes
past-the-end pointers to be non-dereferenceable, but is some sense valid
pointers.
"In some sense valid" means that there is a given set of operations that are
well-defined (this include comparisons, pointer arithmetic back into array
and similar stuff), but this set does not include dereferencing.

So:

int *a = new int[10];
int *p1 = a + 5;
int *p2 = a + 10; // past-the-end

p1 == p2; // OK (gives false as a result)
--p2; // OK, becomes dereferenceable
++p2; // OK, goes back to past-the-end
int *p3 = p2; // OK, p2 is valid expression
++p3; // UB
cout << *p1; // OK, p1 is dereferenceable
cout << *p2; // UB, p2 is not dereferenceable

delete [] a;
p1 == p2; // UB, now all a, p1, p2 and p3 are invalid

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

David Olsen

unread,

May 9, 2004, 9:24:11 AM5/9/04

to

Frank Birbacher wrote:
> So what about passing around past-the-end iterators of arrays (which
> are invalid pointers, because you may not dereference them)?? Is
> passing them around undefined behaviour?? If not, what's the
> difference?

A pointer that points just past the end of an array can be used safely
in a number of contexts, such as copying it or comparing it against a
different pointer that points within the same array. It just can't be
dereferenced. Whether or not you consider it to be a valid pointer
depends on your definition of valid. My preferred terminology is that
such a pointer is valid but not dereferencible.

A pointer that has been deleted, however, is not valid in any way. It
can't be used safely in any context.

--
David Olsen
qg4h9...@yahoo.com

Kurt Stege

unread,

May 10, 2004, 11:14:10 AM5/10/04

to

On 8 May 2004 16:02:11 -0400, ka...@gabi-soft.fr wrote:

>The pointer has a value, which is (or rather may be) an
>address. Roughly speaking, we can consider this value as being in one
>of four categories:
> 1. the address of an object, or the raw memory which will or has held
> an object,
> 2. the address one behind the last object in an array -- for this use,
> a scalar object is considered to be an array of one,
> 3. a null pointer, or
> 4. an invalid pointer.
>
>What you can do with a pointer depends on which category its value is
>in. You can only dereference pointers in category 1.

OK.

>You can only
>compare pointers for inequality if they are in category 1 or 2, and only
>then if they both point into the same array.

Surely you mean, that null pointers (3.) for comparison.

And, important for the current thread, I assume whenever a pointer
may be compared with another pointer, it may be read and copied (the
pointer, not the value it is pointing to).

Every pointer of type 1. or 2. may be compared to a pointer of
type 3., the result will be false (different). And every pointer
of type 3. may be compared to a pointer of type 3. with the
result true (same).

Here I am talking about comparisons "==" and "!=".

For other comparisons like "<" the restictions you mentioned
are valid.

>You can only access the value of the pointer (the address) if
>the pointer is in category 1, 2 or 3.

Yes.

>You can take the address of the pointer, or assign something to it,
>in all four categories.

Yes.

That summaries the defined standard behaviour. It may be surprising
to learn that copying a pointer of type 4. leads to undefined behaviour.

Regards,
Kurt.

ka...@gabi-soft.fr

unread,

May 11, 2004, 4:41:34 PM5/11/04

to

Kurt Stege <kst...@innovative-systems.de> wrote in message
news:<2g926s...@uni-berlin.de>...

> On 8 May 2004 16:02:11 -0400, ka...@gabi-soft.fr wrote:
> >The pointer has a value, which is (or rather may be) an address.
> >Roughly speaking, we can consider this value as being in one of four
> >categories:
> > 1. the address of an object, or the raw memory which will or has held
> > an object,
> > 2. the address one behind the last object in an array -- for this use,
> > a scalar object is considered to be an array of one,
> > 3. a null pointer, or
> > 4. an invalid pointer.
> >What you can do with a pointer depends on which category its value is
> >in. You can only dereference pointers in category 1.

> OK.

> >You can only
> >compare pointers for inequality if they are in category 1 or 2, and only
> >then if they both point into the same array.

> Surely you mean, that null pointers (3.) for comparison.

Not for inequality.

> And, important for the current thread, I assume whenever a pointer may
> be compared with another pointer, it may be read and copied (the
> pointer, not the value it is pointing to).

Yes. I mention that below.

> Every pointer of type 1. or 2. may be compared to a pointer of type
> 3., the result will be false (different). And every pointer of type
> 3. may be compared to a pointer of type 3. with the result true
> (same).

> Here I am talking about comparisons "==" and "!=".

And I specifically said compare for inequality, that is >, >=, < and
<=. But you're right that I should have mentioned comparisons for
equality below, to make it clear.

> For other comparisons like "<" the restictions you mentioned
> are valid.

> >You can only access the value of the pointer (the address) if the
> >pointer is in category 1, 2 or 3.

> Yes.

> >You can take the address of the pointer, or assign something to it,
> >in all four categories.

> Yes.

> That summaries the defined standard behaviour. It may be surprising to
> learn that copying a pointer of type 4. leads to undefined behaviour.

It seems to surprise a lot of people, but it has been the case since
work began on the C standard, at least. The C++ rules here are exactly
the same as those in C.

--
James Kanze GABI Software

Conseils en informatique orientée objet/

Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

toddmars...@yahoo.com

unread,

May 12, 2004, 2:02:23 PM5/12/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04051...@posting.google.com>...

> That summaries the defined standard behaviour. It may be surprising to
> learn that copying a pointer of type 4. leads to undefined behaviour.

How is a compiler supposed to implement these 'standard' behaviours in
run-time?
Seems like adding alot of code where it's not needed. Doesn't seem
like the C++ I'm used to. Where's a good place to look to understand
how a compiler implements undefined behaviour when copying a pointer?

Andrew Koenig

unread,

May 12, 2004, 8:16:16 PM5/12/04

to

<ka...@gabi-soft.fr> wrote in message
news:d6652001.04050...@posting.google.com...

> dtm...@rijnh.nl (Dave Moore) wrote in message

> Not at all. The pointer has a value, which is (or rather may be) an

> address. Roughly speaking, we can consider this value as being in one
> of four categories:
> 1. the address of an object, or the raw memory which will or has held
> an object,
> 2. the address one behind the last object in an array -- for this use,
> a scalar object is considered to be an array of one,
> 3. a null pointer, or
> 4. an invalid pointer.
>
> What you can do with a pointer depends on which category its value is
> in. You can only dereference pointers in category 1. You can only
> compare pointers for inequality if they are in category 1 or 2, and only
> then if they both point into the same array. You can only access the
> value of the pointer (the address) if the pointer is in category 1, 2 or
> 3. You can take the address of the pointer, or assign something to it,
> in all four categories.

This statement is not quite correct:

You can use == and != on pointers in category 1, 2, or 3.

You can also use <, >, <=, and >= on pointers in category 1 or 2, provided
that they point into (or one past the end of) the same array.

Andrew Koenig

unread,

May 12, 2004, 8:18:13 PM5/12/04

to

<ka...@gabi-soft.fr> wrote in message
news:d6652001.04050...@posting.google.com...

> "Andrew Koenig" <a...@acm.org> wrote in message

> pointer in a standard container means that you have undefined behavior.

> Even the sequence:
>
> delete v[ i ] ;
> v[ i ] = NULL ;
>
> is illegal, because an implementation is allowed to read the element
> when calculating the reference.

I am very skeptical of this argument, because if true, it would render
undefined the following code:

int main()
{
int x[10];
for (int i = 0; i != 10; ++i)
x[i] = i;
}

For the sake of argument, let's rewrite v[i] as *(v+i). Can you tell me
where the standard permits *(v+i) to be dereferenced if it is used as an
lvalue?

(If you can find such a place, I think a defect report is in order, because
of the example above)

Max Zinal

unread,

May 12, 2004, 8:27:16 PM5/12/04

to

ka...@gabi-soft.fr wrote:
> It seems to surprise a lot of people, but it has been the case since
> work began on the C standard, at least. The C++ rules here are exactly
> the same as those in C.

IMHO it tends to surprise so many of us because the reasons for that
'pointer rules' are outside the scope of the language itself. C has been
called 'a high-level assembler' many times, but you cannot find any
reason for denying access to a pointer value uness you know about
the 'address registers' on those exotic hardware acrhitectures,
which doesn't seem to integrate well with the logic on which
the language has been based.

Segmented address models are becoming less and less common as well
as word-only addressing. But now I realize the reasons for that
rules - thank you for your comments. Purhaps this rules should be
written in huge red letters in some of the most basic and primitive
books on C/C++.

Francis Glassborow

unread,

May 12, 2004, 8:28:32 PM5/12/04

to

In message <c7aadd5c.04051...@posting.google.com>,
toddmars...@yahoo.com writes

>ka...@gabi-soft.fr wrote in message news:<d6652001.04051...@posting.google.com>...
>> That summaries the defined standard behaviour. It may be surprising to
>> learn that copying a pointer of type 4. leads to undefined behaviour.
>
>How is a compiler supposed to implement these 'standard' behaviours in
>run-time?
>Seems like adding alot of code where it's not needed. Doesn't seem
>like the C++ I'm used to. Where's a good place to look to understand
>how a compiler implements undefined behaviour when copying a pointer?

What is that supposed to mean? The compiler has to do zilch for
undefined behaviour because it has no responsibility for the
consequences.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

llewelly

unread,

May 13, 2004, 6:51:03 AM5/13/04

to

Max Zinal <Zl...@mail.ru> writes:

> ka...@gabi-soft.fr wrote:
>> It seems to surprise a lot of people, but it has been the case since
>> work began on the C standard, at least. The C++ rules here are exactly
>> the same as those in C.
>
> IMHO it tends to surprise so many of us because the reasons for that
> 'pointer rules' are outside the scope of the language itself. C has been
> called 'a high-level assembler' many times, but you cannot find any
> reason for denying access to a pointer value uness you know about
> the 'address registers' on those exotic hardware acrhitectures,
> which doesn't seem to integrate well with the logic on which
> the language has been based.
>
> Segmented address models are becoming less and less common as well
> as word-only addressing.

[snip]

But since faulting on the load of an invalid address into an address
register is a security feature, it may come back.

James Dennett

unread,

May 13, 2004, 7:00:02 AM5/13/04

to

Andrew Koenig wrote:

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04050...@posting.google.com...
>
>>"Andrew Koenig" <a...@acm.org> wrote in message
>
>
>>pointer in a standard container means that you have undefined behavior.
>>Even the sequence:
>>
>> delete v[ i ] ;
>> v[ i ] = NULL ;
>>
>>is illegal, because an implementation is allowed to read the element
>>when calculating the reference.
>
>
> I am very skeptical of this argument, because if true, it would render
> undefined the following code:
>
> int main()
> {
> int x[10];
> for (int i = 0; i != 10; ++i)
> x[i] = i;
> }

No, it would not. Arguments about the restrictions imposed
by standard library containers have nothing to say about
built-in arrays. Items in a container are required to obey
certain rules. If a deleted pointer doesn't satisfy those
rules, then the delete v[i]; v[i] = NULL; fragment is not
guaranteed to work, nor is delete v[0]; v[1];

> For the sake of argument, let's rewrite v[i] as *(v+i). Can you tell me
> where the standard permits *(v+i) to be dereferenced if it is used as an
> lvalue?
>
> (If you can find such a place, I think a defect report is in order, because
> of the example above)

I don't see that your example relates to the issue of whether
a deleted pointer can legally be held in (say) a std::vector<int*>.

Did you miss the "in a standard container" restriction?

-- James

ka...@gabi-soft.fr

unread,

May 13, 2004, 9:17:54 AM5/13/04

to

Max Zinal <Zl...@mail.ru> wrote in message

news:<10843834...@smtp.tvcom.ru>...

> ka...@gabi-soft.fr wrote:
> > It seems to surprise a lot of people, but it has been the case since
> > work began on the C standard, at least. The C++ rules here are
> > exactly the same as those in C.

> IMHO it tends to surprise so many of us because the reasons for that
> 'pointer rules' are outside the scope of the language itself.

The reasons for everything in C and in C++ are "outside the scope of the
language". The language doesn't dictate usability or implementability;
it is usability and implementability which have dictated the language.

> C has been called 'a high-level assembler' many times, but you cannot
> find any reason for denying access to a pointer value uness you know
> about the 'address registers' on those exotic hardware acrhitectures,
> which doesn't seem to integrate well with the logic on which the
> language has been based.

So all the world's a VAX:-). There was a time when many C programmers
seemed to think so.

> Segmented address models are becoming less and less common as well as
> word-only addressing.

One of the two machines I have on my desk has segmented addressing. As
do several machines I have at home.

Generally, because they are 32 bit machines, and most programs don't use
more than 4 GB of main memory, we choose to ignore the segmentation;
some current OS's (Windows and Linux, for example) don't even allow use
to use it. But it is there, and I've written programs in the past that
did use it (48 bit addresses on an Intel 80386).

Of course, just because the memory is segmented doesn't mean that free
or delete have to invalidate the pointer. And it is entirely possible
for free or delete to invalidate it on some non-segmented architectures;
all that is needed is for the machine to have separate address
registers, and for the compiler to systematically use them.

> But now I realize the reasons for that rules - thank you for your
> comments. Purhaps this rules should be written in huge red letters in
> some of the most basic and primitive books on C/C++.

It *should* be pointed out relatively early in any C/C++ book. Almost
as soon as pointers and dynamic allocation are introduced.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

May 13, 2004, 9:18:29 AM5/13/04

to

toddmars...@yahoo.com wrote in message
news:<c7aadd5c.04051...@posting.google.com>...

> ka...@gabi-soft.fr wrote in message
> news:<d6652001.04051...@posting.google.com>...
> > That summaries the defined standard behaviour. It may be surprising
> > to learn that copying a pointer of type 4. leads to undefined
> > behaviour.

> How is a compiler supposed to implement these 'standard' behaviours in
> run-time?
> Seems like adding alot of code where it's not needed. Doesn't seem
> like the C++ I'm used to. Where's a good place to look to understand
> how a compiler implements undefined behaviour when copying a pointer?

The whole point of undefined behavior is that the compiler doesn't have
to implement anything, regardless of what the hardware does. The point
behind the undefined behavior here is precisely to free the compiler
from having to generate special code to make it work when it doesn't
work naturally with the hardware.

Very much in the philosophy of C/C++ that I know: you get whatever the
hardware gives you.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

May 13, 2004, 9:19:06 AM5/13/04

to

"Andrew Koenig" <a...@acm.org> wrote in message

news:<XBroc.47041$Ut1.1...@bgtnsc05-news.ops.worldnet.att.net>...

That's what I tried to say. The comparisons for inequality are <, >, <=
and >=. (I didn't mention comparisons for equality, but since to
compare, you need to read the pointer, they fall into the group
accessing the value of the pointer, i.e. groups 1, 2, and 3.)

I guess I should have been at bit wordier.

--
James Kanze GABI Software

Conseils en informatique orientée objet/

Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Kurt Stege

unread,

May 13, 2004, 7:32:29 PM5/13/04

to

On 12 May 2004 14:02:23 -0400, toddmars...@yahoo.com wrote:

>ka...@gabi-soft.fr wrote in message news:<d6652001.04051...@posting.google.com>...
>> That summaries the defined standard behaviour. It may be surprising to
>> learn that copying a pointer of type 4. leads to undefined behaviour.

Somehow, your quoting went crazy. These two lines were not written
by James, but by myself.

>How is a compiler supposed to implement these 'standard' behaviours in
>run-time?

This question I don't grasp. Whar "standard behaviours" are you
mentioning? The compiler does not have to do anything to "implement"
"undefined behaviour" The standard says that any code (or non-code)
for undefined behaviour is OK. Even when happens what you are expecting
is OK for undefined behaviour ;-)

>Seems like adding alot of code where it's not needed.

Undefined behaviour is most times not a special code generated
by the compiler, but just what happens by accident when executing
the code that was generated for the cases with defined behaviour.
In our example that would be to copy a valid pointer or a NULL pointer.

>Doesn't seem like the C++ I'm used to. Where's a good place to
>look to understand how a compiler implements undefined behaviour
>when copying a pointer?

That depends on the compiler, and is typically even for a concrete
compiler not defined. Thats why it's called undefined behaviour :-)
To get "undefined behaviour", the compiler does not have to implement
something. It may add special code, of course, but you are right,
that is not typical for C++.

Hope that helps,
Kurt.

Rob

unread,

May 16, 2004, 5:44:29 AM5/16/04

to

"David Olsen" <qg4h9...@yahoo.com> wrote in message
news:2g26r6F...@uni-berlin.de...

> Frank Birbacher wrote:
> > So what about passing around past-the-end iterators of arrays (which
> > are invalid pointers, because you may not dereference them)?? Is
> > passing them around undefined behaviour?? If not, what's the
> > difference?

It is quite acceptable to pass past-the-end iterators around, and compare
them with other iterators. It is undefined behaviour to dereference them.

>
> A pointer that points just past the end of an array can be used safely
> in a number of contexts, such as copying it or comparing it against a
> different pointer that points within the same array. It just can't be
> dereferenced. Whether or not you consider it to be a valid pointer
> depends on your definition of valid. My preferred terminology is that
> such a pointer is valid but not dereferencible.

I suspect you're splitting hairs unnecessarily. If your definition
of "valid" is something like "any variable of type pointer can
represent this value", then all pointers will be valid.

>
> A pointer that has been deleted, however, is not valid in any way. It
> can't be used safely in any context.
>

Really? What is unsafe with the following?

#include <iostream>
main()
{
int *x = new int;
int *y = new int;
delete x;
if (x != y)
std::cout << "True\n";
else
std::cout << "False\n";
return 0;

John Potter

unread,

May 16, 2004, 7:30:30 PM5/16/04

to

On 16 May 2004 05:44:29 -0400, "Rob" <nos...@nonexistant.com> wrote:

> What is unsafe with the following?

You seem to be missing the point that this is law not logic.

> int *x = new int;
> int *y = new int;
> delete x;

The standard states that any use of the current value of x
has undefined behavior.

> if (x != y)

This uses the value of x. It is UB. UB is unsafe because
anything may happen. "Anything" means exactly that. Do not
ask what.

John

Francis Glassborow

unread,

May 16, 2004, 7:33:36 PM5/16/04

to

In message <40a6f...@news.iprimus.com.au>, Rob
<nos...@nonexistant.com> writes

>Really? What is unsafe with the following?
>
>#include <iostream>
>main()
>{
> int *x = new int;
> int *y = new int;
> delete x;
> if (x != y)

That the above line compare the values stored in two pointer variables,
one of which is in an indeterminate state.

> std::cout << "True\n";
> else
> std::cout << "False\n";
> return 0;
>}
>

--

Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Rob

unread,

May 17, 2004, 10:06:04 AM5/17/04

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message

news:xQxTKDIO...@robinton.demon.co.uk...

> In message <40a6f...@news.iprimus.com.au>, Rob
> <nos...@nonexistant.com> writes
> >Really? What is unsafe with the following?
> >
> >#include <iostream>
> >main()
> >{
> > int *x = new int;
> > int *y = new int;
> > delete x;
> > if (x != y)
>
> That the above line compare the values stored in two pointer variables,
> one of which is in an indeterminate state.
>

I agree that the value of x is indeterminate. My quibble is that I fail to
see why the act of comparing an indeterminate value with a (for want
of better wording) determinate one is fundamentally unsafe.

I'll even go so far as to accept that the result of the comparison is
indeterminate, but it is also constrained to be either true or false.

John Potter

unread,

May 17, 2004, 4:39:09 PM5/17/04

to

On 17 May 2004 10:06:04 -0400, "Rob" <nos...@nonexistant.com> wrote:

> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
> news:xQxTKDIO...@robinton.demon.co.uk...

> > In message <40a6f...@news.iprimus.com.au>, Rob
> > <nos...@nonexistant.com> writes

> > >Really? What is unsafe with the following?

> > >#include <iostream>
> > >main()
> > >{
> > > int *x = new int;
> > > int *y = new int;
> > > delete x;
> > > if (x != y)

> > That the above line compare the values stored in two pointer variables,
> > one of which is in an indeterminate state.

> I agree that the value of x is indeterminate. My quibble is that I fail to
> see why the act of comparing an indeterminate value with a (for want
> of better wording) determinate one is fundamentally unsafe.

> I'll even go so far as to accept that the result of the comparison is
> indeterminate, but it is also constrained to be either true or false.

There are no constraints on undefined behavior. Since this one is so
easy to detect, the implementation may generate code to call abort as
a valid translation of the if statement.

You are still missing the point. What an implementation is allowed to
do is determined by the C++ standard. The standard says that when a
pointer value is used as an operand to delete, looking at any pointer
which had that value is undefined behavior. Undefined behavior says
that the implementation may generate any code it likes. My favorite
example is to send email to your boss suggesting that you should seek
new employment for using undefined behavior in production code. There
was a compiler which detected some forms of undefined behavior and
generated code to start nethack.

Using something which has no defined behavior is unsafe.

John

Ralf Fassel

unread,

May 17, 2004, 4:40:54 PM5/17/04

to

* Francis Glassborow <fra...@robinton.demon.co.uk>

I would not expect the value of x to change just because the object it
points to is deleted.

R'

Randy Maddox

unread,

May 18, 2004, 11:22:43 AM5/18/04

to

"Rob" <nos...@nonexistant.com> wrote in message news:<40a81...@news.iprimus.com.au>...

> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
> news:xQxTKDIO...@robinton.demon.co.uk...
> > In message <40a6f...@news.iprimus.com.au>, Rob
> > <nos...@nonexistant.com> writes
> > >Really? What is unsafe with the following?
> > >
> > >#include <iostream>
> > >main()
> > >{
> > > int *x = new int;
> > > int *y = new int;
> > > delete x;
> > > if (x != y)
> >
> > That the above line compare the values stored in two pointer variables,
> > one of which is in an indeterminate state.
> >
>
> I agree that the value of x is indeterminate. My quibble is that I fail to
> see why the act of comparing an indeterminate value with a (for want
> of better wording) determinate one is fundamentally unsafe.
>
> I'll even go so far as to accept that the result of the comparison is
> indeterminate, but it is also constrained to be either true or false.
>

It is not necessarily unsafe, although it well could be. It is
definitely, however, undefined. And that means that the result is not
constrained to be either true or false. Once you enter the land of
UB, all bets are off. Whatever happens is whatever happens. It may
be as you expect, but it may be completely different. It's undefined.
That's the whole issue here.

Randy.

llewelly

unread,

May 18, 2004, 11:23:24 AM5/18/04

to

"Rob" <nos...@nonexistant.com> writes:

> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
> news:xQxTKDIO...@robinton.demon.co.uk...
>> In message <40a6f...@news.iprimus.com.au>, Rob
>> <nos...@nonexistant.com> writes
>> >Really? What is unsafe with the following?
>> >
>> >#include <iostream>
>> >main()
>> >{
>> > int *x = new int;
>> > int *y = new int;
>> > delete x;
>> > if (x != y)
>>
>> That the above line compare the values stored in two pointer variables,
>> one of which is in an indeterminate state.
>>
>
> I agree that the value of x is indeterminate. My quibble is that I fail to
> see why the act of comparing an indeterminate value with a (for want
> of better wording) determinate one is fundamentally unsafe.

[snip]

(a) An implementation of delete could return memory to the operating
system after it was de-allocated, and the memory could be then
unmapped, making pointers into it invalid.

(b) Some architectures fault when an invalid address is loaded into an
address register. This is considered a safety feature, as it
prevents use of an invalid address.

The combination of (a) and (b) is one example the authors of
standard wanted to allow by making the use of an invalid pointer
undefined.

To put it another way, use of a pointer to a destroyed object is a
conceptual error, and it is good the standard allows an
implementation to detect such an error. (Too bad such
implementations aren't more common. :-)

David Olsen

unread,

May 18, 2004, 11:32:22 AM5/18/04

to

Rob wrote:
> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message news:xQxTKDIO...@robinton.demon.co.uk...
>>In message <40a6f...@news.iprimus.com.au>, Rob <nos...@nonexistant.com> writes
>>
>>>Really? What is unsafe with the following?
>>>
>>>#include <iostream>
>>>main()
>>>{
>>> int *x = new int;
>>> int *y = new int;
>>> delete x;
>>> if (x != y)
>>
>>That the above line compare the values stored in two pointer variables,
>>one of which is in an indeterminate state.
>
> I agree that the value of x is indeterminate. My quibble is that I fail to
> see why the act of comparing an indeterminate value with a (for want
> of better wording) determinate one is fundamentally unsafe.

Because the C++ says that comparing an indeterminate pointer value with
something else is undefined behavior.

> I'll even go so far as to accept that the result of the comparison is
> indeterminate, but it is also constrained to be either true or false.

No, the behavior is undefined. It is not constrained to return any
value at all.

I would be very, very surprised if you ever used a compiler where
if (x != y)
in your example code returned anything other than true. But there have
existed (and may still exist) computer architectures where simply
loading an invalid address into a register will cause a trap. Because
such architectures exist, and because you can't do anything useful with
an indeterminate pointer, there is no good reason for the C++ standard
to define the behavior of using such a pointer.

--
David Olsen
qg4h9...@yahoo.com

John Potter

unread,

May 18, 2004, 5:57:22 PM5/18/04

to

On 17 May 2004 16:40:54 -0400, Ralf Fassel <ral...@gmx.de> wrote:

> * Francis Glassborow <fra...@robinton.demon.co.uk>
> | > int *x = new int;
> | > int *y = new int;
> | > delete x;
> | > if (x != y)

> | That the above line compare the values stored in two pointer
> | variables, one of which is in an indeterminate state.

> I would not expect the value of x to change just because the object it
> points to is deleted.

Expectations have little to do with reality and nothing to do with the
standard. Even if it does not change, looking at it can call abort or
anything else.

John

James Dennett

unread,

May 18, 2004, 5:58:06 PM5/18/04

to

Ralf Fassel wrote:
> * Francis Glassborow <fra...@robinton.demon.co.uk>
> | > int *x = new int;
> | > int *y = new int;
> | > delete x;
> | > if (x != y)
> |
> | That the above line compare the values stored in two pointer
> | variables, one of which is in an indeterminate state.
>
> I would not expect the value of x to change just because the object it
> points to is deleted.

You might not, but your expectation is not based on the C++
language as defined by its ISO standard. That standard quite
clearly states that the value *is* indeterminate after the
call to delete. An implementation is quite allowed to add
the pointer value to a list of forbidden values for which
it checks, for example, or do the equivalent in hardware.
While that wouldn't even have to change the value stored in
memory, it could make *that value* a trap value.

-- James

ka...@gabi-soft.fr

unread,

May 18, 2004, 6:00:48 PM5/18/04

to

Ralf Fassel <ral...@gmx.de> wrote in message
news:<ygaad07...@ozelot.akutech-local.de>...

> * Francis Glassborow <fra...@robinton.demon.co.uk>
> | > int *x = new int;
> | > int *y = new int;
> | > delete x;
> | > if (x != y)

> | That the above line compare the values stored in two pointer
> | variables, one of which is in an indeterminate state.

> I would not expect the value of x to change just because the object it
> points to is deleted.

The bit pattern may not change, but what the bit pattern means may be
modified by the delete.

You might want to try it on an Intel 80286, running under RMX-286, for
example. I think this is the system where I actually saw the problem;
it was definitely on an 80286. And I've also seen it on an 80386, using
a real-time OS whose name I've forgotten. This was before the days of
C++ (for me, at least), so the language was PL/M, but freeing memory
returned it to the OS, which unmapped it. And at least one compiler
systematicaly used the instruction LES to load an address, even for
comparison -- loading an unmapped address using the LES instruction
caused a hardware trap. (I think the 386 compiler used to instructions
to load the address, loading both parts directly into general purpose
registers, rather than using LES, LFS or LGS. So the comparison
wouldn't core dump.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

May 18, 2004, 6:01:11 PM5/18/04

to

"Rob" <nos...@nonexistant.com> wrote in message
news:<40a81...@news.iprimus.com.au>...

> "Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
> news:xQxTKDIO...@robinton.demon.co.uk...
> > In message <40a6f...@news.iprimus.com.au>, Rob
> > <nos...@nonexistant.com> writes
> > >Really? What is unsafe with the following?

> > >#include <iostream>
> > >main()
> > >{
> > > int *x = new int;
> > > int *y = new int;
> > > delete x;
> > > if (x != y)

> > That the above line compare the values stored in two pointer
> > variables, one of which is in an indeterminate state.

> I agree that the value of x is indeterminate. My quibble is that I
> fail to see why the act of comparing an indeterminate value with a
> (for want of better wording) determinate one is fundamentally unsafe.

Because the standard says that any use of an indeterminate value is
undefined behavior. Because there are machines which have trapping
representations for different types, and the indeterminate value may
trap.

> I'll even go so far as to accept that the result of the comparison is
> indeterminate, but it is also constrained to be either true or false.

Or core dump. Or ...

The standard doesn't make any requirements. In practice, given exactly
the snippet above, the only results I've seen are false, and a core
dump.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Hyman Rosen

unread,

May 19, 2004, 5:30:49 AM5/19/04

to

David Olsen wrote:
> But there have existed (and may still exist) computer architectures where
> simply loading an invalid address into a register will cause a trap.

And lest anyone think that this was some obscure platform that
didn't matter, we're talking about the Intel 286 processor here.

Francis Glassborow

unread,

May 19, 2004, 9:33:43 PM5/19/04

to

In message <40a81...@news.iprimus.com.au>, Rob
<nos...@nonexistant.com> writes

>"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote in message
>news:xQxTKDIO...@robinton.demon.co.uk...
>> In message <40a6f...@news.iprimus.com.au>, Rob
>> <nos...@nonexistant.com> writes
>> >Really? What is unsafe with the following?
>> >
>> >#include <iostream>
>> >main()
>> >{
>> > int *x = new int;
>> > int *y = new int;
>> > delete x;
>> > if (x != y)
>>
>> That the above line compare the values stored in two pointer variables,
>> one of which is in an indeterminate state.
>>
>
>I agree that the value of x is indeterminate. My quibble is that I fail
to
>see why the act of comparing an indeterminate value with a (for want
>of better wording) determinate one is fundamentally unsafe.
>
>I'll even go so far as to accept that the result of the comparison is
>indeterminate, but it is also constrained to be either true or false.

The Standard allows an indeterminate value to be a trap value.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions:
http://www.spellen.org/youcandoit/projects

Francis Glassborow

unread,

May 19, 2004, 9:39:56 PM5/19/04

to

In message <ygaad07...@ozelot.akutech-local.de>, Ralf Fassel
<ral...@gmx.de> writes

>* Francis Glassborow <fra...@robinton.demon.co.uk>
>| > int *x = new int;
>| > int *y = new int;
>| > delete x;
>| > if (x != y)
>|
>| That the above line compare the values stored in two pointer
>| variables, one of which is in an indeterminate state.
>
>I would not expect the value of x to change just because the object it
>points to is deleted.

Strictly speaking the value has changed:-) I think you are confusing a
bit pattern with a value. A value is the result of interpreting a bit
pattern in a certain context. Before the delete was applied to x the
bit-pattern contained in x was the address of an int, after the delete,
regardless as to whether the bit-pattern has or has not changed it is no
longer the address of anything, not even of raw memory. I.e. its value
has changed.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions:
http://www.spellen.org/youcandoit/projects

Ralf Fassel

unread,

May 19, 2004, 9:57:57 PM5/19/04

to

* ka...@gabi-soft.fr

| The bit pattern may not change, but what the bit pattern means may
| be modified by the delete.

Forgive me for not having the standard at hand, but the crucial point
here seems to be the `delete' statement?

Or is it also invalid to do
int *foo;
if (foo == 0xDEADBEEF) // wow, what a coincidence

since `foo' might get initialized with a value just deleted in the
previous block?

R'

Gabriel Dos Reis

unread,

May 20, 2004, 7:16:50 AM5/20/04

to

Ralf Fassel <ral...@gmx.de> writes:

Reading uninitialized variable of type other than character leads
to undefined behaviour.

--
Gabriel Dos Reis
g...@integrable-solutions.net

James Kanze

unread,

May 20, 2004, 9:35:22 AM5/20/04

to

Ralf Fassel <ral...@gmx.de> writes:

|> Forgive me for not having the standard at hand, but the crucial
|> point here seems to be the `delete' statement?

|> Or is it also invalid to do
|> int *foo;
|> if (foo == 0xDEADBEEF) // wow, what a coincidence

Yes. It shouldn't even compile. You can't compare a pointer to an int.

Supposing a reinterpret_cast on the integral constant, or a declaration
of foo as an int, it is undefined behavior (unless foo has static
lifetime), since any access to an uninitialized variable is undefined
behavior.

|> since `foo' might get initialized with a value just deleted in the
|> previous block?

Or any other trapping representation.

Trapping representations are allowed for all types non character types.

--
James Kanze

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

James Kanze

unread,

May 20, 2004, 9:37:49 AM5/20/04

to

Hyman Rosen <hyr...@mail.com> writes:

|> David Olsen wrote:
|> > But there have existed (and may still exist) computer
|> > architectures where simply loading an invalid address into a
|> > register will cause a trap.

|> And lest anyone think that this was some obscure platform that
|> didn't matter, we're talking about the Intel 286 processor here.

Amongst others: the same thing is true on the 386 and all of its
followers as well.

For this to be a problem, three things are in fact necessary:

- The implementation traps on invalid addesses. All Intel processors
286 and up trap if an invalid segment is loaded; I think that there
are other processors with separate address registers which will
check when the register is loaded as well.

- The implementation invalidates addresses passed to free/operator
delete. This is not the case for modern Windows or Linux -- it was
the case for some real-time OS's, however; I've used at least two
which did this. (Note that at least under Linux, free/operator
delete does not normally even return the memory to the OS. This
seems to be an undocumented feature/bug of all Unix systems.)

- the compiler generates code which actually loads the address into an
address register, or in the case of Intel, loads the segment part of
the address into a segment register. This was the case for early
compilers, since they simply reused the code generation techniques
lifted from the 8086 compilers; loading a segment register in
protected mode is a very expensive operation, however, so most
compilers gradually changed their code generators to not do it
unless they really intended to deaccess. And of course, today,
neither Windows nor Linux even allow user code to access the segment
registers -- the 640K will be enough for anyone syndrome strikes
again.

--
James Kanze
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

llewelly

unread,

May 20, 2004, 12:40:35 PM5/20/04

to

Ralf Fassel <ral...@gmx.de> writes:

> * ka...@gabi-soft.fr
> | The bit pattern may not change, but what the bit pattern means may
> | be modified by the delete.
>
> Forgive me for not having the standard at hand, but the crucial point
> here seems to be the `delete' statement?
>
> Or is it also invalid to do
> int *foo;
> if (foo == 0xDEADBEEF) // wow, what a coincidence
>
> since `foo' might get initialized with a value just deleted in the
> previous block?

This case is actually simpler than the delete case. For the delete
case, the standard merely says the pointer is rendered invalid,
and that it is undefined behavior to 'use' an invalid pointer,
with no definition of 'use'.

Here we can refer to 4.1/1, which (approximately) says that if an
object is uninitialized, a program that necessitates
lvalue-to-rvalue conversion has undefined behavior. Conceptually,
lvalue-to-rvalue conversion is what happens when you 'read' the
value of an object. So this is undefined behavior.

(Now it seems 4.1/1 must be what John potter was thinking of when he
declare that a 'use' was an lvalue-to-rvalue conversion.)

I am nearly certain there are only 4 things you can safely do with an
uninitialized pointer:

(a) assign it a new value.
(b) take its address.
(c) bind a reference to it.
(d) destroy it.

John Potter

unread,

May 21, 2004, 5:47:15 AM5/21/04

to

On 20 May 2004 12:40:35 -0400, llewelly <llewe...@xmission.dot.com>
wrote:

> (Now it seems 4.1/1 must be what John potter was thinking of when he
> declare that a 'use' was an lvalue-to-rvalue conversion.)

I hope I said it the other way. An lvalue to rvalue conversion is
a use of the value. Looking at an rvalue is also a use even when
there was no lvalue to convert.

John

llewelly

unread,

May 22, 2004, 5:39:15 AM5/22/04

to

John Potter <jpo...@falcon.lhup.edu> writes:

> On 20 May 2004 12:40:35 -0400, llewelly <llewe...@xmission.dot.com>
> wrote:
>
> > (Now it seems 4.1/1 must be what John potter was thinking of when he
> > declare that a 'use' was an lvalue-to-rvalue conversion.)
>
> I hope I said it the other way.

[snip]

Sorry, I misquoted you. You said 'An rvalue conversion is a use.' No
mention of lvalues.

http://xrl.us/b4m6 is a google groups link to the post.

James Dennett

unread,

May 16, 2004, 7:45:09 PM5/16/04

to

Rob wrote:

> "David Olsen" <qg4h9...@yahoo.com> wrote in message
> news:2g26r6F...@uni-berlin.de...
> > Frank Birbacher wrote:
> > > So what about passing around past-the-end iterators of arrays (which
> > > are invalid pointers, because you may not dereference them)?? Is
> > > passing them around undefined behaviour?? If not, what's the
> > > difference?
>
> It is quite acceptable to pass past-the-end iterators around, and compare
> them with other iterators. It is undefined behaviour to dereference them.
>
> >
> > A pointer that points just past the end of an array can be used safely
> > in a number of contexts, such as copying it or comparing it against a
> > different pointer that points within the same array. It just can't be
> > dereferenced. Whether or not you consider it to be a valid pointer
> > depends on your definition of valid. My preferred terminology is that
> > such a pointer is valid but not dereferencible.
>
> I suspect you're splitting hairs unnecessarily. If your definition
> of "valid" is something like "any variable of type pointer can
> represent this value", then all pointers will be valid.

It's valid in that there are a number of well-defined operations
on its value. For some pointers, the only things you can do with
them are read the bytes of their representations or assign
new values to the pointer.

> > A pointer that has been deleted, however, is not valid in any way. It
> > can't be used safely in any context.
> >
>
> Really? What is unsafe with the following?
>
> #include <iostream>

Pedantically, missing #include <ostream>.

> main()

Missing return type, diagnosed by most compilers I use, and
required in C++ for a long, long time.

> {
> int *x = new int;
> int *y = new int;
> delete x;

x now has an indeterminate value...

> if (x != y)

...which is read here, giving undefined behavior. Trapping
would be reasonable, IMO. A warning at compile-time would
also be quite possible in this situation.

-- James

Allan W

unread,

May 25, 2004, 8:43:08 AM5/25/04

to

James Kanze <ka...@gabi-soft.fr> wrote

> And of course, today,
> neither Windows nor Linux even allow user code to access the segment
> registers -- the 640K will be enough for anyone syndrome strikes
> again.

In today's 32-bit systems, a single segment can be up to 4 gigabytes.
In 64-bit systems, a single segment can be up to 16384 terabytes.
(Is there a name for 1024 terabytes?) {petabyte mod/fwg}

Guilty then of "the 16384T will be enough for anyone syndrome" which
might be considered shortsighted someday -- but perhaps not quite yet.

ka...@gabi-soft.fr

unread,

May 26, 2004, 10:06:20 AM5/26/04

to

all...@my-dejanews.com (Allan W) wrote in message
news:<7f2735a5.04052...@posting.google.com>...

> James Kanze <ka...@gabi-soft.fr> wrote
> > And of course, today, neither Windows nor Linux even allow user
> > code to access the segment registers -- the 640K will be enough
> > for anyone syndrome strikes again.

> In today's 32-bit systems, a single segment can be up to 4 gigabytes.
> In 64-bit systems, a single segment can be up to 16384 terabytes.
> (Is there a name for 1024 terabytes?) {petabyte mod/fwg}

> Guilty then of "the 16384T will be enough for anyone syndrome" which
> might be considered shortsighted someday -- but perhaps not quite yet.

The Intel based Windows and Linux systems I know are mainly 32 bit
systems. And IMHO, the "4 Gb will be enough for anyone" is already
short-sighted.

If the processor can do it, the OS shouldn't get in the way.

--
James Kanze GABI Software

Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34

toddmars...@yahoo.com

unread,

May 26, 2004, 7:51:42 PM5/26/04

to

> >> That summaries the defined standard behaviour. It may be surprising to
> >> learn that copying a pointer of type 4. leads to undefined behaviour.
>
> Somehow, your quoting went crazy. These two lines were not written
> by James, but by myself.
>
> >How is a compiler supposed to implement these 'standard' behaviours in
> >run-time?
>
> This question I don't grasp. Whar "standard behaviours" are you
> mentioning? The compiler does not have to do anything to "implement"
> "undefined behaviour" The standard says that any code (or non-code)
> for undefined behaviour is OK. Even when happens what you are expecting
> is OK for undefined behaviour ;-)
Sorry about the quoting. Trying to be concise, have to do it by hand.

---So if you copy an invalid pointer you get undefined behavior? Seems
to me that copying a variable to another of the same type should be
easy for a compiler to implement, and if not (whatever the contents)
the compiler inserts code to check if the contents of the variable is
an invalid pointer. The compiler knows the pointer is invalid.
otherwise it copies a variable.
---Also, if a pointer is invalid, it is already loaded into an address
register and would cause a fault before it is copied. The discussions
about copying invalid pointers don't account for the source of the
copy.
If it is NOT in an address register, the compiler has inserted code to
test if the pointer is valid, (else it could be in an address
register) thus adding extra code I feel is weird, testing for invalid
pointers to 'implement standard behaviours' which are undefined. yes
it is a contradiction which I am complaining about.
So we are saying that copying pointers can de-reference them if only
temporarily. I guess I can't complain about that.
Todd.

llewelly

unread,

May 26, 2004, 7:54:17 PM5/26/04

to

ka...@gabi-soft.fr writes:

> all...@my-dejanews.com (Allan W) wrote in message
> news:<7f2735a5.04052...@posting.google.com>...
>> James Kanze <ka...@gabi-soft.fr> wrote
>> > And of course, today, neither Windows nor Linux even allow user
>> > code to access the segment registers -- the 640K will be enough
>> > for anyone syndrome strikes again.
>
>> In today's 32-bit systems, a single segment can be up to 4 gigabytes.
>> In 64-bit systems, a single segment can be up to 16384 terabytes.
>> (Is there a name for 1024 terabytes?) {petabyte mod/fwg}
>
>> Guilty then of "the 16384T will be enough for anyone syndrome" which
>> might be considered shortsighted someday -- but perhaps not quite yet.
>
> The Intel based Windows and Linux systems I know are mainly 32 bit
> systems. And IMHO, the "4 Gb will be enough for anyone" is already
> short-sighted.
>
> If the processor can do it, the OS shouldn't get in the way.

I think if an implementation on IA32 used segment:offset combos to
represent C++ pointers, the C++ object model would prevent
objects larger than 4GB, unless it laid out the segments in a
linear fashion, and generated more complicated code for pointer
arithmetic. But 4GB objects should be enough for anyone ...

Ben Hutchings

unread,

May 27, 2004, 5:30:20 PM5/27/04

to

llewelly wrote:
> ka...@gabi-soft.fr writes:
<snip>

>> The Intel based Windows and Linux systems I know are mainly 32 bit
>> systems. And IMHO, the "4 Gb will be enough for anyone" is already
>> short-sighted.
>>
>> If the processor can do it, the OS shouldn't get in the way.
>
> I think if an implementation on IA32 used segment:offset combos to
> represent C++ pointers, the C++ object model would prevent
> objects larger than 4GB, unless it laid out the segments in a
> linear fashion, and generated more complicated code for pointer
> arithmetic.

That sort of thing has been done before by C compilers for DOS (where
the limit would normally be 64 KB per object); it was called "huge
model".

> But 4GB objects should be enough for anyone ...

An array is a single object and some programs store almost all their
data in a few arrays.

ka...@gabi-soft.fr

unread,

May 28, 2004, 12:02:58 PM5/28/04

to

Ben Hutchings <do-not-s...@bwsint.com> wrote in message
news:<slrncbbjg3.1pc....@shadbolt.i.decadentplace.org.uk>...

> llewelly wrote:
>> ka...@gabi-soft.fr writes:
> <snip>
>>> The Intel based Windows and Linux systems I know are mainly 32 bit
>>> systems. And IMHO, the "4 Gb will be enough for anyone" is already
>>> short-sighted.

>>> If the processor can do it, the OS shouldn't get in the way.

>> I think if an implementation on IA32 used segment:offset combos to
>> represent C++ pointers, the C++ object model would prevent
>> objects larger than 4GB, unless it laid out the segments in a
>> linear fashion, and generated more complicated code for pointer
>> arithmetic.

> That sort of thing has been done before by C compilers for DOS (where
> the limit would normally be 64 KB per object); it was called "huge
> model".

That sort of thing only works if you have some sort of arithmetical
mapping of segment:offset into a physical address. It doesn't work in
protected mode.

>> But 4GB objects should be enough for anyone ...

> An array is a single object and some programs store almost all their
> data in a few arrays.

It's more a question: if the hardware doesn't allow it...

And 4GB per object can be restrictive for certain applications. Those
applications simply cannot run on an IA-32 architecture. On the other
hand, 4GB total is restrictive for a lot more applications, and those
applications could run on an IA-32 architecture, if the OS allows it.
(Curiously enough, the only OS I've used which did allow it was one
designed for small, embedded applications, which typically don't have
much memory. And I used it about 15 years ago, when 4GB really was a
lot of memory.)

--
James Kanze GABI Software
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Allan W

unread,

May 29, 2004, 7:50:19 AM5/29/04

to

Ben Hutchings <do-not-s...@bwsint.com> wrote

> llewelly wrote:
> > ka...@gabi-soft.fr writes:
> <snip>
> >> The Intel based Windows and Linux systems I know are mainly 32 bit
> >> systems. And IMHO, the "4 Gb will be enough for anyone" is already
> >> short-sighted.
> >>
> >> If the processor can do it, the OS shouldn't get in the way.

I think that's exactly what Microsoft Windows 95 (and more recent) have
done. There have been differences from one version to the next, but in
general it's been possible to allocate as much virtual memory as you
want in a single segment, subject only to hardware limitations.

Intel 16-bit processors (8086) support up to 64K segments of up to 64Kb
each, and the maximum physical address space is 1Mb. The first 32-bit
processors (486, Pentium, et. al.) support up to 4G segments of up to
4Gb each, and the maximum total physical memory is also 4Gb. Since
loading the segment registers takes a relatively long time, and since
each segment can be up to 100% of the physical memory, 32-bit Microsoft
Windows platforms support the "flat model" where we don't even bother
loading the segment registers (except of course on a task switch). But
that doesn't mean it HAS to work this way.

I stopped paying attention to processor limits. I'll assume that time
has marched on, and so the newer processors can address more than 4Gb
of physical memory. If this is true, all that's required in order to
support this, is using a 64-bit pointer consisting of a 32-bit segment
and a 32-bit offset -- and using some care to ensure that pointer
arithmetic doesn't alter the segment.

> > I think if an implementation on IA32 used segment:offset combos to
> > represent C++ pointers, the C++ object model would prevent
> > objects larger than 4GB, unless it laid out the segments in a
> > linear fashion, and generated more complicated code for pointer
> > arithmetic.
>
> That sort of thing has been done before by C compilers for DOS (where
> the limit would normally be 64 KB per object); it was called "huge
> model".
>
> > But 4GB objects should be enough for anyone ...
>
> An array is a single object and some programs store almost all their
> data in a few arrays.

But even then, if you need more than 4GB for your program it's probably
because you have a small number (one?) of special huge objects. If these
go into their own segments, then 4GB should be enough for all the rest.
So I'd have to say that I agree, at least for the next decade or so,
that 4GB objects *should* be enough.

llewelly

unread,

May 30, 2004, 6:44:09 AM5/30/04

to

all...@my-dejanews.com (Allan W) writes:

> Ben Hutchings <do-not-s...@bwsint.com> wrote
> > llewelly wrote:

[snip]

> > > I think if an implementation on IA32 used segment:offset combos to
> > > represent C++ pointers, the C++ object model would prevent
> > > objects larger than 4GB, unless it laid out the segments in a
> > > linear fashion, and generated more complicated code for pointer
> > > arithmetic.
> >
> > That sort of thing has been done before by C compilers for DOS (where
> > the limit would normally be 64 KB per object); it was called "huge
> > model".

But, AFAIR, none of them laid the segements in a linear fashion, and
none of them supported objects larger than 4 GB.

> >
> > > But 4GB objects should be enough for anyone ...
> >
> > An array is a single object and some programs store almost all their
> > data in a few arrays.
>
> But even then, if you need more than 4GB for your program it's probably
> because you have a small number (one?) of special huge objects. If these
> go into their own segments, then 4GB should be enough for all the
> rest. So I'd have to say that I agree, at least for the next decade or so,
> that 4GB objects *should* be enough.

Actually, I left out a smiley. :-) I don't believe '4GB objects should
be enough for anyone' - it's easy enough to find numerical
simulations that use 4 or 8 GB for one array. I *do* believe it
is enough for many kinds of programs, but not for 'anyone', even
today.

Ben Hutchings

unread,

May 30, 2004, 9:58:11 PM5/30/04

to

llewelly wrote:D

> all...@my-dejanews.com (Allan W) writes:
>
>> Ben Hutchings <do-not-s...@bwsint.com> wrote
>> > llewelly wrote:
> [snip]
>> >> I think if an implementation on IA32 used segment:offset combos to
>> >> represent C++ pointers, the C++ object model would prevent
>> >> objects larger than 4GB, unless it laid out the segments in a
>> >> linear fashion, and generated more complicated code for pointer
>> >> arithmetic.
>> >
>> > That sort of thing has been done before by C compilers for DOS (where
>> > the limit would normally be 64 KB per object); it was called "huge
>> > model".
>
> But, AFAIR, none of them laid the segements in a linear fashion, and
> none of them supported objects larger than 4 GB.

<snip>

In DOS, without an extender, every segment is a 64 KB region
overlapping the previous one but offset by 16 bytes. In the huge
model, pointers would be "far" pointers (i.e. tuples of segment,
offset) by default and pointer arithmetic involved some adjustment to
ensure that it worked as if they were linear addresses. So an object
could, I think, be as large as the full address space, i.e. 1 MB,
whereas the limit would normally be 64 KB.

I'm saying that perhaps a similar technique could be applied to break
through the 4 GB limit of linear addresses on x86, though I doubt
anyone will try this since switching to x86-64 won't be so hard for
customers as switching away from DOS was.

Dave Harris

unread,

May 31, 2004, 6:33:14 PM5/31/04

to

toddmars...@yahoo.com () wrote (abridged):

> ---So if you copy an invalid pointer you get undefined behavior?

Yes.

> Seems to me that copying a variable to another of the same type
> should be easy for a compiler to implement

It is.

> and if not (whatever the contents) the compiler inserts code to check
> if the contents of the variable is an invalid pointer.

Allowing behaviour to be undefined here eliminates the need for such
checks.

> ---Also, if a pointer is invalid, it is already loaded into an address
> register and would cause a fault before it is copied.

Presumably whatever hardware mechanism makes the pointer a trapping value,
also deals with copies of that value in registers. I don't actually know
how this works - perhaps they magically become 0.

> If it is NOT in an address register, the compiler has inserted code to
> test if the pointer is valid, (else it could be in an address
> register) thus adding extra code I feel is weird, testing for invalid
> pointers to 'implement standard behaviours' which are undefined.

There's no need for such extra code. "Undefined behaviour" means the
compiler can let invalid pointers through to the hardware.

-- Dave Harris, Nottingham, UK

Balog Pal

unread,

Jun 1, 2004, 5:48:56 AM6/1/04

to

"Ben Hutchings" <do-not-s...@bwsint.com> wrote in message

news:slrncbkbq2.1pc....@shadbolt.i.decadentplace.org.uk...

> In DOS, without an extender, every segment is a 64 KB region
> overlapping the previous one but offset by 16 bytes. In the huge
> model, pointers would be "far" pointers (i.e. tuples of segment,
> offset) by default and pointer arithmetic involved some adjustment to
> ensure that it worked as if they were linear addresses. So an object
> could, I think, be as large as the full address space, i.e. 1 MB,
> whereas the limit would normally be 64 KB.

No, the huge pointer differs from a far pointer only in its nature to be
normalised. Meaning its offset part is kept below 16 and allowing you to
reach almost 64k ahead. With far pointers arithmetic keeps the segment as
is and change the offset part only.

With huge you can have a huge *array* object, with access to elements via
pointer math, but no luck with elements bigger than 64k.

> I'm saying that perhaps a similar technique could be applied to break
> through the 4 GB limit of linear addresses on x86

AFAIK segments are available in WIN32 drivers. So if someone is interested
he can write such a driver and use it as an extended malloc. Then think some
way to "do stuff" with the memory. :) Just like with XMS back in DOS. I
never liked it due to copying/speed issues. EMS, using paging into a
window was way better, and in fact pretty easy to use. And fast too. I
think you can use similar tech within WIN32 as file mapping. Which can be
served without ever flushing to disk by a sane OS, provided it has the
physical memory. (while any other/regular memory is subject to paging, so
there's little difference.)

>, though I doubt
> anyone will try this since switching to x86-64 won't be so hard for
> customers as switching away from DOS was.

I agree with this one. There may be need to go a little over the limit,
with a need for a few times 2G space. There it may worth some struggle to
keep working in the "old" world. Using mapping, or a process pool. But
generally it's better to move those appliations to 64bit instead of the
hackery.

Paul

Ben Hutchings

unread,

Jun 1, 2004, 2:53:19 PM6/1/04

to

Balog Pal wrote:
> "Ben Hutchings" <do-not-s...@bwsint.com> wrote in message
> news:slrncbkbq2.1pc....@shadbolt.i.decadentplace.org.uk...
>
> > In DOS, without an extender, every segment is a 64 KB region
> > overlapping the previous one but offset by 16 bytes. In the huge

> > model [...] an object

> > could, I think, be as large as the full address space, i.e. 1 MB,
> > whereas the limit would normally be 64 KB.

<snip>

> With huge you can have a huge *array* object, with access to elements via
> pointer math, but no luck with elements bigger than 64k.

An array is an object, so you're just confirming what I said. You
can't have class instances larger than 64 KB, but then you can't have
them in many 32-bit implementations of C++ since they store member
offsets in 16-bit fields in instructions and in member-function-
pointers.

> > I'm saying that perhaps a similar technique could be applied to break
> > through the 4 GB limit of linear addresses on x86
>
> AFAIK segments are available in WIN32 drivers.

<snip>

You are right that a driver could manipulate segments, but that should
be true for any x86 OS, and both Linux and NT provide user-mode APIs
for manipulating segments (with appropriate security checks, at least
in theory). However, all segments are normally mapped onto a single
virtual address space and I don't know whether the architecture or
operating systems provide any way around that.

James Kanze

unread,

Jun 2, 2004, 8:51:24 AM6/2/04

to

bran...@cix.co.uk (Dave Harris) writes:

|> > ---Also, if a pointer is invalid, it is already loaded into an
|> > address register and would cause a fault before it is copied.

|> Presumably whatever hardware mechanism makes the pointer a trapping
|> value, also deals with copies of that value in registers. I don't
|> actually know how this works - perhaps they magically become 0.

Normally, the only way you can convert a valid address to an invalid one
would be by programming the memory controler. In doing that, the
"addresse" you use are those of the memory controler, and the address
which becomes invalid (or more likely, just parts of it) is just plain
data, which is output, or perhaps memcpy'ed or otherwise transfered (as
raw bytes or words) to the MMU (which may itself be memory mapped).

--
James Kanze
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34