Pointer comparison and undefined behavior

Bjorn Reese

unread,

Aug 30, 2004, 1:56:52 PM8/30/04

to

Assume that I have a large, and properly initialized, char
array. I have char pointer that may or may not point into
the array.

Is there any way to determine if the char pointer points
into the array, without invoking undefined behavior in case
the pointer does not point into the array? (and without
comparing for equality with each location in the array)

The code fragment I have in mind is this:

if ( (pointer >= array) && (pointer < array + sizeof(array) )

Reference: C99 6.5.8/5

--
mail1dotstofanetdotdk

Jack Klein

unread,

Aug 30, 2004, 11:46:38 PM8/30/04

to

On Mon, 30 Aug 2004 19:56:52 +0200, "Bjorn Reese"
<bre...@see.signature> wrote in comp.std.c:

> Assume that I have a large, and properly initialized, char
> array. I have char pointer that may or may not point into
> the array.
>
> Is there any way to determine if the char pointer points
> into the array, without invoking undefined behavior in case
> the pointer does not point into the array? (and without
> comparing for equality with each location in the array)

No, even though the undefined behavior on most platforms would be to
work as you want it to.

> The code fragment I have in mind is this:
>
> if ( (pointer >= array) && (pointer < array + sizeof(array) )
>
> Reference: C99 6.5.8/5

You neglected to quote the last sentence of that paragraph:

"In all other cases, the behavior is undefined."

There is really no way to give it defined behavior. Even converting
the pointers to (u)intprt_t, assuming the platform defines these
types, doesn't help because there is no guarantee that any arithmetic
differences between the resulting integers represent those between the
pointers, and there are a few platforms out there where the
relationship is at the very least non-intuitive.

So this is one of those things that you do if you really need to on a
platform where you know it works as needed. But add a big comment
with lots of asterisks and exclamation points about the fact that it
might not be portable.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Douglas A. Gwyn

unread,

Aug 31, 2004, 2:20:55 AM8/31/04

to

Bjorn Reese wrote:
> Assume that I have a large, and properly initialized, char
> array. I have char pointer that may or may not point into
> the array.
> Is there any way to determine if the char pointer points
> into the array, without invoking undefined behavior in case
> the pointer does not point into the array? (and without
> comparing for equality with each location in the array)

No. Why would you encounter such a requirement?

Christian Bau

unread,

Aug 31, 2004, 3:12:09 AM8/31/04

to

In article <RvudnW7raMl...@comcast.com>,

/* The pointer p must be pointing to an element of array a */
assert (p >= a && p < a + sizeof (a) / sizeof (a[0]));

On many implementations, this will do exactly what it is supposed to do
except that it doesn't check if the pointer p has correct alignment, and
will sometimes fail to assert when p is an undefined pointer. This is a
very useful but non-portable debugging tool.

By the way, if the C Standard stated that (size_t) (p-q) sometimes gives
an _unspecified_ result, but not an _undefined_ result, then I could do
the test correctly myself:

size_t distance = (size_t) (p - &a [0]);

if (distance < sizeof (a) / sizeof (a [0]) && p == &a [distance])
...

Is there any justification involving non-hypothetical implementations of
C for (size_t) (p-q) being undefined when p and q point to different
arrays instead of being just unspecified?

pete

unread,

Aug 31, 2004, 8:40:27 AM8/31/04

to

Implementing memmove() in C.

void *memmove(void *s1, const void *s2, size_t n)
{
unsigned char *p1 = s1;
const unsigned char *p2 = s2;

p2 += n;
while (p2 != s2 && --p2 != s1) {
;
}
if (p2 != s2) {
p2 = s2;
p2 += n;
p1 += n;
while (n-- != 0) {
*--p1 = *--p2;
}
} else {
while (n-- != 0) {
*p1++ = *p2++;
}
}
return s1;
}

--
pete

Dan Pop

unread,

Aug 31, 2004, 10:41:47 AM8/31/04

to

In <413471...@mindspring.com> pete <pfi...@mindspring.com> writes:

>Douglas A. Gwyn wrote:
>>
>> Bjorn Reese wrote:
>> > Assume that I have a large, and properly initialized, char
>> > array. I have char pointer that may or may not point into
>> > the array.
>> > Is there any way to determine if the char pointer points
>> > into the array, without invoking undefined behavior in case
>> > the pointer does not point into the array? (and without
>> > comparing for equality with each location in the array)
>>
>> No. Why would you encounter such a requirement?
>
>Implementing memmove() in C.

memmove() is part of the standard C library so that it doesn't have to
be implemented in (portable) C. Honestly, I would hope for a hand
optimised assembly version instead of your extremely naive C version.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

Bjorn Reese

unread,

Aug 31, 2004, 3:56:44 PM8/31/04

to

On Tue, 31 Aug 2004 02:20:55 -0400, Douglas A. Gwyn wrote:

> No. Why would you encounter such a requirement?

Thank you for the answer.

The need for this type of comparison came from an optimization
of string handling. The code in question is part of a dictionary
data structure. The explanation and examples below are conceptual
(I did not write the original code, so I hope I got it right.)

The code stores all strings in a special data structure, which
consists of a linked list. Each entry contains a char array.

struct chunk {
struct chunk *next;
char *array;
size_t array_size;
};

The array contains many strings (separated by zero). Pointers to
the individual strings are maintained elsewhere.

A lookup function needs to determine if a given string is already
in the data structure or not. This function iterates through the
linked list, and examines if the array in the current chunk
contains the string.

for ( current = first_chunk;
current != NULL;
current = current->next) {
if ( (pointer >= current->array) &&
(pointer < current->array + current->array_size) ) {
return 1; /* Found */
}
}
return 0; /* Not found */

The actual implementation (xmlDictOwns) can be found at:

http://cvs.gnome.org/viewcvs/libxml2/dict.c?view=markup

--
mail1dotstofanetdotdk

pete

unread,

Aug 31, 2004, 6:45:20 PM8/31/04

to

I write portable versions of standard library functions in C,
so that I can post code examples for discussion,
without having to explain what the code is supposed to do.

The first part of the function definition shows a portable way
to tell if a pointer points into an array, as per OP's question.
The existence of the memmove function itself,
is an example of when you might "encounter such a requirement".

--
pete

Jack Klein

unread,

Aug 31, 2004, 10:08:35 PM8/31/04

to

On Tue, 31 Aug 2004 21:56:44 +0200, "Bjorn Reese"
<bre...@see.signature> wrote in comp.std.c:

> On Tue, 31 Aug 2004 02:20:55 -0400, Douglas A. Gwyn wrote:

As I said, if you need to do it, make sure it works on your original
platform and just do it. Mark it with a comment that it needs to be
checked on porting, but the implementations where it will not work are
few and far between.

Note that if the objects in the array are larger than 1 byte in size,
however, this will not verify that this is actually a pointer to one
of the objects.

Douglas A. Gwyn

unread,

Sep 1, 2004, 2:05:23 AM9/1/04

to

Christian Bau wrote:
> /* The pointer p must be pointing to an element of array a */
> assert (p >= a && p < a + sizeof (a) / sizeof (a[0]));

No. This can fail miserably if p does not in fact
lie within the range of a. This being comp.std.c,
when the fellow asks if there is a way, we presume
he is asking for a way that is guaranteed to work
according to the C standard.

> Is there any justification involving non-hypothetical implementations of
> C for (size_t) (p-q) being undefined when p and q point to different
> arrays instead of being just unspecified?

Yes. The subtraction is meaningless if p and q
point into different segments.

Douglas A. Gwyn

unread,

Sep 1, 2004, 2:10:43 AM9/1/04

to

pete wrote:

> Douglas A. Gwyn wrote:
>>No. Why would you encounter such a requirement?

> Implementing memmove() in C. ...

It's not needed for memmove, just for a particular
attempt at a faster tweak to memmove. Presumably
the best implementation of memmove for a given
platform would be obtained by less generic code
anyway, or if not, a flat global data address space
exists on the platform so that such code can be
used there. On platforms with different addressing
architectures, requiring the compiler to support an
artificial nonsegmented address space would slow
things down substantially for essentially every
application, and the "optimization" of memmove
might turn out to be slower too in that case.

Douglas A. Gwyn

unread,

Sep 1, 2004, 2:21:22 AM9/1/04

to

Bjorn Reese wrote:
> A lookup function needs to determine if a given string is already
> in the data structure or not.

More likely, you want to find out which chunk a
managed string is already assigned to, although I'm
not sure what good use could subsequently be made of
that information.

If you just wanted to know whether you have an
unmanaged string pointer or a managed string pointer,
that should be done by using separate types (i.e. a
typedef for a managed-string pointer) and not
jumbling them together at the managed-string package
interface.

Dan Pop

unread,

Sep 1, 2004, 7:57:48 AM9/1/04

to

In <UMqdnXcCne4...@comcast.com> "Douglas A. Gwyn" <DAG...@null.net> writes:

>Christian Bau wrote:
>
>> Is there any justification involving non-hypothetical implementations of
>> C for (size_t) (p-q) being undefined when p and q point to different
>> arrays instead of being just unspecified?
>
>Yes. The subtraction is meaningless if p and q
>point into different segments.

This does NOT explain why the result couldn't be unspecified instead of
undefined.

Dan Pop

unread,

Sep 1, 2004, 8:00:09 AM9/1/04

to

In <4134FF...@mindspring.com> pete <pfi...@mindspring.com> writes:

>The existence of the memmove function itself,
>is an example of when you might "encounter such a requirement".

It's a non-example, because in this particular case you can simply use
the standard library function, which need not be implemented in portable
C code.

Christian Bau

unread,

Sep 1, 2004, 12:20:09 PM9/1/04

to

In article <UMqdnXcCne4...@comcast.com>,

"Douglas A. Gwyn" <DAG...@null.net> wrote:

> Christian Bau wrote:
> > /* The pointer p must be pointing to an element of array a */
> > assert (p >= a && p < a + sizeof (a) / sizeof (a[0]));
>
> No. This can fail miserably if p does not in fact
> lie within the range of a. This being comp.std.c,
> when the fellow asks if there is a way, we presume
> he is asking for a way that is guaranteed to work
> according to the C standard.

You are missing the point. In your previous post you didn't seem to
understand the need for the test whether a pointer p points into an
array. Most people would have understood by the comment that precedes
the code what the code is _supposed_ to do, giving an example for the
_need_ for that kind of information.

> > Is there any justification involving non-hypothetical implementations of
> > C for (size_t) (p-q) being undefined when p and q point to different
> > arrays instead of being just unspecified?
>
> Yes. The subtraction is meaningless if p and q
> point into different segments.

You are missing the point again. If the subtraction is meaningless, why
can't the result be just _unspecified_? Are there any implementations
where code that would calculate the pointer difference correctly when
the result is specified by the C Standard would do something other than
producing an arbitrary value for unrelated pointers?

If you can't see why this would be valuable, then I am willing to
explain it very slowly to you.

Barry Margolin

unread,

Sep 1, 2004, 11:22:54 PM9/1/04

to

In article
<christian.bau-7FB...@slb-newsm1.svr.pol.co.uk>,
Christian Bau <christ...@cbau.freeserve.co.uk> wrote:

What difference does it make? On flat-memory architectures, both the
comparison and subtraction will do what you expect, so it doesn't matter
what the standard says. On other architectures, neither expression will
do anything useful, so the code won't work on those systems.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Douglas A. Gwyn

unread,

Sep 2, 2004, 1:15:33 AM9/2/04

to

Dan Pop wrote:
> "Douglas A. Gwyn" <DAG...@null.net> writes:
>>Yes. The subtraction is meaningless if p and q
>>point into different segments.
> This does NOT explain why the result couldn't be unspecified instead of
> undefined.

Sure it does. Unspecified behavior requires that there
be a set of meaningful alternatives from which the
implementation can choose.

Christian Bau

unread,

Sep 2, 2004, 2:47:22 AM9/2/04

to

In article <barmar-6463A1....@comcast.dca.giganews.com>,
Barry Margolin <bar...@alum.mit.edu> wrote:

> In article
> <christian.bau-7FB...@slb-newsm1.svr.pol.co.uk>,
> Christian Bau <christ...@cbau.freeserve.co.uk> wrote:
>
> > In article <UMqdnXcCne4...@comcast.com>,
> > "Douglas A. Gwyn" <DAG...@null.net> wrote:
> > > Yes. The subtraction is meaningless if p and q
> > > point into different segments.
> >
> > You are missing the point again. If the subtraction is meaningless, why
> > can't the result be just _unspecified_? Are there any implementations
> > where code that would calculate the pointer difference correctly when
> > the result is specified by the C Standard would do something other than
> > producing an arbitrary value for unrelated pointers?
> >
> > If you can't see why this would be valuable, then I am willing to
> > explain it very slowly to you.
>
> What difference does it make? On flat-memory architectures, both the
> comparison and subtraction will do what you expect, so it doesn't matter
> what the standard says. On other architectures, neither expression will
> do anything useful, so the code won't work on those systems.

I take this as an invitation about a slow and careful explanation.

Lets say I have an array

double a [100];

and a pointer

double* p;

I know that the pointer p points to some object of type double, and I
wish to know whether p points to any element of the array a. The only
portable way to find out that p points to any element of a is to run a
loop for 0 <= i < 100 and compare p == &a[i].

The obvious test (p >= &a[0] && p < &a[100]) will produce undefined
behavior if p doesn't point to any array element and can therefore not
be used. Undefined behavior that I have personally encountered was
producing a TRUE result when p did _not_ point to any array element.

But if (size_t) (p - &a[0]) would produce an _unspecified_ result, then
I can do this test very easily:

size_t d = (size_t) (p - &a[0]);
// There are two possibilities: If p points to a[i] then d = i,
// if p doesn't point to any array element then d is unspecified.

if (d >= 100) {
// If p points to a[i] then d = i and therefore d < 100. So
// d >= 100 proves that p doesn't point to an array element
... Does not point to array element ...
} else {
// We have 0 <= d < 100, which means we can calculate q. q will
// point to some element of array a.
double* q = &a [d];

if (p == q) {
// q points to an element of array a, p and q compare equal,
// therefore p points to an element of array a. Note that I
// specified that p actually points to an object of type
// double to avoid the case that p points past the last element
// of an array which happens to be located just in front of a.
... Does point to an array element ...
} else {
// If p had pointed to an element of a, for example a [i], then
// d would be equal to i, and q would be equal to p. This is not
// the case, so p does not point to an element of a.
... Does not point to an array element.
}
}

I have the impression that some people use the reasoning "Undefined
behavior in C" => "This couldn't possibly be useful" => "Undefined
behavior in C is fine, because it couldn't possibly be useful".

Christian Bau

unread,

Sep 2, 2004, 2:58:06 AM9/2/04

to

In article <s6ydnT51IfT...@comcast.com>,

"Douglas A. Gwyn" <DAG...@null.net> wrote:

It doesn't. Not in my C Standard. It requires that there are two or more
alternatives with no requirement on which is chosen in any instance.

Pointer comparison for unrelated pointers could very easily be
unspecified behavior. The two obvious alternatives are that p >= q
yields either a value of 1 or a value of 0, with no further
requirements. For pointer difference, the obvious alternatives are that
p-q produces any value which doesn not lie outside the minimum and
maximum values of type ptrdiff_t.

If you can find the words "meaningful alternatives" anywhere in the C
Standard, then please tell me. If you know of an implementation where
straightforward code to calculate pointer differences that are defined
by the C Standard will do anything other than yielding values of type
ptrdiff_t for the cases not defined by the C Standard then please tell
me.

James Kuyper

unread,

Sep 2, 2004, 7:13:10 AM9/2/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<s6ydnT51IfT...@comcast.com>...

How about: the result must be an integer in the valid range of
ptrdiff_t that is not the same as the result from p-r, for any r that
is a pointer into or one past the end of the same array that p points
into? That would be sufficiently well defined to allow programmatic
detection of whether or not p points into a given array. Are there any
implementations for which such a requirement would pose an undue
burden? For that matter, are there any implementations for which that
requirement isn't already being met?

Dik T. Winter

unread,

Sep 2, 2004, 7:34:16 AM9/2/04

to

In article <8b42afac.04090...@posting.google.com> kuy...@wizard.net (James Kuyper) writes:
...

> How about: the result must be an integer in the valid range of
> ptrdiff_t that is not the same as the result from p-r, for any r that
> is a pointer into or one past the end of the same array that p points
> into? That would be sufficiently well defined to allow programmatic
> detection of whether or not p points into a given array. Are there any
> implementations for which such a requirement would pose an undue
> burden?

How about a segmented architecture where a pointer consists of two parts
(segment and location in segment)? Would the above requirement mean that
with a pointer subtraction first the segment part has to be subtracted
and the final result depends on whether the result of this first
subtraction is 0 or not?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Christian Bau

unread,

Sep 2, 2004, 7:38:24 PM9/2/04

to

In article <8b42afac.04090...@posting.google.com>,
kuy...@wizard.net (James Kuyper) wrote:

That would be inefficient on a machine using segment + offset, where
each individual object has to be contained completely in one segment.
Pointer difference doesn't need to take the segments into account.
However, you could have two objects at the same offset in different
segments, and without looking at segments the pointer difference would
be zero.

Comparison for equality obviously would have to compare both segment and
offset on such a machine.

Barry Margolin

unread,

Sep 3, 2004, 1:53:12 AM9/3/04

to

In article
<christian.bau-E28...@slb-newsm1.svr.pol.co.uk>,
Christian Bau <christ...@cbau.freeserve.co.uk> wrote:

Cute. Is there any other application of this subtraction? Perhaps
instead of changing the rule for pointer subtraction, it would be better
to add an operator or macro that tests whether one pointer points to a
location between two other pointers, or points to within some object.
Similar to the way the offsetof() operator was added rather than
defining the underlying pointer subtraction.

The advantage of creating a new operator is that it can generate the
most efficient code for the memory architecture, rather than requiring
you to do all the extra work above on simple, flat-memory systems.

Douglas A. Gwyn

unread,

Sep 3, 2004, 3:14:17 AM9/3/04

to

Barry Margolin wrote:
> The advantage of creating a new operator is that it can generate the
> most efficient code for the memory architecture, rather than requiring
> you to do all the extra work above on simple, flat-memory systems.

It also has the advantage of not adding overhead unless
it is used, unlike extensions to the spec for pointer
subtraction etc.

Wojtek Lerch

unread,

Sep 3, 2004, 8:04:55 AM9/3/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message
news:IvGdneHWPrB...@comcast.com...

On the other hand, unlike a new operator, it may turn out that the extension
to pointer subtraction is already implemented by all existing compilers.

Dik T. Winter

unread,

Sep 3, 2004, 8:33:22 AM9/3/04

to

Yes, for some definition of "all existing".

Dan Pop

unread,

Sep 3, 2004, 9:21:09 AM9/3/04

to

In <barmar-D107F5....@comcast.dca.giganews.com> Barry Margolin <bar...@alum.mit.edu> writes:

>Cute. Is there any other application of this subtraction? Perhaps
>instead of changing the rule for pointer subtraction, it would be better
>to add an operator or macro that tests whether one pointer points to a
>location between two other pointers, or points to within some object.
>Similar to the way the offsetof() operator was added rather than
>defining the underlying pointer subtraction.
>
>The advantage of creating a new operator is that it can generate the
>most efficient code for the memory architecture, rather than requiring
>you to do all the extra work above on simple, flat-memory systems.

OTOH, the advantages of changing the semantics of pointer subtraction
between unrelated pointers from undefined to unspecified are:

1. Minimal changes in text of the standard.

2. Most existing implementations don't require *any* changes at all in
order to conform to the changes in the standard.

The need to check whether one pointer points inside an object doesn't
arise very often, and the few cases where such checks are needed could
cope with the 1 or 2 tests involved by the method based on pointer
subtraction (1 test if q - p is larger than the object, 2 tests
otherwise).

David Adrien Tanguay

unread,

Sep 3, 2004, 8:02:25 PM9/3/04

to

Barry Margolin wrote:
> Perhaps
> instead of changing the rule for pointer subtraction, it would be better
> to add an operator or macro that tests whether one pointer points to a
> location between two other pointers, or points to within some object.
> Similar to the way the offsetof() operator was added rather than
> defining the underlying pointer subtraction.

We implemented it as memwithin(), a function in stdlib.h. It could be inlined
by the compiler in the same way that all mem* etc. might be.
--
David Tanguay http://www.sentex.ca/~datanguayh/
Kitchener, Ontario, Canada [43.24N 80.29W]

Keith Thompson

unread,

Sep 3, 2004, 9:18:13 PM9/3/04

to

David Adrien Tanguay <datan...@sentex.cookie.can> writes:
> Barry Margolin wrote:
> > Perhaps instead of changing the rule for pointer subtraction, it
> > would be better to add an operator or macro that tests whether one
> > pointer points to a location between two other pointers, or points
> > to within some object. Similar to the way the offsetof() operator
> > was added rather than defining the underlying pointer subtraction.
>
> We implemented it as memwithin(), a function in stdlib.h. It could be inlined
> by the compiler in the same way that all mem* etc. might be.

stdlib.h? The other mem* functions are in string.h.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

David Adrien Tanguay

unread,

Sep 4, 2004, 12:10:29 AM9/4/04

to

Keith Thompson wrote:
> David Adrien Tanguay <datan...@sentex.cookie.can> writes:
>>We implemented it as memwithin(), a function in stdlib.h. It could be inlined
>>by the compiler in the same way that all mem* etc. might be.
>
>
> stdlib.h? The other mem* functions are in string.h.

Oops, my mento.

Douglas A. Gwyn

unread,

Sep 5, 2004, 3:14:40 AM9/5/04

to

David Adrien Tanguay wrote:
> We implemented it as memwithin(), a function in stdlib.h.

Please don't do that. The name is reserved and may be
used for a different purpose, or with a different
interface, in a future revision of the C standard.

pete

unread,

Sep 10, 2004, 8:26:24 AM9/10/04

to

James Kuyper wrote:
>
> "Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<s6ydnT51IfT...@comcast.com>...
> > Dan Pop wrote:
> > > "Douglas A. Gwyn" <DAG...@null.net> writes:
> > >>Yes. The subtraction is meaningless if p and q
> > >>point into different segments.
> > > This does NOT explain why the result couldn't be unspecified instead of
> > > undefined.
> >
> > Sure it does. Unspecified behavior requires that there
> > be a set of meaningful alternatives from which the
> > implementation can choose.
>
> How about: the result must be an integer in the valid range of
> ptrdiff_t that is not the same as the result from p-r, for any r that
> is a pointer into or one past the end of the same array that p points
> into?

That seems a little strange,
given that p-r isn't guaranteed to be within the range of ptrdiff_t.

--
pete

James Kuyper

unread,

Sep 10, 2004, 9:32:54 PM9/10/04

to

pete <pfi...@mindspring.com> wrote in message news:<41419D...@mindspring.com>...

Requiring that ptrdiff_t be capable of representing p-r is part of the
same package of ideas.

Christian Bau

unread,

Sep 11, 2004, 4:20:55 AM9/11/04

to

In article <8b42afac.04091...@posting.google.com>,
kuy...@wizard.net (James Kuyper) wrote:

If for example size_t = 32 bit unsigned and ptrdiff_t = 32 bit signed,
then char*p = malloc (3000000000); char*q = p+3000000000; could create
two pointers such that q-p doesn't fit into ptrdiff_t. Currently this
invokes undefined behavior. This should probably be changed to
"unspecified result" as well.

(Some people will now come up and say that this might be inefficient on
some weird implementation. There is a very simple and convincing
counter-argument for this which they could easily find themselves if
they are not too lazy to think. Let's see what happens).

Douglas A. Gwyn

unread,

Sep 11, 2004, 10:54:11 AM9/11/04

to

Christian Bau wrote:
> ... Currently this

> invokes undefined behavior. This should probably be changed to
> "unspecified result" as well.

No, undefined behavior is the right category.
If it were instead unspecified that would mean
that there are multiple valid results according
to the standard and that the implementation has
to choose one of them. But there is no valid
result when an arithmetic value is not
representable in the signed integer type.

Max T. Woodbury

unread,

Nov 23, 2004, 9:50:55 AM11/23/04

to

While the meaning of the difference between pointers in different segments
may be ambiguous, it can be combined with other information that checks for
that difference to provide a definitive result to the original question. Exactly
how that combination should be performed is almost certainly complex enough to
require standardization much the same way that 'offsetof' is standardized. An
'isinmem(void * q, void * p, size_t n)' macro that checks to see if q is within
the object pointed to by p that is size n would be most useful. ('isinobject' is
another name that could be used for this macro.)

In other words, your failed to distinguish between ambiguous and meaningless.

On the more general question of the difference between two pointers that may
not be within the same object, are there architectures where taking that
difference could cause an exception? It may also be relevant whether or not
that difference can be used to transform one of those pointers without
exception. If exceptions can be avoided, it would make sense to specify that
the difference is unspecified or implementation defined rather than undefined.

m...@mtew.isa-geek.net

Douglas A. Gwyn

unread,

Nov 24, 2004, 3:04:48 AM11/24/04

to

Max T. Woodbury wrote:
> In other words, your failed to distinguish between ambiguous and meaningless.

No, I didn't. There is no *meaning* to the "difference"
between pointers into different segments on a true
segmented architecture. And if one wanted to arbitrarily
force implementations to give it a meaning, that might
require making ptrdiff_t wider than is normally necessary,
and would almost certainly force most pointer difference
operations to generate additional "normalization" code.
We chose not to do that, because of the poor trade-off
between the exceedingly limited added utility versus the
run-time benefits from having it specified the way it is.

Max T. Woodbury

unread,

Nov 24, 2004, 12:23:57 PM11/24/04

to

Argh! Just because additional information is required to answer
important questions about the "difference" does NOT mean that
the difference does not contain any information. The *meaning* of
that information is complex and useless by itself, (that is
ambiguous) but is not devoid of all meaning (that is meaningless).
Just repeating your assertion does NOT correct your error.

Your argument assumes that pointer arithmetic is the same as signed integer
arithmetic. Bluntly, it is usually closer to unsigned arithmetic.
The difference between two unsigned quantities has to be a signed
quantity to preserve the ordering for small differences as would be
found in the difference between to pointers into the same array. However
the clock-like nature of the sequence of unsigned values makes large
differences somewhat less consistent with what you would normally expect
of signed quantities, but is none the less predictable. With that in mind,
ptrdiff_t can be the same size as size_t. It might be a good idea to
add a footnote to the definition of subtraction of pointer values to
remind people that surprises are possible with humugous objects (i.e.
ones where sizeof(object) > PTRDIFF_MAX).

To restate this in other terms: ptrdiff_t is the type of the result of
taking the difference between two pointers and its size is implementation-
defined. The standard requires that it be a signed type. This an artifice
to make the difference have desirable properties for small differences; the
kind of differences most often encountered. Because the part of pointers
that can be support arithmetic are most often unsigned values, the result
of subtracting them will have the properties of an unsigned value.
There are rules for converting unsigned values to signed values. Those rules
are used in this case. Those rules produce surprising results for large
differences. You are objecting (with reason) to that surprise, but your
solution is inconsistent with the rest of the definition of 'C'. In your
insistence that ptrdiff_t be larger than size_t, you are ignoring all the
rules specifically set up to reflect what happens when large unsigned
quantities are converted to signed quantities.

So there is no REQUIREMENT that ptrdiff_t be larger than size_t. (You
did say 'might' so you are not completely wrong on that point.) Your
conclusion that "normalization" would 'almost certainly' be required is
incorrect because there is no requirement that the difference between
unsigned values be "normalized" when converted to a signed value. There
is an additional conversion implicit in taking the difference between
pointers namely the one that divides the difference by the size of the
element being addressed. Exactly where that conversion is applied could
change the result. If it were done before the difference was taken
rather than after there might be fewer surprises. I suspect that is what
you mean by "normalize". However early conversion introduces a number of
nasty problems and not requiring that particular solution showed
considerable wisdom on the committee's part.

If you had argued that ambiguity was sufficient grounds for declaring
the difference between pointers to be 'undefined behavior' I would not
have disagreed with you, but that is NOT what you did. It could also
be argued that the difference is unspecified or implementation defined.
The point is that this was a somewhat (but not completely) arbitrary
decision. If you had argued that making this 'undefined-behaviour'
put the smallest constraint on implementations, you could have settled
that issue without argument.

But none of this addresses the original question. The answer to that
question is that there is no way to do what is wanted that stays within
the standard, much like there is no way to define 'offsetof' that stays
within the strict definitions of the standard. However every
implementation can in fact answer the question one way or another
just as every implementation implements 'offsetof' one way or another.
That points to a need for an 'isin' macro or function and its absence
might be construed as a defect. In any case, it is an issue that needs
to be addressed by the committee.

m...@mtew.isa-geek.net

Douglas A. Gwyn

unread,

Nov 24, 2004, 12:58:58 PM11/24/04

to

Max T. Woodbury wrote:
> Your argument assumes that pointer arithmetic is the same as signed integer
> arithmetic.

Not at all. Read what I *said*. There is NO meaning

to the "difference" between pointers into different

segments in a true segmented architecture. If you
aren't familiar with such architectures, there are
Burroughs 6500 family documents on-line at the
bitsavers archive site.

> ... In your

> insistence that ptrdiff_t be larger than size_t, you are ignoring all the
> rules specifically set up to reflect what happens when large unsigned
> quantities are converted to signed quantities.

I said nothing of the kind; you're making up the
argumentation that you're attributing to me.

Wojtek Lerch

unread,

Nov 24, 2004, 1:07:23 PM11/24/04

to

Douglas A. Gwyn wrote:
> Max T. Woodbury wrote:
>
>> In other words, your failed to distinguish between ambiguous and
>> meaningless.
>
>
> No, I didn't. There is no *meaning* to the "difference"
> between pointers into different segments on a true
> segmented architecture. And if one wanted to arbitrarily

Not if you already know that they point into different segments; but the
point of the proposal is to have a reasonable way to find out whether
they do or don't. If you don't know whether a pointer points to one of
the N elements of a given array or somewhere else, the number that
ptr-arr produces is not completely meaningless even if it is completely
bogus -- it tells you that ptr does *not* point to any element of arr
other than arr[ptr-arr]. It doesn't say unambiguously where ptr does
point to, but it does rule out at least N-1 possibilities (all N if
ptr-arr turns out to be out of range).

> force implementations to give it a meaning, that might
> require making ptrdiff_t wider than is normally necessary,
> and would almost certainly force most pointer difference
> operations to generate additional "normalization" code.

No, the proposal was for the subtraction to produce an unspecified
number if the two pointers don't point into the same object. The
implementation wouldn't need to generate any additional code, just make
sure that the normal code doesn't trap.

> We chose not to do that, because of the poor trade-off
> between the exceedingly limited added utility versus the
> run-time benefits from having it specified the way it is.

That was the original question: what are the benefits from it being
completely undefined behaviour instead of requiring it to return an
unspecified number? Do you know of any implementations where
subtracting pointers that point to different segments causes a trap
instead of producing some number?

Douglas A. Gwyn

unread,

Nov 24, 2004, 1:35:28 PM11/24/04

to

Wojtek Lerch wrote:
> No, the proposal was for the subtraction to produce an unspecified
> number if the two pointers don't point into the same object.

But that is useless without the additional information
that they do point into the same object, for which the
result is already well defined by the current spec.
Why should any algorithm be subtracting pointers to
objects that it doesn't know lie in the same subspace?

The only genuine use of some related facility seems to
be, in some instances, for a library function given a
pointer parameter to check that it validly points into
the data structure upon which the function operates.
But usually such a function expects a pointer to an
object "base" (e.g. start of a struct), not to some
arbitrary offset within a valid object, and therefore
simple equality comparison is adequate; that is
already allowed by the spec.

Max T. Woodbury

unread,

Nov 24, 2004, 1:51:09 PM11/24/04

to

"Douglas A. Gwyn" wrote:
>
> Max T. Woodbury wrote:
> > Your argument assumes that pointer arithmetic is the same as signed integer
> > arithmetic.
>
> Not at all. Read what I *said*. There is NO meaning
> to the "difference" between pointers into different
> segments in a true segmented architecture. If you
> aren't familiar with such architectures, there are
> Burroughs 6500 family documents on-line at the
> bitsavers archive site.

You are simply repeating yourself and repeating your error.

I am familiar with a couple segmented architectures. I understand what
you are trying to say, but you are failing to understand what I am saying.
You fail again to grasp the difference between ambiguous and meaningless.

>
> > ... In your
> > insistence that ptrdiff_t be larger than size_t, you are ignoring all the
> > rules specifically set up to reflect what happens when large unsigned
> > quantities are converted to signed quantities.
>
> I said nothing of the kind; you're making up the
> argumentation that you're attributing to me.

All right, you said 'might' and I exaggerated, but you still said something
about making ptrdiff_t larger than size_t, which was what I was refuting.

Wojtek Lerch

unread,

Nov 24, 2004, 2:31:47 PM11/24/04

to

Douglas A. Gwyn wrote:
> Wojtek Lerch wrote:
>> No, the proposal was for the subtraction to produce an unspecified
>> number if the two pointers don't point into the same object.
>
>
> But that is useless without the additional information
> that they do point into the same object, for which the
> result is already well defined by the current spec.

No, the *purpose* of it is to let you find out *efficiently* whether
they point into the same object or not.

> Why should any algorithm be subtracting pointers to
> objects that it doesn't know lie in the same subspace?

To find out whether they do or not. Currently, the only way is to run a
loop that will rule out all the possibilities one at a time. The
subtraction would rule out all except one.

> The only genuine use of some related facility seems to
> be, in some instances, for a library function given a
> pointer parameter to check that it validly points into
> the data structure upon which the function operates.
> But usually such a function expects a pointer to an
> object "base" (e.g. start of a struct), not to some
> arbitrary offset within a valid object, and therefore
> simple equality comparison is adequate; that is
> already allowed by the spec.

Usually, yes. But sometimes arrays are involved, and the pointer may
validly point to any of N array elements, and you need N equality
comparisons instead of one. A subtraction followed by one equality
comparison is likely to be more efficient in such cases.

James Kuyper

unread,

Nov 24, 2004, 3:34:21 PM11/24/04

to

Max T. Woodbury wrote:
...

> I am familiar with a couple segmented architectures. I understand what
> you are trying to say, but you are failing to understand what I am saying.
> You fail again to grasp the difference between ambiguous and meaningless.

If it's not meaningless, then what is the meaning of the difference that
could be calculated between pointers into different segments, on that
archictecture?

James Kuyper

unread,

Nov 24, 2004, 4:47:31 PM11/24/04

to

Max T. Woodbury wrote:

> It would be the difference in the offsets within the respective segments.

Which is meaningless.

> That value is *almost* totally useless unless the two segments were actually
> aliases for the same physical memory address.

Then those address would be pointers into the same segment, even if one
of them used a different segment to point there. What meaning is there
to the difference when they point into truly different memory blocks?

Max T. Woodbury

unread,

Nov 24, 2004, 4:21:42 PM11/24/04

to

It would be the difference in the offsets within the respective segments.

That value is *almost* totally useless unless the two segments were actually
aliases for the same physical memory address. It is conceivable that the
contents of two segments were codependent in some other fashion so that that
value had useful implications. In other words, there are a very few cases
where such a difference could be used, but there *could* be some.

You should understand that I am NOT saying that this operation should be
classified as anything but an 'undefined behavior'. It is NOT portable and is
almost always an error when done. However the operation is not *QUITE*
meaningless.

m...@mtew.isa-geek.net

Max T. Woodbury

unread,

Nov 24, 2004, 3:56:56 PM11/24/04

to

Umm, let's stop feeding this bear...

What you want to do is reasonable and can be done one way or
another in every C implementation, but you have to step outside
the strict language of the standard to do it. As a practical
matter you should include a configuration header in your program
that defines an 'isin' macro that encapsulates the solution and
tailor that header to whatever platform you are using. On some
implementations you would actually have to call a real function
or modify the compiler to prevent side effects from killing you
in really perverse uses of the macro.

This is the same problem you would face if you were trying to
implement 'offsetof' if it were not defined in <stddef.h>. The
fact that there is no 'isin' specification in the standard is as
much of a defect as the absence of a 'offsetof' specification would
be.

Fiddling with the definition of pointer differences is not really
going to solve the problem because there are fundamental issues that
make taking some of these differences non-portable. That means that
that operation really does represent an undefined behavior as far
as the standard is concerned. On the other hand it is to be expected
that EVERY implementation will define that behavior one way or
another even if the result is almost useless. The reason it can
not simply be 'implementation defined' is because taking that particular
kind of difference is an error in almost all cases and making it
'implementation defined' would imply that it was not an error in any
case. 'unspecified' would also imply that it was not the error it is.

m...@mtew.isa-geek.net

Max T. Woodbury

unread,

Nov 24, 2004, 8:35:12 PM11/24/04

to

James Kuyper wrote:
>
> Max T. Woodbury wrote:
> > James Kuyper wrote:
> >
> >>Max T. Woodbury wrote:
> >>...
> >>
> >>>I am familiar with a couple segmented architectures. I understand what
> >>>you are trying to say, but you are failing to understand what I am saying.
> >>>You fail again to grasp the difference between ambiguous and meaningless.
> >>
> >>If it's not meaningless, then what is the meaning of the difference that
> >>could be calculated between pointers into different segments, on that
> >>archictecture?
> >
> >
> > It would be the difference in the offsets within the respective segments.
>
> Which is meaningless.

Almost always useless, but not always meaningless.

> > That value is *almost* totally useless unless the two segments were actually
> > aliases for the same physical memory address.
>
> Then those address would be pointers into the same segment, even if one
> of them used a different segment to point there. What meaning is there
> to the difference when they point into truly different memory blocks?

They would point to the same OBJECT, but would never the less be different
SEGMENTS.

Meaning is a complex philosophical concept. One useful approach to meaning
has to deal with utility. That is something has meaning if you can use the
information it contains. I pointed out that the difference between pointers
into different segments contains a specific kind of information and defined
how to calculate it. I also pointed out that a context could exist where
that information could be used to achieve an intended goal. I did not
specify details because I expect that anyone with good will can imagine
such a context and intent as well as or better than I can. Apparently
I expected too much.

A marginally useful example:

Set up a triple consisting of two tables and a function. Require that the
entries in the two tables are the same type. Require that each of the tables
are the sole occupant of a segment. Require that the function compute
something useful that depends on the difference in the index between the
table entries. The code for such an application can use the difference
between pointers into the different segments as the argument to the function.
That function defines the meaning of that difference.

Such an implementation would require recoding to work on a nonsegmented
architecture. That is it would not be easily portable. But it is still
an example where the difference between pointers into different segments
would be useful and therefore meaningful.

Douglas A. Gwyn

unread,

Nov 25, 2004, 2:41:44 AM11/25/04

to

Wojtek Lerch wrote:
> No, the *purpose* of it is to let you find out *efficiently* whether
> they point into the same object or not.

But that doesn't work. The difference of two pointers
to distinct objects (assuming the scheme you guys keep
assuming for making this extension) could be zero yet
the objects be in separate segments. Logically there
is no way to provide the additional information encoded
within the existing ptrdiff_t width; either ptrdiff_t
might have to be widened or a separate channel used for
the "same-object" information. (And all that is on the
assumption that it is feasible, which imposes a
requirement that the current C standard does not.)

Douglas A. Gwyn

unread,

Nov 25, 2004, 2:44:04 AM11/25/04

to

Max T. Woodbury wrote:
> It would be the difference in the offsets within the respective segments.

Which has no *meaning*, i.e. it doesn't refer to anything
representing a useful aspect of reality. It might as
well be hard-wired to always return 42, just as meaningful.

Dan Pop

unread,

Nov 25, 2004, 7:24:56 AM11/25/04

to

Just as meaningful and just as useful for its *intended* purpose, which
has been explained several times in this thread, but it has failed to
penetrate your thick skull.

The key point is that the test is a TWO step operation and the second
step eliminates any ambiguity caused by the unspecified nature of the
pointer subtraction operation when the pointers are not pointing inside
the same object or one byte after (assuming that the currently undefined
result is made unspecified by a future revision of the standard).

You have an array arr of N elements and a valid pointer p and you want
to know if p points inside the array. The first step is to compute
p - arr. If the result is not in the range 0 .. N-1, you know that
p is not pointing inside arr. If the result is within the expected
range, you need to perform a second test: p == arr + result, that is
guaranteed to work reliably, even when p points to a different memory
segment. If the equality test yields true, you *know* that p points
inside arr, otherwise you *know* that p doesn't point inside arr.

Within the framework of the current standard, you have to perform 1 to N
pointer equality tests in order to obtain the result without invoking
undefined behaviour and even you can realise that this is suboptimal for
large values of N.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

Currently looking for a job in the European Union

Wojtek Lerch

unread,

Nov 25, 2004, 8:45:09 AM11/25/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message

news:KtmdnSa4cI0...@comcast.com...

> Wojtek Lerch wrote:
>> No, the *purpose* of it is to let you find out *efficiently* whether they
>> point into the same object or not.
>
> But that doesn't work. The difference of two pointers
> to distinct objects (assuming the scheme you guys keep
> assuming for making this extension) could be zero yet
> the objects be in separate segments.

int arr[ N ], *ptr;

unsigned n = ptr - arr;
if ( n <= N && ptr == & arr[ n ] )
printf( "ptr points to arr[%u]\n", n );
else
puts( "ptr doesn't point to any element of arr" );

David Hopwood

unread,

Nov 25, 2004, 12:14:41 PM11/25/04

to

Wojtek Lerch wrote:

> "Douglas A. Gwyn" <DAG...@null.net> wrote:
>>Wojtek Lerch wrote:
>>
>>>No, the *purpose* of it is to let you find out *efficiently* whether they
>>>point into the same object or not.
>>
>>But that doesn't work. The difference of two pointers
>>to distinct objects (assuming the scheme you guys keep
>>assuming for making this extension) could be zero yet
>>the objects be in separate segments.
>
> int arr[ N ], *ptr;
>
> unsigned n = ptr - arr;

You mean size_t (which is guaranteed to be unsigned and large enough).

> if ( n <= N && ptr == & arr[ n ] )
> printf( "ptr points to arr[%u]\n", n );
> else
> puts( "ptr doesn't point to any element of arr" );

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Message has been deleted

Wojtek Lerch

unread,

Nov 25, 2004, 1:19:45 PM11/25/04

to

David Hopwood wrote:

> Wojtek Lerch wrote:
>> unsigned n = ptr - arr;
>
> You mean size_t (which is guaranteed to be unsigned and large enough).

Any unsigned type wide enough to hold the value of N is good enough.

Wojtek Lerch

unread,

Nov 25, 2004, 1:39:28 PM11/25/04

to

Douglas A. Gwyn wrote:
> Dan Pop wrote:
>

>>Just as meaningful and just as useful for its *intended* purpose, which
>>has been explained several times in this thread, but it has failed to
>>penetrate your thick skull.

>>The key point is that the test is a TWO step operation ...
>
>
> So it is a two-step operation where the first step is a waste
> of time, since you agree that the constant 42 could be used
> instead of subtracting the pointers at all.

Only if the pointers don't point into the same array. If they do, the
subtraction is necessary. If they don't, returning 42 is good enough,
but doing a subtraction anyway is probably simpler and faster.

>>... If the result is within the expected

>>range, you need to perform a second test: p == arr + result, that is
>>guaranteed to work reliably, even when p points to a different memory
>>segment.
>

> Since result = p - arr, the only way such a scheme is going to
> work is if it amounts to the canonicalization (essentially,
> introduction of a flat address superspace) that I referred
> to previously.

No; if result is 42 except when p points to an element of arr, there's
no canonicalization, is there?

> The bottom line is that there is no way to make this work on
> all platforms without imposing additional requirements that
> affect even those programs that don't try to do this. If you

That's a question that hasn't been answered here: are there any existing
implementations that don't satisfy the additional requerement already?
Do you know of any implementations where the subtraction may trap
instead of returning a bogus number?

> want an is-in-object feature, it should be done in a different
> way than by messing with the requirements for pointer
> arithmetic.

On the other hand, if all existing implementations already meet the
proposed requirement, adding it to the standard would have less impact
on them than adding an is-in-object macro.

Message has been deleted

Douglas A. Gwyn

unread,

Nov 25, 2004, 1:56:32 PM11/25/04

to

Wojtek Lerch wrote:
> No; if result is 42 except when p points to an element of arr, there's
> no canonicalization, is there?

No, I agree that *if* all the relevant intermediate
quantities were required to have valid values, then
the approach you outlined would work. It is the
requirement that pointer subtraction simultaneously
work correctly for the currently well-defined cases
and also result in a valid (albeit meaningless)
result for the distinct-object case that I object to.

> Do you know of any implementations where the subtraction may trap
> instead of returning a bogus number?

I know of reasonable architectures where C could be
implemented that would have that property, at least
using the most natural implementation (which the
current spec supports). We'd like to see the future
evolving in the direction of more secure computation,
so even if such architectures aren't widespread
today, why should we discourage their development?
(There are already instances of architectural design
deficiencies that are at least partly attributable
to other infelicitous aspects of the C spec, so it
does matter what the spec supports.)

Wojtek Lerch

unread,

Nov 25, 2004, 5:05:06 PM11/25/04

to

Douglas A. Gwyn wrote:
> Wojtek Lerch wrote:

>> Do you know of any implementations where the subtraction may trap
>> instead of returning a bogus number?
>
> I know of reasonable architectures where C could be
> implemented that would have that property, at least
> using the most natural implementation (which the
> current spec supports). We'd like to see the future
> evolving in the direction of more secure computation,
> so even if such architectures aren't widespread
> today, why should we discourage their development?

That is a good point. I'm not sure about subtracting pointers pointing
into different segments, but an example that sounds pretty realistic to
me is a processor that has a choice of two opcodes for integer division:
one that stores the remainder in a register, and another one that traps
if the remainder wasn't zero. If they're equally fast, it makes more
sense for pointer subtraction to use the latter than the former.

Christian Bau

unread,

Nov 25, 2004, 5:55:30 PM11/25/04

to

In article <KtmdnSa4cI0...@comcast.com>,

"Douglas A. Gwyn" <DAG...@null.net> wrote:

You just don't get it.

I want to write a function

int inrange (char* p, size_t n, char* q)

with the following properties: p and q must be valid pointers with the
property that p + n is also a valid pointer. The function returns 1 if
and only if there is an i, 0 <= i < n, such that p + i == q. A trivial
implementation taking O (n) time would be

int inrange (char* p, size_t n, char* q)
{
size_t i;
for (i = 0; i < n; ++i)
if (p + i == q)
return 1;
return 0;
}

Now an implementation that takes O (1), provided that the difference
between pointers produces an arbitrary value of type intptr_t in all
cases where the result would be undefined in the current standard:

int inrange (char* p, size_t n, char* q)
{
ptrdiff_t diff = q - p;
return diff >= 0 && diff < n && p+diff == q;
}

That's it.

Christian Bau

unread,

Nov 25, 2004, 5:56:58 PM11/25/04

to

In article <KtmdnSG4cI3...@comcast.com>,

"Douglas A. Gwyn" <DAG...@null.net> wrote:

That's absolutely fine with me. If pointer difference yields a result of
42 in all cases where the current standard says the result is undefined,
that would be very helpful indeed.

Christian Bau

unread,

Nov 25, 2004, 6:06:52 PM11/25/04

to

In article <30n38kF...@uni-berlin.de>,
Wojtek Lerch <Wojt...@yahoo.ca> wrote:

> That is a good point. I'm not sure about subtracting pointers pointing
> into different segments, but an example that sounds pretty realistic to
> me is a processor that has a choice of two opcodes for integer division:
> one that stores the remainder in a register, and another one that traps
> if the remainder wasn't zero. If they're equally fast, it makes more
> sense for pointer subtraction to use the latter than the former.

Something that actually exists: Architectures where an integer division
x/y as defined by the C Standard takes a certain amount of time, but an
integer division that only needs to produce the correct result if x is a
multiple of y would be faster.

Still not a problem; I don't need a "correct" result in this situation
whatever "correct" would mean.

And anyway, the C Standard requires that comparison for equality or
inequality works for all legal pointer values, even if one or both
operands are null pointers. Gwyn's "security" argument is completely
bogus: I can write a function that has the requested functionality in
completely conforming C, except that it would be rather slow. So making
pointer differences of unrelated pointers undefined behavior doesn't
help at all with security. And since I can always memcpy a pointer into
an array of unsigned char and look at it (non-portable, but hackers
don't care)...

David Hopwood

unread,

Nov 25, 2004, 7:42:04 PM11/25/04

to

Douglas A. Gwyn wrote:
> Wojtek Lerch wrote:

>> Do you know of any implementations where the subtraction may trap
>> instead of returning a bogus number?
>
> I know of reasonable architectures where C could be
> implemented that would have that property, at least
> using the most natural implementation (which the
> current spec supports).

Name them.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Wojtek Lerch

unread,

Nov 25, 2004, 9:30:45 PM11/25/04

to

"Christian Bau" <christ...@cbau.freeserve.co.uk> wrote in message
news:christian.bau-4D8...@slb-newsm1.svr.pol.co.uk...

> Something that actually exists: Architectures where an integer division
> x/y as defined by the C Standard takes a certain amount of time, but an
> integer division that only needs to produce the correct result if x is a
> multiple of y would be faster.

All that this example proves is the existence of systems where any
reasonable implementation of pointer subtraction always produces some
number. I don't think anybody here doubts their existence; in fact, they
seem to be in majority. The question is whether there actually is a
minority worth worrying about.

> And anyway, the C Standard requires that comparison for equality or
> inequality works for all legal pointer values, even if one or both
> operands are null pointers. Gwyn's "security" argument is completely
> bogus: I can write a function that has the requested functionality in
> completely conforming C, except that it would be rather slow. So making
> pointer differences of unrelated pointers undefined behavior doesn't
> help at all with security. And since I can always memcpy a pointer into
> an array of unsigned char and look at it (non-portable, but hackers
> don't care)...

I doubt he was concerned about allowing hackers to detect whether two
pointers point into the same object; I think the kind of security he had in
mind has to do with allowing implementations to terminate a buggy program
that attempts to subtract unrelated pointers, or to divide be zero, or to
dereference a null pointer.

Douglas A. Gwyn

unread,

Nov 26, 2004, 12:21:36 AM11/26/04

to

Christian Bau wrote:
> You just don't get it.
> I want to write a function
> int inrange (char* p, size_t n, char* q)

> ...

Yes, yes, I did get that. I think the goal is
better expressed in just those terms than in terms
of changing the requirements for pointer arithmetic.

Douglas A. Gwyn

unread,

Nov 26, 2004, 12:22:39 AM11/26/04

to

Wojtek Lerch wrote:
> I doubt he was concerned about allowing hackers to detect whether two
> pointers point into the same object; I think the kind of security he had in
> mind has to do with allowing implementations to terminate a buggy program
> that attempts to subtract unrelated pointers, or to divide be zero, or to
> dereference a null pointer.

Etc. Yes.

James Kuyper

unread,

Dec 4, 2004, 11:42:04 AM12/4/04

to

"Max T. Woodbury" <max.teneyc...@verizon.net> wrote in message news:<41A536CD...@verizon.net>...

> James Kuyper wrote:
>> Max T. Woodbury wrote:
...

>>> It would be the difference in the offsets within the respective
segments.
>>
>> Which is meaningless.
> Almost always useless, but not always meaningless.

OK, then please describe that meaning. Not how to calculate the number
- we're quite clear on that. What does that number mean? I can
subtract my birth year from my house number; the result is well
defined, but it doesn't mean anything. It might, coincidentally, be
equal to a number that would be meaningful in a different context, but
the context of this calculation gives no meaning to the result.

...

> information it contains. I pointed out that the difference between pointers
> into different segments contains a specific kind of information and defined
> how to calculate it. I also pointed out that a context could exist where
> that information could be used to achieve an intended goal. I did not
> specify details because I expect that anyone with good will can imagine such
> a context and intent as well as or better than I can. Apparently
> I expected too much.

You did expect too much. I can't imagine any such context. Your
example is extremely contrived.

> A marginally useful example:
>
> Set up a triple consisting of two tables and a function. Require that the
> entries in the two tables are the same type. Require that each of the tables
> are the sole occupant of a segment. Require that the function compute
> something useful that depends on the difference in the index between the
> table entries. The code for such an application can use the difference
> between pointers into the different segments as the argument to the function.
> That function defines the meaning of that difference.

Since C code provides no mechanism for ensuring that an object is
memory-segment aligned, and no guarantee that a pointer difference
gives the result that it would give on that particular implementation,
there's no meaning withing portable C code for that difference. And
there's no reason for someone writing non-portable code to complain
about the restrictions of the C standard, because
implementation-specific extensions can provide whatever it is you're
asking for.

You can invent an implementation-specific meaning for anything.
Consider the following highly unportable code:

int main(void)
{
int a, b;
return &b-&a;
}

On a particular implementation, that code might deliberatly have the
effect of returning as an exit status the secret key for unlocking a
nuclear weapon. However, such implementation-specific meanings aren't
worth taking into consideration in the C standard.

Max T. Woodbury

unread,

Dec 4, 2004, 3:26:39 PM12/4/04

to

James Kuyper wrote:
> "Max T. Woodbury" <max.teneyc...@verizon.net>

>> information it contains. I pointed out that the difference between pointers
>> into different segments contains a specific kind of information and defined
>> how to calculate it. I also pointed out that a context could exist where
>> that information could be used to achieve an intended goal. I did not
>> specify details because I expect that anyone with good will can imagine such
>> a context and intent as well as or better than I can. Apparently
>> I expected too much.
>
> You did expect too much. I can't imagine any such context. Your
> example is extremely contrived.

Of course it was contrived. That you can not imagine where something like it
might be useful is a reflection on your philosophy. It also indicates that
further discussion would be useless.

I am sorry that your philosophy prevents you from grasping that meaning is
related to utility, at least in some philosophical systems. Until that
misunderstanding is cleared up, it will be impossible to answer your questions
in a fashion that you are likely to find acceptable.

In a few more rounds I'm sure Goodwin will get invoked. Let's stop before
that happens.

m...@mtew.isa-geek.net