ptrdiff_t and portability

Serve Laurijssen

unread,

Nov 13, 2002, 3:31:56 AM11/13/02

to

I was converting some class to C-code. The code computes the index of a
character in a string. If the character wasn't found it would return -1.

I did it like this:
wchar_t *p = wcschr(str, L' ');
int pos = (p != NULL ? p - str : -1);

Then I started wondering, suppose this code will have to run on a 64-bit
machine, will the pointer subtraction always work?
I could change the pos type to ptrdiff_t, but how about -1 then, will it be
ok to test for -1 or is there a macro I could use for that?

Richard Bos

unread,

Nov 13, 2002, 5:14:52 AM11/13/02

to

"Serve Laurijssen" <s.laur...@c-content.nl> wrote:

> I was converting some class to C-code. The code computes the index of a
> character in a string. If the character wasn't found it would return -1.
>
> I did it like this:
> wchar_t *p = wcschr(str, L' ');
> int pos = (p != NULL ? p - str : -1);
>
> Then I started wondering, suppose this code will have to run on a 64-bit
> machine, will the pointer subtraction always work?

It will work as long as the length of the longest string you ever have
(more precisely, the largest non-space start of a string) will be small
enough to fit in an int. It's unlikely that this will never be the case
if your strings are human-readable, but if they may contain random
computer-generated data, check this. (And even so, how long a string do
you need to overflow a 64-bit int?)

Richard

pete

unread,

Nov 13, 2002, 11:51:12 AM11/13/02

to

ptrdiff_t is a good choice.
ptrdiff_t is an implementation defined, signed integer, type.
(-1) is within the range of all signed integer types.
Comparing (-1) against a ptrdiff_t is good.

--
pete

those who know me have no need of my name

unread,

Nov 13, 2002, 1:30:38 PM11/13/02

to

in comp.lang.c i read:

>wchar_t *p = wcschr(str, L' ');
>int pos = (p != NULL ? p - str : -1);
>
>Then I started wondering, suppose this code will have to run on a 64-bit
>machine, will the pointer subtraction always work?

yes, but that's by knowing how current 64 bit platforms work and guessing
that future ones will be substantially similar. but it is poor programming
practice to program with direct consideration of the platform in this way,
things can (and have) changed then you'll suffer while you hunt down and
fix these problems, which you'll find quite annoying (i hope) since writing
better code to start with would have avoided the whole issue. ptrdiff_t
was created to cope with this problem, e.g., there are machines where int
is 32 bit but pointers are wider hence math with them may require more bits
to contain the result.

the most annoying thing about ptrdiff_t is that the standard doesn't
require that pointer math must be representable within the range of this
type, i.e., using it has the potential for having undefined behavior. but
no other type is appropriate and has the same potential for ub anyway.

>I could change the pos type to ptrdiff_t, but how about -1 then, will it be
>ok to test for -1 or is there a macro I could use for that?

ptrdiff_t is signed, so by itself there's no problem with using that value.
but the difference between two adjacent elements can be -1 (or 1) so it
isn't appropriate to use as an error indicator. i suggest PTRDIFF_MIN or
PTRDIFF_MAX instead, as they're the least likely values that could normally
be generated.

--
bringing you boring signatures for 17 years

Johan Aurér

unread,

Nov 13, 2002, 4:32:59 PM11/13/02

to

"Serve Laurijssen" <s.laur...@c-content.nl> wrote in message news:<_3oA9.29$s%6.182718@zonnet-reader-1>...

> I was converting some class to C-code. The code computes the index of a
> character in a string. If the character wasn't found it would return -1.
>
> I did it like this:
> wchar_t *p = wcschr(str, L' ');
> int pos = (p != NULL ? p - str : -1);
>
> Then I started wondering, suppose this code will have to run on a 64-bit
> machine, will the pointer subtraction always work?

It will work if and only if the difference of p and str can be represented
in both ptrdiff_t and int.

> I could change the pos type to ptrdiff_t, but how about -1 then, will it be
> ok to test for -1 or is there a macro I could use for that?

I don't know what kind of macro you're after, but comparing against -1 is
certainly alright. But even if you use ptrdiff_t, there is no guarantee
that the expression p - str can actually be computed (the result might not
be within the range of ptrdiff_t). If this is a concern, you can get
around the problem using size_t and addition, rather than ptrdiff_t and subtraction:

#include <stdio.h>
#include <wchar.h>

size_t find_wide_char(const wchar_t *s, wchar_t c)
{
size_t n = 0;

while (*s != c) {
if (*s++ == L'\0')
return -1;
n++;
}
return n;
}

int main(void)
{
wchar_t s[] = L"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
size_t i = find_wide_char(s, L'C');

if (i != (size_t)-1)
printf("%u\n", (unsigned)i);
else
printf("not found\n");
return 0;
}

--
au...@axis.com

Paul Sheer

unread,

Nov 14, 2002, 12:01:42 PM11/14/02

to

> avoided the whole issue. ptrdiff_t was created to cope with this
> problem, e.g., there are machines where int is 32 bit but pointers are
> wider hence math with them may require more bits to contain the result.
>
> the most annoying thing about ptrdiff_t is that the standard doesn't
> require that pointer math must be representable within the range of this
> type, i.e., using it has the potential for having undefined behavior.
> but no other type is appropriate and has the same potential for ub
> anyway.

Myself i always cast pointers to (unsigned long) before
doing calculations or comparisons

In general unsigned long is always greater than or equal
to the width of a pointer. On a 64 bit system, unsigned long
is 64 bits, and so are pointers. On an eight bit system,
unsigned long is usually 32, and pointers are usually 16
bits. Hence unsigned long is generally safe.

BUT microsoft has recently proposed an unsigned long of
32 bits, and pointers of 64 bits: for their 64 bit stuff.
This is TOALLY INSANE (where did i read it??)

So the very best thing you can do is define

typedef unsigned long ptr_word;

at the start of your program. Then cast pointers to it
whenever doing such calcs. If you ever get to a system
that is wierd, you can just change it to something else.

-paul

--

Paul Sheer Consulting IT Services . . Tel . . . +27 (0)21 6869634
Email . . . psh...@icon.co.za . . . . Work . . +27 (0)21 6503467
Linux development, cryptography, embedded, support, training
http://www.icon.co.za/~psheer . . . . . . http://rute.2038bug.com
L I N U X . . . . . . . . . . . . The Choice of a GNU Generation

Eric Sosman

unread,

Nov 14, 2002, 1:19:48 PM11/14/02

to

Paul Sheer wrote:
>
> > avoided the whole issue. ptrdiff_t was created to cope with this
> > problem, e.g., there are machines where int is 32 bit but pointers are
> > wider hence math with them may require more bits to contain the result.
> >
> > the most annoying thing about ptrdiff_t is that the standard doesn't
> > require that pointer math must be representable within the range of this
> > type, i.e., using it has the potential for having undefined behavior.
> > but no other type is appropriate and has the same potential for ub
> > anyway.
>
> Myself i always cast pointers to (unsigned long) before
> doing calculations or comparisons

Thanks for warning us about your software.

The language promises that you can convert a pointers to
integers and vice versa, but it does *not* promise that such a
conversion is meaningful:

6.3.2.3 Pointers
/5/ An integer may be converted to any pointer type.
Except as previously specified [null pointer constants],
the result is implementation-defined, might not be
correctly aligned, might not point to an entity of the
referenced type, and might be a trap representation.

/6/ Any pointer type may be converted to an integer
type. Except as previously specified, the result is
implementation-defined. If the result cannot be
represented in the integer type, the behavior is
undefined. The result need not be in the range of
values of any integer type.

That is, conversion between pointers and integers redefines
"GIGO" to mean "Goodstuff In, Garbage Out."

Now, most implementations will try to do the conversions
in a way that makes sense for the machine at hand; there's a
footnote in the Standard stating that this is the intent of
allowing the conversions in the first place. But footnotes
aren't normative and good intentions aren't guarantees, so
attempting such a conversion leaves you on shaky ground.

Do I sometimes convert pointers back and forth with
integers? Certainly: but only in code already known to be
non-portable, usually in connection with implementation-
specific memory managers. I don't do it "always" and I'd
suggest that anyone who does so is perpetuating the old
knock that "C combines the power of assembly language with
the portability of assembly language."

> In general unsigned long is always greater than or equal
> to the width of a pointer. On a 64 bit system, unsigned long
> is 64 bits, and so are pointers. On an eight bit system,
> unsigned long is usually 32, and pointers are usually 16
> bits. Hence unsigned long is generally safe.

You've never encountered the 128-bit pointers of IBM's
iSeries (AS/400) systems, I guess. I also guess you use
"in general ... always" to mean "sometimes."

> BUT microsoft has recently proposed an unsigned long of
> 32 bits, and pointers of 64 bits: for their 64 bit stuff.
> This is TOALLY INSANE (where did i read it??)

It's not "TOALLY" anything at all, and should create no
trouble for code that doesn't indulge in gratuitous idiocies.
Here's an idea: `double' is 64 bits in Microsoft's systems;
why not convert all your pointers to `double' to be safe?

--
Eric....@sun.com

those who know me have no need of my name

unread,

Nov 16, 2002, 3:29:37 AM11/16/02

to

in comp.lang.c i read:

>Myself i always cast pointers to (unsigned long) before
>doing calculations or comparisons
>
>In general unsigned long is always greater than or equal
>to the width of a pointer. On a 64 bit system, unsigned long
>is 64 bits, and so are pointers. On an eight bit system,
>unsigned long is usually 32, and pointers are usually 16
>bits. Hence unsigned long is generally safe.

but it may not be safe in the future. given no other history i would have
expected a 64 bit system to have 64 bit int's and 128 bit long's, but that
wasn't done, mostly because of the mass of existing systems expecting int's
to be 32 bits and long's to be 64. so i expect that the >64 bit systems of
the future to have 32 bit int's and 64 bit long's, just as today, but
pointers will be wider hence so will ptrdiff_t. your cast will then be too
narrow, necessitating fixes all around (even if just a typedef). but
casting to unsigned long long depends on c99(ish) compliance in the
compiler, so isn't very portable. and casting to a wider type than is
necessary likely involves way too much frobnication getting the math done,
which can be slow as hell on some platforms.

but then it's not horrible either.

Peter Shaggy Haywood

unread,

Nov 17, 2002, 6:57:51 PM11/17/02

to

Groovy hepcat Paul Sheer was jivin' on Thu, 14 Nov 2002 19:01:42 +0200
in comp.lang.c.
Re: ptrdiff_t and portability's a cool scene! Dig it!

> Myself i always cast pointers to (unsigned long) before
> doing calculations or comparisons

[Gasp!] How horrible!

>In general unsigned long is always greater than or equal
>to the width of a pointer. On a 64 bit system, unsigned long

Nonsense! On a 32 bit Intel protected mode large memory model
platform, for example, an unsigned int is 32 bits, while a pointer is
48 bits. Are you telling me that 32 >= 48? Surely not.
Implementations with pointers larger than ints are not uncommon.
Your statement is patently false.

--

Dig the even newer still, yet more improved, sig!

http://alphalink.com.au/~phaywood/
"Ain't I'm a dog?" - Ronny Self, Ain't I'm a Dog, written by G. Sherry & W. Walker.
I know it's not "technically correct" English; but since when was rock & roll "technically correct"?

Dan Pop

unread,

Nov 18, 2002, 8:23:28 AM11/18/02

to

In <m1n0oa2...@usa.net> those who know me have no need of my name <not-a-rea...@usa.net> writes:

>in comp.lang.c i read:
>
>>Myself i always cast pointers to (unsigned long) before
>>doing calculations or comparisons
>>
>>In general unsigned long is always greater than or equal
>>to the width of a pointer. On a 64 bit system, unsigned long
>>is 64 bits, and so are pointers. On an eight bit system,
>>unsigned long is usually 32, and pointers are usually 16
>>bits. Hence unsigned long is generally safe.
>
>but it may not be safe in the future. given no other history i would have
>expected a 64 bit system to have 64 bit int's and 128 bit long's, but that
>wasn't done, mostly because of the mass of existing systems expecting int's
>to be 32 bits and long's to be 64.

The existing systems, at the time, were certainly not expecting 64-bit
longs.

The main reason against 64-bit ints is that there would be only two
standard C types left (char and short) for three popular sizes: 8, 16 and
32-bit. The advantage of having long as an 128-bit type would not have
compensated for this. As a matter of fact, I am not aware of any C
implementation supporting an 128-bit integer type.

>so i expect that the >64 bit systems of
>the future to have 32 bit int's and 64 bit long's, just as today, but
>pointers will be wider hence so will ptrdiff_t. your cast will then be too
>narrow, necessitating fixes all around (even if just a typedef). but
>casting to unsigned long long depends on c99(ish) compliance in the
>compiler, so isn't very portable. and casting to a wider type than is
>necessary likely involves way too much frobnication getting the math done,
>which can be slow as hell on some platforms.

If C99 compliance can be assumed, the right type is uintptr_t (defined in
<stdint.h>). And if the implementation can be determined as not
being C99 compliant, uintptr_t can be (almost) safely defined as
unsigned long.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

Dan Pop

unread,

Nov 18, 2002, 8:44:04 AM11/18/02

to

In <3dd72cd9...@news.alphalink.com.au> phay...@alphalink.com.au.STOP.SPAM (Peter "Shaggy" Haywood) writes:

>Groovy hepcat Paul Sheer was jivin' on Thu, 14 Nov 2002 19:01:42 +0200
>in comp.lang.c.
>Re: ptrdiff_t and portability's a cool scene! Dig it!
>
>> Myself i always cast pointers to (unsigned long) before
>> doing calculations or comparisons
>
> [Gasp!] How horrible!
>
>>In general unsigned long is always greater than or equal

^^^^^^^^^^^^^^^^^^^^^^^^

>>to the width of a pointer. On a 64 bit system, unsigned long
>
> Nonsense!

Note that the statement started with "in general". As such, it is a
fairly accurate statement and it is nonsense to shout nonsense ;-)

>On a 32 bit Intel protected mode large memory model
>platform, for example, an unsigned int is 32 bits, while a pointer is

^^^^^^^^^^^^^^^
He was talking about unsigned long, so the size of unsigned int is
irrelevant, in context.

>48 bits. Are you telling me that 32 >= 48? Surely not.

Care to mention a couple of such implementations?

>Implementations with pointers larger than ints are not uncommon.

Care to mention a couple of *common* implementations where pointers are
wider than unsigned longs? I'm not saying that such things don't exist,
merely claiming, based on my experience, that they do not qualify as
common. The common behaviour is to have one integer type as wide as
pointer types (and these days, unsigned long seems to be the best
candidate).

>Your statement is patently false.

Since it covers the vast majority of common implementations and it starts
with "in general", it looks patently true to me :-)

Peter Shaggy Haywood

unread,

Nov 18, 2002, 8:40:34 PM11/18/02

to

Groovy hepcat Dan Pop was jivin' on 18 Nov 2002 13:44:04 GMT in

comp.lang.c.
Re: ptrdiff_t and portability's a cool scene! Dig it!

>In <3dd72cd9...@news.alphalink.com.au> phay...@alphalink.com.au.STOP.SPAM (Peter "Shaggy" Haywood) writes:
>
>>Groovy hepcat Paul Sheer was jivin' on Thu, 14 Nov 2002 19:01:42 +0200
>>in comp.lang.c.
>>Re: ptrdiff_t and portability's a cool scene! Dig it!
>>
>>> Myself i always cast pointers to (unsigned long) before
>>> doing calculations or comparisons
>>
>> [Gasp!] How horrible!
>>
>>>In general unsigned long is always greater than or equal
> ^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^
>>>to the width of a pointer. On a 64 bit system, unsigned long
>>
>> Nonsense!
>
>Note that the statement started with "in general". As such, it is a

This was followed be "is always". What are we supposed to make of
that? Does he mean "in general" or "always"? It's one or the other. He
can't have it both ways.

>fairly accurate statement and it is nonsense to shout nonsense ;-)
>
>>On a 32 bit Intel protected mode large memory model
>>platform, for example, an unsigned int is 32 bits, while a pointer is
> ^^^^^^^^^^^^^^^
>He was talking about unsigned long, so the size of unsigned int is
>irrelevant, in context.

Whoops! I meant unsigned long. On the architecture to which I was
refering, sizeof(unsigned int) == sizeof(unsigned long) anyhow.

>>48 bits. Are you telling me that 32 >= 48? Surely not.
>
>Care to mention a couple of such implementations?

Not really, no. :) I can't think of any off hand.

>>Implementations with pointers larger than ints are not uncommon.
>
>Care to mention a couple of *common* implementations where pointers are
>wider than unsigned longs? I'm not saying that such things don't exist,
>merely claiming, based on my experience, that they do not qualify as
>common. The common behaviour is to have one integer type as wide as
>pointer types (and these days, unsigned long seems to be the best
>candidate).

Again, I can't think of any off hand. My experience isn't as great
as yours, I'm afraid.
Perhaps my statement about such implementations not being uncommon
was phrased badly. Perhaps they're not that common. You'd know better
than I would, I'm sure, Dan.

>>Your statement is patently false.
>
>Since it covers the vast majority of common implementations and it starts
>with "in general", it looks patently true to me :-)

Since that phrase was followed by "is always", I'm not so sure about
that.
Anyhow, my point, which is valid either way, is that he shouldn't
make assumptions about things that may not always be true.

Dan Pop

unread,

Nov 19, 2002, 7:30:16 AM11/19/02

to

In <3dd99332...@news.alphalink.com.au> phay...@alphalink.com.au.STOP.SPAM (Peter "Shaggy" Haywood) writes:

>Groovy hepcat Dan Pop was jivin' on 18 Nov 2002 13:44:04 GMT in
>comp.lang.c.
>Re: ptrdiff_t and portability's a cool scene! Dig it!
>
>>In <3dd72cd9...@news.alphalink.com.au> phay...@alphalink.com.au.STOP.SPAM (Peter "Shaggy" Haywood) writes:
>>
>>>Groovy hepcat Paul Sheer was jivin' on Thu, 14 Nov 2002 19:01:42 +0200
>>>in comp.lang.c.
>>>Re: ptrdiff_t and portability's a cool scene! Dig it!
>>>
>>>> Myself i always cast pointers to (unsigned long) before
>>>> doing calculations or comparisons
>>>
>>> [Gasp!] How horrible!
>>>
>>>>In general unsigned long is always greater than or equal
>> ^^^^^^^^^^^^^^^^^^^^^^^^
> ^^^^^^^^^
>>>>to the width of a pointer. On a 64 bit system, unsigned long
>>>
>>> Nonsense!
>>
>>Note that the statement started with "in general". As such, it is a
>
> This was followed be "is always". What are we supposed to make of
>that? Does he mean "in general" or "always"? It's one or the other. He
>can't have it both ways.

Well, "in general" was first, so it takes precedence over "is always" :-)

David Thompson

unread,

Nov 24, 2002, 8:54:59 PM11/24/02

to

Dan Pop <Dan...@cern.ch> wrote :

> In <3dd99332...@news.alphalink.com.au>
phay...@alphalink.com.au.STOP.SPAM (Peter "Shaggy" Haywood) writes:

...

> >>>>In general unsigned long is always greater than or equal

...
> >>Note that the statement started with "in general". ...

> > This was followed be "is always". What are we supposed to make of
> >that? Does he mean "in general" or "always"? It's one or the other. He
> >can't have it both ways.
>
> Well, "in general" was first, so it takes precedence over "is always" :-)
>

ObSortofC: but both "in general" and "is always" (here)
are prefix, and prefix (unary) operators bind with the
rightmost (and hence nearest to the primary-expr) first.

:-?

--
- David.Thompson 1 now at worldnet.att.net

Dan Pop

unread,

Nov 25, 2002, 8:28:57 AM11/25/02

to

Which supports my statement :-) "is always" binds closer, so it is
overriden by "in general". As in (double)(int)foo, where the type of
the expression is double.