actual usefulness of uintptr

Tom Yu

unread,

Nov 21, 2007, 7:09:40 PM11/21/07

to

I am investigating the portability of hashing pointer values for
storage in a hash table (along with additional data related to the
pointer value). c99 7.18.1.4 allows an implementation to provide
integer types (intptr_t and uintptr_t) capable of holding object
pointers. The specification of these types is remarkably lenient.
For example, as far as I can tell, the standard does not require that
two pointers to void that compare equal to each other convert to two
uintptr_t values that compare equal to each other. (The standard
doesn't even require that null pointers convert to integer zero, but
that can be worked around.) This makes it difficult to portably use
these integer types as keys for storing pointer values into a hash
table.

What was the Committee's intent with intptr_t and uintptr_t? Clearly
an implementation with a flat memory addressing scheme could provide
intptr_t and uintptr_t types which behave usefully for the purposes of
hashing pointers, etc., but given how few requirements the Standard
places on these types, it seems that the types are not useful for very
much other than as an alternative method for opaquely storing pointer
values, or for debugging.

---Tom

James Kuyper

unread,

Nov 21, 2007, 10:39:24 PM11/21/07

to

Tom Yu wrote:
> I am investigating the portability of hashing pointer values for
> storage in a hash table (along with additional data related to the
> pointer value). c99 7.18.1.4 allows an implementation to provide
> integer types (intptr_t and uintptr_t) capable of holding object
> pointers. The specification of these types is remarkably lenient.
> For example, as far as I can tell, the standard does not require that
> two pointers to void that compare equal to each other convert to two
> uintptr_t values that compare equal to each other.

Correct

> ...(The standard

> doesn't even require that null pointers convert to integer zero, but
> that can be worked around.)

Correct.

> ... This makes it difficult to portably use

> these integer types as keys for storing pointer values into a hash
> table.

I think it's difficult to even define what it would mean for such use to
be portable, at least in the general case. However, I can imagine
special cases where you could meaningfully talk about whether it's
portable. In every such special case I could think of, what you could
talk about is the fact that it can't be portable.

If you were using indices into an array as keys for a hash table, that
could be done in a portable fashion. The only way to do it with pointers
is by converting them into indices, by subtracting the start of the array.

> What was the Committee's intent with intptr_t and uintptr_t? Clearly
> an implementation with a flat memory addressing scheme could provide
> intptr_t and uintptr_t types which behave usefully for the purposes of
> hashing pointers, etc., but given how few requirements the Standard
> places on these types, it seems that the types are not useful for very
> much other than as an alternative method for opaquely storing pointer
> values, or for debugging.

You'll have to ask the committee for the exact reasons. However,
allowing a pointer to be converted to an integer type allows
manipulations of that integer value that could only be performed on the
pointer object itself by aliasing the pointer as an array of unsigned
char. This allows for such things as encrypting the value of the integer.

Douglas A. Gwyn

unread,

Nov 23, 2007, 2:15:08 PM11/23/07

to

Tom Yu wrote:
> ... The specification of these types is remarkably lenient.

> For example, as far as I can tell, the standard does not require that
> two pointers to void that compare equal to each other convert to two
> uintptr_t values that compare equal to each other.

True; no "normalization" is required. About the only thing you
can usefully do with such a value within the scope of the standard
is to convert it back to a pointer to the original target type,
which may have a different representation than the original but
will compare equal to it.

> (The standard doesn't even require that null pointers convert
> to integer zero, but that can be worked around.)

Null pointer values are not necessarily represented by all 0 bits.

> This makes it difficult to portably use these integer types as
> keys for storing pointer values into a hash table.

That is true. However, if all the pointers are into the same
(array) object, then you could instead hash the differences
from some fixed base pointer; the difference is an integer type.

> What was the Committee's intent with intptr_t and uintptr_t? Clearly
> an implementation with a flat memory addressing scheme could provide
> intptr_t and uintptr_t types which behave usefully for the purposes of
> hashing pointers, etc., but given how few requirements the Standard
> places on these types, it seems that the types are not useful for very
> much other than as an alternative method for opaquely storing pointer
> values, or for debugging.

That's right. Because some (relatively important) architectures
don't have flat address spaces, it's hard to guarantee much more
than that. I suppose we could have specified that the mapping
between pointer and integer had to be such that distinct integer
values obtained from conversion from pointers guaranteed that
when converted back to pointers the values would compare unequal
(i.e. that the conversion involve normalization of the
representation), but even that runs into problems where two
regions abut.

Tom Yu

unread,

Nov 26, 2007, 2:58:25 PM11/26/07

to

>>>>> "DAG" == Douglas A Gwyn <DAG...@null.net> writes:

DAG> Tom Yu wrote:

>> This makes it difficult to portably use these integer types as
>> keys for storing pointer values into a hash table.

DAG> That is true. However, if all the pointers are into the same
DAG> (array) object, then you could instead hash the differences
DAG> from some fixed base pointer; the difference is an integer type.

Unfortunately, the pointers point to memory allocated by code not
under my control, so I can't assume that they all point to memory that
is part of a single array. I guess I'll just document that the
pointer hashing code will only work on C implementations that convert
pointers to integers in a normalized way, and provide alternatives
(less optimal ones, e.g. O(n) direct pointer comparison in an array of
structures each containing such a pointer) for the implementations
which do not.

---Tom

Douglas A. Gwyn

unread,

Nov 26, 2007, 5:45:20 PM11/26/07

to

Tom Yu wrote:
> Unfortunately, the pointers point to memory allocated by code not
> under my control, so I can't assume that they all point to memory that
> is part of a single array. I guess I'll just document that the
> pointer hashing code will only work on C implementations that convert
> pointers to integers in a normalized way, and provide alternatives
> (less optimal ones, e.g. O(n) direct pointer comparison in an array of
> structures each containing such a pointer) for the implementations
> which do not.

You could provide the ptrdiff_t-based hashing scheme as an
alternative that would work wherever there is an overall flat
memory addressing scheme, which is most modern processors
(even many of those with multiple address spaces). The
"base" could be the address of any static data object.

Tom Yu

unread,

Dec 3, 2007, 10:33:33 PM12/3/07

to

>>>>> "DAG" == Douglas A Gwyn <DAG...@null.net> writes:

DAG> Tom Yu wrote:
>> Unfortunately, the pointers point to memory allocated by code not
>> under my control, so I can't assume that they all point to memory that
>> is part of a single array. I guess I'll just document that the
>> pointer hashing code will only work on C implementations that convert
>> pointers to integers in a normalized way, and provide alternatives
>> (less optimal ones, e.g. O(n) direct pointer comparison in an array of
>> structures each containing such a pointer) for the implementations
>> which do not.

DAG> You could provide the ptrdiff_t-based hashing scheme as an
DAG> alternative that would work wherever there is an overall flat
DAG> memory addressing scheme, which is most modern processors
DAG> (even many of those with multiple address spaces). The
DAG> "base" could be the address of any static data object.

That's an interesting idea. I had thought that pointer subtraction
was only defined for pointers that both point to elements of the same
array. This also brings to mind a question about how pointer
subtraction is valid with regard to two pointers into a single block
of allocated storage. My understanding is that allocated storage has
no effective type until something is stored into it. Of course, one
could argue that a single block of allocated storage is a valid single
array for the purposes of pointer subtraction provided that the bounds
of the array are correctly aligned for the pointed-to type.

---Tom

Douglas A. Gwyn

unread,

Dec 4, 2007, 12:12:37 PM12/4/07

to

Tom Yu wrote:
> DAG> You could provide the ptrdiff_t-based hashing scheme as an
> DAG> alternative that would work wherever there is an overall flat
> DAG> memory addressing scheme, which is most modern processors
> DAG> (even many of those with multiple address spaces). The
> DAG> "base" could be the address of any static data object.
> That's an interesting idea. I had thought that pointer subtraction
> was only defined for pointers that both point to elements of the same
> array.

That's the requirement for maximal portability (strictly conforming
program). However, I was talking about how to exploit capabilities
that happen to be supported by the specific class of platforms
(flat, non-segmented address space) but aren't guaranteed for other
platforms. The actual implementation of pointer subtraction is
almost certain to work on that class of platforms regardless of
whether the operands are based on the same array object or not.

> This also brings to mind a question about how pointer
> subtraction is valid with regard to two pointers into a single block
> of allocated storage. My understanding is that allocated storage has
> no effective type until something is stored into it. Of course, one
> could argue that a single block of allocated storage is a valid single
> array for the purposes of pointer subtraction provided that the bounds
> of the array are correctly aligned for the pointed-to type.

Pointer subtraction always involves a specific pointed-to type,
and the result is scaled accordingly. It doesn't matter whether
the actual storage was allocted via a variable declaration or by
dynamic allocation. The pointer values have to be aligned properly,
which for dynamic allocation means they point to starting bytes
that are some integral multiple of the pointed-to type units from
the pointer that was returned by malloc/realloc/calloc. (And for
the general case, the pointers must point within, or one byte past,
the requested region.)

actual usefulness of uintptr_t, etc.?

Tom Yu

James Kuyper

Douglas A. Gwyn

Tom Yu

Douglas A. Gwyn

Tom Yu

Douglas A. Gwyn