Reinterpret the bits of a float as unsigned long

thomas...@gmx.at

unread,

Feb 3, 2006, 12:34:39 PM2/3/06

to

For a hash function I want to reinterpret the bits of a float
expression as unsigned long. The normal cast

(unsigned long) float_expression

truncates the float to an (unsigned long) integer.

But this is not the effect I want for a hash function.
With unions my hash function can be implemented:

union {
long unsigned hashvalue;
float floatvalue;
} value;

value.floatvalue = float_expression;

Now

value.hashvalue

contains the reinterpreted bits as intended. But I think a shorter
solution (without assignment or memcpy to a variable) should
be possible.

When I cast first to (void *) and after that to (unsigned long)
the gcc gives me the error:

cannot convert to a pointer type

It seems casting a float to a pointer is prohibited by gcc.

Has anybody an idea to reinterpret the bits of a float as
unsigned long without assignment or memcpy (some sort
of tricky cast).

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Wikipedia: http://en.wikipedia.org/wiki/Seed7
Project page: http://sourceforge.net/projects/seed7

P.J. Plauger

unread,

Feb 3, 2006, 12:40:36 PM2/3/06

to

<thomas...@gmx.at> wrote in message
news:1138988079.2...@g47g2000cwa.googlegroups.com...

> For a hash function I want to reinterpret the bits of a float
> expression as unsigned long. The normal cast
>
> (unsigned long) float_expression
>
> truncates the float to an (unsigned long) integer.
>
> But this is not the effect I want for a hash function.
> With unions my hash function can be implemented:
>
> union {
> long unsigned hashvalue;
> float floatvalue;
> } value;
>
> value.floatvalue = float_expression;
>
> Now
>
> value.hashvalue
>
> contains the reinterpreted bits as intended. But I think a shorter
> solution (without assignment or memcpy to a variable) should
> be possible.
>
> When I cast first to (void *) and after that to (unsigned long)
> the gcc gives me the error:
>
> cannot convert to a pointer type
>
> It seems casting a float to a pointer is prohibited by gcc.
>
> Has anybody an idea to reinterpret the bits of a float as
> unsigned long without assignment or memcpy (some sort
> of tricky cast).

If float_expression is a lvalue, you can write:

*(unsigned long *)&(float_expression)

but that is fraught with peril.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

thomas...@gmx.at

unread,

Feb 3, 2006, 1:20:08 PM2/3/06

to

"P.J. Plauger" <p...@dinkumware.com> wrote:

> If float_expression is a lvalue, you can write:
>
> *(unsigned long *)&(float_expression)
>
> but that is fraught with peril.

Thank you for your help. But the float_expression is not
granted to be an lvalue. Do you have an rvalue solution as well?

"Nils O. Selåsdal"

unread,

Feb 3, 2006, 1:31:01 PM2/3/06

to

thomas...@gmx.at wrote:
> "P.J. Plauger" <p...@dinkumware.com> wrote:
>
>> If float_expression is a lvalue, you can write:
>>
>> *(unsigned long *)&(float_expression)
>>
>> but that is fraught with peril.
>
> Thank you for your help. But the float_expression is not
> granted to be an lvalue. Do you have an rvalue solution as well?

You can get at the underlying representation by casting to an
unsigned char* and work your way through that.
unsigned char *raw = (unsigned char *)&floatval;
Fiddle with raw[0] to raw[sizeof(float)-1] (e.g build an
unsigned long from the bits..)

Eric Sosman

unread,

Feb 3, 2006, 1:41:47 PM2/3/06

to

thomas...@gmx.at wrote On 02/03/06 13:20,:

> "P.J. Plauger" <p...@dinkumware.com> wrote:
>
>
>>If float_expression is a lvalue, you can write:
>>
>> *(unsigned long *)&(float_expression)
>>
>>but that is fraught with peril.
>
>
> Thank you for your help. But the float_expression is not
> granted to be an lvalue. Do you have an rvalue solution as well?

You'll have to store the value in an lvalue first.
Consider: "Reinterpreting the bits" is an operation on
the representation of a value, not on the value itself.
A value doesn't "get represented" (in any C-accessible
sense) until it's stored; a value "in flight" in an
expression is just a naked value, without a representation
that C has any way to talk about.

Don't overlook PJP's closing remark, by the way.
When he says "fraught with peril" he doesn't mean there's
only one peril; he means there are multiple perils. Just
off-hand and without working hard, I can think of three.

--
Eric....@sun.com

P.J. Plauger

unread,

Feb 3, 2006, 2:34:07 PM2/3/06

to

<thomas...@gmx.at> wrote in message
news:1138990808....@g43g2000cwa.googlegroups.com...

> "P.J. Plauger" <p...@dinkumware.com> wrote:
>
>> If float_expression is a lvalue, you can write:
>>
>> *(unsigned long *)&(float_expression)
>>
>> but that is fraught with peril.
>
> Thank you for your help. But the float_expression is not
> granted to be an lvalue. Do you have an rvalue solution as well?

No.

thomas...@gmx.at

unread,

Feb 3, 2006, 2:55:59 PM2/3/06

to

Eric Sosman <Eric.Sos...@sun.com> wrote:

> You'll have to store the value in an lvalue first.
> Consider: "Reinterpreting the bits" is an operation on
> the representation of a value, not on the value itself.

Reinterpreting a float expression as unsigned long
does absolutely nothing with the value. In the machine
code you would see nothing at this place.

The C compiler (gcc) must be convinced to accept a
"do nothing" cast. Normal casts from float to (long) integers
are defined as trunc operations. But in my case nothing
should be done with the value. From the logical point of view
a "do nothing" operation does not need an lvalue.

The x86 gcc has the same size for float and unsigned long:

sizeof(float) = 4
sizeof(unsigned long) = 4

therefore all float values map to unsigned long values.
Just the compiler must be convinced.

Any ideas?

Eric Sosman

unread,

Feb 3, 2006, 4:04:00 PM2/3/06

to

thomas...@gmx.at wrote On 02/03/06 14:55,:

> Eric Sosman <Eric.Sos...@sun.com> wrote:
>
>
>>You'll have to store the value in an lvalue first.
>>Consider: "Reinterpreting the bits" is an operation on
>>the representation of a value, not on the value itself.
>
>
> Reinterpreting a float expression as unsigned long
> does absolutely nothing with the value. In the machine
> code you would see nothing at this place.

I think we've said the same thing in different words:
Reinterpretation is about representations, not about
values.

> The C compiler (gcc) must be convinced to accept a
> "do nothing" cast. Normal casts from float to (long) integers
> are defined as trunc operations.

That's why the compiler must *not* be convinced to
change its behavior. It is doing what the language
definition requires.

> But in my case nothing
> should be done with the value. From the logical point of view
> a "do nothing" operation does not need an lvalue.

It needs a representation to be able to reinterpret.
C is unable to "see" representations that do not reside
in lvalues, and this "blindness" is the source of some of
C's power to optimize. For example, `x++' is defined as
having the same effect as `x += 1', yet on many systems
there will not even *be* a `1' anywhere in the compiled
code of `x++'. That `1' has no representation, yet it
is a value nonetheless.

> The x86 gcc has the same size for float and unsigned long:
>
> sizeof(float) = 4
> sizeof(unsigned long) = 4
>
> therefore all float values map to unsigned long values.
> Just the compiler must be convinced.

That takes care of one of the three perils I thought
of. Another is alignment: the fact that two objects have
the same size does not imply that they have the same
alignment requirements -- and it's not hard to imagine a
machine whose integer and F-P units connect to memory in
different fashions (one word: "coprocessor").

The third peril is more subtle: Some F-P schemes can
use different bit combinations for the same value. In
IEEE F-P, for example, -0 and +0 are numerically equal but
have different representations. Since you want to use the
reinterpreted-as-long bits as a hash code, you could find
yourself with different hash codes for equal keys. Your
hash table would work fine *most* of the time ...

And, of course, once you've verified that all three
perils are either non-perilous or can be avoided on your
current machine, you or someone else will decide to use
the code on a different system -- where, all of a sudden,
`long' is eight bytes and requires eight-byte alignment.
The peril of avoiding peril rather than dealing with it
is that avoidance is local and avoidance is temporary ...

> Any ideas?

One thought is that using a floating-point value as a
hash key is a dodgy business at best. The problem isn't
so much with the value itself, but with the computation
that produces it: if you insert an item using `0.0f' as a
key and then try to retrieve it with `(float)cos(3.14159)'
you are quite likely to be disappointed. Discrepancies
as small as one part in four million are enough to bollix
your hash code and send the search to the wrong spot in
the table.

Are you sure you want to do this? What larger problem
are you trying to solve by building this hash table?

--
Eric....@sun.com

Eric Sosman

unread,

Feb 3, 2006, 4:25:14 PM2/3/06

to

Eric Sosman wrote On 02/03/06 16:04,:
>
> [...] if you insert an item using `0.0f' as a

> key and then try to retrieve it with `(float)cos(3.14159)'

> you are quite likely to be disappointed. [...]

Never do trig on an empty stomach. That should
have been `(float)sin(3.14159)', of course.

--
Eric....@sun.com

thomas...@gmx.at

unread,

Feb 3, 2006, 5:40:14 PM2/3/06

to

Eric Sosman <Eric.Sos...@sun.com> wrote

> Are you sure you want to do this? What larger problem
> are you trying to solve by building this hash table?

The Seed7 programming language supports hashtables for
all index types that provide a 'hashCode' and a 'compare'
function. If you use the type

hash [string] float

the 'hashCode(string)' and 'compare(string, sting)' functions
are used to implement this hashtable. When the baseType of
the hashtable (float in this examle) has also an 'hashCode'
and an 'compare' function the type 'hash [string] float' can
be flipped (the values are turned to keys and the keys into values).
the result type of the flipped hashtable is

hash [float] array string

When your 'hash [string] float' table maps mountain names to
their height. It is possible to write a list of mountains sorted by
height with:

mountainsWithHeight := flip(HeightOfMountain);
for height range sort(keys(mountainsWithHeight)) do
for mountainName range sort(mountainsWithHeight[height]) do
writeln(mountainName rpad 20 <& " " <& height);
end for;
end for;

This example is similar to the wordcnt.sd7 example in the
Seed7 package.

The Seed7 interpreter uses unions to store all object values.
The 'float to unsigned long' cast is done with unions there.
The Seed7 compiler produces a C program where the hashCode
function could just be a "do nothing" cast from float to
unsigned long.

Christian Bau

unread,

Feb 3, 2006, 5:56:59 PM2/3/06

to

In article <ds0hnq$2ne$1...@news1brm.Central.Sun.COM>,
Eric Sosman <Eric....@sun.com> wrote:

It is the floating point that got you into trouble, because

(int) sin(3.14159) == (int) cos (3.14159) :-)

Eric Sosman

unread,

Feb 3, 2006, 6:00:56 PM2/3/06

to

thomas...@gmx.at wrote On 02/03/06 17:40,:

> Eric Sosman <Eric.Sos...@sun.com> wrote
>
>
>>Are you sure you want to do this? What larger problem
>>are you trying to solve by building this hash table?
>

> [something about a language called Seed7, which supports
> hash tables using all manner of weird keys]

>
> When your 'hash [string] float' table maps mountain names to
> their height. It is possible to write a list of mountains sorted by
> height with:
>
> mountainsWithHeight := flip(HeightOfMountain);
> for height range sort(keys(mountainsWithHeight)) do
> for mountainName range sort(mountainsWithHeight[height]) do
> writeln(mountainName rpad 20 <& " " <& height);
> end for;
> end for;

Seems like either (1) Seed7 hash tables allow
duplicate keys, or (2) no two mountains can have
the same height ...

--
Eric....@sun.com

Al Balmer

unread,

Feb 3, 2006, 6:59:30 PM2/3/06

to

On 3 Feb 2006 14:40:14 -0800, thomas...@gmx.at wrote:

>Eric Sosman <Eric.Sos...@sun.com> wrote
>
>> Are you sure you want to do this? What larger problem
>> are you trying to solve by building this hash table?
>
>The Seed7 programming language supports hashtables for

Thanks for the unsolicited advertisement. Not. Your post is not only
off-topic in c.l.c., but has no relevance to the OP's question.

--
Al Balmer
Sun City, AZ

thomas...@gmx.at

unread,

Feb 4, 2006, 2:05:27 AM2/4/06

to

Eric Sosman <Eric.Sos...@sun.com> wrote:

> Seems like either (1) Seed7 hash tables allow
> duplicate keys, or (2) no two mountains can have
> the same height ...

(3) Flipping a hashtable creates a hashtable with
arrays of formerKeys as elements:

hash [formerValue] array formerKey

Therefore you need two for-loops to write the output
in the example.

pete

unread,

Feb 4, 2006, 5:09:02 AM2/4/06

to

Eric Sosman wrote:

> A value doesn't "get represented" (in any C-accessible
> sense) until it's stored; a value "in flight" in an
> expression is just a naked value, without a representation
> that C has any way to talk about.

I disagree.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
switch (-1 & 3) {
case 1:
puts("signed magnitude");
break;
case 2:
puts("ones's complement");
break;
default:
puts("two's complement");
break;
}
return 0;
}

/* END new.c */

--
pete

Eric Sosman

unread,

Feb 4, 2006, 8:38:34 AM2/4/06

to

pete wrote:
> Eric Sosman wrote:
>
>
>>A value doesn't "get represented" (in any C-accessible
>>sense) until it's stored; a value "in flight" in an
>>expression is just a naked value, without a representation
>>that C has any way to talk about.
>
>
> I disagree.

> [counterexample with `&' snipped]

Good point. Not helpful to the O.P., though,
who wants access to the representation of a `float'.

--
Eric Sosman
eso...@acm-dot-org.invalid