hashlittle(), hashbig(), hashword() and endianness

99 views
Skip to first unread message

Alex Vinokur

unread,
Nov 15, 2007, 5:54:21 AM11/15/07
to Hash Functions
Some question concerning Bob Jenkins' functions
hashword(uint32_t*, size_t), hashlittle(uint8_t*, size_t) and
hashbig(uint8_t*, size_t) in lookup3.c.

Let k1 by a key: uint8_t* k1; strlen(k1)%sizeof(uint32_t) == 0.

1. hashlittle(k1) produces the same value on Little-Endian and Big-
Endian machines.
Let hashlittle(k1) be == L1.

2. hashbig(k1) produces the same value on Little-Endian and Big-Endian
machines.
Let hashbig(k1) be == B1.

L1 != B1


3. hashword((uint32_t*)k1) produces
* L1 on LittleEndian machine and
* B1 on BigEndian machine.


Is it possible to create two new hash functions on basis of
hashword():
i) hashword_little () that produces L1 on Little-Endian and Big-
Endian machines;
ii) hashword_big () that produces B1 on Little-Endian and Big-
Endian machines
?

Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Blah Blah

unread,
Nov 29, 2007, 1:21:11 AM11/29/07
to hash-fu...@googlegroups.com
You could come up with a hash function that processed 4 bytes at a time such that it was insensitive to the endianness of the machine, but the hash function would probably suck, and you'll lose the speed advantages of the endian sensitive hashes.

  w = *k;
  t1 = w & 0xffff;
  t2 = ((w >> 8) & 0xff00) | ((w >> 24) & 0xff);
  t3 = t1 ^ t2;

t3 is now an endian independent hash of a 4-byte value.  You can combine this however you want with internal state to maintain the endian independent property.

With an appropriate vector processing unit, you could also look at processing 4 columns of bytes as separate strings and then combining the 4 hashes at the end in a fashion similar to the above. 

Cheers, Chuck

Andres Valloud

unread,
Nov 29, 2007, 1:25:23 AM11/29/07
to hash-fu...@googlegroups.com
If you can assume the hash function will look at only at a few bytes the value of which tends to be different, you may want to use an S-box hash feeding a bit xor funnel.

Thanks,
Andres.
Reply all
Reply to author
Forward
0 new messages