Probably an unreachable statement in hashlttle() and hashbig()

5 views
Skip to first unread message

Alex Vinokur

unread,
Apr 13, 2008, 8:26:27 AM4/13/08
to Hash Functions
Jenkins' function hashlttle() in http://burtleburtle.net/bob/c/lookup3.c
contains the following piece of the code.

It seems that this code holds an unreachable statement.
The similar code is in function hashbig() too.

=====================================
/*--------------- all but the last block: affect some 32 bits of
(a,b,c) */
while (length > 12)
{
a += k[0];
a += ((uint32_t)k[1])<<8;
a += ((uint32_t)k[2])<<16;
a += ((uint32_t)k[3])<<24;
b += k[4];
b += ((uint32_t)k[5])<<8;
b += ((uint32_t)k[6])<<16;
b += ((uint32_t)k[7])<<24;
c += k[8];
c += ((uint32_t)k[9])<<8;
c += ((uint32_t)k[10])<<16;
c += ((uint32_t)k[11])<<24;
mix(a,b,c);
length -= 12;
k += 12;
}

/*-------------------------------- last block: affect all 32 bits
of (c) */
switch(length) /* all the case statements fall
through */
{
case 12: c+=((uint32_t)k[11])<<24;
case 11: c+=((uint32_t)k[10])<<16;
case 10: c+=((uint32_t)k[9])<<8;
case 9 : c+=k[8];
case 8 : b+=((uint32_t)k[7])<<24;
case 7 : b+=((uint32_t)k[6])<<16;
case 6 : b+=((uint32_t)k[5])<<8;
case 5 : b+=k[4];
case 4 : a+=((uint32_t)k[3])<<24;
case 3 : a+=((uint32_t)k[2])<<16;
case 2 : a+=((uint32_t)k[1])<<8;
case 1 : a+=k[0];
break;
case 0 : return c; // <My comment - Alex Vinokur> It seems that
that case is unreachable, because of 'while (length > 12)' above

}
}

final(a,b,c);
return c;

=====================================

Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

bob_j...@burtleburtle.net

unread,
Apr 14, 2008, 1:22:29 PM4/14/08
to Hash Functions
It's true it won't be reached if the original length was 12, 24,
36, ... but it will be reached if the original length was 0.

On Apr 13, 5:26 am, Alex Vinokur <ale...@users.sourceforge.net> wrote:
> Jenkins' function hashlttle() inhttp://burtleburtle.net/bob/c/lookup3.c

Alex Vinokur

unread,
Apr 15, 2008, 12:43:00 AM4/15/08
to Hash Functions


On Apr 14, 8:22 pm, bob_jenk...@burtleburtle.net wrote:
> It's true it won't be reached if the original length was 12, 24,
> 36, ... but it will be reached if the original length was 0.

uint32_t hashlittle( const void *key, size_t length, uint32_t initval)

uint32_t hasValue = hashlittle (key, 0, initval) - hash value of an
empty array (?).
When/why can it be useful?

>
> On Apr 13, 5:26 am, Alex Vinokur <ale...@users.sourceforge.net> wrote:
>
>
>
> > Jenkins' function hashlttle() inhttp://burtleburtle.net/bob/c/lookup3.c
> > contains the following piece of the code.
>
> > It seems that this code holds an unreachable statement.
> > The similar code is in function hashbig() too.
>
[snipped]

bob_j...@burtleburtle.net

unread,
Apr 15, 2008, 11:31:29 AM4/15/08
to Hash Functions
On Apr 14, 9:43 pm, Alex Vinokur <ale...@users.sourceforge.net> wrote:
> On Apr 14, 8:22 pm, bob_jenk...@burtleburtle.net wrote:
>
> > It's true it won't be reached if the original length was 12, 24,
> > 36, ... but it will be reached if the original length was 0.
>
> uint32_t hashlittle( const void *key, size_t length, uint32_t initval)
>
> uint32_t hasValue = hashlittle (key, 0, initval) - hash value of an
> empty array (?).
> When/why can it be useful?

Suppose you have two database tables, each with a million rows with
three string columns each. To check if they contain the same rows
(but perhaps in different orders), sum the hashes of all the rows and
compare the sums. Hashing a single row is
h = hashlittle(col[0], len[0], seed);
h = hashlittle(col[1], len[1], h);
h = hashlittle(col[2], len[2], h);
If one table has a row ("","red","") and another has ("","","red"),
the tables should appear different. This requires the hash to accept
zero-length strings and do something other than a no-op to the
initval.

My impression is that zero-length keys are very common.

Andres Valloud

unread,
Apr 17, 2008, 11:29:34 AM4/17/08
to hash-fu...@googlegroups.com
In general, the issue is that the hash value of a null object should not be null.  This is particularly so in the sense of hash computation arithmetic, and even more particularly when the hash function can go through bytes without changing the resulting hash value (e.g.: multiplicative hash with initial value zero hashing a series of null bytes).  This does not apply to perfect hash functions though.

Andres.
Reply all
Reply to author
Forward
0 new messages