xxHash vs fletcher's checksum

534 views
Skip to first unread message

Ralph Doncaster

unread,
Aug 12, 2014, 11:03:49 AM8/12/14
to lz...@googlegroups.com
I think fletcher's checksum (fletcher-32, fletcher-4, etc) could be even faster than xxHash.
http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Website/articles/DDJ/1992/9205/9205b/9205b.htm

Looking at the source this seems to be the core of xxHash
        h32 += XXH_get32bits(p) * PRIME32_3;
        h32  = XXH_rotl32(h32, 17) * PRIME32_4; Which decomposes to multiply, add, rotate, & multiply. Fletcher is basically add, add carry, add, add carry. Add is faster than multiply on many platforms and never slower. It's not as strong, but that could be more than made up for with a larger checksum. A 64-bit fletcher's checksum based on 2 32-bit sums would be at least as fast as xxHash 32, and probably much stronger. Fletcher4 is one of the checksum algorithms used in ZFS.

Yann Collet

unread,
Aug 12, 2014, 10:27:57 PM8/12/14
to lz...@googlegroups.com
If I do remember correctly,
when I tested fletcher algorithm, its speed was relatively good (although a bit slower than murmurhash)
but its quality was poor, if not very poor.
It ended with a Q.Score at or below 2, and its collision ratio was significantly higher (like 3x higher than average).

It's not too difficult to make a very fast algorithm.
What is difficult is to make one which is both fast and feature good distribution and "random-like" behavior.

If all you need is raw speed, you can test the many variants from SanMayce :
The author concentrated on speed, and as a consequence completed some very fast hashes.
They also badly fail on almost every other smHasher test,
but you could nonetheless make a case that the quality of these hashes is "good enough".
And that's most likely true, depending on usage scenario.

At the end of the day, you"ll need to make some tests to ensure your end-benefit.
A faster hash may nonetheless end up in a global algorithm slow down, due to increased collisions, or longer distinction comparisons.

xxHash has proven to be a performance benefit to several applications not just because of its raw speed,
but also because of its better "randomness" property, and better distribution.
Both aspects matter, and randomness typically matters even more than raw speed.
Reply all
Reply to author
Forward
0 new messages