On Mar 6, 4:28 am, Austin Appleby <
tanj...@gmail.com> wrote:
> I'm using a dictionary of 470k english words & variants from SCOWL.
>
> Avalanche-wise it seems fine, the collisions is the weird bit -
> expected number of collisions is around 26, I'm seeing 50-60 which
> coincidentally is the same number I'm getting on a good ol'
> multiplicative hash, so I'm a bit wary.
>
> What sort of perf numbers are you seeing for the different versions?
> Also, what sort of keys are you testing against?
>
I have a scowl dictionary in my tests as well, but I've got 650k words
in mine. Ignore the elapsed time, the app isn't written for speed but
I include it anyway as a general overview:
elapsed for bj-lookup3/scowl-dict 469, collisions 60, chisq 0.953436,
active bins 488640, items 658244
elapsed for SHA1/scowl-dict 859, collisions 69, chisq 0.952829, active
bins 488491, items 658244
elapsed for murmur/scowl-dict 453, collisions 45, chisq 0.950456,
active bins 488930, items 658244
As for the aggregate across all of the input files I test, murmur is
#1 with prime sized tables and is neck and neck with the Python hash
(FNV variant) with pow2 sized tables.
I just use Hsieh's app for the speed tests (not the greatest, I know).
1ghz Tbird Linux, so no sfhasm:
SuperFastHash : 2.5100s
MurmurHash : 2.7600s
MurmurHash2 : 2.4800s
AMD64 3200+ WinXP:
SFHAsm : 0.9840s
SuperFastHash : 1.5310s
MurmurHash : 1.0310s
MurmurHash2 : 0.8750s