xxHash seed and collision

1,411 views
Skip to first unread message

Aman JIANG

unread,
Aug 6, 2016, 11:20:55 AM8/6/16
to LZ4c
"Seed can be used to alter the result predictably."

Assume two different sequences have same the result, namely collision,
can we use another seed to avoid this collision ?

Cyan

unread,
Aug 6, 2016, 11:23:00 AM8/6/16
to LZ4c
Yes,

though the problem is, how will you know which seed was used to calculate the hash ?
You'll need a predictable way to select the same seed on both sides.

Roger Pack

unread,
Aug 12, 2016, 2:43:34 PM8/12/16
to LZ4c
Yes, though with two seeds they could still clash every 2^64 (or whatever times).
And if you hash long enough, you'll still get other collisions.
Just calling it out :) 

Nikolay Dimitrov

unread,
Mar 8, 2019, 12:57:52 AM3/8/19
to LZ4c
Sorry about reviving old thread.

My use-case is hashing (under 1KB) strings and comparing the hash with previous hashes.
I was thinking to use the length of the hashed string as a seed. My assumption is that it is less likely to have a collision when two strings have exactly the same length. 

I'm curious what others think?

Cyan

unread,
Mar 8, 2019, 3:00:26 AM3/8/19
to LZ4c
If the hash is "good quality", it shouldn't make any difference :
two 32-bits hashes of two different inputs are supposed to have exactly 1 in 2^32 chance of collision, irrespective of their size and seed.

Also, note that internally, the size of input is already used as part of the formula.
You could say it's a kind of "implicit seed".

Nikolay Dimitrov

unread,
Mar 8, 2019, 3:35:20 AM3/8/19
to LZ4c
Thanks for the response, Yann!

That was very helpful and makes perfect sense.
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages