xxHash seed and collision

Aman JIANG

unread,

Aug 6, 2016, 11:20:55 AM8/6/16

to LZ4c

"Seed can be used to alter the result predictably."

Assume two different sequences have same the result, namely collision,
can we use another seed to avoid this collision ?

Cyan

unread,

Aug 6, 2016, 11:23:00 AM8/6/16

to LZ4c

Yes,

though the problem is, how will you know which seed was used to calculate the hash ?

You'll need a predictable way to select the same seed on both sides.

Roger Pack

unread,

Aug 12, 2016, 2:43:34 PM8/12/16

to LZ4c

Yes, though with two seeds they could still clash every 2^64 (or whatever times).

And if you hash long enough, you'll still get other collisions.

Just calling it out :)

Nikolay Dimitrov

unread,

Mar 8, 2019, 12:57:52 AM3/8/19

to LZ4c

Sorry about reviving old thread.

My use-case is hashing (under 1KB) strings and comparing the hash with previous hashes.

I was thinking to use the length of the hashed string as a seed. My assumption is that it is less likely to have a collision when two strings have exactly the same length.

I'm curious what others think?

Cyan

unread,

Mar 8, 2019, 3:00:26 AM3/8/19

to LZ4c

If the hash is "good quality", it shouldn't make any difference :

two 32-bits hashes of two different inputs are supposed to have exactly 1 in 2^32 chance of collision, irrespective of their size and seed.

Also, note that internally, the size of input is already used as part of the formula.

You could say it's a kind of "implicit seed".

Nikolay Dimitrov

unread,

Mar 8, 2019, 3:35:20 AM3/8/19

to LZ4c

Thanks for the response, Yann!

That was very helpful and makes perfect sense.

Reply all

Reply to author

Forward

Message has been deleted