Redis Cluster and Hashing

Vladut Paiu

unread,

Sep 30, 2011, 10:21:38 AM9/30/11

to Redis DB

Hello all,

While implementing a C cluster client on top of the hiredis library, I
had to move the hashing function that distributes keys among Cluster
nodes inside my application.

While doing this, I was kind of surprised that Redis uses CRC16 to
hash keys between the Redis Instances inside a Cluster. Is there any
reason for choosing it, other than the obvious speed of the hash
calculation ?

From what I know and read before, CRC in general performs quite poor
when it comes to string keys. Upon looking for actual benchmarks of
hashing functions, I encountered [1], which describes the tests they
performed. They also analyzed CRC32, the results are at [2].
Another very interesting thing that these tests showed is that taking
the lower end part of the CRC32 output as the hashing part would turn
out to be very bad when taking into account uniform distribution of
text keys, while the upper part of the CRC32 would behave much better.

I'll try to take some time and do some kind of 'benchmark', to get a
cluster of X nodes and stress-testing the cluster with a 'smart'
client, and check out if the load is roughly the same on all the
machines.

[1] http://bretm.home.comcast.net/~bretm/hash/5.html
[2] http://bretm.home.comcast.net/~bretm/hash/8.html

Regards,
Vlad

Salvatore Sanfilippo

unread,

Sep 30, 2011, 10:42:51 AM9/30/11

to redi...@googlegroups.com

On Fri, Sep 30, 2011 at 4:21 PM, Vladut Paiu <vladu...@gmail.com> wrote:
> While doing this, I was kind of surprised that Redis uses CRC16 to
> hash keys between the Redis Instances inside a Cluster. Is there any
> reason for choosing it, other than the obvious speed of the hash
> calculation ?

Hello Vladut,

You are right, I did some experimental test with the most obvious way
key names are generated, that is "prefix:<num>" and I don't like too
much the results.
In a 100 nodes cluster this is not going to be a problem since most
buckets are balanced, but from time to time there are buckets that
have very very few keys.
If one has a 4096 nodes cluster, or a node has the "wrong" nodes
assigned, then this is going to create problems. I did some test
previous but always simulating real clusters, never checking what the
distribution was in the actual 4096 buckets. A cluster of 100 nodes
will mask the bias, but not when you see the single buckets.

I guess it is time to change it now. Ideas about what we should use?
It should be fast and *easy* to implement, as client implementations
will need to have it as well.
What we need is an hash function with good distribution *especially*
in the lower 12 bits.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Salvatore Sanfilippo

unread,

Sep 30, 2011, 10:50:47 AM9/30/11

to redi...@googlegroups.com

Note: Sorry maybe the tests I did were not accurate, testing more, as
I get exactly the same bad distribution with djb hash function as
well, that while not great should not be this bad.

Salvatore Sanfilippo

unread,

Sep 30, 2011, 11:00:00 AM9/30/11

to redi...@googlegroups.com

Ok I was wrong,

actually why djb hash function has a very bad distribution, the crc16
is showing a very good distribution.

Actual test: 1 million keys form "key:0" to "key:999999".
The distribution is the following (can't spam with too many data so
I'll not provide all the info needed):

Expected elements per bucket: 244

What I got instead:

"best" bucket: 269 elements
"worst" bucket: 206 elements

I would say this is good, especially since the key generation is very
good test as we use always the same 10 bytes in the tail of the string
and the prefix is always the same.

Under the same test djb performs very, very poorly for instance with a
best bucket containing 781 elements and the worst 10 elements.

Waiting for your tests as well, but I'm pretty confident we are well
served by crc16, especially since we are almost using all the output
as we use 12 bits, and this can be good with small keys compared to
bigger hash functions where we'll take just the latest 12 bits.

Salvatore

Sergei Tulentsev

unread,

Sep 30, 2011, 11:30:30 PM9/30/11

to redi...@googlegroups.com

I am not very experienced in writing distributed clustered data structure servers, and this question puzzles me: why not some modern hash function (MD5 for instance)? :-)

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

--
Best regards,
Sergei Tulentsev

Sergei Tulentsev

unread,

Sep 30, 2011, 11:38:42 PM9/30/11

to redi...@googlegroups.com

I also found this:
" Input bits directly determine the state of certain output bits and have no effect on the state of the other output bits. This simply reflects the fact that CRC32 wasn't designed for uniform distribution of output bits. Its hash value is manipulated in a very specific way designed to detect channel transmission errors. "

http://bretm.home.comcast.net/~bretm/hash/8.html

I guess, this statement is also true for CRC16

Joran Greef

unread,

Oct 1, 2011, 9:52:24 AM10/1/11

to redi...@googlegroups.com

Are CRC's used for hash maps? This seems to be essentially the same use-case.

I think Bob Jenkin's One-At-A-Time hash may be much faster than a CRC, simpler and provide better distribution:

http://en.wikipedia.org/wiki/Jenkins_hash_function#one-at-a-time

Salvatore Sanfilippo

unread,

Oct 1, 2011, 10:15:59 AM10/1/11

to redi...@googlegroups.com

On Sat, Oct 1, 2011 at 5:30 AM, Sergei Tulentsev
<sergei.t...@gmail.com> wrote:
> I am not very experienced in writing distributed clustered data structure
> servers, and this question puzzles me: why not some modern hash function
> (MD5 for instance)? :-)

Hi Sergei, the problem with MD5 is that it is too slow for this application.
Redis internally needs to perform many Key -> Slot conversions, using
MD5 is wasting resources, since MD5 not only offers good distribution
(so good that is impossible to statistically see the difference with
true random data), and is designed in order to be "one way" so that it
is computationally very hard to find informations about the input from
the output and so forth.
We don't need all that fortunately, we only need a decent distribution
in the latest 12 bits of the output in the case of hashing biased
strings (keys are often far from random, with a prefix that is always
fixed and with bytes mostly in the A-z 0-9 range).

So we can pick something much faster. More about CRC16 in the reply to
the other email from you :)

Salvatore Sanfilippo

unread,

Oct 1, 2011, 10:47:09 AM10/1/11

to redi...@googlegroups.com

On Sat, Oct 1, 2011 at 5:38 AM, Sergei Tulentsev
<sergei.t...@gmail.com> wrote:
> I also found this:
> " Input bits directly determine the state of certain output bits and have no
> effect on the state of the other output bits. This simply reflects the fact
> that CRC32 wasn't designed for uniform distribution of output bits. Its hash
> value is manipulated in a very specific way designed to detect channel
> transmission errors. "

Exactly, CRC16 and CRC32 are designed in order to make sure flipped
bytes or a few different bytes in the input will *likely* result into
something different enough. An hash function instead should be
designed for good distribution, that is, different inputs should
produce any of the possible 65536 outputs (in the case of a 16 bit
output) with the same probability, even hashing very related inputs,
that is, inputs that are not random (like our keys in the form of
key;0, key:1 and so forth). (Note: this is not the only property of an
hash function, another property is for instance, how much the output
will change if I perform a very small change in the input?).

Now what is interesting is that even if CRC16 is designed for a
different purpose, it does not mean that it completely fails at the
other task. Also note that we get a lot of the output of CRC16, and
always the *same* output, that is, the latest 12 bits, so this is not
like using CRC16 to implement an hash table that can have different
sizes. We are interested in the specific distribution of the latest 12
bits of the CRC16 output.

And well, this works remarkably well for some reason I'm not
completely sure I understand.

See for instance the output of the following Ruby program I
implemented for this purpose:

https://gist.github.com/1256122

You can switch the UseSHA1 constant to true or false in order to
switch from crc16 to SHA1, you'll see that for this use case the two
hash functions will perform equally the same.

You may think that it is performing so well since the sample size is
1000000 elements, but actually even if you use just 4096 as sample
size, and you could expect to find roughly from 0 to 2 elements per
hash bucket, the result is very similar to the perfect random
distribution.

So I think CRC16 is fine for us.

From a "mathematical" standpoint I think that even if CRC16 is a weak
hash function we are getting the best here since we get a 16 bit
output obtained using an prefix that is always the same (so this is an
invariant and we can ignore it) and an input that is usually even for
small datasets more than 3 bytes. So this is enough to guarantee us
with a great lower 12 bit distribution.
If some expert could confirm that I'll be much more happy ;)

Cheers,
Salvatore

John D. Mitchell

unread,

Oct 1, 2011, 11:36:25 AM10/1/11

to redi...@googlegroups.com

Hi folks,

Just noticed this thread about hashing and your concerns about distribution.

For the redis use cases the cryptographic hashes are slower and provide no value relative to a good non-crypto hash function.

Given your stated common case of <comomon prefixes><numbers>
I strongly suggest looking at the Larson hash function.
It's simple, fast, and has particularly good behavior in similar tests:
- http://www.strchr.com/hash_functions
(See the in particular their test cases called "Numbers" and "Prefix").

Go wild,
John

Tobias Petry

unread,

Oct 2, 2011, 4:49:08 AM10/2/11

to Redis DB

It's a german article but you can use google translator:
http://www.ntecs.de/projects/guugelhupf/doc/html/x435.html
Seems "SDBM" would be an optimal fit.

Tobias

On 30 Sep., 16:42, Salvatore Sanfilippo <anti...@gmail.com> wrote:

Salvatore Sanfilippo

unread,

Oct 2, 2011, 5:30:34 AM10/2/11

to redi...@googlegroups.com

Hi John! I would say from that table that we are best served by CRC16.
CRC32 is one of the entries and does not show the pathological
behavior of Larson hashing numbers. It is perfectly possible for Redis
to have keys being binary numbers as keys are binary safe.

Given the good results I obtained in my tests, and also the evidence
on this table that CRC is not so bad for hashing contrarily to common
beliefe, I would take CRC16 if not evidence against our use case
(lower 12 bits for hashing keys in a fixed table size) is found.

Thanks for this email, very interesting link!
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--

Salvatore Sanfilippo

unread,

Oct 2, 2011, 5:33:47 AM10/2/11

to redi...@googlegroups.com

Oh also that's interesting, from that site:

"For table size less than 2^16, we can improve the quality of hash
function by XORing high and low words, so that more letters will be
taken into account"

This appears to give value to my hypotesis that since we have 12 bits
of output from 16 bits hash function we are getting better results
compared to 32 bit hash function where only 12 bits of output are
used.
I guess we could similarly improve even more the result xoring the
upper 4 bits in odd positions across the 12 bits of lower output, but
seems like to make the specification more complex without a good
reason...

Cheers,
Salvatore

Reply all

Reply to author

Forward