Why is there a CLUSTERDOWN in case of a single master failure?

5,234 views
Skip to first unread message

Tomas Kramar

unread,
Apr 14, 2015, 9:23:32 AM4/14/15
to redi...@googlegroups.com
Hello,

we are evaluating redis cluster here and noticed that when one of the masters go down, for a short period every request to any host returns CLUSTERDOWN. I've tried searching around but the only thing I was able to find out is that it baffles a lot of people who are trying redis cluster. I get that this is the expected behavior, but I couldn't find any explanation as to why. I'd expect that any requests that map to the hash slot that is down would fail, but other hash slots will continue to work.
Can you please point me in the right direction?

Thanks,
Tomas

Freire Zhu

unread,
Apr 14, 2015, 10:31:29 PM4/14/15
to redi...@googlegroups.com
According to the explanation at http://redis.io/topics/cluster-spec#availability, cluster would go into unavailable when a single master and all of its slaves go down, means that when any sub range of hash slots become unreachable,  cluster would refuse to continue to work. 

在 2015年4月14日星期二 UTC+8下午9:23:32,Tomas Kramar写道:

Jano Suchal

unread,
Apr 15, 2015, 2:25:58 AM4/15/15
to redi...@googlegroups.com
Yes, but that would mean that if the probability of node going down is P, with N nodes the probability of the *entire* cluster being down is N*P. So adding a node makes your cluster *less* resilient to node failures. With 99% single node availability a cluster of 100 nodes would be down all the time. Is this by design?

Salvatore Sanfilippo

unread,
Apr 15, 2015, 4:32:52 AM4/15/15
to Redis DB
Hello, you can control this with a configuration option. From the
example redis.conf:

# By default Redis Cluster nodes stop accepting queries if they detect there
# is at least an hash slot uncovered (no available node is serving it).
# This way if the cluster is partially down (for example a range of hash slots
# are no longer covered) all the cluster becomes, eventually, unavailable.
# It automatically returns available as soon as all the slots are covered again.
#
# However sometimes you want the subset of the cluster which is working,
# to continue to accept queries for the part of the key space that is still
# covered. In order to do so, just set the cluster-require-full-coverage
# option to no.
#
# cluster-require-full-coverage yes
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - Pivotal http://pivotal.io

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.

Salvatore Sanfilippo

unread,
Apr 15, 2015, 4:46:18 AM4/15/15
to Redis DB
Hello Jano, conceptually it is as you say, but the reality is a bit different:

1) For a cluster where each master has two replicas (just an example)
to be down, you don't need any three random nodes to be down, but they
must be the master and its replicas. The larger is N, the less likely
is that if three random nodes fail they are all related (master and
its slaves).
2) P, the probability of nodes failing, is small compared to the
greatest intended N for Redis Cluster, which is at max 1000.
3) Single node failures are (unlike partitions) unrelated events that
hardly happen at the same time. So even in a network with a very high
P (unreliable hardware), using the Redis Cluster feature know as
"replicas migration", multiple failures can occur one after the other
and the additional slaves will migrate to the orphaned masters, making
the real world availability much better than the theoretical one,
which is instead computed in terms of multiple simultaneous failures.

So about 3, if you have 100 nodes with 1 slave each, but then you
attach 10 additional slaves to a single random master, as failures
will continue to happen and masters non backed by slaves will be
created as a result, the additional slaves will migrate to the
unbacked masters in order to protect them.

Cheers,
Salvatore

Jano Suchal

unread,
Apr 15, 2015, 5:32:16 PM4/15/15
to redi...@googlegroups.com
This is it! Thanks!
Reply all
Reply to author
Forward
0 new messages