Dear Redis community,
2014 just started, and it was a time where Redis Cluster was supposed
to reach stability. While a stable release is still not planned for
the next days, at this point the process to get Redis Cluster at
increasing stages of stability, to the point we'll be confident to
call it a production-ready product, has started. This email summarizes
what is missing from Redis Cluster in order to be complete and what
are the next steps that will happen in the next weeks to reach
Note that production-ready will be a moving target as usually, and
different users may start to use Redis Cluster at different times. For
this reason the roadmap that the Redis 3.0 release will follow is a
bit different compared to the one of the previous releases.
This is indeed the first argument addressed by this email.
Usually the Redis roadmap for a new release is the following:
1) Freeze the development of new features unless they are considered
to have low impact on the rest of the code base.
2) Wait the release to get stable and stable, releasing Release
Candidate versions of Redis.
3) When the critical bug reports are starting to be reported very
rarely, and for weeks they stop to be reported at all so that the next
RC version just provides minor fixes, I call the release stable.
This instead of development -> freeze -> RCs -> stable we'll add a new
step, that is, betas, so it will look like this:
freeze -> Betas -> RCs -> stable.
In this way we'll consolidate betas as usable and upgradable (when the
next beta is available) releases of Redis ASAP, to favor early
adoption of Redis Cluster for environments where it was tested and
considered stable enough for the task at hand.
The first beta will be version 2.9.51, the first RC as usually will be 2.9.101.
In the rest of this email you'll find a detail of what is currently
missing. However before enumerating all the missing features and bore
you to death, I'll go to the point, where Redis Cluster will be stable
enough to be used in your production environment?
This is an hard question since software does not follow a
pre-scheduled plan to get stable ;-) But as exposed before, there will
be different levels of stability available to you ASAP.
To start, I'll release a new beta of Redis Cluster every month. This
is the plan:
10 Feb: all the missing points listed above already implemented,
1 March: beta-2
1 April: beta-3 or RC1
After this point a beta (or RC based on feedbacks and bug reports)
will be released monthly, until no critical bug is reported for a few
weeks. We'll call it 3.0.0. This is likely to happen before June.
The real critical thing in the Cluster ETA is IMHO client libraries.
Redis Cluster needs some more work but at this point it is an
incremental process that will go forward easily. Instead the client
landscape is a bit lacking at this point.
I suggest organizations interested in Redis Cluster in investing some
money / development time in donations to their client of choice
developer(s) in order to speedup the process.
It will not help to have a Redis Cluster working well server side
without good clients. While Redis clients tend to be not super
complex, Redis Cluster clients are a bit more articulated and require
some testing and care.
And now a list of issues that are work in progress:
Enhancement: read-only access to slave nodes.
One thing missing (but fortunately a few lines of code away) is
reading from slaves.
As detailed in the issue linked above, this will be fixed with an
additional READONLY command that says the node that we no longer want
read-after-write consistency for this session and will be happy to
potentially read stale data.
The client will be only redirected if it issues a read about an
hashslot not served by the node, and in that specific case the
redirection message will not just list a single ip / port pair, but
the master and all the slaves for this hash slot.
The ability to take snapshots of the whole cluster in form of RDB
files with attached informations about the served hash slots is
critical for safe operations.
This way it will be possible to restore the cluster at a latter time
if needed in any setup where there are at least the same number of
This is another important concern for everybody migrating from a
different sharding setup (for example client-based sharding or
Twemproxy) to Redis Cluster.
The migration tool should get the address of the old Redis instances,
and the address of the cluster, and will copy every key from the old
instances to the cluster via SCAN + MIGRATE.
Redis-trib resharding / check / fix enhancements
Redis-trib currently is able to do things like checking the cluster,
and trying to fix it when inconsistencies are found. This must be
For example when instances restart and find keys that mismatch the
assigned slots, the instances mark the slots as 'migrating',
Redis-trib is already albe to fix some cases of slots/keys
inconsistencies, but should be able to deal with any possible mess.
This is an example of what it is able to do already:
$ ./redis-trib.rb fix 127.0.0.1:7001
[snip of long output]
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 127.0.0.1:7001
has slots in importing state.
[WARNING] Node 127.0.0.1:7003
has slots in migrating state.
[WARNING] The following slots are open: 16
>>> Fixing open slot 16
Set as migrating in: 127.0.0.1:7003
Set as importing in: 127.0.0.1:7001
Moving slot 16 from 127.0.0.1:7003
>>> Check slots coverage...
[OK] All 16384 slots covered.
Here I created this issue by blocking a resharding that was in
progress. redis-trib fix was able to fix the problem correctly in this
base case, but it should deal with more complex problems as well.
Failover user-tunable settings and other improvements
Redis Cluster failover is still not using replication offsets in order
to promote the replica that is likely to have diverged less compared
to the master (less chances of data loss during failures), and this
must be addressed.
Here the trick is just to delay the failover attempt of all the slaves
but the one with the greater replication offset of a few hundred
milliseconds. Note that the other slaves will try to get promoted
Another related issue is that currently, a slave will not try to get
promoted at all if it has data that are older than NODE_TIMEOUT*10
milliseconds, this means that it is possible that after a master
fails, and all the slaves are, for example, restarted, the cluster
can't continue without human intervention, since there is no slave
considered to be ok enough to get promoted. This should be changed to
a user-configurable parameter, so that the user can specify even "0"
as the max disconnection time with the slave in order to still attempt
to get promoted, and favor availability over limiting divergence.
This is a very simple but extremely powerful concept that I would love
to implement before the stable release, even if it actually an
incremental improvement that can be provided later.
The idea is the following: in a master-slave setup, you may imagine
the actual ability to survive to failures while still preserving every
hash slot is limited to the number of replicas you have for a given
So if you have just 1 slave for ever master, if you are unlucky and
both the master and slave of the same hash slot will explode, the
cluster will not be able to continue at all.
However in practice instances don't fail always at the same time, so
you may configure a Redis Cluster with 10 masters, and a slave for
every master for 9 slaves, but instead assign 3 slaves to the last
master number 10.
Now what happens if the master numer 1 or its slave fail? A promotion
will happen if needed, and the cluster will continue with just 1
instance left for a set of hash slots.
However with "replicas migration" slaves of master number 10 will
notice this and will migrate one replica from number 10 to master
number 1, providing more safety.
Useless to say, on top of this, we need more testing: I'll do a lot of
manual testing as I'm already doing, but also no stable release of
Redis Cluster will happen before we have automated unit tests for
Probably I'll use a different test infrastructure / code in order to
test Redis Cluster, and the normal test will not start the cluster
test by default if not explicitly required.
However we need basic cluster testing for sure.
Here the idea is to avoid having instances on-demand like the current
test is doing, but instead to spawn N instances at startup that are
easily addressable by instance number by the testing code, and setup
different scenarios to run different tests.
Ok, that's all for now, please if you need more clarifications send a
reply to this message and I'll do my best to address your concerns.
I understand Redis Cluster was announced too early, but the actual
amount of work that went into it was not big until recent times...
this is why it took so much time.
However now that I play with it, I believe it will have a profound
impact in the Redis ecosystem, if we are able to provide a solid
implementation. I'll do my best to deliver something solid, and with
such a goal I can't rush too much, however with the betas -> RCs ->
Stable phases we'll try to have different stability levels at
different times suitable by different kind of users and use cases.
Thanks for your patience and help,
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)