I found a minor issue with the cluster tool (redis-trib.rb). For economic reasons we desire to run more instances of redis than we have machines running redis. Each server runs say, 3 instances of redis, a master and two slaves. Masters divvy up the 16384 hashslots evenly. Each master needs at least one replica, and it's very important that this replica exist on a server other than the one where the master runs, in case the whole server fails. But the cluster tool isn't aware of the relationship or mapping between masters, slaves, and servers. It appears that the utility consistently assigns the first of the list of sockets corresponding to instances of redis to the masters, but I'm not able to determine how it assigns the slaves after that. So it's possible to avoid having two or more masters on one server, but not two or more slaves of the same master on a server, or slaves on the same server as the master.
Of course none of this would matter if the there were a 1-to-1 relationship of instances and servers. What's needed is some kind of affinity feature or method to convey how masters and slaves are allocated across servers. For smaller scale redis deploys like mine it would be enough to know how slaves are assigned to masters according to the sockets provided.
Below I've provided the output of the redis-trib.rb tool to show the problem. With this cluster a total loss of machine two (172.30.6.232) would mean losing the middle master server plus both replicas of the first! Not a catastrophe but not good either. With nine instances of redis on three servers it should be possible to lose two servers and still have access to all shards.
If anyone can shed light on the matter I would appreciate it.
# /usr/local/bin/redis-trib.rb create \
--replicas 2 \
>>> Creating cluster
>>> Performing hash slots allocation on 9 nodes...
Using 3 masters:
slots:0-5460 (5461 slots) master
slots:5461-10922 (5462 slots) master
slots:10923-16383 (5461 slots) master
replicates 247e3ffb596ee3dcbd6da0ee517fbb675242fe76
replicates 247e3ffb596ee3dcbd6da0ee517fbb675242fe76
replicates d057060ec152d8e2eae42a8bb1fe829ca8db26c7
replicates 3b693e12a4e453d56636e7bef9be6a21a4445409
replicates d057060ec152d8e2eae42a8bb1fe829ca8db26c7
replicates 3b693e12a4e453d56636e7bef9be6a21a4445409
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join...
slots:0-5460 (5461 slots) master
slots:5461-10922 (5462 slots) master
slots:10923-16383 (5461 slots) master
slots: (0 slots) master
replicates 247e3ffb596ee3dcbd6da0ee517fbb675242fe76
slots: (0 slots) master
replicates 247e3ffb596ee3dcbd6da0ee517fbb675242fe76
slots: (0 slots) master
replicates d057060ec152d8e2eae42a8bb1fe829ca8db26c7
slots: (0 slots) master
replicates 3b693e12a4e453d56636e7bef9be6a21a4445409
slots: (0 slots) master
replicates d057060ec152d8e2eae42a8bb1fe829ca8db26c7
slots: (0 slots) master
replicates 3b693e12a4e453d56636e7bef9be6a21a4445409
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.