Hi All,
We have been using lettuce for a while now, and it always works like a charm, but when we are trying to scale redis, we seem to be running into some issues.
Below is the scenario.
We have the following.
- 70 master node cluster
- one slave per master, ie 70 masters + 70 slaves
- ~60 application servers which push data in the redis nodes.
The 60 application servers are expected to increase to 100.
However, if such a configuration change happens, it is not detected by lettuce, unless we refresh the cluster topology by calling ClusterClient.reloadPartitions() or start the background job using ClusterClientOptions.
Considering, the cluster has a number of nodes, there is a possibility that any node may temporarily fail, and slave may become a master, so we decided to run the built in job using ClusterClientOptions
We run the job using ClusterClientOptions configured every 60 secs, but considering there are 140 total nodes, the command "cluster nodes" takes around 30 ms(it is the slowest command, the other commands are in microseconds), while it was taking 100 microseconds for a 30 node cluster.
Also, ClusterTopologyRefreshTask hits each of the 140 nodes for refresh.
Considering, all 60 servers hit 140 nodes every 60 seconds, each redis node gets an average of 1 cluster client request per second, which takes 30 millisecs.
That means we are losing 3% of the time in the "cluster nodes" command.
We are planning to double the size of the redis cluster, and the application servers will also increase.
The former will lead to even slow "cluster nodes" command, and the latter will lead to more than 1 "cluster nodes" command per second on each node.
The combined effect will definitely have a bad impact on redis considering "cluster nodes" command is already around 1000 times slower than the other commands.
We tried to dig in the redis code, and found that when the ClusterTopologyRefreshTask tries to refresh its partitions, it executes "cluster nodes" command on all the partitions every 60 secs, in our case it is 140 nodes using TimedAsyncCommand.
One solution which we felt is that if it could iterate over the set of RedisURIs, make a sync call to get the "cluster nodes", and if any call was successful, use it to refresh the partitions, leading to only one random redis being queried for the cluster nodes, which will solve our case.
The above issue is only noticeable when the cluster size is very large because that is when the "cluster nodes" commands becomes very slow.
suggestions / possible solutions are welcome..
Thanks