Hi all,
I've been trying out Redis Cluster to see how it works and performs for high data rate clients (possibly processing tens of thousands of events/sec).
I made some tests where a number of concurrent clients performed SET and GET operations with random keys. The goal was to saturate the Redis Cluster, i.e. to stress at least one of the master nodes to fully utilize a CPU core. From these test (see the details below) I found that the cluster can handle 30-45.000 ops/sec/master node (depending on the number of clients and the size of the cluster). Previously, with the standalone version we could reach 100.000+ ops/sec with a single Redis instance (with pipelinening we could even go way above 100.000 ops/sec).
I wonder if this should be considered as a correct figure? What could be the source of this more than 50% performance drop? Async replication to slaves and persistence to disk are not the source: switching these off do not change the numbers. An interesting observation though is that the CPU load of the master instances differ: although CRC16 spreads to operations equally among the masters, there's always one or two masters which saturate the CPU first, while other masters still have room with 0.5-0.8 CPU load. Are there any inherent asymmetries in the cluster? Also, to reach the 30-45.000 ops/sec/master saturation point, I need to start a large number of concurrent clients, thus the client side performance drops down from 8.000 to 2-3.000 ops/client (i.e. response latencies are going up from 0.13 ms to 0.5 ms).
Any ideas/comments about this?
And finally a related question: do you know of any Redis Cluster clients that plan to support async and/or pipeline operations?
Br,
Peter
--------------------------------------------
BENCHMARK SETUP
--------------------------------------------
- M redis masters
- S redis slaves
- 2N client processes (N doing random writes, N doing random reads)
Methodology:
- all processes evenly spread across 7 physical servers (~64 GB memory each, 1 Gbe network)
- no pipelinening or async operations since Redis Cluster clients do not support those
- to compensate the client side performance loss originating from round-trip times (0.13 ms between servers),
we keep starting more clients until at least one of the Redis instances fill up 1 CPU (we treat this point as a saturation point of the cluster)
--------------------------------------------
TEST 1: STANDALONE
--------------------------------------------
- single instance Redis
- Figures at saturation point:
2N = 20 clients
106.000 op/s total
5.300 op/s/client
--------------------------------------------
TEST 2: SMALL CLUSTER
--------------------------------------------
- M=S=3
- Figures at saturation point:
2N = 42 clients
45.000 op/s/master
3.200 op/s/client
--------------------------------------------
TEST 3: MEDIUM CLUSTER
--------------------------------------------
- M=S=7
- Figures at saturation point:
2N = 84 clients
45.000 op/s/master
3.800 op/s/client
--------------------------------------------
TEST 4: MEDIUM CLUSTER NO REPLICATION
--------------------------------------------
- M=7
- S=0
- Figures at saturation point:
2N = 84 clients
45.000 op/s/master
3.800 op/s/client
--------------------------------------------
TEST 5: LARGE CLUSTER
--------------------------------------------
- M=S=14
- Figures at saturation point:
2N = 168 clients
32.000 op/s/master
2.700 op/s/client