Tips for a fast, read-only Redis server

Aardvark Zebra

unread,

Jan 7, 2014, 2:54:17 PM1/7/14

to redi...@googlegroups.com

(I posted this using Google's web interface to the group, but it never showed up; so I offer my sincere apologies if this shows up twice)

I am trying to optimize my Redis server, which will be used only for key-value retrieval.

The data size is about 8GB; each key is about ~40 bytes, each value about 16B.

Keys and values are binary.

The machine is a quad Xeon with 24GB memory and GbE network connection.

I'm using the HiRedis library on clients. Each query is a fetch of about ~500 keys at once.

The code is something like this (pseudocode alert!):
--------
        for (i = 0; i < NKEYS; i++) {
            redisAppendCommand(c,"GET %b", key[i], keylen[i]);
        }

        for (i = 0; i < NKEYS; i++) {
            if (redisGetReply(c, (void*)&replies[i]) == REDIS_OK) {
   . . . . . . . . .
            }
        }
---------

I'm trying to squeeze the maximum QPS out of it. One thing I've not tried is to shard the DB into 8 Redis instances of ~2GB each (running on their own ports), and use the same sharding algo on the client side to decide which port to hit.

Any other tips?

Thanks!

Josiah Carlson

unread,

Jan 8, 2014, 12:28:59 PM1/8/14

to redi...@googlegroups.com

My first suggestion would be to use the MGET command, which I benchmark to be significantly faster than pipelined GET requests (2-10x, depending on clients). You can also pipeline MGET requests for the full effect.

Whether or not sharding will substantially help performance will depend on whether or not you can saturate your one Redis process. Do you know your required QPS? Expected growth?

One thing that may throw a bit of confusion into the mix is that if you use plain strings with GET/SET, your 8 gigs of data may not fit in Redis on your 24 gig box - there is some overhead to store keys/values. If you are planning on sharding anyway,you can shard your strings into hashes (discussed http://redis.io/topics/memory-optimization and other places). While you'd need to use pipelined HGET calls instead of pipelined GET or MGET calls, you would likely see memory use closer to 10-12 gigs, instead of 30+ gigs.

- Josiah

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Arnaud GRANAL

unread,

Jan 8, 2014, 12:57:24 PM1/8/14

to redi...@googlegroups.com

On Wed, Jan 8, 2014 at 7:28 PM, Josiah Carlson <josiah....@gmail.com> wrote:
> My first suggestion would be to use the MGET command, which I benchmark to
> be significantly faster than pipelined GET requests (2-10x, depending on
> clients). You can also pipeline MGET requests for the full effect.
>
> Whether or not sharding will substantially help performance will depend on
> whether or not you can saturate your one Redis process. Do you know your
> required QPS? Expected growth?
>

I (was) in a similar situation where I needed to fetch lot of keys at
the same time.

Redis is spending more time parsing and reading MGET commands /
pipelined GET than executing them.

The solution is to avoid sending long strings through sockets to enjoy
very fast Redis:
total_commands_processed:2559986635
keyspace_hits:1340709160158

There are 2 ways for this:
- Implement in C inside Redis
- Implement in LUA

Let's say you have keys like this: "field2130" "field2131" "field2132"
"field2133" "field2134" "field2135" "field2136" and you want to SUM
them, you can just write a simple loop:

for (i = 2130; i <= 2136; i++)
{
robj *o = lookupKeyRead(c->db, startKey);

if (o != NULL) {
total += (long)(o->ptr);
}
incrKeyName((char*)startKey->ptr); <---- your function here
}
addReplyLongLong(c, total);

and voila.

Arnaud.

Aardvark Zebra

unread,

Jan 8, 2014, 6:55:26 PM1/8/14

to redi...@googlegroups.com

Thanks! I'm in the process of benchmarking multiple "GET"s -vs- a single "MGET".
But I wanted to know if the code is correct, or am I missing something.

Also, on the server, "netstat -nt" prints the following:
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp    30307      0 XX.XX.XX.68:6379           XX.XX.XX.49:50591          ESTABLISHED
tcp    13384      0 XX.XX.XX.68:6379           XX.XX.XX.48:44679          ESTABLISHED
tcp    29434      0 XX.XX.XX.68:6379           XX.XX.XX.55:60014          ESTABLISHED
tcp    29967      0 XX.XX.XX.68:6379           XX.XX.XX.51:43848          ESTABLISHED
tcp    29397      0 XX.XX.XX.68:6379           XX.XX.XX.50:55911          ESTABLISHED
tcp    29556      0 XX.XX.XX.68:6379           XX.XX.XX.54:41364          ESTABLISHED

So it looks like the receive queue is filling up quicker than the server can consume it.

I have tuned TCP values to increase the buffer sizes.

Josiah Carlson

unread,

Jan 8, 2014, 7:12:40 PM1/8/14

to redi...@googlegroups.com

Increasing TCP's buffer sizes will increase the amount of data that your client can send to Redis and be buffered by TCP, but it won't necessarily improve Redis performance. In this case you have outstanding requests that are waiting for Redis to process. Making that backlog bigger won't make Redis process your requests faster.

- Josiah

Aardvark Zebra

unread,

Jan 9, 2014, 1:34:56 AM1/9/14

to redi...@googlegroups.com

Time to report on some experiments.
Setup was: 1 quad-Xeon server with 24MB memory and 1GbE, with around 8GB of data inserted into Redis version 2.8.3.

Keys average around 46B, values around 10B.

Clients: 8 clients, of the exact same spec.

For this test, each client selects a random 10% subset of the set that was loaded into Redis.

All start hitting on the server at the same instant.
Each 'query' consists of 500 keys (randomly selected). After the query is retrieve, the values returned are compared with the values stored, to make sure there are no errors.

Two techniques were used:

1. Create a pipelined list of "GET"s using redisAppendCommand(), or

2. Create a 500-element "MGET" using redisCommandArgv()

100K queries were performed. The average numbers are:

77QPS for the pipelined "GET"s above;
201QPS for the MGET.

Running 'top' on the server, the CPU utilization of the Redis server seems to be ~100% during the test.

Question: would it be correct to say that the only major gains now will come from sharding the Redis server, and starting multiple servers on this host to take advantage of the multiple CPUs?

Arnaud GRANAL

unread,

Jan 9, 2014, 1:50:27 AM1/9/14

to redi...@googlegroups.com

On Thu, Jan 9, 2014 at 8:34 AM, Aardvark Zebra <exm...@gmail.com> wrote:
> Question: would it be correct to say that the only major gains now will come
> from sharding the Redis server, and starting multiple servers on this host
> to take advantage of the multiple CPUs?
>

It depends, are the keys that you are trying to fetch following some
logic ? (like:
colors = [red, blue, green, yellow] and prefix = 'mykey')

and keys are mykeyred, mykeyblue, mykeygreen, mykeyyellow ?

Arnaud.

Aardvark Zebra

unread,

Jan 9, 2014, 2:43:46 AM1/9/14

to redi...@googlegroups.com

No, they're not; but out of curiosity, what difference would it make if it were indeed the case?

Aardvark Zebra

unread,

Jan 9, 2014, 2:46:30 AM1/9/14

to redi...@googlegroups.com

Just to clarify: these numbers are per server; so for the overall throughput

you would multiply by 8. Sorry about the confusion!

Arnaud GRANAL

unread,

Jan 9, 2014, 2:53:04 AM1/9/14

to redi...@googlegroups.com

Let's say the length of your keys are 10 bytes each. 500 keys = 500 *
50 = 25000 bytes that will add up quickly in recv-q (= the number of
bytes unconsumed by Redis).

If instead, you can send a command of 50 bytes "get_all_my_keys", you
don't have the queue filled (hence, much more commands per second).
Ideally if the operation is processed inside Redis, Redis doesn't have
to return you 500 keys but one or two.

Also I would recommend you to benchmark in a real-world latency (e.g.
0.8ms) instead of locally.
If you don't have any secondary server, you can simulate the latency
using "tc" on linux, but this might not be perfect (because of IRQ
interrupts, etc).

For your second post, yes, Redis +/- linearly scales across cores. So
if you have 8 cores you can expect around 8x performance compared to 1
instance only.
However I wouldn't put my hand in the fire about this, this is what I
experimented on my own machines and your experience may vary :o)

Arnaud.

Arnaud GRANAL

unread,

Jan 9, 2014, 2:54:08 AM1/9/14

to redi...@googlegroups.com

On Thu, Jan 9, 2014 at 9:53 AM, Arnaud GRANAL <ser...@gmail.com> wrote:
> If instead, you can send a command of 50 bytes "get_all_my_keys", you

Mea culpa. This looks more like 16 bytes to me :)

Reply all

Reply to author

Forward