Hi all --
First off, Redis is an amazing tool! We've used it fairly extensively
for many different things in our company, and have very happy with the
performance and results. So thanks for all of the effort and
development!
We recently switched our caching store from Memcached to Redis. The
primary reason we did this was to take advantage of Redis' VM feature
to allow some of the lesser-accessed values to be swapped to disk,
boosting the overall size of cache beyond what could be stored in
memory. We have found this to be true; indeed, we are able to store
many more values in our Redis-backed cache then we have been able to
store in a Memcache-only cluster, where everything must be stored in
memory. So on this front we are quite happy.
However, one thing we have noticed is that read performance out of the
cache can be quite erratic and unpredictable. Most of the read
requests finish very quickly (~1ms) but a large number of them finish
orders of magnitude more slowly (10ms - 1s). Even worse, there seem to
be times when many such requests will be slow, leading to degradation
in our site performance (see graph):
https://skitch.com/vincentchu/ri12f/gnuplot
As background, we have a Redis cluster of two boxes, each with 16 GB
of RAM. Each box runs one redis-instance (v2.2.2) with the following
configuration file:
https://gist.github.com/877761
We do an average of around 500 reads / second out of the cluster, with
peaks of around 800-900 reads / sec. Thus each Redis instance will see
around 400-500 reads / sec MAX. We are using these redis instances as
a pure key => value stores, with simple string values of around 1-4k
in size. Each box has around 250 connected clients. One thing I
noticed is that once the redis boxes began to fill with keys is that
both boxes became very I/O bound. Disk utilization shot up to near
100% and has not gone down since. Moreover, memory usage has shot up
to the full 16GB. I've even seen the box start to swap a bit, though
this has remained pretty stable. Our read patterns are pretty long-
tail, so a subset of keys are accessed very heavily.
Using redis-tools, as recommended by
http://redis.io/topics/virtual-memory,
to view how many keys are going in / out of swap. Most of the time,
there is no "swap-out" activity. But every once in awhile, I see the
following:
https://gist.github.com/877771
In any case, would be great if I could get some hints as to how best
to configure our Redis instances to maximize read rate and minimize
variability in read throughput:
* Basic question: Am I even using the right redis configuration? Data
durability / recovery would be nice, but it's not hugely important for
us since we just want to use Redis as a cache that can use disk to
boost the number of keys that it is able to store.
* Should we increase the size of the swap that redis is allowed to
use? In our current setup, vm-max-memory is set to 3221225472 (3GB).
If we do increase this, will this force Redis' aggregate memory usage
to exceed the 16 GB of RAM we have on each box, causing it to swap?
Redis is currently using all of the RAM on the box.
* Am I bumping into the maximum performance that I could expect? If
so, purchasing a bigger box (either more RAM, or just adding more
boxes to the cluster) is prudent?
* Compressing string values on the fly? One thing we've thought about
is compressing the strings on their way in/out of the cluster. This
would of course mean more work for our app servers, but this might
also lower the amount of I/O coming in and out of the disk, and allow
Redis to cache more in memory.
Thanks again, and hats off to the Redis team for all of the hard work.
Any help or hints would be much appreciated!
Cheers,
Vince