How fast is Redis on sorted sets?

10,038 views
Skip to first unread message

Andy

unread,
Jan 18, 2012, 1:18:19 AM1/18/12
to Redis DB
Hi,

I've seen benchmarks showing Redis running on commodity hardware
getting close to or more than 100K GET/SET per seconds.

I'm mainly interested in the sorted set feature of Redis.

How many ZINCRBY or ZRANGE per seconds can Redis perform on a
commodity server? How does tha performance compare to MySQL's SORT BY
using a secondary index?

Also in the documentation it states "A complex object in memory like a
list or a set is something 10 times bigger than the same object
serialized on disk." Does that also mean a bunch of zsets in Redis
takes up 10 times more memory than an equivalent table (using
secondary index for sorting) in MySQL?

Thanks.

Josiah Carlson

unread,
Jan 18, 2012, 2:51:32 AM1/18/12
to redi...@googlegroups.com
On Tue, Jan 17, 2012 at 10:18 PM, Andy <selfor...@gmail.com> wrote:
> I've seen benchmarks showing Redis running on commodity hardware
> getting close to or more than 100K GET/SET per seconds.
>
> I'm mainly interested in the sorted set feature of Redis.
>
> How many ZINCRBY or ZRANGE per seconds can Redis perform on a
> commodity server? How does tha performance compare to MySQL's SORT BY
> using a secondary index?

Using 'redis-benchmark -q' on my $200 core-2 duo box:
PING (inline): 109890.11 requests per second
PING: 111111.11 requests per second
MSET (10 keys): 67114.09 requests per second
SET: 113636.37 requests per second
GET: 113636.37 requests per second
INCR: 112359.55 requests per second
LPUSH: 114942.53 requests per second
LPOP: 113636.37 requests per second
SADD: 111111.11 requests per second
SPOP: 112359.55 requests per second
LPUSH (again, in order to bench LRANGE): 113636.37 requests per second
LRANGE (first 100 elements): 66225.17 requests per second
LRANGE (first 300 elements): 32362.46 requests per second
LRANGE (first 450 elements): 23696.68 requests per second
LRANGE (first 600 elements): 18903.59 requests per second

Writing a Python script (which uses hiredis via the standard Python
redis library) to do ZINCRBY/ZRANGE and SET/LPUSH/LRANGE (for
comparison):
ZINCRBY: 71232.23 per second
ZRANGE (first 100): 23803.95 per second
ZRANGE (first 300): 9206.23 per second
ZRANGE (first 450): 5957.89 per second
ZRANGE (first 600): 4268.74 per second
SET: 92908.21 per second
LPUSH: 91704.67 per second
LRANGE (first 100): 23810.03 per second
LRANGE (first 300): 9200.13 per second
LRANGE (first 450): 6082.21 per second
LRANGE (first 600): 4317.51 per second

There seems to be a very slight performance penalty to using ZRANGE
over LRANGE, and a 25% penalty to using ZINCRBY over SET. The Python
side of things is significantly slower on the [Z|L]RANGE side of
things, I suspect because of parsing.


As for it's comparison to MySQL's SORT BY procedure using a secondary
index; if the index is already created, MySQL can get away with
perhaps a half-dozen random disk IOs or better for almost any volume
of data. That is roughly 60ms on the upper end, though as quick as
under 1 ms if proper regions of the disk are cached (or similar times
for SSDs and low memory). Assuming that your Redis zset has been
incrementally updated, Redis should be able to respond also under 1 ms
for any reasonable ZRANGE query. If you can fit your data in memory
for Redis, and you can force MySQL to keep it's index in memory, Redis
vs. MySQL will probably be similar in performance. But what you gain
with Redis is a new way of thinking about your data and a new tool to
solve your problems.

What problem are you really trying to solve?

> Also in the documentation it states "A complex object in memory like a
> list or a set is something 10 times bigger than the same object
> serialized on disk." Does that also mean a bunch of zsets in Redis
> takes up 10 times more memory than an equivalent table (using
> secondary index for sorting) in MySQL?

The Redis documentation is primarily referring to the representation
in the dump file, which is compressed. You can think of it
(conceptually) like a gzipped MySQL binlog. Essentially useless by
itself, but you can feed it into MySQL and have a database that is
consistent up to the point the binlog was created. If you were to
create an in-memory table in MySQL, you would be able to compare
memory usage directly between the two.

Regards,
- Josiah

Andy

unread,
Jan 18, 2012, 3:48:55 AM1/18/12
to Redis DB
Thank you Josiah for the benchmark!

Can you run ZRANGE again but getting just the first 10 elements this
time? I just want to get a rough idea as that mirrors closely what I
want to do. Really appreciate it.

Also there's about a 3x performance difference between LRANGE using
redis-benchmark and LRANGE using your python script. Is that because
redis-benchmark is written in C while your script is in python?

> What problem are you really trying to solve?

2 things:

1) Q&A system - all answers to a specific question can be represented
as a sorted set, sorted by the score of each answer. User voting will
be used to determine the scores.

2) Tracking - every time a user visits a page, I need to keep track of
which category that page belongs to. So each user will have a sorted
set of (category, hits) where hits is the number of times that user
has visited a page belongs to "category".

I'm trying to decide between MySQL and Redis. Redis looks really good
so far, but I'm a bit concerned about memory requirements:
1) Assuming both are completely in-memory, how does Redis sorted sets
compared to a equivalent MySQL table in size? Is there any
significant difference?
2) In the case of MySQL, even if the table grows larger than RAM, I
could still use an SSD to maintain respectable performance. Is the
same true for Redis? Or is Redis pretty much unusable once the data
set grows larger than RAM, even if there's an SSD?

Thanks.

Andy

unread,
Jan 18, 2012, 3:53:56 AM1/18/12
to Redis DB
Forgot to ask:

All your results are for Redis running on a single core, right?

On Jan 18, 2:51 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:

ivan babrou

unread,
Jan 18, 2012, 4:22:31 AM1/18/12
to redi...@googlegroups.com
Redis is always running on the single core, it's not multithreaded.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.




--
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou

Salvatore Sanfilippo

unread,
Jan 18, 2012, 4:44:39 AM1/18/12
to redi...@googlegroups.com
Hello,

a little known feature of redis-benchmark introduced recently by
Pieter Noordhuis is that you can benchmark any command you want (only
available in Redis unstable branch, but you can use redis-benchmark
from unstable to benchmark Redis 2.4.x):

$ ./redis-benchmark -q -n 100000 zadd sortedset 10 a
zadd sortedset 10 a: 152439.02 requests per second

To compare it with another command:

$ ./redis-benchmark -q -n 100000 set foo bar
set foo bar: 153374.23 requests per second

However you can say that we are setting again and again the same
element. True... but there is an hidden feature of redis-benchmark
that allows to randomize arguments:

$ ./redis-benchmark -q -r 100000 -n 100000 zadd sortedset 10
ele:rand:000000000000
zadd sortedset 10 ele:rand:000000000000: 104166.67 requests per second
$ redis-cli zcard sortedset
(integer) 63202

We got lower values with a more real-world test, but this is actually
only because the sorted set started small and was resized to reach its
final size, and in the process the hash table is resized as well.
Actually if we run the test again, we'll see that updating an already
existing sorted set is almost as fast as plain SET again:

$ ./redis-benchmark -q -r 100000 -n 100000 zadd sortedset 10
ele:rand:000000000000
zadd sortedset 10 ele:rand:000000000000: 138888.89 requests per second

However you may wonder, this are just around 100k elements, what
happens with millions of elements?

$ ./redis-benchmark -q -r 10000000 -n 10000000 zadd sortedset 10
ele:rand:000000000000
zadd sortedset 10 ele:rand:000000000000: 69898.79 requests per second
$ redis-cli zcard sortedset(integer) 6345866

So to create from scratch a 6 million sorted set key the speed was 70k
insertions per second, but again this has the overhead of creating
everything. Now let's check if our business is just updating the
scores:

zadd sortedset 10 ele:rand:000000000000: 132694.77

Now let's check incrementing the score of random elements:

zincrby sortedset 10 ele:rand:000000000000: 39511.39

Incrementing of random elements is slower, showing 40k ops per second
(not sure why, probably profiling it will show some trivial
optimization to perform).

what about ZRANK?

zrank sortedset ele:rand:000000000000: 70774.48

And finally let's get a range in the middle of this 6 millions entry sorted set.

$ redis-cli zrange sortedset 3000000 3000010
1) "ele:rand:000005696312"
2) "ele:rand:000005696314"
3) "ele:rand:000005696315"
4) "ele:rand:000005696317"
5) "ele:rand:000005696318"
6) "ele:rand:000005696320"
7) "ele:rand:000005696321"
8) "ele:rand:000005696324"
9) "ele:rand:000005696326"
10) "ele:rand:000005696327"
11) "ele:rand:000005696328"
$ ./redis-benchmark -q -r 10000000 -n 10000000 zrange sortedset 3000000 3000010
zrange sortedset 3000000 3000010: 132718.59

132k/s to fetch the same 10 elements again and again from the middle.

Long story short, sorted sets are fast and keep to be fast even when
you have millions of elements inside.

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Andy

unread,
Jan 18, 2012, 5:41:44 AM1/18/12
to Redis DB
Hi,

Thanks for the data. That's very helpful.

1) All your results are very comparable (or even faster) than
Josiah's. But your result for ZINCRBY is a lot slower than Josiah's
(40K/s vs. 71K/s) even though you were using a C program (redis-
benchmark) while Josiah was using a Python script. Is that because you
were using an unstable version of Redis that has performance
regression in ZINCRBY compared to the stable version?

2) Can you comment on whether SSD would be useful for Redis when the
data is bigger than RAM?

Thanks!
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Tim Lossen

unread,
Jan 18, 2012, 6:27:59 AM1/18/12
to redi...@googlegroups.com
> 2) Can you comment on whether SSD would be useful for Redis when the
> data is bigger than RAM?

sure, in this case the high IOPS (i/o ops per second) provided
by SSDs are great. we run redis diskstore in production since
last summer. our box has 24 gig ram plus 200 gig ssd storage
(4 disks in raid 10), and this setup works fine:

----------------
$ iostat
Linux 2.6.32-29-server (panama) 01/18/12 _x86_64_ (12 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
2.86 0.00 5.43 0.37 0.00 91.34

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 582.52 3239.27 3022.51 2416639950 2254927608
----------------

also helps when you concurrently want to backup the diskstore
files to a second machine via rsync :)

cheers
tim

--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Jan 18, 2012, 6:31:58 AM1/18/12
to redi...@googlegroups.com
On Wed, Jan 18, 2012 at 12:27 PM, Tim Lossen <t...@lossen.de> wrote:
> sure, in this case the high IOPS (i/o ops per second) provided
> by SSDs are great. we run redis diskstore in production since
> last summer. our box has 24 gig ram plus 200 gig ssd storage
> (4 disks in raid 10), and this setup works fine:

Thanks for sharing, but I feel that the user can get confused:
diskstore, the technology Tim is using, is not inside a Redis stable
release and there are currently no plans to do so.

So I guess that the original user plans to use the OS swap. An SSD may
help in this context, but Redis should be optimized to handle swapping
better (optionally using threads to serve queries, with key-level
locking). I've a new SSD that I'll use to run this tests in the near
future, once I'll have some data I'll surely share them with you.

About normal persistence, an SSD helps but with a limited impact.
Because of the way Redis works, always writing data in append-only
mode, normal rotating disks already work very well.

Cheers,
Salvatore

Josiah Carlson

unread,
Jan 18, 2012, 12:46:24 PM1/18/12
to redi...@googlegroups.com
On Wed, Jan 18, 2012 at 2:41 AM, Andy <selfor...@gmail.com> wrote:
> Hi,
>
> Thanks for the data. That's very helpful.
>
> 1) All your results are very comparable (or even faster) than
> Josiah's. But your result for ZINCRBY is a lot slower than Josiah's
> (40K/s vs. 71K/s) even though you were using a C program (redis-
> benchmark) while Josiah was using a Python script. Is that because you
> were using an unstable version of Redis that has performance
> regression in ZINCRBY compared to the stable version?

I use non-transactional pipelines of 1000 commands deep on
pre-generated sequential keys/members. Not only do the keys that I
generate also generate sequential hashes, so has effectively zero
collisions in hash buckets, but the benchmark (IIRC) doesn't use
pipelining; it uses 50 concurrent connections to get it's throughput.
It's possible that pipelines are able to better utilize Redis than the
concurrent connections in this scenario (especially because the
response is not complicated and easy to parse).

Regards,
- Josiah

Josiah Carlson

unread,
Jan 18, 2012, 1:07:03 PM1/18/12
to redi...@googlegroups.com
On Wed, Jan 18, 2012 at 12:48 AM, Andy <selfor...@gmail.com> wrote:
> Thank you Josiah for the benchmark!
>
> Can you run ZRANGE again  but getting  just the first 10 elements this
> time? I just want to get a rough idea as that mirrors closely what I
> want to do. Really appreciate it.

ZINCRBY: 70304.31 per second
ZRANGE (first 10): 66485.50 per second
ZRANGE (first 100): 24411.03 per second
ZRANGE (first 300): 9207.51 per second
ZRANGE (first 450): 5794.75 per second
ZRANGE (first 600): 4212.64 per second
SET: 92515.85 per second
LPUSH: 90825.36 per second
LRANGE (first 10): 78351.34 per second
LRANGE (first 100): 24739.61 per second
LRANGE (first 300): 9373.92 per second
LRANGE (first 450): 6053.34 per second
LRANGE (first 600): 4193.56 per second

> Also there's about a 3x performance difference between LRANGE using
> redis-benchmark and LRANGE using your python script. Is that because
> redis-benchmark is written in C while your script is in python?

Yes. Parsing speed makes a huge difference.

>> What problem are you really trying to solve?
>
> 2 things:
>
> 1) Q&A system - all answers to a specific question can be represented
> as a sorted set, sorted by the score of each answer. User voting will
> be used to determine the scores.

This is exactly the kind of system that Redis does very well at.
Simple up/down voting works best for this, but if you need something
more complicated, you can store the up/down votes in an associated
hash, then update the score in the sorted set later. I've done the
complicated scoring thing in MySQL before, and it's not too bad.

> 2) Tracking - every time a user visits a page, I need to keep track of
> which category that page belongs to. So each user will have a sorted
> set of (category, hits) where hits is the number of times that user
> has visited a page belongs to "category".

You may want to degrade that over time, so that you can track their
behavior changes over time. "ZINTERSTORE key 1 key weights .5" will
halve the scores.

> All your results are for Redis running on a single core, right?

Redis is single-threaded. I don't do anything special to it, but the
linux scheduler will keep it on a single core.

> I'm trying to decide between MySQL and Redis. Redis looks really good
> so far, but I'm a bit concerned about memory requirements:
> 1) Assuming both are completely in-memory, how does Redis sorted sets
> compared to a equivalent MySQL table in size? Is there any
> significant  difference?

MySQL (like most databases) pack the data fairly efficiently for
on-disk storage. In-memory storage of the same data is essentially the
same, and will be cached on the page/large chunk level. Redis will use
more memory than the on-disk or in-memory caching of MySQL.

> 2) In the case of MySQL, even if the table grows larger than RAM, I
> could still use an SSD to maintain respectable performance. Is the
> same true for Redis? Or is Redis pretty much unusable once the data
> set grows larger than RAM, even if there's an SSD?

In the case of MySQL, if you structure your updates properly, it will
perform very respectively even if you have a spinning disk. SSDs will
help both of them, but I've not picked up a SSD for Redis testing
(I've got a Samsung SSD on my personal workstation, and it is
amazing).

Regards,
- Josiah

Pieter Noordhuis

unread,
Jan 18, 2012, 4:32:03 PM1/18/12
to redi...@googlegroups.com
Actually, it is not the raw parsing that makes the Python
implementation slower (since both implementations share the parser
code), but the fact that redis-benchmark can walk over the response
and the Python client makes actual arrays and string objects. If
you're looking at the performance a single client can get, the Python
benchmarks are more realistic. The redis-benchmark output is more
realistic for multiple, less intensive clients.

Cheers,
Pieter

Salvatore Sanfilippo

unread,
Jan 22, 2012, 10:49:05 AM1/22/12
to redi...@googlegroups.com
On Wed, Jan 18, 2012 at 10:44 AM, Salvatore Sanfilippo
<ant...@gmail.com> wrote:

> zincrby sortedset 10 ele:rand:000000000000: 39511.39

A few emails ago we were wondering why zincrby was slower than zadd,
so today I used Instruments to profile the two commands.

The difference was in the (bad) benchmarking, with zadd I was always
adding values with the same score they had previously, and Redis has
an internal optimization to avoid touching the element at all if the
score was already equal to the new one.
Once you start randomizing the score of the zadd operation the
performance is he same as zincrby.

Further inspection of profiling data shows that the additional time
when the score changes is mostly lost into zslDelete(), more
specifically in the line: "while (x->level[i].forward &&", so this are
clearly cache misses.

Walking the sorted set is not needed when the score does not change,
since we obtain the score information using the hash table lookup.
This is also an indirect hint about the fact that the hash table is
not just O(1) in the average case compared to O(Log(N)) of the skip
list, but also much better in terms of constant times. This would
suggest that we not did an error taking our incremental hash table
implementation instead of switching to a tree-alike structure for our
implementations of things that externally expose a "Dictionary" alike
interface.

Cheers,
Salvatore

Reply all
Reply to author
Forward
0 new messages