ZSET Overhead for smaller ZSETs, Measured...

6 views
Skip to first unread message

Jeremy Zawodny

unread,
Nov 4, 2010, 11:50:31 PM11/4/10
to redi...@googlegroups.com
Earlier today I was chatting a bit with antirez on IRC about a feature I suggested (ordered lists[1]) and we got to talking about the overhead of ZSETs.  I had been measuring it and decided to publish the tool[2] I used as part of a collection that I'm calling redis-size-bench[3] (soon to contain more of my little measurement scripts).

Anyway, here's a brief description of the redis_set_ram script and results I found:

This tool is used to demonstrate the memory overhead of sort sets (ZSETs) in Redis.  It connects to the local Redis instance, calls FLUSHALL, then creates 100,000 simple key/value pairs (to simulate having other stuff in the keyspace) and then creates 10,000 ZSETs with a number of members (integers).  It then produces a line of stats for that run and moves on to the next one.

It currently does runs for the following set sizes: 1, 2, 10, 50, 100, 500, 1000, 5000.

Here's a sample run on my notebook (64bit Redis 2.1.5 (master from a few days ago)):

10000 zsets of 1 members, 10000 total in 69979720 bytes ram, 6997/member
10000 zsets of 2 members, 20000 total in 70834600 bytes ram, 3541/member
10000 zsets of 10 members, 100000 total in 79337864 bytes ram, 793/member
10000 zsets of 50 members, 500000 total in 116593304 bytes ram, 233/member
10000 zsets of 100 members, 1000000 total in 164377880 bytes ram, 164/member
10000 zsets of 500 members, 5000000 total in 536431064 bytes ram, 107/member
10000 zsets of 1000 members, 10000000 total in 1004059720 bytes ram, 100/member
10000 zsets of 2000 members, 20000000 total in 1939253688 bytes ram, 96/member

Clearly once a ZSET has over 100 members (probably 200 or so--need to measure more in that range), the overhead approaches 100 bytes per member.  But if you have a lot of small ZSETs (10-50 members), ther'e a clear penalty involved.  This led to a discussion of optimizing ZSETs in a few ways that could be very helpful to those of us who'd like to have millions of mostly small ZSETs in memory.

Jeremy

Jak Sprats

unread,
Nov 5, 2010, 7:49:51 AM11/5/10
to Redis DB
Hi Jeremy

great methodology in those tests and nice writeup.

One question, does it make sense to do these tests for other data
types
1.) LISTs are constant in terms of memory usage per added item
2.) Hashes use ziplist (Pieter has done lots of memory benchmarking on
them)
3.) SETs ... use the same data structure as ZSETs, w/o storing values

So that covers all the bases, nice tests.

- Jak
> [2]https://github.com/jzawodn/redis-size-bench/blob/master/redis_zset_ra...
> [3]https://github.com/jzawodn/redis-size-bench

Jeremy Zawodny

unread,
Nov 5, 2010, 10:49:36 AM11/5/10
to redi...@googlegroups.com
Jak,

Yes, I believe it does make sense and probably will be doing some of those very soon to aid my own capacity planning.  I'll post here when I have new results to share.

Jeremy

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.


Jeremy Zawodny

unread,
Nov 5, 2010, 12:36:47 PM11/5/10
to redi...@googlegroups.com
Here's the same test using hashes instead of zsets:

10000 hashes of 1 pairs, 10000 total in 11438000 bytes ram, 1143/pair
10000 hashes of 2 pairs, 20000 total in 11526128 bytes ram, 576/pair
10000 hashes of 10 pairs, 100000 total in 11926128 bytes ram, 119/pair
10000 hashes of 50 pairs, 500000 total in 14726128 bytes ram, 29/pair
10000 hashes of 100 pairs, 1000000 total in 59696072 bytes ram, 59/pair
10000 hashes of 200 pairs, 2000000 total in 106839016 bytes ram, 53/pair
10000 hashes of 500 pairs, 5000000 total in 213986608 bytes ram, 42/pair
10000 hashes of 1000 pairs, 10000000 total in 414240048 bytes ram, 41/pair

Jeremy

Jak Sprats

unread,
Nov 5, 2010, 1:29:30 PM11/5/10
to Redis DB

This is great, you are finding sweet spots and spots to optimise at.

Other thoughts:
1.) string w/ varying lengths
2.) ZSET, LIST, HASH w/ strings of varying lengths

The memory optimisation debate is probably advanced by now, one very
simple thing I like to point out: at a certain number (maybe 2
entries, maybe 8) arrays are just as fast as anything else, and they
have minimal memory overhead. Tries are also real good w/ memory. I am
a big fan of hybrid data-structures, because elegant data-structures
(hashes, dicts, etc...) are wasteful on small data-sets.

Post memory improvements, we can use the same tests for delta
measurements.

On Nov 5, 9:36 am, Jeremy Zawodny <Jer...@Zawodny.com> wrote:
> Here's the same test using hashes instead of zsets:
>
> 10000 hashes of 1 pairs, 10000 total in 11438000 bytes ram, 1143/pair
> 10000 hashes of 2 pairs, 20000 total in 11526128 bytes ram, 576/pair
> 10000 hashes of 10 pairs, 100000 total in 11926128 bytes ram, 119/pair
> 10000 hashes of 50 pairs, 500000 total in 14726128 bytes ram, 29/pair
> 10000 hashes of 100 pairs, 1000000 total in 59696072 bytes ram, 59/pair
> 10000 hashes of 200 pairs, 2000000 total in 106839016 bytes ram, 53/pair
> 10000 hashes of 500 pairs, 5000000 total in 213986608 bytes ram, 42/pair
> 10000 hashes of 1000 pairs, 10000000 total in 414240048 bytes ram, 41/pair
>
> Jeremy
>
> On Fri, Nov 5, 2010 at 7:49 AM, Jeremy Zawodny <Jer...@zawodny.com> wrote:
> > Jak,
>
> > Yes, I believe it does make sense and probably will be doing some of those
> > very soon to aid my own capacity planning.  I'll post here when I have new
> > results to share.
>
> > Jeremy
>
> >> redis-db+u...@googlegroups.com<redis-db%2Bunsu...@googlegroups.com>
> >> .

Salvatore Sanfilippo

unread,
Nov 5, 2010, 4:51:36 PM11/5/10
to redi...@googlegroups.com
Just to add a line of text in this discussion: there is probably a
case for optimizing zsets as well as we did with hashes, lists and set
of small integers.

As going out with 2.2 RC1 is so urgent this will probably be delayed
til 2.4 that may even just include this change as a major change and
other small things, but it seems like it's a good direction.

But I think I and Pieter could start to make our first guesses to an
efficient/compact implementation of sorted sets.

Cheers,
Salvatore

> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.


> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Reply all
Reply to author
Forward
0 new messages