decision of ziplist over zipmap

Fumin Wang

unread,

Apr 23, 2014, 6:53:29 AM4/23/14

to redi...@googlegroups.com

We are currently optimizing our memory usage along the lines of

http://redis.io/topics/memory-optimization

and realized that the config option mentioned in that doc "hash-max-zipmap-entries" is outdated and currently both small ZSETs and HASHes are encoded as ziplists.

Can anyone explain or point me to a discussion thread as to why this conversion to ziplists is made?

The best I could find related to this decision is this commit "encode small hashes with a ziplist" https://github.com/antirez/redis/commit/ebd85e9a455df689c9be02a93354f580df4cafd8 .

I tried to search for code and pull requests in github.com and here in google groups but to no avail.

By the way, I noticed in https://github.com/antirez/redis/blob/unstable/redis.conf , the default value for say "hash-max-ziplist-entries" is 512 instead of 1024. Are there any reasons why not make the default be 1024?

Michel Martens

unread,

Apr 23, 2014, 9:54:37 AM4/23/14

to redi...@googlegroups.com

On 23 April 2014 07:53, Fumin Wang <awaw...@gmail.com> wrote:
> Can anyone explain or point me to a discussion thread as to why this
> conversion to ziplists is made?

Maybe this thread provides a good clarification:

https://groups.google.com/forum/?hl=en#!searchin/redis-db/Need$20some$20help$3A$20hash$20redis.conf$20settings/redis-db/1Mbd3V1ZA4g/JV_QyQ-NO-EJ

Fumin Wang

unread,

Apr 23, 2014, 10:43:17 AM4/23/14

to redi...@googlegroups.com

Hi Michel

Yes, I understand ziplist/zipmaps are the preferred data structures when the number of elements are small.

In fact, I've read both src/t_hash.c and src/t_zset.c and have a pretty clear idea of the implementation under the hood. For example, the options "*-max-ziplist-entries" are actually the names of the variables in the `server` object, and that what "max-ziplist-value" means is the max byte length of the value that would trigger a convert to a true hash table. The source code of Redis is highly readable, and we now have a firm understanding of why the ziplist approach results in a 3X improvement in memory usage in our experiments.

Nevertheless, there remains the partly academic question: why the switch from zipmaps to ziplists around 2 years ago? From what I can see, zipmaps are actually not used anywhere except in zipmap.c/h and tests. I'm curious about the real world cases or profiling results that suggest the total deprecation of zipmaps.

Salvatore Sanfilippo

unread,

Apr 23, 2014, 10:48:21 AM4/23/14

to Redis DB

On Wed, Apr 23, 2014 at 4:43 PM, Fumin Wang <awaw...@gmail.com> wrote:
> Nevertheless, there remains the partly academic question: why the switch
> from zipmaps to ziplists around 2 years ago? From what I can see, zipmaps
> are actually not used anywhere except in zipmap.c/h and tests. I'm curious
> about the real world cases or profiling results that suggest the total
> deprecation of zipmaps.

Hello,

the idea was that having two different encodings was not a good idea,
two times the bugs, and so forth. Also ziplists were more
space-efficient.
So if I remember correctly Pieter Noordhuis at some point rewrote the
code to use ziplists instead.
Only one format to deal with...

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

Michel Martens

unread,

Apr 23, 2014, 10:52:43 AM4/23/14

to redi...@googlegroups.com

On 23 April 2014 11:43, Fumin Wang <awaw...@gmail.com> wrote:
>
> Nevertheless, there remains the partly academic question: why the switch
> from zipmaps to ziplists around 2 years ago? From what I can see, zipmaps
> are actually not used anywhere except in zipmap.c/h and tests. I'm curious
> about the real world cases or profiling results that suggest the total
> deprecation of zipmaps.

Sorry, I missed the nature of your question and replied with something else.

Fumin Wang

unread,

Apr 23, 2014, 12:22:23 PM4/23/14

to redi...@googlegroups.com

Hi Michel and Salvatore

Got it, I guess having one single encoding to maintain is indeed a valid reason for refactoring.

By the way, in the meantime we also made some additional insights about the ziplist optimization:

We discovered similar experiments conducted by Instagram who reported 4X memory savings (70MB -> 16MB), and were quite perplexed by our results of only 3X (1.47GB -> 505MB).

http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs

However, we eventually realized that this was because our key distribution was actually a bit sparse. Instead of having every bucket fully packed with 1000 members, some of our buckets had only 300 members, resulting in more top level keys needed for the same total number of records to be hashed. Another way of understanding this is to consider the degenerate case of having all buckets storing only 1 member, in this case the memory usage would be identical to that of storing everything in a single HSET.

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/0sIXS3OD3H4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward