Evaluating jemalloc using 2.2-jemalloc branch

94 views
Skip to first unread message

Didier Spezia

unread,
May 11, 2011, 5:51:21 PM5/11/11
to Redis DB

Hi,

I'm trying to evaluate jemalloc using the 2.2-jemalloc branch
on a 64 bits Linux platform.

My first tests were far below my expectations, so I have
investigated a bit more.

I believe the Makefile is broken regarding tcmalloc
and jemalloc (especially jemalloc), and that the glibc
malloc is actually used even when compiling with
USE_JEMALLOC=yes

This is due to the fact -ljemalloc must be used to override
the glibc symbols instead of inlining the whole library,
and this option must be placed at the end of the link
command line.

Patch can be found here: https://gist.github.com/967358
After fixing the Makefile, the results are much better.

CPU consumption is mostly similar to glibc malloc, with
a slight overhead in system CPU which can be partially
offset by tweaking MALLOC_CONF.

Memory footprint is much better especially when plenty
of very small objects are created. Here is a dummy
example:

redis-cli -r 1000000 RPUSH dummy x >/dev/null

With jemalloc:
used_memory:64725984
used_memory_human:61.73M
used_memory_rss:66998272
mem_fragmentation_ratio:1.04
mem_allocator:jemalloc

With glibc malloc:
used_memory:80798832
used_memory_human:77.06M
used_memory_rss:113664000
mem_fragmentation_ratio:1.41
mem_allocator:libc

Gains come from the fact the HAVE_MALLOC_SIZE workaround
is no longer necessary, and from the better granularity
in the allocation classes of jemalloc for small objects.

Another interesting consequence of using jemalloc is it
tends to give some deallocated memory back to the system
(more than glibc malloc). For instance after a flushall,
a lot a memory is given back.

Memory footprint could be even better if the tiny class
was extended until 32 bytes. Unfortunately, it stops at 8.
For small objects, the classes are:

Tiny: 8
Quantum-spaced: 16 32 48 ...

So 24 bytes is not an allocation class with jemalloc
(and I think there is no way to enforce it using
parameters). It is a pity because 24 bytes is precisely
the size of some critical Redis structures (dictionary
entries, list entries, etc ...). This is one point where
tcmalloc is better (all classes are 8 bytes spaced).

Long term memory fragmentation is difficult to evaluate
with my workload (it is low even with glibc malloc), but
this is supposed to be a strong point of jemalloc, so
I am confident it will improve the behavior of Redis
for most situations.

Finally, I think jemalloc support is a useful addition
to Redis. If you want to play with it, please consider
applying the Makefile patch, otherwise your results
will be meaningless.

Regards,
Didier.

Javier Guerra Giraldez

unread,
May 11, 2011, 6:48:24 PM5/11/11
to redi...@googlegroups.com
On Wed, May 11, 2011 at 4:51 PM, Didier Spezia <didi...@gmail.com> wrote:
> I believe the Makefile is broken regarding tcmalloc
> and jemalloc (especially jemalloc), and that the glibc
> malloc is actually used even when compiling with
> USE_JEMALLOC=yes

very interesting!

can you describe the brokeness on tcmalloc? did you manage to improve
that case too?

--
Javier

Didier Spezia

unread,
May 12, 2011, 5:18:44 AM5/12/11
to Redis DB
Hi,

with tcmalloc the situation is a bit different because:
1. it is linked using a shared library
2. Redis uses the tc_ prefixed flavors of memory management functions

With a normal installation of google perftools (i.e tcmalloc), the
original makefile works fine. However, if you want to link tcmalloc
or tcmalloc_minimal statically to avoid the shared library
dependency, then it fails (but hopefully, it cannot generate a
binary, so we know it fails).

To compile with a static version of tcmalloc, the patched makefile
has to be used, and g++ must be used to link the binaries (perftools
being a C++ package).

To quickly check your jemalloc/tcmalloc enabled Redis is actually
using the correct allocator, you can set the following variable
before launching Redis:

export TCMALLOC_SKIP_SBRK=true
(no variable needed for jemalloc)

then, create some objects in Redis, and use pmap:

pmap `pidof redis-server`
...
000000000063e000 1716K 1156K 1156K 1156K 0K rw-p [heap]
00007fb5c721a000 62176K 61952K 61952K 61952K 0K rw-p [anon]
...

When the glibc malloc is used, memory is allocated in heap.
When jemalloc and TCMALLOC_SKIP_SBRK tcmalloc is used instead,
memory is mmapped from an anymous zone, as shown above.

The footprint of tcmalloc looks similar to jemalloc.

With perftools 1.7 built like this:
./configure --enable-minimal --disable-cpu-profiler \
--disable-heap-profiler --disable-heap-checker \
--disable-debugalloc

And using the same dummy command:
redis-cli -r 1000000 RPUSH dummy x >/dev/null

I got:

used_memory:64725984
used_memory_human:61.73M
used_memory_rss:67346432
mem_fragmentation_ratio:1.04
mem_allocator:tcmalloc

so, very similar to jemalloc.
I cannot comment about long term fragmentation.

Regarding CPU consumption, it is hard to say which one
is better - they are both excellent.

IMO, jemalloc seems to be a better fit than tcmalloc for Redis
since it is pure C, easily embeddable, and it does not have
the stack unwinding issue of tcmalloc on Linux/64. For more
information on this last point, please refer to
http://j.mp/lEpFKg

Regards,
Didier.


On May 12, 12:48 am, Javier Guerra Giraldez <jav...@guerrag.com>
wrote:
Reply all
Reply to author
Forward
0 new messages