Hello, are you sure that this instance was not having much more data
inside at some point?
This is the reason I added "peak memory" in INFO in Redis unstable
branch, since many of this reports are actually caused by the fact
that this instances are fill with more data, then partially emptied,
and the RSS does not go low in such a case (and the showed
fragmentation ratio is not correct).
> 2. We didn't get to the point where Redis ran out of memory, but would
> it swap at that point, or is this RSS statistic a sort of 'upper
> bound' on how much memory is allocated to Redis?
The RSS in Redis is usually the figure of the max memory used
multiplied for the actual fragmentation ratio that can be 1.3 or
alike.
So if you actually have a 3.0 fragmentation since you used to have 12
GB of data inside Redis and now you just just 4GB, actually adding 8
GB of data will likely not change the RSS as the same pages will be
used to add more data.
Salvatore
--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele
On Thu, Apr 28, 2011 at 1:41 AM, Mike K <mik...@instagram.com> wrote:
> http://d.pr/DoTm
More things that can help for sure:
output of CONFIG GET * in both the master and the slave.
full output of INFO. If you are using unstable branch please use "INFO
ALL" instead.
Thanks!
Salvatore
Hello, no sorry problem not solved, I analyzed the data and this does
not make sense to me at a first glance, for a reason: why the slave
that gets the same write pattern does not show the same problem?
I updated lloogg.com that has exactly the same workload of lpush+ltrim
to Redis unstable that contains the ziplist implementation, but so far
no evidence of this problem. So for now I've currently no idea...
From the graphs that Mike provided it is also clear that the RSS was
growing in a linear way so this really sounds like as a fragmentation
problem.
My only hypothesis is the following Consider the ziplist lpush+ltrim
pattern. What happens if you start with an empty DB is that you often
realloc to a bigger ziplist block on lpush, so you end with a lot of
small freed allocations that can't be used when the ziplist reallocs.
Until you reach the max number of elements in your lists.
So maybe the slave was not showing this problem as it was synched with
the master when lists where already at max size?
ziplists are different than all the other strings used inside redis as
they don't used the sds.c lib. Sds allocates in power of two
minimizing this kind of problems. So the fix would be to allocate
ziplists to the next power of two size as well...
I'll try an example script to verify if this is really the problem or
not and report back.
Salvatore
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
I Mike, we received pasts reports that tcmalloc() (there is support
into 2.2.4 for that) is better at fragmentation.
However jemalloc() support is trivial to backport into 2.2.x
I would try jemalloc, as if it works well for you (big) chances are
that it will get integrated into Redis source code to be used if the
target is Linux... so if you agree we are going to backport the
jemalloc support into 2.2.x branch and release it as 2.2.7 tomorrow.
Thank you for assisting us with this issue, this is very very appreciated.
If, time permitting, soon or later you can send a description of the
workload of this instance so that we can try to reproduce this problem
even into a long running (weeks) instance, this would be cool.
Especially what will help is the exact kind of write operations like
LPUSH/RPUSH/LINSERT/LREM/... and the average list size and list
element size.
If we can also get the output of
https://github.com/antirez/redis-sampler running against the slave,
that would be great.
Thank you!
Salvatore
Hello Mike, thank you for all the information provided!
The 2.2-jemalloc branch is online:
http://github.com/antirez/redis/tree/2.2-jemalloc
To compile just use:
make USE_JEMALLOC=yes
It will also build jemalloc itself (shipped together with the source
code of Redis) and link with it.
Everything seems fine and all tests are passing, but it is indeed
better to test all this into a slave as a first step.
Cheers,
Hello Mike, did you tried my branch unaltered or after applying the
patch proposed by Didier Spezia?
Unfortunately for a Makefile problem my branch is *not* using jemalloc
at all. There was a topic about this issue here in the mailing list
but I did an error not informing you directly.
If you tried my branch I'll update it today to fix this problem, and I
ask you to retry. In that case sorry for the time you wasted with the
broken branch.
I can confirm there was a problem in the old build, I just applied the
patch provided by Didier in the 2.2-jemalloc branch so now this should
be the safe one. There is to build it with:
MAKE USE_JEMALLOC=yes
Apparently this is a huge improvement when I try to load big datasets,
but I can't test fragmentation unfortunately.
Hope you'll get good results with this.
Salvatore
currently you can't make a 32 bit build using jemalloc unfortunately,
at least not out of the box.
You may try to compile jemalloc under /deps with 32 bit target
tweaking the make file and then build Redis.
Another solution for this fragmentation issues is to turn zmalloc()
into a slab allocator, but this means a much bigger memory footprint
for the same data set... Better to try IMHO if jemalloc can solve our
issues.
Salvatore
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
--
Please can you describe your work load? Is it also related to
LPUSH/LTRIM? Thanks.
Also I think that 53MB -> 58MB can be ok as long as there is at some
point no longer any growth of the fragmentation.
What is currently your fragmentation ratio? As long as it is <= 1.4 it
makes sense.
Thank you for the interesting information.
So far both looks good, up to 1.3 / 1.4 fragmentation is ok as long as
it does not tend to monotonically increase, but stops when reaching
such a value.
I'm working right now to a modification of zmalloc.c that should
prevent fragmentation problems without requiring a non standard
allocator, but it is just an experiment. We'll see if it works... I'll
push the branch today I hope.
Cheers,
Salvatore
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
--
Hello Mike, that makes a lot of sense. Trimming at every push makes
the ziplist fluctuation in length very small, just the difference
between the list minus the old element, and the list plus the new one.
With the trim 20% of times the difference is bigger and this opens
spots for fragmentation.
But this in turns makes a lot more likely that my zmalloc2 branch will
make a huge difference...
Ciao,
Salvatore
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
--
I know it is not trivial to test things in production environments,
but it would be awesome if you can also try the 2.2-zmalloc2 branch in
the same condition to see if it also solves the problem. We probably
will go anyway for jemalloc but it is interesting to see if there are
or not alternatives.
Cheers,
Salvatore
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
Thank you Mike :)