maxmemory & disk space recommendation

1,341 views
Skip to first unread message

ryanc...@gmail.com

unread,
Oct 22, 2013, 8:57:28 AM10/22/13
to redi...@googlegroups.com
Hi,

We have a machine currently has 4GB of memory, as I understand during AOF rewrite (we use AOF and no RDG), a redis server will be forked and take memory, so 

1. What value of maxmemory do you guy think it make sense to use? Are there any rule of thumb? 
2. Do you think use 50% of system memory as maxmemory is safe?
3. For the disk space, do you think have 200% of maxmemory free space (excluding the max db size) safe to use?


Thanks

Josiah Carlson

unread,
Oct 22, 2013, 2:29:32 PM10/22/13
to redi...@googlegroups.com
Replies inline.

On Tue, Oct 22, 2013 at 5:57 AM, <ryanc...@gmail.com> wrote:
Hi,

We have a machine currently has 4GB of memory, as I understand during AOF rewrite (we use AOF and no RDG), a redis server will be forked and take memory, so 

1. What value of maxmemory do you guy think it make sense to use? Are there any rule of thumb? 
2. Do you think use 50% of system memory as maxmemory is safe?

You should be fine with this.
 
3. For the disk space, do you think have 200% of maxmemory free space (excluding the max db size) safe to use?

No. Have as much disk space available for the Redis AOF as possible. If Redis is unable to rewrite its AOF due to low disk space, or because of some other reason, you will stop getting updates to your AOF, and your data will not be persisted. Better safe than sorry.

 - Josiah

Thanks

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Matt Palmer

unread,
Oct 24, 2013, 4:16:54 PM10/24/13
to redi...@googlegroups.com
On Tue, Oct 22, 2013 at 05:57:28AM -0700, ryanc...@gmail.com wrote:
> We have a machine currently has 4GB of memory, as I understand during AOF
> rewrite (we use AOF and no RDG), a redis server will be forked and take
> memory, so
>
> 3. For the disk space, do you think have 200% of maxmemory free space
> (excluding the max db size) safe to use?

Nope. My rule of thumb for AOF is to reserve *at least* 16x the maximum
amount of memory that your in-memory dataset could be, when the percentage
growth to trigger an automatic rewrite is 100%. For a dataset in which you
are expecting a lot of discards, you *might* be able to get away with less
(because your AOF would be full of a lot of DELs), but I wouldn't risk it.

- Matt

--
"Alas, slideware often reduces the analytical quality of presentations. In
particular, the popular PowerPoint templates (ready-made designs) usually
weaken verbal and spatial reasoning, and almost always corrupt statistical
analysis." -- http://www.edwardtufte.com/tufte/books_pp

Felix Gallo

unread,
Oct 24, 2013, 4:37:00 PM10/24/13
to redi...@googlegroups.com
16x is pretty ...pessimistic.

On linux, copy-on-write plus fork scenario means that the absolute max worst case is that your entire database gets rewritten during the time that the rewrite is happening.  Forks on some virtualization platforms (e.g. amazon's) are very slow but that's a pretty aggressive case.  In that situation the kernel, the original process, and the new process are all potentially carrying rewrite info. 

So the worst case max is around 3x stable initial redis memory size, plus whatever headroom you think is appropriate.

This is generally alleviated in real production environments by sharding your redis instance (a good idea anyway) so that each individual aof rewrite is 1/N the size of a single large rewrite, happens O(N) times more frequently (= less latency variability); and there are more cores generally involved.  And further, it's rare that redis is in a use case where you rewrite the entire database fast enough that doing it during even a slow fork is possible.  The more normal case of changing say max 5% of an environment every minute means that only relatively few pages are dirtied during aof.

F.



Matt Palmer

unread,
Oct 24, 2013, 6:29:14 PM10/24/13
to redi...@googlegroups.com
On Thu, Oct 24, 2013 at 01:37:00PM -0700, Felix Gallo wrote:
> 16x is pretty ...pessimistic.

No, it's based on real usage scenarios where less than that has resulted in
ENOSPC (which Redis, until recently, handled pretty poorly).

> On linux, copy-on-write plus fork scenario means that the absolute max
> worst case is that your entire database gets rewritten during the time that
> the rewrite is happening. Forks on some virtualization platforms (e.g.
> amazon's) are very slow but that's a pretty aggressive case. In that
> situation the kernel, the original process, and the new process are all
> potentially carrying rewrite info.

I'm not sure what CoW/fork memory usage has to do with disk space required.

> So the worst case max is around 3x stable initial redis memory size, plus
> whatever headroom you think is appropriate.

No, the worst case is pathological... I'd be surprised if I couldn't
construct an in-memory dataset that took 20x the space to store as an AOF on
disk. 16x is a safe middle ground, given my experiences trying to stop
AOF-backed Redis from filling disk and falling over. Here's what I've found:

1. A minimal AOF representation of an in-memory dataset is *significantly*
larger than the amount of memory required to store that dataset. There are
several reasons for this, including the protocol overhead (all those "*2"s,
"$27"s, and "\r\n"s add up...), and inefficient string representations of
data which is far more densely packed in memory (an integer stored in a
string, which takes up 4 bytes in memory, can take up to 10 bytes on disk,
then there's all sorts of fun ziplist-type optimisations). Our rule of
thumb -- again, taken from *extensive* real-world testing -- is that a
minimal AOF representation will take approximately 4x the space on disk that
it will occupy in memory.

2. With the default configuration, the minimal AOF will double in size
before a rewrite is triggered. That means that you'll end up with 8x memory
on disk before a rewrite even *starts*.

3. Your newly rewritten minimal AOF is once again going to take about 4x
memory on disk, alongside your overgrown AOF. We're now up to, at a
minimum, 12x memory worth of disk space -- and we're still not done.

4. While the rewrite is taking place, data is still being appended to the
original AOF, consuming more and more disk space. Also, before the mammoth
over-sized AOF can be nuked, all of the "AOF write queue" data *also* has to
be written to the new AOF, taking even *more* space. How much disk space
all this takes is dependent on how long your AOFs take to rewrite (which is
a function of the size of your Redis dataset), but if you're worrying about
diskspace to begin with, chances are your Rediises aren't small.

5. There's a final corner case that *probably* won't hit you if you've got a
stable-sized Redis (such as would be the case with a maxmemory-constrained
cache), and that's the risk that your new "minimal" AOF is actually about
the same size as the "oversized" AOF that triggered the rewrite with its
huge growth. This happens when you're storing new data all the time, and
not deleting / replacing anything, and let me tell you, it comes as a fairly
huge shock the first time it happens.

> This is generally alleviated in real production environments by sharding
> your redis instance (a good idea anyway) so that each individual aof
> rewrite is 1/N the size of a single large rewrite, happens O(N) times more
> frequently (= less latency variability); and there are more cores generally
> involved.

Sure, sharding's great, and it changes the space required calculations
somewhat, but implementing it is a separate issue from "how much space do I
need for an AOF-backed Redis of size X?".

> And further, it's rare that redis is in a use case where you
> rewrite the entire database fast enough that doing it during even a slow
> fork is possible. The more normal case of changing say max 5% of an
> environment every minute means that only relatively few pages are dirtied
> during aof.

Again, I'm not sure why you keep raising the issue of forks and memory
usage, since this thread is all about disk space usage.

- Matt

--
Software engineering: that part of computer science which is too difficult
for the computer scientist.

Reply all
Reply to author
Forward
0 new messages