Hey,
this post is more about a small discussion about advantages and disadvantages of a Rolling BGSAVE instead of (pre)-configured save points.
Imagine you got a big amount of web server who access a small amount of Redis server without pipelining or persistent connections.
You are running Redis in a non cluster mode with Master - Slave replica.
You persist your data via rdb file and you are not using AOF.
If such a save point is triggered a BGSAVE-Command will be called.
During a BGSAVE-Command a new process will be forked (because of the BG (Background) in BGSAVE).
The fork process is blocking, no other command will be executed until the fork of the process was finished.
Depending on the size of the memory table this fork can took a while.
In our environment we saw a lot of requests running into timeouts due to a triggered BGSAVE command by a pre configured save point.
Hulu is using a kind of "Rolling BGSAVE":
Since Redis usually can only use one CPU core, and our boxes have 24 CPUs, we are running 16 shards per box trying to achieve a one to one CPU ratio to maximize our hardware utilization. However, when bgsave is enabled, Redis will fork a new process and may even use double the ram. To keep performance consistent on the shards, we disabled the writing to disk across all the shards, and we have a cron job that runs at 4am everyday doing a rolling “BGSAVE” command on each individual instance. We also use this rdb file for analyzing the Redis data without affecting the performance of the application.
So we tried the same. Disable all save points and triggered BGSAVE in times where we know that the amount of traffic is not so high (during night for example).
And voila, we reduced the connections running into a timeout a lot.
This is the situation (in short).
Now i want to discuss with you if this is a good or bad solution.
I`d collected some reasons. I like to know your reasons + opinion as well :)
Advantage of a preconfigured save point:
* Save is triggered by the application itselfs (redis)
* Persistence is triggered only if this is necessary (depends on the amount of changed + configured keys)
Disadvantage of a preconfigured save point:
* BGSAVE is triggered independent by time, is triggered by seconds + data changes
* If only a small amount of keys changed, there can be a long timeframe without persistence (depend on your configuration)
* In replication there can be the case that all servers triggered a BGSAVE in the same timeframe
Advantage of a rolling BGSAVE:
* You can time it to times where traffic is not high
* You explicity know when you got a dump / when your last dump was triggered
* You can configure when which server fires a BGSAVE (Rolling)
Disadvantage of a rolling BGSAVE:
* Maybe there is a bigger timeframe between the save points (depends on the amount of command changes)
* You have to take care that the BGSAVE is triggeres
Thanks,
Andy