BGSAVE lost data, hangs for 20min on EC2

502 views
Skip to first unread message

Dane Miller

unread,
Jan 3, 2012, 3:01:54 PM1/3/12
to Redis DB
Hi,

We're seeing problems with BGSAVE on an EC2 instance running redis
2.4.1. Most recently, BGSAVE caused redis to hang for 20 minutes and
lose lots of data. This was on a redis master, and all clients,
including 3 slaves, lost their connection during this 20 min period of
time. From our monitoring graphs, the 20min BGSAVE corresponds to a
burst of swapping and a spike in CPU steal and iowait.

Number of keys before and after the BGSAVE...
before: 22M
after: 3k

Losing data sucks. Any ideas on where the mis-configuration might be?

Dane

redis.log (annotated)
-----------------------
21:00:15 - DB 0: 22039263 keys (22039263 volatile) in 33554432 slots
HT.
21:00:27 - BGSAVE starts
slaves get MASTER timeout
burst of swap activity 21H
21:19:30 - BGSAVE ends
connecttion errors from many clients
slaves ask for sync
21:19:53 - DB 0: 3377 keys (3377 volatile) in 4096 slots HT.

System
-----------------------
EC2 m1.large
Ubuntu 10.04 64bit
kernel: 2.6.38-8-virtual x86_64 GNU/Linux
redis 2.4.1

Filesystems and Swap
-----------------------
redis dir: /var/lib/redis (ext4)
swap: /dev/dm-0


redis.conf:
-----------------------
daemonize yes
pidfile /var/run/redis.pid
port 6379
timeout 300
loglevel verbose
logfile /mnt/redis-log/redis.log
databases 16
save 900 1
save 300 10
save 60 100000
rdbcompression yes
dbfilename dump.rdb
dir /var/lib/redis
slave-serve-stale-data yes
maxmemory 4000mb
maxmemory-policy volatile-ttl
maxmemory-samples 10
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
slowlog-log-slower-than 1000000
slowlog-max-len 1024
vm-enabled no
vm-swap-file /tmp/redis.swap
vm-max-memory 0
vm-page-size 32
vm-pages 134217728
vm-max-threads 4
hash-max-zipmap-entries 512
hash-max-zipmap-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes

Josiah Carlson

unread,
Jan 3, 2012, 5:28:27 PM1/3/12
to redi...@googlegroups.com
Could you send the output of INFO?

Thank you,
- Josiah

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

Andy McCurdy

unread,
Jan 3, 2012, 5:32:43 PM1/3/12
to redi...@googlegroups.com
Ubuntu 10.04 and 10.10 have significant CPU compatibility issues with some of the EC2 hardware. While we never lost data, we did experience instance connectivity issues when performing heavy IO operations like BGSAVE or large network transfers.

Upgrading to Ubuntu 11.04 solved these issues for us.

-andy

Dvir Volk

unread,
Jan 3, 2012, 5:43:05 PM1/3/12
to redi...@googlegroups.com
yup, same here. we moved to debian then later to ubuntu 11.10 and it was solved in both OSes.
BGSAVE still slows things down a bit but nothing like that.
--
Dvir Volk
System Architect, DoAT, http://doat.com

Dane Miller

unread,
Jan 3, 2012, 6:28:49 PM1/3/12
to redi...@googlegroups.com
On Tue, Jan 3, 2012 at 2:28 PM, Josiah Carlson <josiah....@gmail.com> wrote:
> Could you send the output of INFO?

Here's the output of INFO from today. I've also created a page with
graphs covering the 20min BGSAVE outage. You'll find system metrics
(cpu, memory, processes) as well as redis counters:

Graphs from 2011-Dec-31:
https://s3.amazonaws.com/redis-bgsave-crash-graphs/index.html

redis INFO from today
--------------------
redis_version:2.4.1
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
process_id:31239
uptime_in_seconds:1143630
uptime_in_days:13
lru_clock:442739
used_cpu_sys:232277.86
used_cpu_user:98715.13
used_cpu_sys_children:16887.25
used_cpu_user_children:141665.44
connected_clients:186
connected_slaves:3
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:2625362720
used_memory_human:2.45G
used_memory_rss:2670764032
used_memory_peak:5861321848
used_memory_peak_human:5.46G
mem_fragmentation_ratio:1.02
mem_allocator:jemalloc-2.2.1
loading:0
aof_enabled:0
changes_since_last_save:17834
bgsave_in_progress:0
last_save_time:1325633123
bgrewriteaof_in_progress:0
total_connections_received:24560750
total_commands_processed:8111961900
expired_keys:15069
evicted_keys:98253446
keyspace_hits:3795854023
keyspace_misses:3684036115
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:1125426
vm_enabled:0
role:master
db0:keys=12593757,expires=12593757


Dane

Marty Weiner

unread,
Jan 3, 2012, 6:31:18 PM1/3/12
to redi...@googlegroups.com
This shows peak usage of 5.46G.  When you get close to the edge of your RAM limits, redis can start to misbehave on BGSAVE.

Something else to check -- find your .rdb file and make sure it's timestamp is getting updated (ls -l) after the bgsave completes.  While the bgsave processes, you'll see a temp rdb file next to it growing.  It should grow by 10's to 100's megs a second.

Dane Miller

unread,
Jan 3, 2012, 6:31:28 PM1/3/12
to redi...@googlegroups.com
On Tue, Jan 3, 2012 at 2:32 PM, Andy McCurdy <sed...@gmail.com> wrote:
> Ubuntu 10.04 and 10.10 have significant CPU compatibility issues with some
> of the EC2 hardware. While we never lost data, we did experience instance
> connectivity issues when performing heavy IO operations like BGSAVE or large
> network transfers.
>
> Upgrading to Ubuntu 11.04 solved these issues for us.

Oops, my mistake. We're using Ubuntu 11.04, not 10.04 as I indicated.

What kernel are you using with 11.04?

Dane

Andy McCurdy

unread,
Jan 3, 2012, 6:32:48 PM1/3/12
to redi...@googlegroups.com
Whatever the default one is with the official Ubuntu 11.04 AMI

Marc Gravell

unread,
Jan 4, 2012, 2:21:25 AM1/4/12
to redi...@googlegroups.com
Just to echo this.... we recently had some.... "fun" when trying to upgrade the physical RAM in a master, when that master was hitting the physical RAM limit. Trying to bgsave caused the master to crawl, which killed the calling application. Unfortunately, our plan was: upgrade a slave's memory first, get them talking, transfer load to the slave. But the first thing the slave does (when reconnecting) is: bgsave. So it all came crashing down.

No fault here levied at redis: just an advisory - do **not** let your redis servers get critical on RAM. Bad things. Very bad things.

Marc
--
Regards,

Marc

Dane Miller

unread,
Jan 4, 2012, 4:42:14 PM1/4/12
to redi...@googlegroups.com
On Tue, Jan 3, 2012 at 11:21 PM, Marc Gravell <marc.g...@gmail.com> wrote:
> Just to echo this.... we recently had some.... "fun" when trying to upgrade
> the physical RAM in a master, when that master was hitting the physical RAM
> limit. Trying to bgsave caused the master to crawl, which killed the calling
> application. Unfortunately, our plan was: upgrade a slave's memory first,
> get them talking, transfer load to the slave. But the first thing the slave
> does (when reconnecting) is: bgsave. So it all came crashing down.
>
> No fault here levied at redis: just an advisory - do **not** let your redis
> servers get critical on RAM. Bad things. Very bad things.

In our redis.conf, maxmemory is set to 4000mb on an EC2 m1.large
instance with 7.5GB memory. Given this maxmemory setting, why would
peak memory usage, as reported by redis INFO, show 5.4G?

More importantly, I'm still somewhat mystified that something caused
redis to lose 22M keys when the BGSAVE stalled.

Dane

Josiah Carlson

unread,
Jan 4, 2012, 5:05:04 PM1/4/12
to redi...@googlegroups.com
On Wed, Jan 4, 2012 at 1:42 PM, Dane Miller <da...@xa.net> wrote:
> On Tue, Jan 3, 2012 at 11:21 PM, Marc Gravell <marc.g...@gmail.com> wrote:
>> Just to echo this.... we recently had some.... "fun" when trying to upgrade
>> the physical RAM in a master, when that master was hitting the physical RAM
>> limit. Trying to bgsave caused the master to crawl, which killed the calling
>> application. Unfortunately, our plan was: upgrade a slave's memory first,
>> get them talking, transfer load to the slave. But the first thing the slave
>> does (when reconnecting) is: bgsave. So it all came crashing down.
>>
>> No fault here levied at redis: just an advisory - do **not** let your redis
>> servers get critical on RAM. Bad things. Very bad things.
>
> In our redis.conf, maxmemory is set to 4000mb on an EC2 m1.large
> instance with 7.5GB memory.  Given this maxmemory setting, why would
> peak memory usage, as reported by redis INFO, show 5.4G?

I have some ideas, but nothing that I've checked the code to verify.
Someone else may have an idea.

> More importantly, I'm still somewhat mystified that something caused
> redis to lose 22M keys when the BGSAVE stalled.

What are the TTLs on your keys, typically? If they are relatively
short, the following scenario is possible:

1. Redis forked for the BGSAVE
2. the child process started scanning through items, some of them had
expired, so the child Redis evicted them on it's way through
3. as the child process was churning through memory, expiring keys,
etc., the shared memory between the two processes dropped to 0
4. the parent process is now stalled while the child process fills the
remaining memory, inducing swap, etc.
5. at some point the parent process is able to continue, performs a
sampling of keys to evict, notices that more than a few should be
evicted, then scans the entire keyspace for keys to expire, expiring
the majority of the 22M keys.

Regards,
- Josiah

Marty Weiner

unread,
Jan 4, 2012, 7:26:58 PM1/4/12
to redi...@googlegroups.com
This could have happened if many past bgsave's failed.  BGSAVE starts saving to a temp file.  When finished, it replaces it's main dump.rdb file with the temp one.  Is there a chance bgsaves haven't been succeeding for a while?


Dane

Dane Miller

unread,
Jan 5, 2012, 1:50:26 PM1/5/12
to redi...@googlegroups.com
On Wed, Jan 4, 2012 at 4:26 PM, Marty Weiner <ma...@pinterest.com> wrote:
> This could have happened if many past bgsave's failed.  BGSAVE starts saving
> to a temp file.  When finished, it replaces it's main dump.rdb file with the
> temp one.  Is there a chance bgsaves haven't been succeeding for a while?

I see 4 temp rdb files in the redis dir -- all dating from about 1
week prior to the outage. I'm guessing this indicates that BGSAVE
failed several times in the past?

But in response to your question about whether BGSAVE has been
succeeding: is redis.log a reliable source for this info?

In the 15 minutes leading up to the outage, redis.log shows successful
BGSAVE events...
[31239] 31 Dec 20:46:33 * 10 changes in 300 seconds. Saving...
[31239] 31 Dec 20:46:34 * Background saving started by pid 29126
[29126] 31 Dec 20:47:49 * DB saved on disk
[31239] 31 Dec 20:47:49 * Background saving terminated with success
[31239] 31 Dec 20:52:50 * 10 changes in 300 seconds. Saving...
[31239] 31 Dec 20:52:51 * Background saving started by pid 29180
[29180] 31 Dec 20:54:08 * DB saved on disk
[31239] 31 Dec 20:54:09 * Background saving terminated with success
[31239] 31 Dec 20:59:10 * 10 changes in 300 seconds. Saving...
[31239] 31 Dec 20:59:11 * Background saving started by pid 29229
[29229] 31 Dec 21:00:27 * DB saved on disk
... 20 min outage + data loss here ...
[31239] 31 Dec 21:19:30 * Background saving terminated with success
[31239] 31 Dec 21:19:47 * Slave ask for synchronization


Does this offer any clues why the BGSAVE outage would have lost data?

Dane

Josiah Carlson

unread,
Jan 5, 2012, 2:19:38 PM1/5/12
to redi...@googlegroups.com
On Thu, Jan 5, 2012 at 10:50 AM, Dane Miller <da...@xa.net> wrote:
> On Wed, Jan 4, 2012 at 4:26 PM, Marty Weiner <ma...@pinterest.com> wrote:
>> This could have happened if many past bgsave's failed.  BGSAVE starts saving
>> to a temp file.  When finished, it replaces it's main dump.rdb file with the
>> temp one.  Is there a chance bgsaves haven't been succeeding for a while?
>
> I see 4 temp rdb files in the redis dir -- all dating from about 1
> week prior to the outage.  I'm guessing this indicates that BGSAVE
> failed several times in the past?

Yes.

> But in response to your question about whether BGSAVE has been
> succeeding: is redis.log a reliable source for this info?

Yes.

> In the 15 minutes leading up to the outage, redis.log shows successful
> BGSAVE events...
> [31239] 31 Dec 20:46:33 * 10 changes in 300 seconds. Saving...
> [31239] 31 Dec 20:46:34 * Background saving started by pid 29126
> [29126] 31 Dec 20:47:49 * DB saved on disk
> [31239] 31 Dec 20:47:49 * Background saving terminated with success
> [31239] 31 Dec 20:52:50 * 10 changes in 300 seconds. Saving...
> [31239] 31 Dec 20:52:51 * Background saving started by pid 29180
> [29180] 31 Dec 20:54:08 * DB saved on disk
> [31239] 31 Dec 20:54:09 * Background saving terminated with success
> [31239] 31 Dec 20:59:10 * 10 changes in 300 seconds. Saving...
> [31239] 31 Dec 20:59:11 * Background saving started by pid 29229
> [29229] 31 Dec 21:00:27 * DB saved on disk
> ... 20 min outage + data loss here ...
> [31239] 31 Dec 21:19:30 * Background saving terminated with success
> [31239] 31 Dec 21:19:47 * Slave ask for synchronization
>
>
> Does this offer any clues why the BGSAVE outage would have lost data?

It looks like there was a problem removing the old file and replacing
it with the new one. If you're in EC2, and you are storing data
locally, I bet you are using EBS-backed storage, right? Sometimes EBS
can hang. Sometimes EBS can disappear. Sometimes EBS is... well...
crap. Most of the time it's not bad, sometimes it's even quite good.
But sometimes, it's horrible. You should check some of your other
system logs for that time period.

It seems possible to me that during the replacement of the old file,
your EBS hung. When Redis was finally able to continue, it started
sampling your volatile keys, and evicting the hell out of them. Hence
22M -> 3k. You never answered the question about your TTLs. What do
your TTLs look like?

Regards,
- Josiah

Dane Miller

unread,
Jan 5, 2012, 3:27:39 PM1/5/12
to redi...@googlegroups.com
On Thu, Jan 5, 2012 at 11:19 AM, Josiah Carlson
<josiah....@gmail.com> wrote:
> It looks like there was a problem removing the old file and replacing
> it with the new one. If you're in EC2, and you are storing data
> locally, I bet you are using EBS-backed storage, right? Sometimes EBS
> can hang. Sometimes EBS can disappear. Sometimes EBS is... well...
> crap. Most of the time it's not bad, sometimes it's even quite good.
> But sometimes, it's horrible. You should check some of your other
> system logs for that time period.

We're using local ephemeral storage, not EBS devices. There was
definitely a burst of disk+swap IO at that time. Unfortunately the
system logs don't show anything unusual.

> It seems possible to me that during the replacement of the old file,
> your EBS hung. When Redis was finally able to continue, it started
> sampling your volatile keys, and evicting the hell out of them. Hence
> 22M -> 3k. You never answered the question about your TTLs. What do
> your TTLs look like?

Our keys are long-lived -- TTLs are 32 days or greater. So I don't
think normal key expiration is to blame. But perhaps
maxmemory-policy=volatile-ttl kicked in and went crazy evicting keys
because it thought it was out of memory due to a slow BGSAVE child
process. A bug?

More importantly, how should we configure redis to survive slow AWS
disk IO? Set maxmemory-policy=noeviction?

Is there a recommended way to run redis on EC2?

Dane

Marty Weiner

unread,
Jan 5, 2012, 4:26:21 PM1/5/12
to redi...@googlegroups.com, redi...@googlegroups.com
It's likely the disk io was fine. Ephemeral storage is fast enough on a large instance. Likely you were over your ram limit which is known to put redis into a funky high io state that it might or might not recover from.

M. David Peterson

unread,
Jan 10, 2012, 5:28:19 PM1/10/12
to redi...@googlegroups.com
On Tue, Jan 3, 2012 at 4:31 PM, Dane Miller <da...@xa.net> wrote:
Oops, my mistake.  We're using Ubuntu 11.04, not 10.04 as I indicated.

Just to inject a bit of potential context, the issue you're experiencing is possibly related to a bug related specifically to the the Intel Xeon E5507 processor running on EC2 and the Linux kernel originally used in 10.04 LTS (specifically the 2.6.32-314-ec2 kernel) release of Ubuntu.  While the base distro you're running might appear to be Ubuntu 11.04, due to the packaging process of a Linux AMI, if the AMI you're using was based on an AMI that was upgraded to a newer release (rather than rebuilding your software stack on top of a clean 11.04-based AMI) the kernel/ramdisk from the original distro could have inadvertently found its way into fresh instances of the upgraded AMI that were launched. As a result, to ensure the issue you're seeing is different you'll want to verify you're running a kernel/ramdisk >=  2.6.32-311-ec2.

As mentioned the issue is related specifically to the Intel Xeon E5507 CPU's (which is used extensively throughout the EC2 data centers.) AMI's running on top  of the Intel Xeon E5430 CPU's (also used extensively throughout the AWS data centers) were unaffected, though given you have no way of specifying the CPU at launch there is no way to circumvent the issue directly when launching new AMI's. A clear and well researched summary of the problem can be found at [1]. The historical 'discovery' breakdown can be at [2], and the specific bug related to the issue that contains the ultimate fix can be found at [3].

Reply all
Reply to author
Forward
0 new messages