cluster state failing

199 views
Skip to first unread message

Apoorva Gaurav

unread,
Jun 17, 2015, 7:25:32 AM6/17/15
to redi...@googlegroups.com
We have a 6 node cluster running on 3 EC2 m3.xlarge instances, these are test server and share multiple other services.


-Today morning one of the master started throwing these 
28400:M 17 Jun 03:34:18.584 # Connection with slave 10.174.31.118:7002 lost.
28400:M 17 Jun 03:34:19.501 * Slave 10.174.31.118:7002 asks for synchronization
28400:M 17 Jun 03:34:19.501 * Unable to partial resync with slave 10.174.31.118:7002 for lack of backlog (Slave request was: 777184558843).
28400:M 17 Jun 03:34:19.501 * Starting BGSAVE for SYNC with target: disk
28400:M 17 Jun 03:34:19.502 # Can't save in background: fork: Cannot allocate memory
28400:M 17 Jun 03:34:19.502 * Replication failed, can't BGSAVE
28400:M 17 Jun 03:34:20.503 * Slave 10.174.31.118:7002 asks for synchronization
28400:M 17 Jun 03:34:20.503 * Full resync requested by slave 10.174.31.118:7002
28400:M 17 Jun 03:34:20.503 * Starting BGSAVE for SYNC with target: disk
28400:M 17 Jun 03:34:20.503 # Can't save in background: fork: Cannot allocate memory
28400:M 17 Jun 03:34:20.503 * Replication failed, can't BGSAVE
-Disk i/o, pagefaults etc jumped to very high values at the same time
-As BGSAVE was failing replication broke down
-Master kept on trying to flush at the disk for about half an hour, post which it started throwing "too many open files" (on these instance ulimit was 1024)
-Slave kept to trying to replicate but couldn't
-After 5 hours during which Master had written 51GB of append only logs and 36GB of application logs it went down.
-Other instances declared the cluster state as fail as there was no Master for the shard and slave was outdated

-Before this incidence the same master occassionally used to throw exceptions as described in https://github.com/antirez/redis/issues/641 https://github.com/antirez/redis/issues/368 which signifies that the disk is slow 

-Interestingly another master is on the same EC2 instance and it didn't show any signs of troubles. Any other service running on the same instance also didn't show any signs of trouble.


We are thinking of upgrading our prod redis setups to cluster, but such issues are red flag. What might be the root cause and how can it be prevented


IMPORTANT NOTICE: This e-mail, including any attachments, may contain confidential information and is intended only for the addressee(s) named above. If you are not the intended recipient(s), you should not disseminate, distribute, or copy this e-mail. Please notify the sender by reply e-mail immediately if you have received this e-mail in error and permanently delete all copies of the original message from your system. E-mail transmission cannot be guaranteed to be secure as it could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Company accepts no liability for any damage caused by any virus transmitted by this e-mail.

Greg Andrews

unread,
Jun 17, 2015, 11:23:54 AM6/17/15
to redi...@googlegroups.com
The logfile has the first indication of the root cause:

28400:M 17 Jun 03:34:19.502 # Can't save in background: fork: Cannot allocate memory

This master process tried to fork a child process to perform the background save of its database to disk, but was unable.  You should see errors logged by the kernel around the same time.

  -Greg

Apoorva Gaurav

unread,
Jun 17, 2015, 11:36:23 AM6/17/15
to redi...@googlegroups.com
Cannot allocate memory error kept coming for an hour while other services including another instance of redis cluster kept working smoothly. 
Didn't find anything interesting in /var/log/messages

--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/Zjd8kYnBYJw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.



--
Thanks & Regards,
Apoorva

Greg Andrews

unread,
Jun 17, 2015, 11:55:45 AM6/17/15
to redi...@googlegroups.com
What do the performance graphs for this machine say about the amount of free ram available during that period?

  -Greg

You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

Apoorva Gaurav

unread,
Jun 17, 2015, 12:06:17 PM6/17/15
to redi...@googlegroups.com

no sudden jump, though the disk io jumped dramatically

Erdem ASLAN

unread,
Jun 18, 2015, 8:58:07 AM6/18/15
to redi...@googlegroups.com
Somehow Slave asked for a full synch or Master decided that partial synch is not an option. Full synch generally requires the same amount of memory that your Master is using at that moment (may vary depending on your redis config, thou).

So, two questions / comments;

 - Your other Master may be holding less data ?
 - Again, your other master may not doing full synchs at all ?

Also, you should also keep in mind, allocating memory for child process is not incremental. You may have plenty of free memory but the amount should be sufficient to cover the amount of Master's current memory. 

Apoorva Gaurav

unread,
Jun 18, 2015, 9:10:43 AM6/18/15
to redi...@googlegroups.com
On Thu, Jun 18, 2015 at 10:32 AM, Erdem ASLAN <erde...@gmail.com> wrote:
Somehow Slave asked for a full synch or Master decided that partial synch is not an option. Full synch generally requires the same amount of memory that your Master is using at that moment (may vary depending on your redis config, thou).

So, two questions / comments;

 - Your other Master may be holding less data ?
Yes 
 - Again, your other master may not doing full synchs at all ?
Can't see in logs other slaves asking for full syncs while for this one I can see that slave lost the connection and asked for a full sync
logs at slave
2218:S 17 Jun 03:34:18.578 # Connection with master lost.
2218:S 17 Jun 03:34:18.578 * Caching the disconnected master state.
2218:S 17 Jun 03:34:19.494 * Connecting to MASTER 10.174.31.89:7003
2218:S 17 Jun 03:34:19.494 * MASTER <-> SLAVE sync started
2218:S 17 Jun 03:34:19.494 * Non blocking connect for SYNC fired the event.
2218:S 17 Jun 03:34:19.495 * Master replied to PING, replication can continue...
2218:S 17 Jun 03:34:19.495 * Trying a partial resynchronization (request e5d39aea32bd6c5959b20c0184458860227f0a6e:777184558843).
2218:S 17 Jun 03:34:19.495 * Full resync from master: e5d39aea32bd6c5959b20c0184458860227f0a6e:777602513489
2218:S 17 Jun 03:34:19.495 * Discarding previously cached master state.
2218:S 17 Jun 03:34:19.496 # MASTER aborted replication with an error: ERR Unable to perform background save
2218:S 17 Jun 03:34:20.495 * Connecting to MASTER 10.174.31.89:7003
2218:S 17 Jun 03:34:20.495 * MASTER <-> SLAVE sync started
2218:S 17 Jun 03:34:20.496 * Non blocking connect for SYNC fired the event.
2218:S 17 Jun 03:34:20.496 * Master replied to PING, replication can continue...
2218:S 17 Jun 03:34:20.496 * Partial resynchronization not possible (no cached master)
2218:S 17 Jun 03:34:20.497 * Full resync from master: e5d39aea32bd6c5959b20c0184458860227f0a6e:777602513489
2218:S 17 Jun 03:34:20.497 # MASTER aborted replication with an error: ERR Unable to perform background save
2218:S 17 Jun 03:34:21.496 * Connecting to MASTER 10.174.31.89:7003
2218:S 17 Jun 03:34:21.496 * MASTER <-> SLAVE sync started
2218:S 17 Jun 03:34:21.496 * Non blocking connect for SYNC fired the event.
2218:S 17 Jun 03:34:21.497 * Master replied to PING, replication can continue...
2218:S 17 Jun 03:34:21.497 * Partial resynchronization not possible (no cached master)
2218:S 17 Jun 03:34:21.497 * Full resync from master: e5d39aea32bd6c5959b20c0184458860227f0a6e:777602513489
2218:S 17 Jun 03:34:21.498 # MASTER aborted replication with an error: ERR Unable to perform background save
2218:S 17 Jun 03:34:22.497 * Connecting to MASTER 10.174.31.89:7003
2218:S 17 Jun 03:34:22.497 * MASTER <-> SLAVE sync started
2218:S 17 Jun 03:34:22.497 * Non blocking connect for SYNC fired the event.

logs at master, clearly we can see that the slave lost connection first and then asked master for resync. Question is shouldn't master be more resilient towards slave demands, else even normal master slave is not stable, let alone cluster
28400:M 17 Jun 03:34:18.584 # Client id=2253 addr=10.174.31.118:41859 fd=23 name= age=490192 idle=1 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=49 oll=1 omem=419430440 events=rw cmd=replconf scheduled to be closed ASAP for overcoming of output buffer limits.
28400:M 17 Jun 03:34:18.584 # Connection with slave 10.174.31.118:7002 lost.
28400:M 17 Jun 03:34:19.501 * Slave 10.174.31.118:7002 asks for synchronization
28400:M 17 Jun 03:34:19.501 * Unable to partial resync with slave 10.174.31.118:7002 for lack of backlog (Slave request was: 777184558843).
28400:M 17 Jun 03:34:19.501 * Starting BGSAVE for SYNC with target: disk
28400:M 17 Jun 03:34:19.502 # Can't save in background: fork: Cannot allocate memory
28400:M 17 Jun 03:34:19.502 * Replication failed, can't BGSAVE
28400:M 17 Jun 03:34:20.503 * Slave 10.174.31.118:7002 asks for synchronization
28400:M 17 Jun 03:34:20.503 * Full resync requested by slave 10.174.31.118:7002
28400:M 17 Jun 03:34:20.503 * Starting BGSAVE for SYNC with target: disk
28400:M 17 Jun 03:34:20.503 # Can't save in background: fork: Cannot allocate memory
28400:M 17 Jun 03:34:20.503 * Replication failed, can't BGSAVE
28400:M 17 Jun 03:34:21.003 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
28400:M 17 Jun 03:34:21.504 * Slave 10.174.31.118:7002 asks for synchronization
28400:M 17 Jun 03:34:21.504 * Full resync requested by slave 10.174.31.118:7002
28400:M 17 Jun 03:34:21.504 * Starting BGSAVE for SYNC with target: disk
28400:M 17 Jun 03:34:21.504 # Can't save in background: fork: Cannot allocate memory
28400:M 17 Jun 03:34:21.504 * Replication failed, can't BGSAVE

Erdem ASLAN

unread,
Jun 18, 2015, 9:27:51 AM6/18/15
to redi...@googlegroups.com
"Question is shouldn't master be more resilient towards slave demands, else even normal master slave is not stable, let alone cluster"

I think your concern is about  Slave / Master connection since the issue you've encountered triggered the full synch. From experience (some of them were pretty bad actually ...), networked systems are not prone to fail, they are guaranteed to fail. At any given moment, you may loose your network switch, eth interface and other related network layer software and/or hardware. In terms of data integrity and resilience, Redis is doing what it can do, your main issue is; you just don't provide enough resources for it to survive such network failure.

Full synch may occur anytime in any type of deployment, even in Clustered Redis since it also has multiple masters paired with slaves. 

As for the connection issue (Slaves' connection demands towards the Master); Redis is single threaded at its core, every event (with minor exceptions) including lifecycle events (sentinel pings / heartbeats etc. ) is being handled within that single event queue. You have to be very careful not to block your Master with long blocking operations. Also your slave should also have enough resources in terms memory.

P.S: Hitting to Output buffer in Master node is tricky, not sure what caused it. 
Reply all
Reply to author
Forward
0 new messages