Failover in replica sets when the master is hung but not fully dead

35 views
Skip to first unread message

Spike Gronim

unread,
Nov 1, 2010, 8:21:35 PM11/1/10
to mongodb-user
Hello,

I am using MongoDB on EC2 with three hosts in a replica set
configuration. Fail over works well if I EC2 terminate, gracefully
stop, or kill -9 the mongo db master.

Today my master encountered a known bug in the Linux kernel [1] that
caused the mongod process to hang. Any process that tries to access
mongod's logging directory hangs forever, and it looks like this is
what truly hung the process. The mongod process is still listening on
port 27017, it still has established TCP connections to clients, and
most importantly it is still actively heartbeating to its peers. When
I use the shell to connect to a secondary and do "rs.status()" I see
the hung master's heartbeat incrementing steadily, and it is still in
state 1. I forced failover to occur by using iptables to block port
27017 on the hung master. I have this machine available for further
testing if you need more information.

I would like to discuss the criteria for failover. This was not a
"halting failure" - the process was alive but non-functional. So it's
a Byzantine failure, and mongod can't practically detect every such
failure. I think that "I cannot perform IO" is a reasonable case to
handle.

Can the master give itself a more thorough health check before issuing
a heartbeat? What about giving it a private, system internal capped
collection that it reads/writes every time it heartbeats? That
collection could be the oplog or a new collection. The health check
could also use client requests in a similar fashion, but mongod has to
differentiate between client mistakes and internal errors as well as
handle the case when clients are all idle. We also need to make sure
that failover doesn't happen just because IOs are slow to avoid
constantly switching masters under high IO load and similar
oscillating behaviors. This is not a trivial problem and I am curious
what the mongo developers and users think.

$ uname -a
Linux my_host A 2.6.32-308-ec2 #16-Ubuntu SMP Thu Sep 16 15:25:39 UTC
2010 x86_64 GNU/Linux
$ mongo -version
MongoDB shell version: 1.6.3

1. https://patchwork.kernel.org/patch/120327/

Eliot Horowitz

unread,
Nov 1, 2010, 11:01:32 PM11/1/10
to mongod...@googlegroups.com
Are regular user ops stalling or having errors?
Can you send the log and output of db.currentOp() ?
If all disk access is just hanging with no errors, its very hard to
tell the difference between saturated disks and an error condition...
If its really a kernel bug - not sure this exact case is something the
server should handle...

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Spike Gronim

unread,
Nov 2, 2010, 12:31:44 PM11/2/10
to mongod...@googlegroups.com
Sorry, the machine has hung to the point where I can't log back in and
ask it for the log or current operation.

It's fair to say that this exact case shouldn't be handled by the server.

--
    --William "Spike" Gronim
      william...@gmail.com

        "I'm a very technical boy."

Reply all
Reply to author
Forward
0 new messages