How failover is handled?

70 views
Skip to first unread message

mo

unread,
Oct 21, 2011, 11:04:42 AM10/21/11
to mongodb-csharp
I was under the impression that if I was connected to a replicaset of
two nodes (plus one arbiter), that if one of the servers went down,
that the driver would automatically handle this event and direct all
traffic to the remaining server, but instead I'm getting unhandled
IOExceptions ("Unable to write data to the transport connection: An
existing connection was forcibly closed by the remote host.") every
time I try to test this.

public partial class Form1 : Form
{
MongoServer server;
MongoDatabase database;
MongoCollection<BsonDocument> books;
public Form1()
{
InitializeComponent();
}

private void Form1_Load(object sender, EventArgs e)
{
server = MongoServer.Create("mongodb://localhost:27021/?
replicaSet=foo;slaveOk=true");
database = server["bar"];
books = database["books"];
BsonDocument book = new BsonDocument { { "Author", "Ernest
Hemingway" }, { "Title", "For Whom The Bell Tolls" } };
//SafeModeResult result = books.Insert(book);
}

private void button1_Click(object sender, EventArgs e)
{
BsonDocument book = books.FindOne(); //IOException happens here
after I shut down a server.
textBox1.Text = book[1].ToString();
}
}


If I change my connection string to "mongodb://localhost:
27021,localhost:27022/?replicaSet=foo;slaveOk=true" then I get a
MongoConnectionException ("Unable to choose a server instance.")
instead.

What is the best practice for replica set failover?

Thanks,
Mo

Robert Stam

unread,
Oct 21, 2011, 11:40:49 AM10/21/11
to mongodb-csharp
Failover is automatic but not instantaneous. There is a short period
of time during which you will receive exceptions.

If you keep retrying it will eventually succeed, but not until a new
primary has been elected.

mo

unread,
Oct 21, 2011, 12:55:59 PM10/21/11
to mongodb-csharp
Thanks for the reply, Robert. I put in a debug catch with a
messagebox, so I could test repeatedly.
After shutting down the primary server, my connected mongo shell
returns the following for rs.status()

{
"set" : "foo",
"date" : ISODate("2011-10-21T16:47:04Z"),
"myState" : 1,
"syncingTo" : "localhost:27021",
"members" : [
{
"_id" : 0,
"name" : "localhost:27021",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"t" : 1319208673000,
"i" : 1
},
"optimeDate" :
ISODate("2011-10-21T14:51:13Z"),
"lastHeartbeat" :
ISODate("2011-10-21T16:41:21Z"),
"pingMs" : 0,
"errmsg" : "socket exception"
},
{
"_id" : 1,
"name" : "localhost:27022",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1319208673000,
"i" : 1
},
"optimeDate" :
ISODate("2011-10-21T14:51:13Z"),
"self" : true
},
{
"_id" : 2,
"name" : "localhost:27023",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 602,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" :
ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" :
ISODate("2011-10-21T16:47:03Z"),
"pingMs" : 0
}
],
"ok" : 1
}

However, over 5 minutes after the above rs.status(), I'm still getting
a MongoConnectionException ("Unable to choose a server instance.").
How long does this typically take?

Mo



//Current code
public partial class Form1 : Form
{
MongoServer server;
MongoDatabase database;
MongoCollection<BsonDocument> books;
public Form1()
{
InitializeComponent();
}

private void Form1_Load(object sender, EventArgs e)
{
server = MongoServer.Create("mongodb://localhost:27021,localhost:
27022/?replicaSet=foo;slaveOk=true");
database = server["bar"];
books = database["books"];
BsonDocument book = new BsonDocument { { "Author", "Ernest
Hemingway" }, { "Title", "For Whom The Bell Tolls" } };
//SafeModeResult result = books.Insert(book);
}

private void button1_Click(object sender, EventArgs e)
{
try
{
BsonDocument book = books.FindOne();
MessageBox.Show(book[1].ToString());
//textBox1.Text = book[1].ToString();
}
catch (Exception ex)
{
MessageBox.Show(ex.GetType().ToString() + "::" + ex.Message);

Robert Stam

unread,
Oct 21, 2011, 1:17:41 PM10/21/11
to mongodb-csharp
Five minutes is too long. Once rs.status() show a new primary has been
elected you should stop getting exceptions.

The slaveOk=true message might be a separate issue. I'll investigate
and report back.

mo

unread,
Oct 21, 2011, 1:29:18 PM10/21/11
to mongodb-csharp
Thanks for investigating. Me too :)

I'm now using the source projects (1.2 - I've downloaded 1.3, but
haven't looked in there yet) for Driver and Bson, so I can step
through. Looks like primary is null down in the middle of
MongoServer.ChooseServerInstance().
I'm continuing to look (I'll probably switch to 1.3, too), but thought
I'd share what I saw so far.

Thanks again,
Mo

mo

unread,
Oct 21, 2011, 1:44:06 PM10/21/11
to mongodb-csharp
Oh, I'm actually already using the 1.3 codebase.
Sorry for the misinformation. I'm a little new to Git.

Mo

mo

unread,
Oct 21, 2011, 5:55:55 PM10/21/11
to mongodb-csharp
I'm sure it's not the correct place to do this, but I saw that primary
was never getting set to the new primary after losing the old primary,
so I made the following change in
MongServer.InstanceStateChanged(object sender, object args):

// This is the existing part
if (instance.IsPrimary && instance.State == MongoServerState.Connected
&& instances.Contains(instance)) {
if (primary != instance) {
primary = instance; // new primary
}
} else {
if (primary == instance) {
primary = null; // no primary until we find one again
}
}
// This is the new part I added
if (primary == null && instances.Find(x => x.IsPrimary) != null) {
primary = instances.Find(x => x.IsPrimary);
}


I've been testing pretty rigorously since making this change, and
apart from an exception between TimerCallbacks immediately following
the change of primary, all reads and writes have worked perfectly, and
so did removing either of the servers from the connection string.

Like I said, it's a bit of a hack, but it works for me. Hope this
information is useful.

Thanks,
Mo

On Oct 21, 12:17 pm, Robert Stam <rstam10...@gmail.com> wrote:

Robert Stam

unread,
Oct 22, 2011, 2:00:19 PM10/22/11
to mongodb-csharp
Not sure why this would be necessary since when the other instance
changes state to IsPrimary true it also will call InstanceStateChanged
and the primary field will be assigned the correct value at that time.

mo

unread,
Oct 24, 2011, 11:41:21 AM10/24/11
to mongodb-csharp
Okay, a little more research has revealed some (hopefully) useful
info:
When the primary goes down, the secondary becomes the new primary, but
the instance's "State" property never changes from connected, so the
State property setter's "if (state != value)" line prevents
InstanceStateChanged ever getting called on that instance after it
becomes the new primary.
It seems that once a write is attempted...

books.Insert(new BsonDocument { { "Author", "Ernest Hemingway" },
{ "Title", "For Whom The Bell Tolls" } });

...the below call stack happens...

MongoDB.Driver.MongoServer.InstanceStateChanged(object sender =
{MongoDB.Driver.MongoServerInstance}, object args = null)
MongoDB.Driver.MongoServerInstance.State.set(MongoDB.Driver.MongoServerState
value = Connected)
MongoDB.Driver.MongoServerInstance.VerifyState(MongoDB.Driver.Internal.MongoConnection
connection = {MongoDB.Driver.Internal.MongoConnection})
MongoDB.Driver.MongoServerInstance.Connect(bool slaveOk = true)
MongoDB.Driver.Internal.ReplicaSetConnector.ConnectWorkItem(object
argsObject =
{MongoDB.Driver.Internal.ReplicaSetConnector.ConnectArgs})
[External Code]

...which flips the instance's state from connected to connecting (and
eventually back again to connected), causing Instances.Primary to be
set to this new primary instance, however, this call stack ONLY
happens when a write is attempted. If I just continue attempting to
read (since SlaveOk is true), the state remains "Connected"
indefinitely, and no new Primary gets set.

Do you need me to create a JIRA ticket for this?

Thanks again,
Mo

P.S. This may have been difficult to catch, because if there are
multiple secondaries when the primary goes down, one of them will
become the next primary and, although MongServer.Primary will continue
to be null, the following part of MongServer.ChooseServerInstance(bool
slaveOk) will ensure that reads continue to work fine against one of
the other secondaries...
if (instance.State == MongoServerState.Connected &&
(instance.IsSecondary || instance.IsPassive)) {
return instance;

Robert Stam

unread,
Oct 24, 2011, 11:47:57 AM10/24/11
to mongodb-csharp
Yes. Please create a JIRA ticket for this and copy the information
from here to the ticket.

Looks like a good analysis. Thanks for your efforts!
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages