Safemode detected an error 'not master'. when primary switches from one instance to another

453 views
Skip to first unread message

Demitri Baroutsos

unread,
Jul 3, 2012, 6:36:58 AM7/3/12
to mongodb...@googlegroups.com
We're currently running all our MongoDB instances with version 2.0.5 of the MongoDB server with replica-sets.
Each server is hosted on a separate machine to cater for fail-over.

We are using version 1.3.0.4309 of the C# Driver and we're seeing the following error. It happens just after a network failure where one of the server instances takes over as primary in the replica-set:

Safemode detected an error 'not master'. (Response was { "err" : "not master", "code" : 10058, "n" : 0, "lastOp" : NumberLong(0), "connectionId" : 140567, "ok" : 1.0 }).

Here's the stack-trace of that error occurrence:
   at MongoDB.Driver.Internal.MongoConnection.SendMessage(MongoRequestMessage message, SafeMode safeMode)
   at MongoDB.Driver.MongoCollection.InsertBatch(Type nominalType, IEnumerable documents, MongoInsertOptions options)
   at MongoDB.Driver.MongoCollection.Insert(Type nominalType, Object document, MongoInsertOptions options)
   at MongoDB.Driver.MongoCollection.Insert(Type nominalType, Object document, SafeMode safeMode)
   at MongoDB.Driver.MongoCollection.Insert(Type nominalType, Object document)
   at MongoDB.Driver.MongoCollection.Insert[TNominalType](TNominalType document)

Our connection string has all three server names and ports as per the documentation.

It appears as if the MongoDB Connection in the C# driver is not checking that the one it knows to be primary is still the primary when doing a write operation. The only way we can recover from this is to restart the application pool the web application is running in, thereby forcing the next request to re-create the database connection object and then it knows which of the three server instance is now the primary again.

Is this perhaps a known issue with the version of the server software and C# driver we are using? We are planning on updating the code-base to use the latest version of the C# driver shortly.

Thanks in advance

Regards,
Demitri

Scott Hernandez

unread,
Jul 3, 2012, 7:15:44 AM7/3/12
to mongodb...@googlegroups.com
How are you connecting to the server? What connection string (URI) or constructor (MongoServerSettings) are you using?

Demitri Baroutsos

unread,
Jul 3, 2012, 8:01:34 AM7/3/12
to mongodb...@googlegroups.com
Our connection string format is as follows:

mongodb://mongoDBUser:MongoDB...@10.40.123.4:27017,10.80.123.5:27017,10.80.123.6:27017/?safe=true;w=2;wtimeoutMS=2000;slaveOk=true

We keep a static instance of the server and database objects as follows:

        private static MongoServer _server;
        private static MongoServer DBServer
        {
            get
            {
                if (_server == null)
                {
                    _server = MongoServer.Create(ConfigurationManager.ConnectionStrings["MyMongoConnectionString"].ToString());
                }
                return _server;
            }
        }

        private static MongoDatabase _db;
        internal static MongoDatabase DB
        {
            get
            {
                if (_db == null)
                {
                    _db = DBServer.GetDatabase("MyMongoDBName");
                }
                return _db;
            }
        }

We fetch collections using the DB object as follows using generic types:

internal static MongoCollection GetCollection<T>()
{
    return DB.GetCollection(typeof(T), typeof(T).Name);
}

Could it be that the issue is caused by us keeping static instances of the DB and Server objects in a web environment?

Just to re-iterate the issue around server instance connectivity. The web server maintains connection with the current primary DB server. However the two other instances (according to their logs) lose connectivity to the current primary and one of them is elected as primary. Now when the web server tries to write to the original primary it appears that it is still seeing this instance as the primary instance and thus returns the Safemode detected an error 'not master'. error when in fact, one of the other servers has taken over as primary. Only when either re-starting the app-pool on the web server or restarting the MongoDB instance that WAS the primary resolves the exception error.

Regards,
Demitri

craiggwilson

unread,
Jul 3, 2012, 8:21:33 AM7/3/12
to mongodb...@googlegroups.com
I'm not sure if it is known issue or not with that  version, but I do know a lot of work has gone into replica set connections since the release of that version, which was a long time ago.  If you are planning on updating, then that should fix the issue.

Robert Stam

unread,
Jul 3, 2012, 8:50:04 AM7/3/12
to mongodb...@googlegroups.com
Sounds like you are expecting the driver to handle the failover from one primary to another without throwing any exceptions at all, which is not the case for several reasons:

1. checking whether the member we think is primary is still primary before each operation would double the number of round trips to the server
2. failover is not instantaneous, until the election finishes there is no primary

The driver does not currently retry any operations, so you will see some exceptions in your application while the failover is in progress and your application should be prepared to handle these. We think the application should handle all exceptions because only the application knows:

1. whether it even wants to retry
2. what kinds of exceptions should be retried
3. how many times to retry
4. for how long to retry

Keep in mind that you could also be getting exceptions of other kinds now and then (network down, etc...) so you need to be prepared to handle exceptions anyway, and the exceptions that occur during failover from one primary to another are just one kind of exceptions you need to handle.

You should upgrade to a current version of the driver soon anyway, but in this case upgrading to 1.4.2 (or 1.5) won't change the behavior you are describing, which is working as designed.

Demitri Baroutsos

unread,
Jul 3, 2012, 12:04:36 PM7/3/12
to mongodb...@googlegroups.com
Thanks for the reply Robert. 

We do cater for exceptions and don't expect the driver to do all the work for us. However every subsequent request to the database fails when attempting to write until we restart the app pool/IIS or the MongoDB instance that used to be primary for that replica-set.

Regards,
Demitri

Robert Stam

unread,
Jul 3, 2012, 12:21:07 PM7/3/12
to mongodb...@googlegroups.com
Can you try with a current version of the C# driver? Either 1.4.2 or 1.5.

I can't find any existing JIRA ticket that would be similar to what you are describing.

Demitri Baroutsos

unread,
Jul 4, 2012, 3:18:10 AM7/4/12
to mongodb...@googlegroups.com
Thanks Robert - we will try with the latest driver and see if the problem persists.

Regards,
Demitri

Demitri Baroutsos

unread,
Jul 10, 2012, 4:31:35 AM7/10/12
to mongodb...@googlegroups.com
Just an update - we needed a fix quite urgently so we deployed latest build of our code (removed use of static instances of Database and Server contexts, we create instances on-demand now) using the latest C# driver as well. 

So far we've not encountered any issues with fail-over causing problems.

Regards,
Demitri

Robert Stam

unread,
Jul 10, 2012, 9:48:59 AM7/10/12
to mongodb...@googlegroups.com
Thanks for the update.

MongoServer.Create always returns the same instance of MongoServer when called with the same parameters, so even though you may be calling MongoServer.Create on demand, you're still reusing the single instance of MongoServer.

Demitri Baroutsos

unread,
Jul 10, 2012, 9:58:19 AM7/10/12
to mongodb...@googlegroups.com
Thanks for the confirmation and prompt responses Robert.

Demitri
Reply all
Reply to author
Forward
0 new messages