Why does rs.reconfig always take down primary briefly?

116 views
Skip to first unread message

idris

unread,
Apr 25, 2012, 11:17:05 AM4/25/12
to mongod...@googlegroups.com
For example, I just had a slaveDelay member and I removed the slaveDelay, then saved the config using rs.reconfig(), and the primary went down briefly, and apparently dropped all connections to clients.  Why does it need to do this?  Shouldn't there be an option for "don't go down upon reconfig" or something?

-Idris

Kyle Banker

unread,
Apr 25, 2012, 12:15:36 PM4/25/12
to mongod...@googlegroups.com
Which version of MongoDB are you running? These was an issue in v1.8 but should not be an issue post 2.0.

idris

unread,
Apr 25, 2012, 2:00:11 PM4/25/12
to mongod...@googlegroups.com
Running version 2.0.4.  I'll try to dig up some logs.  Is there any documentation on which configuration changes will and will not force an election or a stepDown?

Kyle Banker

unread,
Apr 25, 2012, 3:03:48 PM4/25/12
to mongod...@googlegroups.com
How many total nodes were you running when you removed the slavedelay node?

idris

unread,
Apr 25, 2012, 5:40:45 PM4/25/12
to mongod...@googlegroups.com
Running 7 nodes total:
1 primary, 3 secondaries, 1 arbiter, 1 backup (hidden, priority:0), 1 off-site secondary (hidden, priority:0, votes:0)

Here are the logs of the primary and a secondary when I ran the reconfig command. In this reconfig, I set had votes:0 on the backup node.

I stripped out all of the noise (connection accepted, end connection) from those logs as well.

Primary definitely relinquishes it's primary state...
Wed Apr 25 17:20:42 [conn417285] replSet replSetReconfig config object parses ok, 7 members specified
Wed Apr 25 17:20:42 [conn417285] replSet replSetReconfig [2]
Wed Apr 25 17:20:42 [conn417285] replSet info saving a newer config version to local.system.replset
Wed Apr 25 17:20:42 [conn417285] replSet saveConfigLocally done
Wed Apr 25 17:20:42 [conn417285] replSet relinquishing primary state
Wed Apr 25 17:20:42 [conn417285] replSet SECONDARY
Wed Apr 25 17:20:42 [conn417285] replSet closing client sockets after reqlinquishing primary
Wed Apr 25 17:20:42 [conn417285] replSet PRIMARY

idris

unread,
Apr 25, 2012, 6:35:38 PM4/25/12
to mongod...@googlegroups.com
By the way, the docs (http://www.mongodb.org/display/DOCS/Reconfiguring+when+Members+are+Up) also say that the primary steps down and closes sockets from clients upon a reconfig, but only "in certain circumstances". It would be helpful to know which circumstances will cause a stepdown.

In certain circumstances, the primary steps down (perhaps transiently) on a reconfiguration. On a step-down, the primary closes sockets from clients to assure the clients know quickly that the server is no longer primary. Thus, your shell session may experience a disconnect on a reconfig command.

idris

unread,
Apr 28, 2012, 10:33:50 PM4/28/12
to mongod...@googlegroups.com
*bump*  Is this worthy of a bug report in Jira?

Scott Hernandez

unread,
Apr 29, 2012, 12:14:31 AM4/29/12
to mongod...@googlegroups.com
Can you post the config -- rs.conf() -- before and after as well as
the status -- rs.status()?

Also, please post the logs from before the save showing that state of
the world in terms of who was up/down via the replica status messages.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/8pRjE2tPn7MJ.
>
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

idris

unread,
Apr 29, 2012, 10:08:53 PM4/29/12
to mongod...@googlegroups.com
rs.conf() before:
{
"_id" : "myrepl",
"version" : 19,
"members" : [
{
"_id" : 0,
"host" : "10.0.51.1:27017",
"priority" : 10,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 1,
"host" : "10.0.51.2:27017",
"priority" : 20,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 2,
"host" : "10.0.51.3:27017",
"tags" : {
"dc" : "va"
}
},
{
"_id" : 3,
"host" : "10.0.51.20:27017",
"priority" : 0,
"hidden" : true,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 4,
"host" : "10.0.51.21:27017",
"arbiterOnly" : true
},
{
"_id" : 5,
"host" : "10.0.51.4:27017",
"tags" : {
"dc" : "va"
}
},
{
"_id" : 6,
"host" : "mongo901:27017",
"votes" : 0,
"priority" : 0,
"hidden" : true,
"tags" : {
"dc" : "ca"
}
}
]
}


rs.conf() after:
{
"_id" : "myrepl",
"version" : 20,
"members" : [
{
"_id" : 0,
"host" : "10.0.51.1:27017",
"priority" : 10,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 1,
"host" : "10.0.51.2:27017",
"priority" : 20,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 2,
"host" : "10.0.51.3:27017",
"tags" : {
"dc" : "va"
}
},
{
"_id" : 3,
"host" : "10.0.51.20:27017",
"votes" : 0,
"priority" : 0,
"hidden" : true,
"tags" : {
"dc" : "va"
}
},
{
"_id" : 4,
"host" : "10.0.51.21:27017",
"arbiterOnly" : true
},
{
"_id" : 5,
"host" : "10.0.51.4:27017",
"tags" : {
"dc" : "va"
}
},
{
"_id" : 6,
"host" : "mongo901:27017",
"votes" : 0,
"priority" : 0,
"hidden" : true,
"tags" : {
"dc" : "ca"
}
}
]
}


I don't see anything useful in the logs regarding up/down state before the reconfig, but I know that all nodes were up and reachable from all other nodes.  This was not a one-time event, the same thing happens every time I do a reconfig.


On Saturday, April 28, 2012 11:14:31 PM UTC-5, Scott Hernandez wrote:
Can you post the config -- rs.conf() -- before and after as well as
the status -- rs.status()?

Also, please post the logs from before the save showing that state of
the world in terms of who was up/down via the replica status messages.

> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

On Saturday, April 28, 2012 11:14:31 PM UTC-5, Scott Hernandez wrote:
Can you post the config -- rs.conf() -- before and after as well as
the status -- rs.status()?

Also, please post the logs from before the save showing that state of
the world in terms of who was up/down via the replica status messages.

> mongodb-user+unsubscribe@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

On Saturday, April 28, 2012 11:14:31 PM UTC-5, Scott Hernandez wrote:
Can you post the config -- rs.conf() -- before and after as well as
the status -- rs.status()?

Also, please post the logs from before the save showing that state of
the world in terms of who was up/down via the replica status messages.

> mongodb-user+unsubscribe@googlegroups.com.

Kyle Banker

unread,
Apr 30, 2012, 11:00:03 AM4/30/12
to mongod...@googlegroups.com
Can you please post rs.status() before and after as well?

idris

unread,
Apr 30, 2012, 3:18:09 PM4/30/12
to mongod...@googlegroups.com
Unfortunately, I don't have those from this past time.  I can tell you that before the reconfig, everything was normal (_id=1 was primary, no lag on any secondaries).  After the reconfig, the host that had been primary was now secondary, and there was no primary.  This lasted a couple of minutes, until the primary was finally elected, resulting in a normal status again (_id=1 primary, no lag on secondaries).

If I remember correctly, while there was no primary, rs.status() reported that the other nodes were unreachable.  You can see that in the primary's logs I posted before: 
[rsHealthPoll] replSet info 10.0.51.3:27017 is down (or slow to respond): socket exception
[rsHealthPoll] replSet info 10.0.51.1:27017 is down (or slow to respond): socket exception
[rsHealthPoll] replSet info 10.0.51.20:27017 is down (or slow to respond): socket exception
[rsHealthPoll] replSet info 10.0.51.21:27017 is down (or slow to respond): socket exception
[rsHealthPoll] replSet info 10.0.51.4:27017 is down (or slow to respond): socket exception
[rsHealthPoll] replSet info mongo901:27017 is down (or slow to respond): socket exception

Barrie

unread,
May 30, 2012, 3:27:40 PM5/30/12
to mongod...@googlegroups.com
I think that you should assume that any reconfig will cause a stepdown.

On Wednesday, April 25, 2012 2:00:11 PM UTC-4, idris wrote:
Reply all
Reply to author
Forward
0 new messages