Problem electing new PRIMARY after rs.stepDown()

725 views
Skip to first unread message

Victor Merino

unread,
Sep 19, 2016, 11:57:08 AM9/19/16
to mongodb-user
Hello,

I was upgrading from mongo 3.0.9 to 3.0.12 all our servers, and something weird happened once all SECONDARY and ARBITER were up-to-date and tried to do step down primary in order to update it wasn't able to stepdown and even shutdown with the following message:


asert failed : unexpected error: Error: stepDown failed: No electable secondaries caught up as of ...

After some investigation and some configuration I realized that  priority for PRIMARY was 3 and the others one was 1.  Once I reduced priority to 1 to actual PRIMARY I was able to stepDown.

Reading about priority in official docs :

The priority settings of replica set members affect both the timing and the outcome of elections for primary. Higher-priority members are more likely to call elections, and are more likely to win.

I understand the priority settings, but was that beahivor preventing to stepDown PRIMARY? I particularly have interest on that server being primary as soon as election pops, but also It should be able to elect another primary if something happens to the actual one.

Thanks for you help!

William Hagan

unread,
Sep 20, 2016, 2:18:06 PM9/20/16
to mongodb-user
Hello Victor,

The priority setting shouldn't prevent a stepdown.

My suspicion for the inability to step down was, it took a long time for the secondary nodes to catch up.

A quote from this page reads:

To avoid rollbacks, rs.stepDown(), by default, only steps down the primary if an electable secondary is completely caught up with the primary. The command will wait up to either 10 seconds or the secondaryCatchUpPeriodSecs for a secondary to catch up.

If no electable secondary meets this criterion by the waiting period, the primary does not step down and the method throws an exception.


You may confirm this by comparing optime of the primary and secondary nodes.

Regards,

Lukas Lehner

unread,
Sep 22, 2016, 2:19:51 AM9/22/16
to mongod...@googlegroups.com
find more info about replication lag on http://blog.mlab.com/2013/03/replication-lag-the-facts-of-life/

important commands

db.printSlaveReplicationInfo()

db.printReplicationInfo()

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/94cdca35-215a-4b9e-a64a-4976573a1d36%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Chris Cunningham

unread,
Sep 22, 2016, 4:39:54 AM9/22/16
to mongodb-user

Hi Victor,

Having a higher priority setting should not have an effect on the rs.stepDown() operation.

Based on the error you initially reported,

assert failed : unexpected error: Error: stepDown failed: No electable secondaries caught up as of ...

the most likely cause is that the Secondary was not yet caught up with the Primary after your upgrade.

As suggested earlier, you should check your Replica Set Status as well as your Replica Set Lag to verify the optime of your Primary and Secondaries next time.

You may also find the following documents useful:

Thanks,

Chris

Jonas Courteau

unread,
Jun 28, 2017, 6:02:03 PM6/28/17
to mongodb-user
Worth noting that if you've had zero writes replicated in the last 10s you may hit this too.  Unlikely in most production environments, but in our lab, when testing various upgrades through 3.2, we found that unless _something_ was replicated within the last 10s we couldn't step down a primary (unless we used force, or changed the maximum delay as noted above)

Hope that helps others seeing this!

id: 7898659753248090
Reply all
Reply to author
Forward
0 new messages