Primary steps down when secondaries stopped

165 views
Skip to first unread message

Jeremy Wilson

unread,
Jun 15, 2012, 11:03:51 AM6/15/12
to mongod...@googlegroups.com
Yesterday during our day of hell due to MongoDB and sharding, besides the terrible way that it balances chunks I discovered another issue.

In our case, the secondaries were really far behind the primary, so we thought bringing down some of the secondaries might free up resources and speed up the replication. However, when I did this it forced an election and the Primary - which was fine - decided to stop being primary, so I had 2 servers set to secondary. I had to bring up ALL the secondaries to get the primary to be elected again.

Since there's no way to force a server to be primary, why does the server that is already primary step down?

Kristina Chodorow

unread,
Jun 15, 2012, 1:47:42 PM6/15/12
to mongodb-user
This sounds like expected behavior, a primary needs to be able to
reach a majority of the members of the set to stay primary. This
prevents a "split brain" situation when you have a network partition.
If it did not do this, you could imagine a situation where you have a
primary, A, and a secondary B, in one data center and then three
servers, C,D,E, in another. If the link between the two goes down, A
will stop being primary and one of C, D, or E will be elected.

Jeremy Wilson

unread,
Jun 15, 2012, 2:45:45 PM6/15/12
to <mongodb-user@googlegroups.com>

On 2012-06-15, at 1:47 PM, Kristina Chodorow wrote:

> This sounds like expected behavior, a primary needs to be able to
> reach a majority of the members of the set to stay primary.

Which is why I set "priority : 0" in the remote DCs, so they could never be elected primary, which is how I thought that worked.

All this automated behaviour did was make my system stop working, since then there's no manual way to promote a server to primary. Are there any plans to allow admins to set things themselves to prevent this kind of thing from happening?


Kristina Chodorow

unread,
Jun 15, 2012, 3:25:48 PM6/15/12
to mongodb-user
I think that what you want is to make all of the members in the other
data center have 0 votes (votes:0 in the member configs). Then the
primary will only need votes from its own DC to become primary.

Jeremy Wilson

unread,
Jun 15, 2012, 3:33:14 PM6/15/12
to <mongodb-user@googlegroups.com>

On 2012-06-15, at 3:25 PM, Kristina Chodorow wrote:

> I think that what you want is to make all of the members in the other
> data center have 0 votes (votes:0 in the member configs). Then the
> primary will only need votes from its own DC to become primary.

But that doesn't help when two of the other locals are down and it demotes itself.

Are there any plans to allow admins to control this manually? As it stands having NO primary is worse than split-brain issues, at least for us.

Kristina Chodorow

unread,
Jun 15, 2012, 3:44:18 PM6/15/12
to mongodb-user
Well... you could make it the only voting member. That is basically
the same as controlling it manually. You'll have to do failover
manually, too, though. There are no plans to make this a manual
process as most people prefer failover.

Jeremy Wilson

unread,
Jun 15, 2012, 4:11:03 PM6/15/12
to <mongodb-user@googlegroups.com>

On 2012-06-15, at 3:44 PM, Kristina Chodorow wrote:

> Well... you could make it the only voting member. That is basically
> the same as controlling it manually. You'll have to do failover
> manually, too, though. There are no plans to make this a manual
> process as most people prefer failover.

I think you're missing my point. The auto failover is fine for local machines. But if a server is set for "vote : 0" as you say, it should never become master, this handles DBs in remote DCs. But in the local DC, where I have four machines and the three replicas go down, the last remaining server is *already* primary, shouldn't it *stay* primary?

I'm not talking about promotion, I'm talking about *not demoting*.

Kristina Chodorow

unread,
Jun 15, 2012, 4:59:02 PM6/15/12
to mongodb-user
Theoretically, we could do something with data center awareness to get
something like you're talking about, but there are no plans to do so.

In general, you can't have it both ways. Either you get automatic
failover or you can control who's primary (or you can get both, but
then you have to handle multiple masters and write conflicts). We
went with auto failover.

Jeremy Wilson

unread,
Jun 15, 2012, 6:13:27 PM6/15/12
to <mongodb-user@googlegroups.com>

On 2012-06-15, at 4:59 PM, Kristina Chodorow wrote:

> In general, you can't have it both ways.

Why can't the current primary *stay primary*? I don't see any issue with this problem. The disconnected slaves in the remote DCs were in secondary mode and have a setting saying "don't run an election", so they stay secondary even if there's enough members for an election.

Back in the main DC, there's now X number of servers and one is primary, why must it go into secondary if it was *already* primary?

I can see your point if the primary dies and there's not enough slaves to elect a primary, but if there's already one going it should safely continue being primary.


Karl Seguin

unread,
Jun 15, 2012, 9:25:45 PM6/15/12
to mongod...@googlegroups.com
Personally, I think you are both right.

Jeremy is right that allowing DBs to manually force a primary, regardless of the state of the set, is reasonable. The split-brain concern seems irrelevant when explicit actions are taken by admins.

It also makes sense that the primary _must_ go into secondary. It isn't enough to be primary in order to stay primary. The system has to be paranoid and constantly check "am I still the master?". This is simply not possible to do, automatically, without a consensus. If you don't have a consensus, how do you know another server doesn't?

Worked as expected; admins should be able to manually force a server into primary.

Glenn Maynard

unread,
Jun 16, 2012, 10:12:21 AM6/16/12
to mongod...@googlegroups.com
On Fri, Jun 15, 2012 at 3:11 PM, Jeremy Wilson <JWi...@keek.com> wrote:
I think you're missing my point.  The auto failover is fine for local machines.  But if a server is set for "vote : 0" as you say, it should never become master, this handles DBs in remote DCs.  But in the local DC, where I have four machines and the three replicas go down, the last remaining server is *already* primary, shouldn't it *stay* primary?

No, because the last server doesn't know that the other servers are down.  It has to assume that *it's* the one down--that it's disconnected from the world, and the other three are alive and electing a new primary without it.  A server can only stay primary if it can reach enough servers to maintain the consensus--only so long as it would be possible for it to be re-elected from scratch.

On Fri, Jun 15, 2012 at 3:59 PM, Kristina Chodorow <kris...@10gen.com> wrote:
In general, you can't have it both ways.  Either you get automatic
failover or you can control who's primary (or you can get both, but
then you have to handle multiple masters and write conflicts).  We
went with auto failover.

Well, I think what he was asking for is a way to temporarily switch off automatic failover, and force a node to become primary.  That does make sense; the administrator just has to be careful to do it correctly so you never end up with multiple primaries.  This could also be used to disable voting entirely and create an external failover/voting mechanism, which would be analogous to turning off the balancer and handling chunk migrations yourself.

--
Glenn Maynard

Reply all
Reply to author
Forward
0 new messages