I've been experimenting with network partitions in a multi-site
replication scenario (much like the one
http://www.mongodb.org/display/DOCS/Data+Center+Awareness)
before putting my first Mongo deployment into production, and I've run
into some undesirable behaviour with a particular scenario of network
partition.
Being new at all this, I'd value some community input - after all this
is my first foray into Mongo replication, so I could be going about
this entirely the wrong way :)
Three hosts (all v1.6.5), in a replica set:
config = {_id: 'test1', members: [
{_id: 0, host: 'sf1'},
{_id: 1, host: 'ny1'},
{_id: 2, host: 'uk1'}]
}
sf1 is master.
A 'routing issue' occurs ( root@uk1:~# route add -host sf1 reject ),
such that:
sf1 can talk to ny1.
ny1 can talk to uk1.
sf1 cannot talk to uk1.
sf1 notices uk1 has gone quiet, and remains a master. (it's a master,
it can see a majority, so that's reasonable)
uk1 votes for itself. (it can see a majority, but no master, so
that's also reasonable)
ny1 votes for uk1. (that's probably less sensible, given that it can
already see a master)
ny1 then bemoans that fact that there are two primaries.
Log entries:
sf1:
Thu Feb 10 17:29:39 [conn2] end connection uk1:35740
Thu Feb 10 17:29:57 [ReplSetHealthPollTask] replSet info uk1 is now
down (or slow to respond)
ny1:
Thu Feb 10 17:29:37 [conn4] replSet info voting yea for 2
Thu Feb 10 17:29:39 [ReplSetHealthPollTask] replSet uk1 PRIMARY
Thu Feb 10 17:29:39 [rs Manager] replSet warning DIAG two primaries
(transiently)
Thu Feb 10 17:29:45 [rs Manager] replSet warning DIAG two primaries
(transiently)
Thu Feb 10 17:29:51 [rs Manager] replSet warning DIAG two primaries
(transiently)
(etc, the situation doesn't resolve until uk1 is un-partitioned again)
uk1:
Thu Feb 10 17:29:37 [ReplSetHealthPollTask] replSet info sf1 is now
down (or slow to respond)
Thu Feb 10 17:29:37 [rs Manager] replSet info electSelf 2
Thu Feb 10 17:29:37 [rs Manager] replSet PRIMARY
The impact of this is probably rather mitigated in the real world, as
if I repeat this scenario with frequent writes onto sf1, uk1 when
partitioned in this way will hold back, saying "[rs Manager] replSet
info not electing self, we are not freshest".
So I guess my question is - is this a reasonable topology, and thus I
should be logging this behaviour as a bug?