Fault Tolerance and Availability?

18 views
Skip to first unread message

Wes Gibbs

unread,
Nov 28, 2017, 5:39:02 PM11/28/17
to mongodb-user
For a 3 member replicaset, the fault tolerance is 1.  If two members come offline indefinitely but the last node "the primary" remains online indefinitely, is the primary still healthy for reads/writes indefinitely?  Besides the replicaset being considered unhealthy, are there any other problems will two nodes being offline for a week or two?

Kevin Adistambha

unread,
Dec 4, 2017, 11:12:34 PM12/4/17
to mongodb-user

Hi,

If two members come offline indefinitely but the last node “the primary” remains online indefinitely, is the primary still healthy for reads/writes indefinitely

In a 3 members replica set, if 2 members are offline, the remaining node will be a secondary. If it was primary before, it will step down once it cannot contact the majority of the set. At this point, the set will be read-only, and applications must be configured to be able to read from a secondary.

This is by design, since it is possible that the remaining node cannot see the two other nodes due to a network partition. If the two remaining members are not actually offline but are simply unreachable, the set could potentially have two primaries, and the data between the two primaries could diverge to a point where it’s not reconcilable (i.e. a “split-brain” situation). This is the main reason why if a member cannot see the majority of the set, it will immediately step down.

The page Replica Set Elections has the relevant details that may be of interest.

On the other hand, if only one member is offline for an extended period of time, it is possible that the member will “fall off” the oplog, e.g. its data is so far behind the rest of the set that it’s not possible to catch up anymore. In this situation, an initial sync is necessary to rejoin the affected member into the set.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages