Hi,
I am running quite usual "by the book" 3 node replica set configuration on 3 differnet nodes.
It is running well, now I am running some failure scenarios.
I noticed that if I stop all 3 mongos, then I try to run just mongo on primary.
It issues many error like this
2016-08-17T13:39:46.992+0200 W NETWORK [ReplicationExecutor] Failed to connect to
10.139.123.7:27017, reason: errno:111 Connection refused
2016-08-17T13:39:46.993+0200 W NETWORK [ReplicationExecutor] Failed to connect to
10.139.123.5:27017, reason: errno:111 Connection refused
2016-08-17T13:39:46.993+0200 I REPL [ReplicationExecutor] transition to STARTUP2
2016-08-17T13:39:46.993+0200 I REPL [ReplicationExecutor] Starting replication applier threads
2016-08-17T13:39:46.994+0200 I REPL [ReplicationExecutor] transition to RECOVERING
2016-08-17T13:39:46.996+0200 I REPL [ReplicationExecutor] Error in heartbeat request to
10.139.123.5:27017; HostUnreachable: Connection refused
2016-08-17T13:39:46.997+0200 I REPL [ReplicationExecutor] transition to SECONDARY
2016-08-17T13:40:19.126+0200 I REPL [ReplicationExecutor] Not starting an election, s
ince we are not electableI would expect that primary would at least start as PRimary, but no, it just remains as SECONDARY.
Is this behavior OK ?
I am thinking that following the election logic this is ok. mongo process sess it is alone, other 12 are not reachable so there is no election, no PRIMARY.
I am thinking what should I do if it happens that my other 2 nodes go down for longer time.
I am thinking that in such scenario the best solutio is to use FORCE reconfigure and config just the ione node into new replica ?
Here is a copy/paste of another thread:
"You can force reconfigure the replica set to drop Server
2/secondary and Server 2/arbiter from the rs.config(). Essentially
creating a 1 node replica set.
As per the documentation:
"Use this procedure only to recover from catastrophic interruptions."
"
Thanks!