I have a 3 node Replica set(X.X.X.X1,X.X.X.X2 AND X.X.X.X3).If Primary goes down and the another node is not becoming a Primary. So we observed when Primary timeout occur the other node will not be become a primary every time primary become a primary(X.X.X.X2)
> Current Replicaset status;
> X.X.X.X1: Secondary
> X.X.X.X2: Primary
> X.X.X.X3: Secondary
> We got Below Errors
> Errors:
> {noformat}
> Error in heartbeat request to X.X.X.X3:27017; HostUnreachable: Connection timed out
> 2016-07-26T23:05:27.607-0400 I REPL [ReplicationExecutor] Error in heartbeat request to X.X.X.X1:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
> 2016-07-26T23:04:16.962-0400 I REPL [ReplicationExecutor] Error in heartbeat request to X.X.X.X3:27017; HostUnreachable: Connection refused
> 2016-07-26T23:04:16.962-0400 I REPL [ReplicationExecutor] Error in heartbeat request to X.X.X.X3:27017; HostUnreachable: Connection refused
> 2016-07-26T23:04:16.963-0400 I REPL [ReplicationExecutor] Error in heartbeat request to 10.220.20.X3:27017; HostUnreachable: Connection refused
> 2016-07-26T23:04:27.195-0400 I REPL [ReplicationExecutor] Member X.X.X.X3:27017 is now in state SECONDARY
> 2016-07-26T23:04:28.897-0400 I REPL [ReplicationExecutor] Error in heartbeat request to X.X.X.X1:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
> 2016-07-26T23:04:29.264-0400 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
> 2016-07-26T23:04:29.264-0400 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
> 2016-07-26T23:04:29.266-0400 I REPL [ReplicationExecutor] dry election run succeeded, running for election
> 2016-07-26T23:04:29.267-0400 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 188
> 2016-07-26T23:04:29.267-0400 I REPL [ReplicationExecutor] transition to PRIMARY
> 2016-07-26T23:04:29.768-0400 I COMMAND [conn51] command staging.DataJobs command: find { find: "DataJobs", filter: { Flag: { $exists: false }, __$operation: 1 }, sort: { JobDateCreated: 1 }, projection: { _id: 0 }, limit: 500 } planSummary: IXSCAN { __$operation: 1.0 } keysExamined:89334 docsExamined:89334 hasSortStage:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:756 nreturned:0 reslen:122 locks:{ Global: { acquireCount: { r: 1514 } }, Database: { acquireCount: { r: 757 } }, Collection: { acquireCount: { r: 757 } } } protocol:op_query 4945ms
> {noformat}
> Replicaset configuration information:
> {noformat}
> {
> "_id" : "rs1",
> "version" : 3,
> "protocolVersion" : NumberLong(1),
> "members" : [
> {
> "_id" : 0,
> "host" : "X.X.X.X1.:27017",
> "arbiterOnly" : false,
> "buildIndexes" : true,
> "hidden" : false,
> "priority" : 1,
> "tags" : {
>
> },
> "slaveDelay" : NumberLong(0),
> "votes" : 1
> },
> {
> "_id" : 1,
> "host" : "X.X.X.X2:27017",
> "arbiterOnly" : false,
> "buildIndexes" : true,
> "hidden" : false,
> "priority" : 1,
> "tags" : {
>
> },
> "slaveDelay" : NumberLong(0),
> "votes" : 1
> },
> {
> "_id" : 2,
> "host" : "X.X.X.X3:27017",
> "arbiterOnly" : false,
> "buildIndexes" : true,
> "hidden" : false,
> "priority" : 1,
> "tags" : {
>
> },
> "slaveDelay" : NumberLong(0),
> "votes" : 1
> }
> ],
> "settings" : {
> "chainingAllowed" : true,
> "heartbeatIntervalMillis" : 2000,
> "heartbeatTimeoutSecs" : 10,
> "electionTimeoutMillis" : 10000,
> "getLastErrorModes" : {
>
> },
> "getLastErrorDefaults" : {
> "w" : 1,
> "wtimeout" : 0
> },
> "replicaSetId" : ObjectId("xxxxxxxxxx")
> }
> }