Observed [rsHealthPoll] replSet info <fqdn>:27018 is down (or slow to respond)

552 views
Skip to first unread message

Gajanan Chandgadkar

unread,
Jul 18, 2013, 2:02:10 AM7/18/13
to mongod...@googlegroups.com
Hi,

We have a production setup with 3 replica set with mongodb version 2.4.4 hosted in AWS EC2.

Today we had some issue with one of the replica set 3 server and on aws console it was - Instance reachability check failed. So we did :

1. stop and start the instance from AWS console
2. Once the server is up, reassigned the EIP
3. Mongodb successfully started on this instance but status is : STARTUP2 and the mongodb logs showing :

Thu Jul 18 05:15:13.155 [rsHealthPoll] replset info <FQDN>:27018 thinks that we are down

Thu Jul 18 05:15:14.535 [rsHealthPoll] replset info <FQDN>:27018 thinks that we are down


Also while trying to connect to mongo console, it takes some 30 sec to connect to mongo shell on that instance.


Now in other two replica servers (primary and secondary), mongodb logs shows :


Thu Jul 18 05:14:34.306 [rsHealthPoll] DBClientCursor::init call() failed

Thu Jul 18 05:14:34.306 [rsHealthPoll] replSet info <FQDN>:27018 is down (or slow to respond):

Thu Jul 18 05:14:34.306 [rsHealthPoll] replSet member <FQDN>:27018 is now in state DOWN


I am not sure why we are seeing this issue. Any advice on fixing this issue ?

Please let me know.




William Zola

unread,
Jul 19, 2013, 2:03:00 AM7/19/13
to mongod...@googlegroups.com
The most likely cause of your problem is network connectivity.  Please check your connectivity between all node members using the procedure described here: http://docs.mongodb.org/manual/tutorial/troubleshoot-replica-sets/#test-connections-between-all-members  

If you find a name resolution or DNS connectivity problem, work with your networking folks to resolve it.

Cheers!

 -William 
Reply all
Reply to author
Forward
0 new messages