switchover in replica set due to socket exception

197 views
Skip to first unread message

Yogesh Jadhav

unread,
Sep 4, 2018, 6:05:02 PM9/4/18
to mongodb-user
Hi,

My cluster is community version 3.6 ,in my cluster I am facing switchover in between replica set members. primary becoming secondary and seconcondries are becoming primary.

Need help with socket releted all possiblites which I can work on with my application team.

Kindly suggest all scenarios releted socket issue.

Thanks
Yogesh J

Yogesh Jadhav

unread,
Sep 5, 2018, 3:16:28 AM9/5/18
to mongodb-user
My cluster is community version 3.6 ,in my cluster I am facing switch-over in between replica set members. primary becoming secondary and seconcondries are becoming primary.


Need help with socket releted issue logs are shown below

Logs of primary which become Secondry

2018-09-02T17:28:33.593+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Socket recv() timeout  x.x.x.204:27018
2018-09-02T17:28:33.601+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] SocketException: remote: (NONE):0 error: SocketException socket exception [RECV_TIMEOUT] server [x.x.x.204:27018]
2018-09-02T17:28:33.603+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Detected bad connection created at 1534748836340312 microSec, clearing pool for mongodb9.co.in:27018 of 0 connections
2018-09-02T17:28:33.603+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Dropping all pooled connections to mongodb9.co.in:27018(with timeout of 5 seconds)
2018-09-02T17:28:33.603+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Ending connection to host Hostname9.co.in:27018(with timeout of 5 seconds) due to bad connection status; 0 connections to that host remain open
2018-09-02T17:28:33.605+0530 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host HOSTNAME9:27018 as failed :: caused by :: HostUnreachable: network error while attempting to run command 'ismaster' on host Hostname9.co.in:27018'

At the same time take a look at below


(mtools) [tushar@NDL4176 mongologs_20180902]$ mloginfo mongodb9.log --rsstate
     source: mongodb9
       host: unknown
      start: 2018 Sep 02 17:28:02.354
        end: 2018 Sep 02 17:28:59.135
date format: iso8601-local
     length: 714
     binary: unknown
    version: >= 3.0 (iso8601 format, level, component)
    storage: unknown

RSSTATE
date               host                              state/message
()
Sep 02 17:28:36    mongodb6.co.in:27018      RS_DOWN
Sep 02 17:28:36    lomongodb3.co.in:27018    RS_DOWN
Sep 02 17:28:36    lomongodb3.co.in:27018    SECONDARY
Sep 02 17:28:36    mongodb6.co.in:27018      PRIMARY

(mtools) [tushar@NDL4176 mongologs_20180902]$ mloginfo mongodb6 --rsstate
     source: mongodb6
       host: unknown
      start: 2018 Sep 02 17:00:03.071
        end: 2018 Sep 02 17:59:52.524
date format: iso8601-local
     length: 18154
     binary: unknown
    version: >= 3.0 (iso8601 format, level, component)
    storage: unknown

RSSTATE
date               host                            state/message
()
Sep 02 17:28:26    mongodb9.co.in:27018    RS_DOWN
Sep 02 17:28:37    mongodb9.co.in:27018    SECONDARY



In above logs I have obesrved that All three members of relicaset was not not reaching or pinging each other is it issue of network.

or is there any issue with DBA end Kindly Help Me on it.

Kevin Adistambha

unread,
Sep 7, 2018, 2:09:46 AM9/7/18
to mongodb-user

Hi Yogesh

I think the main reason for the failover is, as you have suspected, network issues:

Marking host HOSTNAME9:27018 as failed :: caused by :: HostUnreachable: network error while attempting to run command ‘ismaster’ on host Hostname9.co.in:27018’

This log line is saying that it cannot ping one of the replica set member, and thus have marked it as offline.

If that host is actually online at that time, one main possibility is that you’re experiencing a network partition. The replica set deployment was designed with this problem in mind, and will automatically failover the primary node if required to maintain availability. Note that this is by design as mentioned in Replication, in particular Automatic Failover.

To take advantage of this automatic failover, your driver needs to connect to the deployment using the correct Connection String URI that describes the replica set. See replica set option for examples.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages