Hey,
We are running a small MongoDB 2.2.0 cluster on AWS (2x shards, each with
2x nodes + 1x arbiter). Each of our app servers is running a mongos
process. The 3 mongo config processes are running on the nodes.
Recently, our mongos instances have randomly been losing connectivity with
the entire cluster (see attached logs). It appears mongos cannot connect to
any machine in the cluster, whilst still being able to resolve dns, and the
machine still able to receive HTTP requests. After about 30 minutes of
this, connectivity is restored.
The logs show after a clean run of Balancer messages, a spike in errors on
ReplicaSetMonitorWatcher, CheckConfigServers, LockPinger and Balancer
Has anyone encountered this before?
Thanks in advance,
Adam