Yes we are experiencing something similar after a compact operation on a secondary node - now it cannot catch up (optimeDate is stuck at the same timestamp for hours) with "not reachable/healthy" status. Any help figuring this out would be appreciated.
On Monday, May 27, 2013 7:50:53 PM UTC-7, Tsz Ming Wong wrote:
Hello,
I remember sometimes ago, when doing a long compact() process on secondary,, it will show as "RECOVERING" in primary's rs.status(), but recently we are seeing error
"name": "my-secondary-node:27017",
"stateStr": "(not reachable/healthy)",
"optime": Timestamp(1369707789000,
"optimeDate": ISODate("2013-05-28T02:23:09Z"),
"lastHeartbeat": ISODate("2013-05-28T02:23:11Z"),
"errmsg": "DBClientBase::findN: transport error: my-secondary-node:27017 ns: admin.$cmd query: { replSetHeartbeat: \"Mongo\", v: 57, pv: 1, checkEmpty: false, from: \"my-primary-node:27017\", $auth: {} }"
However, when I connect to the secondary on primary via shell, it is showing "RECOVERING"
mongo --host my-secondary-node
MongoDB shell version: 2.2.4
connecting to: my-secondary-node:27017/test
Anyone encountered this before?