Unclear lastHeartbeat from not reachable/healthy member

26 views
Skip to first unread message

Astro

unread,
Apr 22, 2016, 1:34:06 AM4/22/16
to mongodb-user
I have a replica set of 3 members running on single machine. mongod version 3.0.8. One of member was down and I need to see since when it was down. 
The lastHeartbeat and lastHeartbeatRecv seems helpful. But, lastHeartbeat for the down member should be as old as old as the time when the member was down

From docs:
The lastHeartbeat value provides an ISODate formatted date and time of the transmission time of last heartbeat received from this member.

"_id" : 1,
"name" : "ip-1-2-3-4:27018",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 155956,
"optime" : Timestamp(1461301526, 10),
"optimeDate" : ISODate("2016-04-22T05:05:26Z"),
"lastHeartbeat" : ISODate("2016-04-22T05:05:26.508Z"),
"lastHeartbeatRecv" : ISODate("2016-04-22T05:05:26.508Z"),
"pingMs" : 0,
"electionTime" : Timestamp(1460972678, 1),
"electionDate" : ISODate("2016-04-18T09:44:38Z"),
"configVersion" : 4
},
{
"_id" : 2,
"name" : "ip-1-2-3-4:27019",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(0, 0),
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2016-04-22T05:05:25.812Z"),
"lastHeartbeatRecv" : ISODate("2016-04-21T12:45:54.178Z"),
"pingMs" : 0,
"lastHeartbeatMessage" : "Failed attempt to connect to ip-1-2-3-4:27019; couldn't connect to server ip-1-2-3-4:27019 (1.2.3.4), connection attempt failed",
"configVersion" : -1
}

Any help on this?

Kevin Adistambha

unread,
May 5, 2016, 3:15:02 AM5/5/16
to mongodb-user

Hi,

I have a replica set of 3 members running on single machine. mongod version 3.0.8. One of member was down and I need to see since when it was down.
The lastHeartbeat and lastHeartbeatRecv seems helpful. But, lastHeartbeat for the down member should be as old as old as the time when the member was down

The documentation is incorrect. Assuming we have two nodes A and B, and we logged into node A, lastHeartbeat actually records the time when node A sends a heartbeat to node B, and lastHeartbeatRecv records the time when node B sends a heartbeat to node A. Therefore, if B was down, the value in lastHeartbeatRecv will show you the last time B responds to a heartbeat request from A.

Thank you for bringing this into our attention. I have created DOCS-7810 to track this issue.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages