ReplicaSet: node status does not change on recover

70 views
Skip to first unread message

Christoph Preissner

unread,
Oct 13, 2010, 8:27:19 AM10/13/10
to mongodb-user
Hi,

we are still doing some load test; we have a replica set with 3 nodes,
and the primary node got so much data that the two other nodes became
stale ( "replSet error RS102 too stale to catch up" ).

Looking at the mongo shell, the status still stays on "state
2" ( secondary according to this docu page:
http://www.mongodb.org/display/DOCS/Replica+Set+Commands ).

We restarted the third node, and it got status "3", as expected.

rs.status()
{
"set" : "testRS",
"date" : "Wed Oct 13 2010 13:44:41 GMT+0200 (CEST)",
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "debian1:27017",
"health" : 1,
"state" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "10.10.40.112:27017",
"health" : 1,
"state" : 2,
"uptime" : 13516,
"lastHeartbeat" : "Wed Oct 13 2010 13:44:40 GMT
+0200 (CEST)",
"errmsg" : "error RS102 too stale to catch up"
},
{
"_id" : 2,
"name" : "10.10.40.113:27017",
"health" : 1,
"state" : 3,
"uptime" : 5856,
"lastHeartbeat" : "Wed Oct 13 2010 13:44:40 GMT
+0200 (CEST)",
"errmsg" : "error RS102 too stale to catch up"
}
],
"ok" : 1
}

Maybe it is only checked on startup ?

Greets, Christoph

Kristina Chodorow

unread,
Oct 13, 2010, 9:04:25 AM10/13/10
to mongod...@googlegroups.com
I think state 2 is what you're going for (that's secondary).  State 3 is recovering, you might want to look at the web console (10.10.40.113:28017/_replSet).  It can show you how far behind it is and more about what's going on.  State 3 should transition to state 2, eventually, and rs.status() is updated on every heartbeat (hence the uptime calculations).



--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Christoph Preissner

unread,
Oct 13, 2010, 9:15:39 AM10/13/10
to mongodb-user
The node "10.10.40.113:27017" is as expected, but the second node
( 10.10.40.112:27017 ) does not change to state 3,
although it is stale ( "error RS102 too stale to catch up" ).

Kristina Chodorow

unread,
Oct 13, 2010, 9:45:56 AM10/13/10
to mongod...@googlegroups.com
Oh, I see.  That's probably a bug, I created a case you can watch/track: http://jira.mongodb.org/browse/SERVER-1933.



--

Christoph Preissner

unread,
Oct 13, 2010, 9:55:01 AM10/13/10
to mongodb-user
Thank you. Maybe state 3 is wrong anyway, because it actually does not
try to recover. But the actual Java driver addresses this node
as secondary, and I do not want that :)

On 13 Okt., 15:45, Kristina Chodorow <krist...@10gen.com> wrote:
> Oh, I see.  That's probably a bug, I created a case you can watch/track:http://jira.mongodb.org/browse/SERVER-1933.
>
> On Wed, Oct 13, 2010 at 9:15 AM, Christoph Preissner <c.preiss...@tv1.eu>wrote:
>
> > The node "10.10.40.113:27017" is as expected, but the second node
> > ( 10.10.40.112:27017 ) does not change to state 3,
> > although it is stale ( "error RS102 too stale to catch up" ).
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > mongodb-user...@googlegroups.com<mongodb-user%2Bunsu...@googlegroups.com>
> > .
Reply all
Reply to author
Forward
0 new messages