Hi, i'm having an issue with my replica. I have 3 servers, 1 primary 2 secondaries and one of the secondaries cannot be restored.
I tried removing the directory with the files, and restarted it. After one day it's in Recovery status with the following message in logs:
Thu Feb 14 09:28:55 [rsBackgroundSync] replSet not trying to sync from
e-00000XXX.cloud.com:27017, it is vetoed for 195 more seconds
Thu Feb 14 09:28:55 [rsBackgroundSync] replSet not trying to sync from
e-00000ZZZ.cloud.com:27017, it is vetoed for 0 more seconds
Rs.status:
IndexShard1:RECOVERING> rs.status()
{
"set" : "IndexShard1",
"date" : ISODate("2013-02-14T13:33:24Z"),
"myState" : 3,
"members" : [
{
"_id" : 1,
"name" : "
e-00000XXX.cloud.com:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 85903,
"optime" : Timestamp(1360848802000, 115),
"optimeDate" : ISODate("2013-02-14T13:33:22Z"),
"lastHeartbeat" : ISODate("2013-02-14T13:33:23Z"),
"pingMs" : 1
},
{
"_id" : 2,
"name" : "
e-00000ZZZ.cloud.com:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 85903,
"optime" : Timestamp(1360848802000, 92),
"optimeDate" : ISODate("2013-02-14T13:33:22Z"),
"lastHeartbeat" : ISODate("2013-02-14T13:33:24Z"),
"pingMs" : 1
},
{
"_id" : 3,
"name" : "
e-00000YYY.cloud.com:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 86030,
"optime" : Timestamp(1360762916000, 93),
"optimeDate" : ISODate("2013-02-13T13:41:56Z"),
"errmsg" : "error RS102 too stale to catch up",
"self" : true
}
],
"ok" : 1
}
IndexShard1:RECOVERING> db.printReplicationInfo()
configured oplog size: 9393.956640625MB
log length start to end: 0secs (0hrs)
oplog first event time: Wed Feb 13 2013 09:41:56 GMT-0400
oplog last event time: Wed Feb 13 2013 09:41:56 GMT-0400
now: Thu Feb 14 2013 09:34:46 GMT-0400
Any ideas? Should I increase oplog size and resync my replica?
Thanks
Matias