Hey all,
So, our config servers got out of sync. We brought two of them down and migrated data to the bad one, then brought things back up. Everything seems to be running smoothly now, except that something's wrong on two collections on two different databases. This is out of 489 databases, with between four and eight collections in each one.
In both collections, doing a find (or a count or a findOne) returns this error:
mongos> db.repriceBatch.findOne()
Fri Feb 28 08:42:55.195 error: {
"$err" : "too many retries of stale version info ( ns : dealer_589.repriceBatch, received : 0|0||000000000000000000000000, wanted : 1|0||53053ed71aa735c9f8351cf8, send )",
"code" : 13388
} at src/mongo/shell/query.js:128
getShardDistribution returns this error:
mongos> db.repriceBatch.getShardDistribution()
Fri Feb 28 08:39:56.002 error: {
"$err" : "stale config on lazy receive :: caused by :: $err: \"[dealer_589.repriceBatch] shard version not ok in Client::Context: this shard contains versioned chunks for dealer_589.repriceBatch, but no version se...\" ( ns : dealer_589.repriceBatch, received : 0|0||000000000000000000000000, wanted : 1|0||53053ed71aa735c9f8351cf8, recv )",
"code" : 9996
} at src/mongo/shell/query.js:128
It doesn't matter which mongos I connect with, the behavior is the same. I've also tried restarting all the mongos processes, and running flushRouterConfig, to no avail.
I've also tried dropping the collections, but the error persists even when the collection doesn't (appear to) exist.
Basically the only hits on google for this error are source code.