Stale config on lazy receive (error code 9996)

336 views
Skip to first unread message

Justin Foote

unread,
Feb 28, 2014, 11:49:27 AM2/28/14
to mongod...@googlegroups.com
Hey all,
So, our config servers got out of sync.  We brought two of them down and migrated data to the bad one, then brought things back up.  Everything seems to be running smoothly now, except that something's wrong on two collections on two different databases.  This is out of 489 databases, with between four and eight collections in each one.  

In both collections, doing a find (or a count or a findOne) returns this error:

mongos> db.repriceBatch.findOne()
Fri Feb 28 08:42:55.195 error: {
 
"$err" : "too many retries of stale version info ( ns : dealer_589.repriceBatch, received : 0|0||000000000000000000000000, wanted : 1|0||53053ed71aa735c9f8351cf8, send )",
 
"code" : 13388
} at src/mongo/shell/query.js:128


getShardDistribution returns this error:

mongos> db.repriceBatch.getShardDistribution()
Fri Feb 28 08:39:56.002 error: {
 
"$err" : "stale config on lazy receive :: caused by :: $err: \"[dealer_589.repriceBatch] shard version not ok in Client::Context: this shard contains versioned chunks for dealer_589.repriceBatch, but no version se...\" ( ns : dealer_589.repriceBatch, received : 0|0||000000000000000000000000, wanted : 1|0||53053ed71aa735c9f8351cf8, recv )",
 
"code" : 9996
} at src/mongo/shell/query.js:128

It doesn't matter which mongos I connect with, the behavior is the same.  I've also tried restarting all the mongos processes, and running flushRouterConfig, to no avail.  

I've also tried dropping the collections, but the error persists even when the collection doesn't (appear to) exist.  

Basically the only hits on google for this error are source code.  


Justin Foote

unread,
Feb 28, 2014, 5:02:50 PM2/28/14
to mongod...@googlegroups.com
Weirdly, this was fixed by restarting all of the mongod processes in the cluster.  

Asya Kamsky

unread,
Mar 2, 2014, 8:38:09 AM3/2/14
to mongodb-user
I was going to suggest just that.

Sometimes when metadata changes in the cluster (and replacing two of the configs with contents of the third was sort-of an equivalent) either mongos' or mongod's can end up with a stale cached config.  With mongos you can run a command to get it to flush the router config, but with shards, the way to get it to refresh config is to step down the primary of the replica set.  New primary will read the fresh config info from the config DB.

Asya



--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
 
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages