mongos> sh.status()
2016-09-05T09:49:15.645+0000 E QUERY [thread1] Error: error: { "code" : 50, "ok" : 0, "errmsg" : "Operation timed out" } :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCommandCursor@src/mongo/shell/query.js:689:1
DBQuery.prototype._exec@src/mongo/shell/query.js:118:28
DBQuery.prototype.hasNext@src/mongo/shell/query.js:276:5
DBCollection.prototype.findOne@src/mongo/shell/collection.js:289:10
printShardingStatus@src/mongo/shell/utils_sh.js:540:19
sh.status@src/mongo/shell/utils_sh.js:78:5
@(shell):1:1
cfg:PRIMARY> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5784eeaef6b7baafd8311861")
}
shards:
{ "_id" : "rs1", "host" : "rs1/mongodbreplicaset1_mongodb-rs1-srv1_1:27017,mongodbreplicaset2_mongodb-rs1-srv2_1:27017" }
{ "_id" : "rs2", "host" : "rs2/mongodbreplicaset1_mongodb-rs2-srv2_1:27017,mongodbreplicaset2_mongodb-rs2-srv1_1:27017" }
active mongoses:
"3.2.7" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: could not get updated shard list from config server due to Operation timed out
Time of Reported error: Tue Sep 06 2016 13:05:53 GMT+0000 (UTC)
Migration Results for the last 24 hours:
No recent migrations
2016-09-05T10:09:35.549+0000 I COMMAND [conn1243] Command on database config timed out waiting for read concern to be satisfied. Command: { find: "shards", readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1472281864000|2, t: 30 } }, maxTimeMS: 30000 }
2016-09-05T10:09:35.551+0000 I COMMAND [conn1243] command config.$cmd command: find { find: "shards", readConcern: { level: "majority", afterOpTime: { ts: Timestamp 1472281864000|2, t: 30 } }, maxTimeMS: 30000 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:92 locks:{} protocol:op_command 30409ms
Hi Lukas, Thiago,
On Sunday I had a server issue which resulted in 2 of 3 config servers in one replica set became bricked and the third one stuck on recovery ..
It has been a while since you posted this question. Have you had any success in fixing the issue?
The main issue is this line in the log you posted:
2016-09-05T10:09:35.549+0000 I COMMAND [conn1243] Command on database config timed out waiting for read concern to be satisfied.
Using config servers as a replica set, MongoDB needs to ensure that any writes and any reads to/from the config servers are committed to the majority of the replica set to ensure that the config data is permanent and will not be rolled back for any reason (see Read and Write Operations on Config Servers).
The timeout message you are seeing in the logs reflects this lack of majority. That is, the majority of the config server is not online at that point in time, and thus the config servers cannot reach a quorum on what is the latest data that were written to the config servers. In this situation, MongoDB opts to return a timeout error instead of returning potentially the wrong data.
In order to restore operation to the cluster, you would need to ensure that the majority of the config server replica set is online.
For more information regarding reading/writing settings in a replica set, please see:
Best regards,
Kevin
Hi Enric
Please note that I have replied in your own thread here: https://groups.google.com/forum/#!topic/mongodb-user/b9okvbIS_A4.
The telltale sign of what’s happening in your deployment seems to be the growth of the WiredTigerLAS.wt
file, which is reported in SERVER-26592, and you may be experiencing a similar issue.
Let’s keep the discussion in that thread.
Best regards,
Kevin