mongos> sh.shardCollection("test.events", { "publisher": 1, "reported": 1})
{ "code" : 50, "ok" : 0, "errmsg" : "Operation timed out" }
2016-04-08T10:45:11.461-0400 I NETWORK [initandlisten] connection accepted from 9.80.193.174:64048 #2 (2 connections now open)
2016-04-08T10:46:23.478-0400 I COMMAND [conn2] command admin.$cmd command: checkShardingIndex { checkShardingIndex: "test.events", keyPattern: { publisher: 1.0, reported: 1.0 } } keyUpdates:0 writeConflicts:0 numYields:784390 reslen:74 locks:{ Global: { acquireCount: { r: 1568782 } }, Database: { acquireCount: { r: 784391 } }, Collection: { acquireCount: { r: 784391 } } } protocol:op_command 72013ms
2016-04-08T10:46:23.506-0400 I SHARDING [conn1] request split points lookup for chunk test.events { : MinKey, : MinKey } -->> { : MaxKey, : MaxKey }
2016-04-08T10:46:33.459-0400 I NETWORK [initandlisten] connection accepted from 9.80.193.174:64063 #3 (3 connections now open)
2016-04-08T10:47:12.843-0400 W SHARDING [conn1] Finding the split vector for test.events over { publisher: 1.0, reported: 1.0 } keyCount: 27639 numSplits: 3632 lookedAt: 13520 took 49336ms
2016-04-08T10:47:12.846-0400 I COMMAND [conn1] command admin.$cmd command: splitVector { splitVector: "test.events", keyPattern: { publisher: 1.0, reported: 1.0 }, min: { publisher: MinKey, reported: MinKey }, max: { publisher: MaxKey, reported: MaxKey }, maxChunkSizeBytes: 67108864, maxSplitPoints: 0, maxChunkObjects: 0 } keyUpdates:0 writeConflicts:0 numYields:784390 reslen:236463 locks:{ Global: { acquireCount: { r: 1568782 } }, Database: { acquireCount: { r: 784391 } }, Collection: { acquireCount: { r: 784391 } } } protocol:op_command 49339ms
2016-04-08T10:47:12.846-0400 I NETWORK [conn1] end connection 9.80.193.174:64033 (2 connections now open)
mongos> sh.shardCollection("test.events2", { "publisher": 1, "reported": 1})
{ "collectionsharded" : "test.events2", "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5707b69d7dd2b2c5d33f8b9e")
}
shards:
{ "_id" : "shard0000", "host" : "9.80.193.174:27017" }
active mongoses:
"3.2.4" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "test", "primary" : "shard0000", "partitioned" : true }
test.events2
shard key: { "publisher" : 1, "reported" : 1 }
unique: false
balancing: true
chunks:
shard0000 1
{ "publisher" : { "$minKey" : 1 }, "reported" : { "$minKey" : 1 } } -->> { "publisher" : { "$maxKey" : 1 }, "reported" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)
Hi,
{ “code” : 50, “ok” : 0, “errmsg” : “Operation timed out” }
Error code 50 is a generic error message, saying that the operation exceeded a time limit. In this case:
2016-04-08T10:46:23.478-0400 I COMMAND [conn2] command admin.$cmd command: checkShardingIndex { checkShardingIndex: “test.events”, keyPattern: { publisher: 1.0, reported: 1.0 } } keyUpdates:0 writeConflicts:0 numYields:784390 reslen:74 locks:{ Global: { acquireCount: { r: 1568782 } }, Database: { acquireCount: { r: 784391 } }, Collection: { acquireCount: { r: 784391 } } } protocol:op_command 72013ms
this checkShardingIndex
operation took 72 seconds, and
2016-04-08T10:47:12.846-0400 I COMMAND [conn1] command admin.$cmd command: splitVector { splitVector: “test.events”, keyPattern: { publisher: 1.0, reported: 1.0 }, min: { publisher: MinKey, reported: MinKey }, max: { publisher: MaxKey, reported: MaxKey }, maxChunkSizeBytes: 67108864, maxSplitPoints: 0, maxChunkObjects: 0 } keyUpdates:0 writeConflicts:0 numYields:784390 reslen:236463 locks:{ Global: { acquireCount: { r: 1568782 } }, Database: { acquireCount: { r: 784391 } }, Collection: { acquireCount: { r: 784391 } } } protocol:op_command 49339ms
this splitVector
operation took 49 seconds. In combination, both operations took 121 seconds to finish, so the mongos
returned the message that the sh.shardCollection()
command took too long to finish in the form of a timeout error.
The checkShardingIndex
command is an internal command that performs a collection scan to ensure that every document in the collection contains the shard key. For example, if the shard key is {a:1,b:1}
, it will check every document for the existence of the field a
and b
.
The splitVector
command is also an internal command. It uses the average object size and number of object in the collection to split the collection into chunks, based on the shard key and the maximum chunk size (which defaults to 64 MB).
The issue that leads to your timeout error is not only the number of documents, but also the size of the collection itself. Instead of importing the whole collection and sharding it afterward, a better solution is to pre-split the collection before importing. This will allow MongoDB to import documents directly into the chunks as efficiently as possible.
To pre-split a collection:
db.createCollection()
and also create the necessary indexes using db.collection.createIndex()
, including the index that corresponds to the shard key.sh.shardCollection()
mongos
.Since the empty chunks are already distributed and balanced, the imported data will go straight into the proper shard.
Please note that you should only pre-split an empty collection.
a compound index on “publisher” and “reported” (a date)
Choosing the correct shard key is extremely important for a sharded cluster deployment. If the shard key you are using involves a date element, the key will be monotonically increasing. This may result in a “hot shard”, i.e. there will be a shard that is more active compared to others. This could limit your insert rate, and the cluster will constantly need to split chunks and rebalance, since all inserts could go into a single chunk. Please see Sharding Pitfalls, Shard Keys, and Considerations for Selecting Shard Keys for more information regarding choosing a shard key.
Best regards,
Kevin