"want to split chunk, but can't find split point chunk"

326 views
Skip to first unread message

drzebra

unread,
Oct 3, 2011, 4:09:55 PM10/3/11
to mongodb-user
Sharding appears to be stuck on a collection ("mydb.tba"), giving an
error: "want to split chunk, but can't find split point chunk"

mongos log: http://pastebin.com/tH1qd2A9

There are 9 total shards as replica sets (including the original
replica set).

Running mongo 2.0.0

Any advice?

Thanks!

Greg Studer

unread,
Oct 4, 2011, 11:18:34 AM10/4/11
to mongod...@googlegroups.com
Hmm... The range in question seems to be :

min: { urlmd5: "16c19bd4b2a01d516282e819010aa015" } max: { urlmd5:
"16c19d86bee89084d9221e4cf213f9b7" }

This can happen if there are docs with the same shard key, preventing
the range from being split. Are there many docs in this range, and do
any have identical keys?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

drzebra

unread,
Oct 4, 2011, 1:09:09 PM10/4/11
to mongodb-user
Thanks for the reply. Yes, many docs share the same shard key. In that
range, there are 85860095 docs.

Here are some of the counts:

db.tba.find().min({ urlmd5:
"16c19bd4b2a01d516282e819010aa015" }).max({ urlmd5:"16c19d86bee89084d9221e4cf213f9b7" }).count()
85860095
[this is the total # of docs in the collection...]


db.tba.find({ urlmd5: "16c19bd4b2a01d516282e819010aa015" }).count()
182622


I know it's exclusive of the max value, but here's the count for that
anyway:
db.tba.find({ urlmd5: "16c19d86bee89084d9221e4cf213f9b7"}).count()
1


db.tba.stats() shows:
"avgObjSize" : 589.44470750935,
"nchunks" : 1310


db.printShardingStatus() shows:
mydb.tba chunks:
rset1 1194
rset2 17
rset3 17
rset4 17
rset5 17
rset6 16
rset7 16
rset8 16
It's been stuck like that showing the error in the logs every 3min.


So, I guess I can try to make the shard key a compound key. Since this
collection is already partially sharded, I have to dump the
collection, remove() the collection, then import the collection onto
rset1, then shard again... correct?

Greg Studer

unread,
Oct 5, 2011, 10:00:10 AM10/5/11
to mongodb-user
Yes, that's one option - the other option, if there's a split point
mongos is missing after 16c19bd4b2a01d516282e819010aa015 (there may
not be?), is to manually split the chunk using the split command with
"middle : <key>".

If you decide to re-import, I'd also recommend pre-sharding and
distributing your ranges beforehand, to make the initial insert
faster.

gen liu

unread,
Oct 9, 2011, 1:14:01 AM10/9/11
to mongodb-user
This problem happened to me.
Sharding appears to be failed on a collection
("mongo.gridfs.file.chunks"), giving an
error: "want to split chunk, but can't find split point chunk",
log:
un Oct 9 11:00:18 [Balancer] forced split results: {}
Sun Oct 9 11:00:18 [Balancer] distributed lock 'balancer/
test10:27000:1318124739:1804289383' unlocked.
Sun Oct 9 11:00:28 [Balancer] distributed lock 'balancer/
test10:27000:1318124739:1804289383' acquired, ts :
4e910e4cad9b95b90ea04d72
Sun Oct 9 11:00:28 [Balancer] chose [shard0000] to [shard0002] { _id:
"mongo.gridfs.file.chunks-
files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }, shard: "shard0000" }
Sun Oct 9 11:00:28 [Balancer] moving chunk ns:
mongo.gridfs.file.chunks moving ( ns:mongo.gridfs.file.chunks at:
shard0000:localhost:21100 lastmod: 2|2 min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }) shard0000:localhost:21100 ->
shard0002:localhost:23300
Sun Oct 9 11:00:28 [Balancer] moveChunk result: { chunkTooBig: true,
estimatedChunkSize: 432819556, errmsg: "chunk too big to move", ok:
0.0 }
Sun Oct 9 11:00:28 [Balancer] balancer move failed: { chunkTooBig:
true, estimatedChunkSize: 432819556, errmsg: "chunk too big to move",
ok: 0.0 } from: shard0000 to: shard0002 chunk: { _id:
"mongo.gridfs.file.chunks-
files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }, shard: "shard0000" }
Sun Oct 9 11:00:28 [Balancer] forcing a split because migrate failed
for size reasons
Sun Oct 9 11:00:28 [Balancer] want to split chunk, but can't find
split point chunk ns:mongo.gridfs.file.chunks at: shard0000:localhost:
21100 lastmod: 2|2 min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') } got: <empty>
Sun Oct 9 11:00:28 [Balancer] forced split results: {}
Sun Oct 9 11:00:28 [Balancer] distributed lock 'balancer/
test10:27000:1318124739:1804289383' unlocked.
Sun Oct 9 11:00:38 [Balancer] distributed lock 'balancer/
test10:27000:1318124739:1804289383' acquired, ts :
4e910e56ad9b95b90ea04d73
Sun Oct 9 11:00:38 [Balancer] chose [shard0000] to [shard0002] { _id:
"mongo.gridfs.file.chunks-
files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }, shard: "shard0000" }
Sun Oct 9 11:00:38 [Balancer] moving chunk ns:
mongo.gridfs.file.chunks moving ( ns:mongo.gridfs.file.chunks at:
shard0000:localhost:21100 lastmod: 2|2 min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }) shard0000:localhost:21100 ->
shard0002:localhost:23300
Sun Oct 9 11:00:38 [Balancer] moveChunk result: { chunkTooBig: true,
estimatedChunkSize: 432819556, errmsg: "chunk too big to move", ok:
0.0 }
Sun Oct 9 11:00:38 [Balancer] balancer move failed: { chunkTooBig:
true, estimatedChunkSize: 432819556, errmsg: "chunk too big to move",
ok: 0.0 } from: shard0000 to: shard0002 chunk: { _id:
"mongo.gridfs.file.chunks-
files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') }, shard: "shard0000" }
Sun Oct 9 11:00:38 [Balancer] forcing a split because migrate failed
for size reasons
Sun Oct 9 11:00:38 [Balancer] want to split chunk, but can't find
split point chunk ns:mongo.gridfs.file.chunks at: shard0000:localhost:
21100 lastmod: 2|2 min: { files_id:
ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
ObjectId('4e91001cdffe608bb3f98baf') } got: <empty>
Sun Oct 9 11:00:38 [Balancer] forced split results: {}
Sun Oct 9 11:00:38 [Balancer] distributed lock 'balancer/
test10:27000:1318124739:1804289383' unlocked.

sharding status:
mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "localhost:21100",
"maxSize" : NumberLong(10000) }
{ "_id" : "shard0001", "host" : "localhost:22200",
"maxSize" : NumberLong(10000) }
{ "_id" : "shard0002", "host" : "localhost:23300",
"maxSize" : NumberLong(10000) }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" :
"config" }
{ "_id" : "mongo", "partitioned" : true, "primary" :
"shard0000" }
mongo.gridfs.file.chunks chunks:
shard0001 1
shard0000 11
{ "files_id" : { $minKey : 1 } } -->>
{ "files_id" : ObjectId("4e90fde1dffef339cede7fee") } on : shard0001
{ "t" : 2000, "i" : 0 }
{ "files_id" :
ObjectId("4e90fde1dffef339cede7fee") } -->> { "files_id" :
ObjectId("4e91001cdffe608bb3f98baf") } on : shard0000 { "t" : 2000,
"i" : 2 }
{ "files_id" :
ObjectId("4e91001cdffe608bb3f98baf") } -->> { "files_id" :
ObjectId("4e91009cdffeade5b4498957") } on : shard0000 { "t" : 2000,
"i" : 3 }
{ "files_id" :
ObjectId("4e91009cdffeade5b4498957") } -->> { "files_id" :
ObjectId("4e9101b3dffed5c07438a477") } on : shard0000 { "t" : 1000,
"i" : 5 }
{ "files_id" :
ObjectId("4e9101b3dffed5c07438a477") } -->> { "files_id" :
ObjectId("4e910225dffed5c07438b15f") } on : shard0000 { "t" : 1000,
"i" : 7 }
{ "files_id" :
ObjectId("4e910225dffed5c07438b15f") } -->> { "files_id" :
ObjectId("4e9102badffed5c07438be47") } on : shard0000 { "t" : 1000,
"i" : 9 }
{ "files_id" :
ObjectId("4e9102badffed5c07438be47") } -->> { "files_id" :
ObjectId("4e91037edffe8736e9f0f2d4") } on : shard0000 { "t" : 1000,
"i" : 11 }
{ "files_id" :
ObjectId("4e91037edffe8736e9f0f2d4") } -->> { "files_id" :
ObjectId("4e9104e1dffe8736e9f0ffbc") } on : shard0000 { "t" : 1000,
"i" : 13 }
{ "files_id" :
ObjectId("4e9104e1dffe8736e9f0ffbc") } -->> { "files_id" :
ObjectId("4e91059edffe8736e9f10ca4") } on : shard0000 { "t" : 1000,
"i" : 15 }
{ "files_id" :
ObjectId("4e91059edffe8736e9f10ca4") } -->> { "files_id" :
ObjectId("4e910658dffe8736e9f1198c") } on : shard0000 { "t" : 1000,
"i" : 17 }
{ "files_id" :
ObjectId("4e910658dffe8736e9f1198c") } -->> { "files_id" :
ObjectId("4e91072cdffe8736e9f12674") } on : shard0000 { "t" : 1000,
"i" : 19 }
{ "files_id" :
ObjectId("4e91072cdffe8736e9f12674") } -->> { "files_id" : { $maxKey :
1 } } on : shard0000 { "t" : 1000, "i" : 20 }

db stats:
mongos> db.stats()
{
"raw" : {
"localhost:21100" : {
"db" : "mongo",
"collections" : 4,
"objects" : 33053,
"avgObjSize" : 261894.77142770702,
"dataSize" : 8656407880,
"storageSize" : 10330087424,
"numExtents" : 38,
"indexes" : 5,
"indexSize" : 3572912,
"fileSize" : 17105420288,
"nsSizeMB" : 16,
"ok" : 1
},
"localhost:22200" : {
"db" : "mongo",
"collections" : 3,
"objects" : 8,
"avgObjSize" : 64,
"dataSize" : 512,
"storageSize" : 20480,
"numExtents" : 3,
"indexes" : 3,
"indexSize" : 24528,
"fileSize" : 201326592,
"nsSizeMB" : 16,
"ok" : 1
}
},
"objects" : 33061,
"avgObjSize" : 261831.4144157769,
"dataSize" : 8656408392,
"storageSize" : 10330107904,
"numExtents" : 41,
"indexes" : 8,
"indexSize" : 3597440,
"fileSize" : 17306746880,
"ok" : 1

There are 3 total shards without replica sets.
shard key is"{"files_id":1}"
Running mongo 2.0.0

Greg Studer

unread,
Oct 9, 2011, 10:37:04 PM10/9/11
to mongodb-user
The first diagnostic steps should be the same as before - are there
duplicate values in:

min: { files_id:ObjectId('4e90fde1dffef339cede7fee') } max:
{ files_id:ObjectId('4e91001cdffe608bb3f98baf') } ?

Other things to check are the total chunk size in terms of # of
documents.
> > > "avgObjSize" :...
>
> read more >>

gen liu

unread,
Oct 11, 2011, 9:42:38 PM10/11/11
to mongodb-user
Maybe my shard key's problem.
My file's size is 400M, so the chunk's size will be very large.
When I use the new shard key:{"files_id":1,"_id":1}, everything is ok.
> ...
>
> read more >>

Greg Studer

unread,
Oct 12, 2011, 10:16:20 AM10/12/11
to mongodb-user
Yes, if you add the always-unique _id to the shard key, you'll be sure
not to get duplicates.
> ...
>
> read more »

gen liu

unread,
Nov 14, 2011, 4:43:51 AM11/14/11
to mongodb-user
Can I use the shard key {"files_id":1,"_id":1} on collection
mongo.file.chunks?
In mongod's doc:
The reason to use "files_id" is so that all chunks of a given file
live on the same shard, which is safer and allows "filemd5" command to
work (required
by certain drivers).

If I use this shard key, the "filemd5" command could report error in
the future, right?


On 10月12日, 下午10时16分, Greg Studer <g...@10gen.com> wrote:
> Yes, if you add the always-unique _id to theshardkey, you'll be sure
> not to get duplicates.
>
> On Oct 11, 9:42 pm, gen liu <tomliu...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Maybe myshardkey'sproblem.
> > My file's size is 400M, so the chunk's size will be very large.
> > When I use the newshardkey:{"files_id":1,"_id":1}, everything is ok.
>
> > On Oct 10, 10:37 am, Greg Studer <g...@10gen.com> wrote:
>
> > > The first diagnostic steps should be the same as before - are there
> > > duplicate values in:
>
> > > min: { files_id:ObjectId('4e90fde1dffef339cede7fee') } max:
> > > { files_id:ObjectId('4e91001cdffe608bb3f98baf') } ?
>
> > > Other things to check are the total chunk size in terms of # of
> > > documents.
>
> > > On Oct 9, 1:14 am, gen liu <tomliu...@gmail.com> wrote:
>
> > > > This problem happened to me.
> > > > Sharding appears to be failed on a collection
> > > > ("mongo.gridfs.file.chunks"), giving an
> > > > error: "want to split chunk, but can't find split point chunk",
> > > > log:
> > > > un Oct 9 11:00:18 [Balancer] forced split results: {}
> > > > Sun Oct 9 11:00:18 [Balancer] distributed lock 'balancer/
> > > > test10:27000:1318124739:1804289383' unlocked.
> > > > Sun Oct 9 11:00:28 [Balancer] distributed lock 'balancer/
> > > > test10:27000:1318124739:1804289383' acquired, ts :
> > > > 4e910e4cad9b95b90ea04d72
> > > > Sun Oct 9 11:00:28 [Balancer] chose [shard0000] to [shard0002] { _id:
> > > > "mongo.gridfs.file.chunks-
> > > > files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
> > > > 2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') },shard: "shard0000" }
> > > > Sun Oct 9 11:00:28 [Balancer] moving chunk ns:
> > > > mongo.gridfs.file.chunks moving ( ns:mongo.gridfs.file.chunks at:
> > > > shard0000:localhost:21100 lastmod: 2|2 min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') }) shard0000:localhost:21100 ->
> > > > shard0002:localhost:23300
> > > > Sun Oct 9 11:00:28 [Balancer] moveChunk result: { chunkTooBig: true,
> > > > estimatedChunkSize: 432819556, errmsg: "chunk too big to move", ok:
> > > > 0.0 }
> > > > Sun Oct 9 11:00:28 [Balancer] balancer move failed: { chunkTooBig:
> > > > true, estimatedChunkSize: 432819556, errmsg: "chunk too big to move",
> > > > ok: 0.0 } from: shard0000 to: shard0002 chunk: { _id:
> > > > "mongo.gridfs.file.chunks-
> > > > files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
> > > > 2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') },shard: "shard0000" }
> > > > Sun Oct 9 11:00:28 [Balancer] forcing a split because migrate failed
> > > > for size reasons
> > > > Sun Oct 9 11:00:28 [Balancer] want to split chunk, but can't find
> > > > split point chunk ns:mongo.gridfs.file.chunks at: shard0000:localhost:
> > > > 21100 lastmod: 2|2 min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') } got: <empty>
> > > > Sun Oct 9 11:00:28 [Balancer] forced split results: {}
> > > > Sun Oct 9 11:00:28 [Balancer] distributed lock 'balancer/
> > > > test10:27000:1318124739:1804289383' unlocked.
> > > > Sun Oct 9 11:00:38 [Balancer] distributed lock 'balancer/
> > > > test10:27000:1318124739:1804289383' acquired, ts :
> > > > 4e910e56ad9b95b90ea04d73
> > > > Sun Oct 9 11:00:38 [Balancer] chose [shard0000] to [shard0002] { _id:
> > > > "mongo.gridfs.file.chunks-
> > > > files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
> > > > 2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') },shard: "shard0000" }
> > > > Sun Oct 9 11:00:38 [Balancer] moving chunk ns:
> > > > mongo.gridfs.file.chunks moving ( ns:mongo.gridfs.file.chunks at:
> > > > shard0000:localhost:21100 lastmod: 2|2 min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') } max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') }) shard0000:localhost:21100 ->
> > > > shard0002:localhost:23300
> > > > Sun Oct 9 11:00:38 [Balancer] moveChunk result: { chunkTooBig: true,
> > > > estimatedChunkSize: 432819556, errmsg: "chunk too big to move", ok:
> > > > 0.0 }
> > > > Sun Oct 9 11:00:38 [Balancer] balancer move failed: { chunkTooBig:
> > > > true, estimatedChunkSize: 432819556, errmsg: "chunk too big to move",
> > > > ok: 0.0 } from: shard0000 to: shard0002 chunk: { _id:
> > > > "mongo.gridfs.file.chunks-
> > > > files_id_ObjectId('4e90fde1dffef339cede7fee')", lastmod: Timestamp
> > > > 2000|2, ns: "mongo.gridfs.file.chunks", min: { files_id:
> > > > ObjectId('4e90fde1dffef339cede7fee') }, max: { files_id:
> > > > ObjectId('4e91001cdffe608bb3f98baf') },shard: "shard0000" }
> ...
>
> 阅读更多 >>

Greg Studer

unread,
Nov 14, 2011, 4:33:43 PM11/14/11
to mongodb-user
Yes, it could, actually you'd probably want to use "n" as the second
shard key field - but as you said, this may cause problems with md5,
depending on the driver you're using : which driver are you using?
> ...
>
> read more >>
Reply all
Reply to author
Forward
0 new messages