Pre-splitting question

31 views
Skip to first unread message

Patrick Scott

unread,
Apr 25, 2012, 9:11:08 AM4/25/12
to mongodb-user
I want to shard a collection of around 230 million documents. It has been suggested that I pre-split the collection before enabling sharding. I ran a local test using db.runCommand({split: "my_collection", middle: { a: prefix, b : 1}}). I am using a compound key for sharding and just want to split on the first portion.

How can I check that pre-splitting works? db.printShardingStatus() shows nothing since I haven't actually sharded the collection yet. I'm worried that if I attempt to shard the collection, the pre-split will not have taken affect or will not be quite right and will severely impact performance.

Thoughts?

Patrick

Scott Hernandez

unread,
Apr 25, 2012, 9:13:29 AM4/25/12
to mongod...@googlegroups.com
Each time you do a split you should see the chunk count increase in
the output of printShardingStatus().

Can you please post the output of printShardingStatus()?

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

Patrick Scott

unread,
Apr 25, 2012, 9:26:11 AM4/25/12
to mongod...@googlegroups.com
Hmmm, I'm afraid to try it on my production setup. I may need to get my test cluster back up and running.

Should the chunk count increase before I actually shard the collection?

Scott Hernandez

unread,
Apr 25, 2012, 9:29:26 AM4/25/12
to mongod...@googlegroups.com
Yes, the chunks are just metadata in the config servers. There does
not need to be any actual data to do the splits/migrates -- it is just
a logical (metadata) operation without data.

On Wed, Apr 25, 2012 at 6:26 AM, Patrick Scott

Patrick Scott

unread,
Apr 25, 2012, 9:40:44 AM4/25/12
to mongod...@googlegroups.com
Ok, I'll get another test cluster running and try a couple of split operations and let you know if I run into more trouble.

Eliot Horowitz

unread,
Apr 25, 2012, 1:18:08 PM4/25/12
to mongod...@googlegroups.com
pre splitting is for before you have any data to prep for a bulk load
or something like that.

once you have data, by definition you can't pre-split, so you'll just
want to enable sharding and that will do the splitting for you
Reply all
Reply to author
Forward
0 new messages