Sharding a database so that some collections are on one shard and the remainder on another

37 views
Skip to first unread message

NoelD

unread,
May 1, 2012, 6:46:35 PM5/1/12
to mongod...@googlegroups.com
We have a Mongo database in a three-node replicaset and we are investigating sharding it as a way to improve write performance, the biggest bottleneck in our setup. At first, I thought that we would be able to shard the database so that some collections would go to shard A and the others would go to shard B. However, after reading the documentation, and discovering https://jira.mongodb.org/browse/SERVER-939, I do not think that it is possible to do this. Can anyone confirm that it is not possible to do what we want and shard a single database in this manner? Is it likely that this feature will be implemented soon?

If sharding cannot be implemented as outlined in SERVER-939, what do others do when they find themselves in this situation? One option might be to put each collection into its own database, thus allowing the data to be spread across the shards, but this complicates queries. Another option might be to put all data into a single collection using the current collection names as shard keys and then split the collection.

Scott Hernandez

unread,
May 1, 2012, 7:10:10 PM5/1/12
to mongod...@googlegroups.com
Yes it is not possible, so people do put things in different databases
so they can be spread out to different shards.

Can you explain why you need to do this? Is it just that you have many
collections which are small and don't need to be sharded?

There is nothing more complicated about doing a query in one database
or another for collection. Queries can only be done in a single
collection anyway -- so moving them between databases shouldn't be an
issue.
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/ur0GP4ISE8IJ.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

NoelD

unread,
May 1, 2012, 7:37:16 PM5/1/12
to mongod...@googlegroups.com
On Wednesday, May 2, 2012 11:10:10 AM UTC+12, Scott Hernandez wrote:
Yes it is not possible, so people do put things in different databases so
they can be spread out to different shards.
 
Thanks for confirming this.

Can you explain why you need to do this? Is it just that you have many
collections which are small and don't need to be sharded?

We have a single database with some twenty collections, although this number is growing. We have been slow to shard the collections because once we choose a shard key, we cannot change it. The problem we are trying to solve is write performance - we write lots of data from mapreduce jobs and the single-threaded write is a bottleneck. We hoped that by splitting the database into two shards that we could increase the write performance by having reducers writing data to different collections in different shards at the same time.
 
There is nothing more complicated about doing a query in one database
or another for collection. Queries can only be done in a single
collection anyway -- so moving them between databases shouldn't be an
issue.

Good point.

Reply all
Reply to author
Forward
0 new messages