Is it possible?
thanks.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Is it then possible to know what shard the result id came from and
retrieve it from only that shard?
thanks.
On Dec 1, 10:22 am, Sergei Tulentsev <sergei.tulent...@gmail.com>
wrote:
The Mongo Document "Choosing a Shard Key" provides some good tips and
best practices on the subject.
http://www.mongodb.org/display/DOCS/Choosing+a+Shard+Key
A big pitfall of choosing a shard key that has the same cardinality as
the number of servers is what to do when those servers fill up.
Consider this example:
"I have four servers, so I am going to create a key, "goto_shard",
with the possible value of 1-4 and shard on that. Each new document
will be assigned a "goto_shard" value of 1-4 in a round-robin
fashion."
This may be okay in the beginning, as each server is filled at the
same rate. However, what will happen when these four servers get
full? A fifth member can be added to the collection, and the range of
"goto_shard" can be expanded to 1-5. However, the chunks on shards
1-4 can not be moved. In order to rebalance the documents, the value
of "goto_shard" will have to be manually changed on all of the
documents that you wish to move to the new shard. To quote "Scaling
MongoDB" by Kristina Chodorow, "If you want to manually distribute
your data, do not use MongoDB's built-in sharding. You'll be fighting
it all the way." This book is a terrific resource for learning about
how sharding works in MongoDB and provides good examples of the "best
practices" to use when creating sharded collections.
http://shop.oreilly.com/product/0636920018308.do
All of that being said, there is a feature request in the pipeline to
allow the user to specify some affinity for which shard the balancer
sends each chunk to. (This is mostly for location awareness, but
could probably be adapted for round-robin style writes) The Jira case
is here:
https://jira.mongodb.org/browse/SERVER-2545 It is tentatively
scheduled for version 2.1.
Other users have asked similar questions regarding round-robin style
writes. You may find the discussion titled "Forcing a Round Robin on
Sharded Collections" beneficial:
http://groups.google.com/group/mongodb-user/browse_thread/thread/b943d60caf3d6347
On the topic of choosing a shard key, for read efficiency, the best
practice is to choose a shard key that your application will be
querying on. This is what Sergei Tulentsev was talking about in
regard to the drawbacks of using a random shard key. If you shard on
the key "name", and run a query on that key, such as:
> db.collection.find({name:"Marc"})
the mongos process will only talk to the shard containing the
document(s) which match the query. However, if you shard on the key
"name" and do a query on the "_id" key all shards will be hit, because
the mongos will not know which shard each specific _id resides on.
In conclusion, it is important to choose a shard key wisely such that
writes will be evenly distributed, and it is strongly discouraged to
try to force which shard each new document is written to yourself.
Hopefully the above has provided you with some insight about how
sharding works. If you have any questions regarding choosing a shard
key or anything else, the MongoDB community is here to help!
Good Luck!