Don't use hashed shard keys (unless you really have to and know what you're doing)

Jacek Radzikowski

unread,

Sep 5, 2013, 3:40:41 PM9/5/13

to mongod...@googlegroups.com

This is a followup to the topic "High load on "old" shards after adding a new one" from a week ago.
My problem was that after adding a new node to mongodb cluster sharded using a hashed key, the entire cluster has been brought to the point where it was unusable. It was caused by a high disk load generated by balancing procedure. When after three days of waiting only about a half of the data has been transferred to the new node (~70GB), I decided to give another chance to unhashed sharding key. The old configuration of the cluster has been wiped and I "replayed" exactly the same scenario as with the previous cluster. The initial config used three nodes, data was sharded using a unique attribute, but with low entropy.
The distribution of data was not as even as with hashed key, there were very long periods of time when only one node was accepting all incoming records, but the thanks to the hard work that balancer was doing almost all the time, the data was more-or-less evenly distributed over all nodes.
The first thing I noticed was that the cluster was much faster: The sustained processing rate for the new cluster was over 400 records per second (each transaction consist of a query and an insert or update), while for the hashed configuration it was closer to 200. After inserting ~95mln of documents (a bit over 500GB of data), I added the fourth node. The new node was populated in less than 7 hours and the balancing had very minimal impact on the performance of the entire cluster. Identical operation for the configuration with hashed sharding key did not complete in 3 days and brought the entire cluster to the ground (processing rate dropped to 5 records per second).

Lesson learned: unless the cluster must be perfectly balanced, or it is absolutely impossible to choose a decent sharding key, hashed keys have no advantages. Any unique field, even with very low entropy, will perform much better as a sharding key than a hash.

j.

Asya Kamsky

unread,

Sep 5, 2013, 8:11:58 PM9/5/13

to mongodb-user

Can you clarify this part for me: "each transaction consist of a query and an insert or update" - is this query of a different collection and insert or update into the sharded collection?

If you are going to be loading a large amount of data into an empty cluster, you should pre-split and balance empty chunks *before* loading any data - that way the impact of the balancer would only be felt when you add a new shard (and therefore *must* balance).

As far as hashed shard key being useless - this is simply not the case - if your alternative is sharding on monotonically increasing value (like _id that's generated as an ObjectId()) then that provides *no* write distribution for inserts and could absolutely *kill* performance.

I'm not sure what the issue was in your case as I know there was testing done comparing hashed and unhashed shard key performance. But then I'm a little unsure by what you mean by "unique attribute, but with low entropy" - is that like a monotonically increasing value? Is that the same field that you previously sharded on but "hashed"?

Asya

--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jacek Radzikowski

unread,

Sep 6, 2013, 1:32:49 AM9/6/13

to mongod...@googlegroups.com

Let me explain: In the incoming stream of documents, some may be duplicates. The duplicates do not have to be exact copies, they may contain some additional information that is not present in the docs in the collection. I can't discard the extra information, I can overwrite the document in the collection either. The only reasonable way to handle such situations is to query the DB and insert new document, or add the new information to the existing one using update. Pre-splitting the data is not an option, because I need to record data coming from a streaming interface at the rate of 10-100 docs per second.
The sharding key is almost monotonic. There is some randomness at lower bits, but the overall trend is increasing. That's why I decided to use hashing for the first experiment. Unfortunately, the overall performance of the cluster was much worse than I expected. Hashing did not eliminate migrations and each migration was causing a very significant drop in the performance. Since this is an experimental setup, I could replay the scenario using non-hashed key. With the same set of servers, the same data and the same set of indexes, the cluster with non-hashed key performed much better. The collection is sharded on the same field, the only difference is that this time I do not hash it. The distribution of writes was much worse (as expected), but not only it did not kill the performance, but actually increased it. While each chunk transfer in the collection sharded on hashed key was bringing the DB to a crawl, now migrations are noticeable only as messages in the log file and as spikes in disk IOs on graphs in MMS.
My experience is very similar to what described Joerek Van Gaalen (http://www.marshut.com/nyhvn/sharding-movechunks-very-slow.html). Apparently I'm not the only one having problems with hashed shard keys.

j.

Jeff Lee

unread,

Sep 6, 2013, 11:40:55 AM9/6/13

to mongod...@googlegroups.com

Jacek,

I posted in your other thread - bottom line is I think that your hardware may be undersized for your dataset and that the hashed index is just exposing problems a little earlier than normal since the entire index is hot.

Have you just run insert tests or have you tried "real-world" queries as well? My guess would be that with 8GB of RAM for a 150+ GB dataset, you're going to have issues as the indexes grow regardless of whether or not you used a hashed shard key.

Asya Kamsky

unread,

Sep 8, 2013, 11:40:10 AM9/8/13

to mongodb-user

Three points about this:

1) The only reasonable way to handle such situations is to query the

DB and insert new document, or add the new information to the existing
one using update.

Adding the new information using an update would be more efficient
than query and then insert (you would use the "upsert" flag to avoid
potential race conditions). Query and then insert or update by
application is not only less efficient, but it is prone to race
conditions if two threads are trying to insert the same document at
once.

2) Pre-splitting the data is not an option, because I need to record

data coming from a streaming interface at the rate of 10-100 docs per
second.

Pre-splitting refers to splitting the shard key ranges into many
chunks and distributing them evenly across your cluster *before* you
load any data - moving empty chunks is very inexpensive. Hash shard
keys already are supposed to do this, and so there should have been no
migrations during your data load *except* when you add a new shard to
an already populated cluster.

So in your second test with unhashed shard key I'm saying that
splitting the chunks and distributing them over the shards before
starting the data load would have avoided the "very long periods of
time when only one node was accepting all incoming records" while
competing for resources with the balancer.

3) The sharding key is almost monotonic.

This means that inserts are almost always going to be going into only
a single chunk (highest one) and the only reason that the distribution
will end up even is that there is an "edge case" handling code in
sharding where when it detects that all the inserts are going into the
highest chunk on the next split it will split it near the max and
migrate the new highest empty chunk to another shard. But if the
reason you needed sharding is to distribute the write load then a
monotonically increasing shard key is a poor choice.

Another thing I would say is that I wish you would file a Jira SERVER
ticket - if balancing with hashed shard keys is slower than balancing
with unhashed shard keys then I feel that it is possible that there is
simply a bug (which hopefully could be fixed if it's identified) and
since your dataset seems to have triggered this behavior it would be
very valuable to figure out why. We did some performance testing
before releasing the hashed shard keys feature and didn't see a
noticeable performance (but I know that there are different hashing
algorithms that were considered and it's possible that on your shard
key values a different hashing algorithm maybe would be measurably
faster.

I just ran a very quick test comparing moving a chunk in two identical
collections when it is sharded based on _id (regular) or on _id hashed
and here are my results for one and the other:

mongos> sh.moveChunk("test.test",{_id:1},"shard0001")
{ "millis" : 1086, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0000")
{ "millis" : 1090, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0001")
{ "millis" : 1077, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0000")
{ "millis" : 1068, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0001")
{ "millis" : 1076, "ok" : 1 }

AND the other:

mongos> sh.moveChunk("test.test",{_id:1},"shard0000")
{ "millis" : 1085, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0001")
{ "millis" : 2164, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0000")
{ "millis" : 1087, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0001")
{ "millis" : 1180, "ok" : 1 }
mongos> sh.moveChunk("test.test",{_id:1},"shard0000")
{ "millis" : 1169, "ok" : 1 }
mongos>

You can see that other than the one outlier (second test in my
_UNHASHED_ key test) they were completely equal.

Since you have a test set-up I would love to see the numbers from your
migrations with and without hashing the shard key and since they seem
to show very different performance in your case what exactly differs -
I'm guessing the key that's being sharded, but maybe there is
something else in the environment that I haven't thought of yet.

Asya

On Thu, Sep 5, 2013 at 10:32 PM, Jacek Radzikowski

Reply all

Reply to author

Forward