Unbalanced shards and balancing error

Nenad Merdanovic

unread,

Nov 5, 2014, 11:45:11 AM11/5/14

to mongod...@googlegroups.com

I am running MongoDB 2.6.4 with 4 servers and 8 shards (each server has four shards, and +1 replication). I have several mongos routers and three config servers.

While balancing a collection I ended up with this (it was not pre-sharded):

{ "_id" : "rty", "partitioned" : true, "primary" : "rty" }

rty.users

shard key: { "_id" : "hashed" }

chunks:

rty-sh4 880

rty-sh5 880

rty-sh7 880

rty-sh6 880

rty-sh8 880

rty-sh1 881

rty-sh3 880

rty 881

As you can see, the data distribution looks very wrong:

Totals

data : 218.57GiB docs : 631746259 chunks : 7042

Shard rty contains 19.7% data, 19.21% docs in cluster, avg obj size on shard : 380B

Shard rty-sh1 contains 40.58% data, 45.32% docs in cluster, avg obj size on shard : 332B

Shard rty-sh3 contains 5.86% data, 4.98% docs in cluster, avg obj size on shard : 437B

Shard rty-sh4 contains 7.39% data, 6.89% docs in cluster, avg obj size on shard : 398B

Shard rty-sh5 contains 5.89% data, 5.04% docs in cluster, avg obj size on shard : 434B

Shard rty-sh6 contains 7.32% data, 6.73% docs in cluster, avg obj size on shard : 404B

Shard rty-sh7 contains 5.89% data, 5.04% docs in cluster, avg obj size on shard : 434B

Shard rty-sh8 contains 7.33% data, 6.75% docs in cluster, avg obj size on shard : 403B

Going through the logs, I found the following:

27019/ded1326:27017:1409603898:1804289383', sleeping for 30000ms

2014-10-08T06:02:05.959-0500 [LockPinger] cluster 10.30.1.162:27019,10.30.1.166:27019,10.30.1.78:27019 pinged successfully at Wed Oct 8 06:02:05 2014 by distributed lock pinger '10.30.1.162:27019,10.30.1.166:27019,10.30.1.78:27019/ded1326:27017:1409603898:1804289383', sleeping for 30000ms

2014-10-08T06:02:07.087-0500 [conn474] SyncClusterConnection connecting to [10.30.1.162:27019]

2014-10-08T06:02:07.088-0500 [conn474] SyncClusterConnection connecting to [10.30.1.166:27019]

2014-10-08T06:02:07.088-0500 [conn474] SyncClusterConnection connecting to [10.30.1.78:27019]

2014-10-08T06:02:07.112-0500 [conn475] ChunkManager: time to load chunks for rty.users: 24ms sequenceNumber: 2265652 version: 3523|1||5406025d06417b544f8da8a4 based on: 3522|1||5406025d06417b544f8da8a4

2014-10-08T06:02:07.566-0500 [Balancer] couldn't find database [rty_test] in config db

2014-10-08T06:02:07.695-0500 [Balancer] put [rty_test] on: rty-sh4:rty-sh4/10.30.1.170:27019,10.30.1.66:27019

2014-10-08T06:02:07.735-0500 [Balancer] warning: could not move chunk min: { _id: MinKey } max: { _id: -9223075039127251768 }, continuing balancing round :: caused by :: 10181 not sharded:rty_test.users_test

2014-10-08T06:02:07.735-0500 [Balancer] moving chunk ns: perf_test.test3 moving ( ns: perf_test.test3, shard: rty:rty/10.30.1.122:27017,10.30.1.158:27017, lastmod: 2|50||000000000000000000000000, min: { _id: MinKey }, max: { _id: -8849721541900570023 }) rty:rty/10.30.1.122:27017,10.30.1.158:27017 -> rty-sh3:rty-sh3/10.30.1.122:27019,10.30.1.158:27019

2014-10-08T06:02:08.661-0500 [Balancer] moveChunk result: { ok: 0.0, errmsg: "ns not found, should be impossible" }

2014-10-08T06:02:08.662-0500 [Balancer] balancer move failed: { ok: 0.0, errmsg: "ns not found, should be impossible" } from: rty to: rty-sh3 chunk: min: { _id: MinKey } max: { _id: -8849721541900570023 }

2014-10-08T06:02:08.852-0500 [conn477] ChunkManager: time to load chunks for rty.users: 104ms sequenceNumber: 2265653 version: 3523|1||5406025d06417b544f8da8a4 based on: (empty)

2014-10-08T06:02:08.872-0500 [Balancer] distributed lock 'balancer/srv0001:27017:1409603898:1804289383' unlocked.

After that, there are numerous entries about various moves failing with the same error message. Can someone shed some light on how this can happen, what to do to balance it (if there are options other than dump annd re-import because there is over 300M documents) and how to prevent this from happening in the future.

Thanks!

Will Berkeley

unread,

Nov 6, 2014, 12:01:08 AM11/6/14

to mongod...@googlegroups.com

The error messages don't have to do with the namespaces rty.users, which is what you've shown the chunk information for. rty.users looks to be fine from what you've shown here. The balancer tries to even up the number of chunks between the shards, and it looks like it was successful. Are the namespaces perf_test.test3 or rty_test.users_test present in the config database or on any of the shards?

-Will

Nenad Merdanovic

unread,

Nov 6, 2014, 3:06:14 AM11/6/14

to mongod...@googlegroups.com

Those were dropped on purpose as a result of testing and I thought the log entries could point to something being wrong with the config servers, so I pasted the messages even though it's not related to the same namespace. You are saying that it's normal to have such discrepancy between amount of data in shards?

Thanks,

Nenad

Will Berkeley

unread,

Nov 6, 2014, 3:44:43 AM11/6/14

to mongod...@googlegroups.com

The balancer balances based on number of chunks, and chunks can be different sizes, so it can happen that the amount of data on shards differs, especially if some chunks are forced to be large by the nature of the shard key. It is odd to have this happen so extremely, and also for it to happen for a hashed shard key. Where did the statistics of the distribution of data come from? The documents in the shards with "too much" data are noticeably smaller on average than the docs in the other shards. Maybe you are counting more than just the docs in the sharded collection, including some from unsharded collections whose primary shard is rty or rty-sh1?

-Will

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/a4fe2f77-4d96-426a-ac87-e9be19ec865a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Nenad Merdanovic

unread,

Nov 6, 2014, 7:35:26 AM11/6/14

to mongod...@googlegroups.com

The situation is like this:

rty:PRIMARY> show dbs;

admin (empty)

local 24.066GB

rty 61.924GB

rty2 15.946GB

rty-sh5:PRIMARY> show dbs;

admin (empty)

local 12.072GB

rty 25.941GB

rty2 15.946GB

If we ignore 'local' difference, you will see that rty is much larger on the 'rty' shard than on 'rty-sh5'. We have repaired the databases on 'rty' shard after the balancing, as they were even bigger.