Collection is imbalanced across the shards

37 views
Skip to first unread message

Sandeep Nemuri

unread,
Apr 24, 2015, 4:28:02 AM4/24/15
to mongod...@googlegroups.com
Hi All,

We are using mongodb 2.4.2 and have a 5 sharded cluster. 

There is a sharded collection with key as date.

It has been observed that this collection is not balanced across the shards even if the balancer is running.

Shard distribution is as follows:

 Shard shard00 contains 16.85% data, 17.82% docs in cluster, avg obj size on shard : 461B
 Shard shard01 contains 25.8% data, 25.33% docs in cluster, avg obj size on shard : 496B
 Shard shard02 contains 23.76% data, 24.17% docs in cluster, avg obj size on shard : 479B
 Shard shard03 contains 7.3% data, 6.85% docs in cluster, avg obj size on shard : 519B
 Shard shard04 contains 26.27% data, 25.8% docs in cluster, avg obj size on shard : 496B

Even after balancer is running why the shards are distributed unevenly.

Thanks in advance.

Regards
Sandeep Nemuri

Asya Kamsky

unread,
Apr 25, 2015, 10:27:17 PM4/25/15
to mongodb-user
It would be helpful to see how the *chunks* are split - the output to sh.status() will have that information.

Balancing is done by chunks, if the chunks are unevenly sized then that can cause a data imbalance.

Do you by any chance ever delete old data?  Because that would definitely cause that issue with a date shard key.

Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/3062307f-5d1f-4fcd-93f7-51c1ad931a13%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!

Sandeep Nemuri

unread,
Apr 28, 2015, 8:05:45 AM4/28/15
to mongod...@googlegroups.com, ankitsinghal59, sandeep sandeep
Hi Asya,

Thanks for your reply.

PFB. Distribution of chunks across shards for this collection.

shard key: { "date" : 1, "sg_id" : 1, "eg_id" : 1 }
chunks:
shard04 20884
shard03 20884
shard01 20885
shard02 20884
shard00 20889

Yes, we have deleted the historical data but we are unable to run the compaction/rebuild indexes because of the storage issue.

Could you please suggest how we can proceed further.

Thanks
Sandeep Nemuri

Sandeep Nemuri

unread,
May 5, 2015, 10:09:01 AM5/5/15
to mongod...@googlegroups.com, ankitsinghal59, sandeep sandeep
Hi Asya,

Did you get a chance to address this issue ? 

Thanks
Sandeep Nemuri
--
  Regards
  Sandeep Nemuri

Stephen Dillon

unread,
May 5, 2015, 10:56:30 AM5/5/15
to mongod...@googlegroups.com
As Asya said, people would really need to see the chunks distribution. If you look at how sharding works in MongoDB, the chunks are split (not necessarily the data) evenly across each shard. A chunk may not have as much data as another.

Dwight, from MongoDB, does a good job explaining this simply in the MongoDB DBA course.

Asya Kamsky

unread,
May 5, 2015, 6:26:56 PM5/5/15
to mongodb-user
Running compaction would not help you - the key is that if you have empty chunks due to them holding historical data that's been deleted that would create this imbalance.

What is your shard key?  Is it time based?

If it is, then you could merge lower ranges of the chunks using mergeChunks command (in 2.6 and later).

Asya





For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages