Shard Imbalance in mongodb 2.0.4

165 views
Skip to first unread message

amuseme

unread,
Apr 28, 2012, 3:22:13 AM4/28/12
to mongod...@googlegroups.com
Hi.
I am using MongoDB 2.0.4
3 shards, each shard contain 3 Node in Replica Set 

mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 200 }
mongos> db.version()
2.0.4

this is printShardingStatus command output

mongos> db.printShardingStatus()
--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
        {  "_id" : "shard62-1",  "host" : "shard62-1/192.168.166.62:50110,192.168.166.63:50110",  "maxSize" : NumberLong(6144) }
        {  "_id" : "shard63-1",  "host" : "shard63-1/192.168.166.62:50111,192.168.166.63:50111",  "maxSize" : NumberLong(6144) }
        {  "_id" : "shard64-1",  "host" : "shard64-1/192.168.166.64:50112,192.168.166.65:50112",  "maxSize" : NumberLong(6144) }
        {  "_id" : "shard65-1",  "host" : "shard65-1/192.168.166.64:50113,192.168.166.65:50113",  "maxSize" : NumberLong(6144) }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "xunuu_test",  "partitioned" : true,  "primary" : "shard62-1" }
                xunuu_test.WebDb chunks:
                                shard62-1       11
                                shard63-1       3
                                shard64-1       1
                        { "URI" : { $minKey : 1 } } -->> { "URI" : "http://club.360buy.com/review/1000468567-1-1.html" } on : shard62-1 { "t" : 2000, "i" : 2 }
                        { "URI" : "http://club.360buy.com/review/1000468567-1-1.html" } -->> { "URI" : "http://club.360buy.com/userreview/123810-1-1.html" } on : shard62-1 { "t" : 3000, "i" : 6 }
                        { "URI" : "http://club.360buy.com/userreview/123810-1-1.html" } -->> { "URI" : "http://detail.zol.com.cn/271/270667/pic.shtml" } on : shard62-1 { "t" : 3000, "i" : 7 }
                        { "URI" : "http://detail.zol.com.cn/271/270667/pic.shtml" } -->> { "URI" : "http://greatwall-rubber.en.alibaba.com/" } on : shard62-1 { "t" : 3000, "i" : 8 }
                        { "URI" : "http://greatwall-rubber.en.alibaba.com/" } -->> { "URI" : "http://mobile.zol.com.cn/289/2891317.html" } on : shard62-1 { "t" : 3000, "i" : 9 }
                        { "URI" : "http://mobile.zol.com.cn/289/2891317.html" } -->> { "URI" : "http://product.it168.com/detail/doc/230589/pic.shtml" } on : shard62-1 { "t" : 2000, "i" : 6 }
                        { "URI" : "http://product.it168.com/detail/doc/230589/pic.shtml" } -->> { "URI" : "http://product.it168.com/detail/doc/417553/2/1/ddp.shtml" } on : shard62-1 { "t" : 2000, "i" : 8 }
                        { "URI" : "http://product.it168.com/detail/doc/417553/2/1/ddp.shtml" } -->> { "URI" : "http://product.it168.com/detail/doc/73653/trait.shtml" } on : shard62-1 { "t" : 2000, "i" : 10 }
                        { "URI" : "http://product.it168.com/detail/doc/73653/trait.shtml" } -->> {
} on : shard62-1 { "t" : 2000, "i" : 14 }
} -->> { "URI" : "http://www.360buy.com/product/1000469946.html" } on : shard62-1 { "t" : 2000, "i" : 15 }
                        { "URI" : "http://www.360buy.com/product/1000469946.html" } -->> { "URI" : "http://www.360buy.com/product/595755.html" } on : shard62-1 { "t" : 1000, "i" : 3 }
                        { "URI" : "http://www.360buy.com/product/595755.html" } -->> {
} on : shard63-1 { "t" : 3000, "i" : 2 }
                        {
} -->> { "URI" : "http://www.newegg.com.cn/Product/02-c07-040.htm" } on : shard63-1 { "t" : 3000, "i" : 4 }
                        { "URI" : "http://www.newegg.com.cn/Product/02-c07-040.htm" } -->> { "URI" : "https://www.alipay.com/user/reg_select.htm" } on : shard63-1 { "t" : 3000, "i" : 5 }
                        { "URI" : "https://www.alipay.com/user/reg_select.htm" } -->> { "URI" : { $maxKey : 1 } } on : shard64-1 { "t" : 3000, "i" : 0 }
        {  "_id" : "test",  "partitioned" : false,  "primary" : "shard63-1" }
        {  "_id" : "bbj",  "partitioned" : false,  "primary" : "shard63-1" }
        {  "_id" : "bbj_test",  "partitioned" : false,  "primary" : "shard64-1" }
        {  "_id" : "xunuu",  "partitioned" : false,  "primary" : "shard65-1" }
        {  "_id" : "EOO",  "partitioned" : false,  "primary" : "shard64-1" }

I think my shard is imbalanced. because shard62-1 have more than 8 chunks than shard64-1, and shard65-1 has no chunk.

I see below error in mongos log 

Sat Apr 28 00:00:53 [conn27772] end connection 192.168.166.65:60724
Sat Apr 28 00:00:53 [conn27773] end connection 192.168.166.65:60727
Sat Apr 28 00:00:59 [Balancer] distributed lock 'balancer/xunuu62:50124:1335497117:1804289383' acquired, ts : 4f9ac2bbbd09b0063f21850a
Sat Apr 28 00:00:59 [Balancer] no available shards to take chunks
Sat Apr 28 00:00:59 [Balancer] distributed lock 'balancer/xunuu62:50124:1335497117:1804289383' unlocked. 
Sat Apr 28 00:01:09 [Balancer] distributed lock 'balancer/xunuu62:50124:1335497117:1804289383' acquired, ts : 4f9ac2c5bd09b0063f21850b
Sat Apr 28 00:01:09 [Balancer] no available shards to take chunks
Sat Apr 28 00:01:09 [Balancer] distributed lock 'balancer/xunuu62:50124:1335497117:1804289383' unlocked. 

and my WebDb.stats() output like this:
mongos> db.WebDb.stats()
{
        "sharded" : true,
        "flags" : 1,  
        "warning" : "indexes don't all match - ok if ensureIndex is running",
        "ns" : "xunuu_test.WebDb",
        "count" : 1308387,
        "numExtents" : 36,
        "size" : 719689268,
        "storageSize" : 828092416,
        "totalIndexSize" : 788068288,
        "indexSizes" : {
                "URI_1" : 161459648,
                "_id_" : 46529616,
                "crawlState_1" : 42883120,
                "expiredTime_1" : 42384384,
                "fetchInterval_1" : 28444304,
                "fetchTime_1" : 30888928,
                "isInfoPage_1" : 29956864,
                "isStructured_1" : 29956864,
                "objectType_1" : 42384384,
                "pageType_1" : 42196336,
                "parentURI_1" : 36636656,
                "score_1" : 38827824,
                "siteKey_1" : 56201824,
                "siteScore_1" : 40912704,
                "siteScore_1_score_1" : 52449040,
                "threadId_1" : 35132272,
                "updateNumber_1" : 30823520
        },
        "avgObjSize" : 550.0584062666474,
        "nindexes" : 17,
        "nchunks" : 15,
        "shards" : {  
        "shard62-1" : {
                        "ns" : "xunuu_test.WebDb",
                        "count" : 891721,
                        "size" : 531580824,
                        "avgObjSize" : 596.1290852183587,
                        "storageSize" : 607436800,
                        "numExtents" : 20,
                        "nindexes" : 17,
                        "lastExtentSize" : 107782144,
                        "paddingFactor" : 1.0099999999852463,
                        "flags" : 1,
                        "totalIndexSize" : 716193072,
                        "indexSizes" : {
                                "_id_" : 32671296,
                                "URI_1" : 103442752,
                                "score_1" : 38827824,
                                "siteScore_1" : 40912704,
                                "siteScore_1_score_1" : 52449040,
                                "threadId_1" : 35132272,
                                "expiredTime_1" : 42384384,
                                "isStructured_1" : 29956864,
                                "crawlState_1" : 42883120,
                                "isInfoPage_1" : 29956864,
                                "siteKey_1" : 56201824,
                                "fetchTime_1" : 30888928,
                                "fetchInterval_1" : 28444304,
                                "parentURI_1" : 36636656,
                                "updateNumber_1" : 30823520,
                                "objectType_1" : 42384384,
                                "pageType_1" : 42196336
                        },
                        "ok" : 1
                },
        "shard63-1" : {
                        "ns" : "xunuu_test.WebDb",
                        "count" : 416659,
                        "size" : 188106548,
                        "avgObjSize" : 451.46402213800735,
                        "storageSize" : 220647424,
                        "numExtents" : 15,
                        "nindexes" : 2,
                        "lastExtentSize" : 43311104,
                        "paddingFactor" : 1.009999999997084,
                        "flags" : 1,
                        "totalIndexSize" : 71858864,
                        "indexSizes" : {
                                "_id_" : 13850144,
                                "URI_1" : 58008720
                        },
                        "ok" : 1
                },
                "shard64-1" : {
                        "ns" : "xunuu_test.WebDb",
                        "count" : 7,
                        "size" : 1896,
                        "avgObjSize" : 270.85714285714283,
                        "storageSize" : 8192,
                        "numExtents" : 1,
                        "nindexes" : 2,
                        "lastExtentSize" : 8192,
                        "paddingFactor" : 1,
                        "flags" : 1,
                        "totalIndexSize" : 16352,
                        "indexSizes" : {
                                "_id_" : 8176,
                                "URI_1" : 8176
                        },
                        "ok" : 1
                }
        },
        "ok" : 1
}

and why shard index count is different. 

Could you help me out. 

Thanks 
lemo lu

gregor

unread,
Apr 30, 2012, 7:35:31 AM4/30/12
to mongod...@googlegroups.com
It looks like your shards are maxed out. Can you check that you have free disk space on your shards and also turn up the log level please?

Adam C

unread,
Apr 30, 2012, 9:54:28 AM4/30/12
to mongod...@googlegroups.com
How recently were these shards added?

It looks like the indexes for the collection have not yet been created on the other shards, or have had some sort of problem (interrupted during index build perhaps?).  It would be a good idea to log into the mongod and get a db.currentOP() output to see what is going on for those 2 shards.  If there is no ensureIndex build going on, then to rectify the problem try creating the indexes (ensureIndex) yourself

The log message you are getting from the balancer can kick in if the candidate shards to move the chunk to are unavailable - either maxed out as Gregor mentioned or draining etc.  I will have to check to see if a blocking operation of some sort could be causing this to trigger also.

As a complete aside: the siteScore_1_score_1 index makes siteScore_1 redundant and it can be removed.

Adam.

amuseme

unread,
May 2, 2012, 1:24:47 AM5/2/12
to mongod...@googlegroups.com
yes ,you are right, the problem is that the shard5-1 has no space to store chunk. this cause the balancer warn "no available shards to take chunks" . because i want to decrease the shard max size to control the limitation of memory usage, so current shard max size is 6144.

thanks gregor.

amuseme

unread,
May 2, 2012, 1:57:35 AM5/2/12
to mongod...@googlegroups.com
these shards are added in advance when i start the cluster . 

yes,you are right, the indexes of that collection were not created success on other shards, because of sort problem during index build, i already rebuild the indexes by myself and delete the redundant index.

thanks for you advice
Reply all
Reply to author
Forward
0 new messages