I have a thorny problem about adding new shards,could you please kindly help me

17 views
Skip to first unread message

Xuguang

unread,
Sep 3, 2012, 1:48:04 AM9/3/12
to mongod...@googlegroups.com, xz...@cisco.com

Hi Guys,I have a thorny problem about adding new shards,could you please kindly help me

the circumstance:
previously, I have three machines:10.10.10.5, 10.10.10.6, and 10.10.10.7,
10.10.10.5 runs mongo ConfigureServer,and mongoS,and rs shards:shard3,shard4 the two shards are primary. the 10.10.10.6 runs mongoS and two shards:shard3,shard4 the two shards as secondary.
the 10.10.10.7 runs mongoS and two shards:shard3,shard4,the two shards just as abitrator.
My application connects 10.10.10.6 mongoS.
everything well,but one year after,10.5 and 10.6 are very heavy load,especially 10.6 the cpu usage and load average very high,so I planned to add two new machines to the cluster,so I created two shards:shard1 and shard2,the new machine 10.10.10.8 runs shard1(primary),shard2(secondary),the new machine 10.10.10.9 runs shard1(secondary),shard2(primary),and the old member 10.10.10.7 add the two shards:shard1,shard2 still as abitrator,and also I started two mongos separately in the two new machine.
the bad problem is: when I added the two new machines(use addShards command),about 5 hours they finished the migration(I can't make sure),then the 10.10.10.6 very very high load,the load average about 90.5(4 cpus),meanwhile there are many writes and reads request from application to 10.10.10.6 mongoS,but rare data or no data write to the new two machines,I use iostats find there were almost no io bytes in the two machine,and why the 10.10.10.6 become so high load,previously even in peak time the highest was about 30.5.
So could you guys kindly help me :)
Warm Regards

gregor

unread,
Sep 4, 2012, 3:45:44 AM9/4/12
to mongod...@googlegroups.com, xz...@cisco.com
Can you connect to mongos with the mongo shell and run a sh.status(true)

andre.defrere

unread,
Sep 4, 2012, 3:47:27 AM9/4/12
to mongod...@googlegroups.com, xz...@cisco.com
Lots of discussion on this in the stack overflow thread http://stackoverflow.com/questions/12250638/

Xuguang

unread,
Sep 4, 2012, 3:52:44 AM9/4/12
to mongod...@googlegroups.com, xz...@cisco.com
yeah , I saw the discussion, seems not digout the rootcause.keep watch the Discussion

在 2012年9月4日星期二UTC+8下午3时47分28秒,andre.defrere写道:
Reply all
Reply to author
Forward
0 new messages