I added a third shard to my cluster last week and no data has moved to
it at all. I've been digging for answers and I'm finding related
things but nothing that quite addresses the issue -- my first thought
was that there is a stale lock somewhere, but I don't see it. Any
help would be much appreciated, because I've GOT to get this sorted
soon -- the RS1 shard is about 80% on disk.
DB is v2.0; we have been using v2.0 for several months now and have
not made the jump to 2.2 yet.
Below is everything I think might help someone figure this out. Sorry
for the wall o' text; I wasn't sure how much to include.
---------------------------------------------------------------------------
[The following done from the RS1 primary unless otherwise indicated,
if that matters]
After some discussion it seems like the new shard has registered but isn't being included properly in the balancing.
Would you be willing to execute the flushRouterConfig command to try and clean out the cached router configurations and load them from the config servers, in order to get the router to acknowledged the new shard. The command is db.adminCommand( 'flushRouterConfig' ) on the mongos instance.
On Tuesday, September 4, 2012 7:59:42 AM UTC+10, David K Storrs wrote:
> I added a third shard to my cluster last week and no data has moved to > it at all. I've been digging for answers and I'm finding related > things but nothing that quite addresses the issue -- my first thought > was that there is a stale lock somewhere, but I don't see it. Any > help would be much appreciated, because I've GOT to get this sorted > soon -- the RS1 shard is about 80% on disk.
> DB is v2.0; we have been using v2.0 for several months now and have > not made the jump to 2.2 yet.
> Below is everything I think might help someone figure this out. Sorry > for the wall o' text; I wasn't sure how much to include.
Hi David, I work with David Storrs, here's what he got from the flushRouterConfig. I also poked in the mongos logs on the new shard machine and saw that the balancer was having issues migrating. After all of dave's stuff is the mongos log with the messages.
Thanks for the response. I tried it from mongos on the RS3 instance and got the following:
mongos> db.adminCommand( 'flushRouterConfig' ) { "flushed" : true, "ok" : 1 } mongos> use cm_prod switched to db cm_prod mongos> db.printShardingStatus() --- Sharding Status --- sharding version: { "_id" : 1, "version" : 3 } shards: { "_id" : "RS1", "host" : "RS1/xxx:27018" } { "_id" : "RS2", "host" : "RS2/xxx:27018" } { "_id" : "RS3", "host" : "RS3/xxx:27018" } databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "demo", "partitioned" : false, "primary" : "RS1" } { "_id" : "cm_prod", "partitioned" : true, "primary" : "RS1" } cm_prod.profile_daily_stats chunks: RS2 16 RS1 16 too many chunks to print, use verbose if you want to force print cm_prod.profile_raw_stats chunks: RS2 160 RS1 158 too many chunks to print, use verbose if you want to force print cm_prod.video_latest_stats chunks: RS1 152 RS2 153 too many chunks to print, use verbose if you want to force print cm_prod.video_raw_stats chunks: RS1 3564 RS2 3533 too many chunks to print, use verbose if you want to force print { "_id" : "test", "partitioned" : false, "primary" : "RS2" } { "_id" : "admin; db", "partitioned" : false, "primary" : "RS2" } { "_id" : "system", "partitioned" : false, "primary" : "RS2" } { "_id" : "geoff", "partitioned" : false, "primary" : "RS2" } { "_id" : "cm_demo_data", "partitioned" : false, "primary" : "RS2" } { "_id" : "cm_develop_data", "partitioned" : false, "primary" : "RS2" } { "_id" : "cm", "partitioned" : false, "primary" : "RS2" }
On Monday, September 3, 2012 10:19:43 PM UTC-7, David Hows wrote:
> Hi David,
> After some discussion it seems like the new shard has registered > but isn't being included properly in the balancing.
> Would you be willing to execute the flushRouterConfig command to try and > clean out the cached router configurations and load them from the config > servers, in order to get the router to acknowledged the new shard. The > command is db.adminCommand( 'flushRouterConfig' ) on the mongos instance.
> On Tuesday, September 4, 2012 7:59:42 AM UTC+10, David K Storrs wrote:
>> I added a third shard to my cluster last week and no data has moved to >> it at all. I've been digging for answers and I'm finding related >> things but nothing that quite addresses the issue -- my first thought >> was that there is a stale lock somewhere, but I don't see it. Any >> help would be much appreciated, because I've GOT to get this sorted >> soon -- the RS1 shard is about 80% on disk.
>> DB is v2.0; we have been using v2.0 for several months now and have >> not made the jump to 2.2 yet.
>> Below is everything I think might help someone figure this out. Sorry >> for the wall o' text; I wasn't sure how much to include.
I've seen that you have also been speaking with Adam.
Looking further it seems that the balancer lock could be stale, can you confirm this? Please query as follows: use config db.locks.find({ _id : 'balancer'})
On Wednesday, September 5, 2012 4:56:09 AM UTC+10, Geoff L wrote:
> Hi David, > I work with David Storrs, here's what he got from the flushRouterConfig. I > also poked in the mongos logs on the new shard machine and saw that the > balancer was having issues migrating. After all of dave's stuff is the > mongos log with the messages.
> Thanks for the response. I tried it from mongos on the RS3 instance > and got the following:
> mongos> db.adminCommand( 'flushRouterConfig' ) > { "flushed" : true, "ok" : 1 } > mongos> use cm_prod > switched to db cm_prod > mongos> db.printShardingStatus() > --- Sharding Status --- > sharding version: { "_id" : 1, "version" : 3 } > shards: > { "_id" : "RS1", "host" : "RS1/xxx:27018" } > { "_id" : "RS2", "host" : "RS2/xxx:27018" } > { "_id" : "RS3", "host" : "RS3/xxx:27018" } > databases: > { "_id" : "admin", "partitioned" : false, "primary" : "config" } > { "_id" : "demo", "partitioned" : false, "primary" : "RS1" } > { "_id" : "cm_prod", "partitioned" : true, "primary" : "RS1" } > cm_prod.profile_daily_stats chunks: > RS2 16 > RS1 16 > too many chunks to print, use verbose if you > want to force print > cm_prod.profile_raw_stats chunks: > RS2 160 > RS1 158 > too many chunks to print, use verbose if you > want to force print > cm_prod.video_latest_stats chunks: > RS1 152 > RS2 153 > too many chunks to print, use verbose if you > want to force print > cm_prod.video_raw_stats chunks: > RS1 3564 > RS2 3533 > too many chunks to print, use verbose if you > want to force print > { "_id" : "test", "partitioned" : false, "primary" : "RS2" } > { "_id" : "admin; db", "partitioned" : false, "primary" : "RS2" > } > { "_id" : "system", "partitioned" : false, "primary" : "RS2" } > { "_id" : "geoff", "partitioned" : false, "primary" : "RS2" } > { "_id" : "cm_demo_data", "partitioned" : false, "primary" : > "RS2" } > { "_id" : "cm_develop_data", "partitioned" : false, > "primary" : "RS2" } > { "_id" : "cm", "partitioned" : false, "primary" : "RS2" }
> On Monday, September 3, 2012 10:19:43 PM UTC-7, David Hows wrote:
>> Hi David,
>> After some discussion it seems like the new shard has registered >> but isn't being included properly in the balancing.
>> Would you be willing to execute the flushRouterConfig command to try and >> clean out the cached router configurations and load them from the config >> servers, in order to get the router to acknowledged the new shard. The >> command is db.adminCommand( 'flushRouterConfig' ) on the mongos instance.
>> On Tuesday, September 4, 2012 7:59:42 AM UTC+10, David K Storrs wrote:
>>> I added a third shard to my cluster last week and no data has moved to >>> it at all. I've been digging for answers and I'm finding related >>> things but nothing that quite addresses the issue -- my first thought >>> was that there is a stale lock somewhere, but I don't see it. Any >>> help would be much appreciated, because I've GOT to get this sorted >>> soon -- the RS1 shard is about 80% on disk.
>>> DB is v2.0; we have been using v2.0 for several months now and have >>> not made the jump to 2.2 yet.
>>> Below is everything I think might help someone figure this out. Sorry >>> for the wall o' text; I wasn't sure how much to include.
Ya, I was talking to Adam looked like he might have been working on a problem similar to ours.
It doesn't seem like the balancer lock is stale. There are locks in the collection that are old, but refer to collections that no longer exist, so I don't believe they are relevant. I have also noticed that balancing has completely stopped when the new shard is in the cluster. If I remove the new shard from the cluster using the removeShard command, the cluster goes back to working fine. Is there something I'm supposed to do with the balancer when I add a new shard?
On Tuesday, September 4, 2012 7:53:12 PM UTC-7, David Hows wrote:
> Hi Geoff,
> I've seen that you have also been speaking with Adam.
> Looking further it seems that the balancer lock could be stale, can you > confirm this? Please query as follows: > use config > db.locks.find({ _id : 'balancer'})
> Would you be able to please check the balancer lock again to see if they > are indeed stale and if so try deleting the balancer lock.
> Thanks
> David
> On Wednesday, September 5, 2012 4:56:09 AM UTC+10, Geoff L wrote:
>> Hi David, >> I work with David Storrs, here's what he got from the flushRouterConfig. >> I also poked in the mongos logs on the new shard machine and saw that the >> balancer was having issues migrating. After all of dave's stuff is the >> mongos log with the messages.
>> Thanks for the response. I tried it from mongos on the RS3 instance >> and got the following:
>> mongos> db.adminCommand( 'flushRouterConfig' ) >> { "flushed" : true, "ok" : 1 } >> mongos> use cm_prod >> switched to db cm_prod >> mongos> db.printShardingStatus() >> --- Sharding Status --- >> sharding version: { "_id" : 1, "version" : 3 } >> shards: >> { "_id" : "RS1", "host" : "RS1/xxx:27018" } >> { "_id" : "RS2", "host" : "RS2/xxx:27018" } >> { "_id" : "RS3", "host" : "RS3/xxx:27018" } >> databases: >> { "_id" : "admin", "partitioned" : false, "primary" : "config" >> } >> { "_id" : "demo", "partitioned" : false, "primary" : "RS1" } >> { "_id" : "cm_prod", "partitioned" : true, "primary" : "RS1" } >> cm_prod.profile_daily_stats chunks: >> RS2 16 >> RS1 16 >> too many chunks to print, use verbose if you >> want to force print >> cm_prod.profile_raw_stats chunks: >> RS2 160 >> RS1 158 >> too many chunks to print, use verbose if you >> want to force print >> cm_prod.video_latest_stats chunks: >> RS1 152 >> RS2 153 >> too many chunks to print, use verbose if you >> want to force print >> cm_prod.video_raw_stats chunks: >> RS1 3564 >> RS2 3533 >> too many chunks to print, use verbose if you >> want to force print >> { "_id" : "test", "partitioned" : false, "primary" : "RS2" } >> { "_id" : "admin; db", "partitioned" : false, "primary" : >> "RS2" } >> { "_id" : "system", "partitioned" : false, "primary" : "RS2" } >> { "_id" : "geoff", "partitioned" : false, "primary" : "RS2" } >> { "_id" : "cm_demo_data", "partitioned" : false, "primary" : >> "RS2" } >> { "_id" : "cm_develop_data", "partitioned" : false, >> "primary" : "RS2" } >> { "_id" : "cm", "partitioned" : false, "primary" : "RS2" }
>> On Monday, September 3, 2012 10:19:43 PM UTC-7, David Hows wrote:
>>> Hi David,
>>> After some discussion it seems like the new shard has registered >>> but isn't being included properly in the balancing.
>>> Would you be willing to execute the flushRouterConfig command to try and >>> clean out the cached router configurations and load them from the config >>> servers, in order to get the router to acknowledged the new shard. The >>> command is db.adminCommand( 'flushRouterConfig' ) on the mongos instance.
>>> On Tuesday, September 4, 2012 7:59:42 AM UTC+10, David K Storrs wrote:
>>>> I added a third shard to my cluster last week and no data has moved to >>>> it at all. I've been digging for answers and I'm finding related >>>> things
Your problem is an interesting one. Its clear that the balancer isn't working, but there is no stale lock on the balancer. Which is normally the cause of these kinds of issues.
Would you have any issue with removing the old locks on the collections that dont exist?
Additionally, would you be able to add back the third shard and enable logging on the mongos. Mongos is saying its aware of a chunk migration already in progress and the extra logging should give us an idea of what the cause is. You would only need to turn the logging up for a short time (1 minute or so) and just need to see the error message in the log around balancing.
To change the logging level execute the following command "db.adminCommand({ setParameter: 1, logLevel: 5 })"
On Thursday, September 6, 2012 3:49:42 AM UTC+10, Geoff L wrote:
> Hi Dave,
> Ya, I was talking to Adam looked like he might have been working on a > problem similar to ours.
> It doesn't seem like the balancer lock is stale. There are locks in the > collection that are old, but refer to collections that no longer exist, so > I don't believe they are relevant. I have also noticed that balancing has > completely stopped when the new shard is in the cluster. If I remove the > new shard from the cluster using the removeShard command, the cluster goes > back to working fine. Is there something I'm supposed to do with the > balancer when I add a new shard?
> Thanks,
> Geoff
> On Tuesday, September 4, 2012 7:53:12 PM UTC-7, David Hows wrote:
>> Hi Geoff,
>> I've seen that you have also been speaking with Adam.
>> Looking further it seems that the balancer lock could be stale, can you >> confirm this? Please query as follows: >> use config >> db.locks.find({ _id : 'balancer'})
>> Would you be able to please check the balancer lock again to see if they >> are indeed stale and if so try deleting the balancer lock.
>> Thanks
>> David
>> On Wednesday, September 5, 2012 4:56:09 AM UTC+10, Geoff L wrote:
>>> Hi David, >>> I work with David Storrs, here's what he got from the flushRouterConfig. >>> I also poked in the mongos logs on the new shard machine and saw that the >>> balancer was having issues migrating. After all of dave's stuff is the >>> mongos log with the messages.
>>> Thanks for the response. I tried it from mongos on the RS3 instance >>> and got the following:
>>> mongos> db.adminCommand( 'flushRouterConfig' ) >>> { "flushed" : true, "ok" : 1 } >>> mongos> use cm_prod >>> switched to db cm_prod >>> mongos> db.printShardingStatus() >>> --- Sharding Status --- >>> sharding version: { "_id" : 1, "version" : 3 } >>> shards: >>> { "_id" : "RS1", "host" : "RS1/xxx:27018" } >>> { "_id" : "RS2", "host" : "RS2/xxx:27018" } >>> { "_id" : "RS3", "host" : "RS3/xxx:27018" } >>> databases: >>> { "_id" : "admin", "partitioned" : false, "primary" : >>> "config" } >>> { "_id" : "demo", "partitioned" : false, "primary" : "RS1" } >>> { "_id" : "cm_prod", "partitioned" : true, "primary" : "RS1" } >>> cm_prod.profile_daily_stats chunks: >>> RS2 16 >>> RS1 16 >>> too many chunks to print, use verbose if you >>> want to force print >>> cm_prod.profile_raw_stats chunks: >>> RS2 160 >>> RS1 158 >>> too many chunks to print, use verbose if you >>> want to force print >>> cm_prod.video_latest_stats chunks: >>> RS1 152 >>> RS2 153 >>> too many chunks to print, use verbose if you >>> want to force print >>> cm_prod.video_raw_stats chunks: >>> RS1 3564 >>> RS2 3533 >>> too many chunks to print, use verbose if you >>> want to force print >>> { "_id" : "test", "partitioned" : false, "primary" : "RS2" } >>> { "_id" : "admin; db", "partitioned" : false, "primary" : >>> "RS2" } >>> { "_id" : "system", "partitioned" : false, "primary" : "RS2" } >>> { "_id" : "geoff", "partitioned" : false, "primary" : "RS2" } >>> { "_id" : "cm_demo_data", "partitioned" : false, "primary" : >>> "RS2" } >>> { "_id" : "cm_develop_data", "partitioned" : false, >>> "primary" : "RS2" } >>> { "_id" : "cm", "partitioned" : false, "primary" : "RS2" }
On Wednesday, September 5, 2012 4:07:56 PM UTC-7, David Hows wrote:
> Hi Geoff,
> Your problem is an interesting one. Its clear that the > balancer isn't working, but there is no stale lock on the balancer. Which > is normally the cause of these kinds of issues.
> Would you have any issue with removing the old locks on the collections > that dont exist?
> Additionally, would you be able to add back the third shard and enable > logging on the mongos. Mongos is saying its aware of a chunk migration > already in progress and the extra logging should give us an idea of what > the cause is. You would only need to turn the logging up for a short time > (1 minute or so) and just need to see the error message in the log around > balancing.
> To change the logging level execute the following command > "db.adminCommand({ setParameter: 1, logLevel: 5 })"
> Cheers,
> David
Well, we found the issue: we had mongo 2.2 on shard 3 and 2.0 on the shards 1 and 2, exactly the way we were certain we didn't.
We installed the new shard the day AFTER 2.2 came out, and when we did 'yum install ...' we got the 2.2 instance without realizing it. When we started debugging this, the first thing we checked was "is this v2.0.6?", but we accidentally connected to the mongod on the dev server, not the prod server. Everything after that was therefore obviously futile thrashing around, since we had already tested and "eliminated" the correct answer. Once we found the problem, it took 20 minutes to fix, after nearly a week of forehead pounding.
On Tuesday, September 4, 2012 7:59:42 AM UTC+10, David K Storrs wrote:
> I added a third shard to my cluster last week and no data has moved to > it at all. I've been digging for answers and I'm finding related > things but nothing that quite addresses the issue -- my first thought > was that there is a stale lock somewhere, but I don't see it. Any > help would be much appreciated, because I've GOT to get this sorted > soon -- the RS1 shard is about 80% on disk.
> DB is v2.0; we have been using v2.0 for several months now and have > not made the jump to 2.2 yet.
> Below is everything I think might help someone figure this out. Sorry > for the wall o' text; I wasn't sure how much to include.
Wasn't sure whether to make a new topic or not, but I would have named my problem the same thing.
It seems like moving chunks has stopped happening recently and I'm not sure why. I have a bunch of collections that were previously sharding fine. Recently, I upgraded all my boxes to Mongo 2.2 (I verified that the mongod's are 2.2) but I noticed that my chunks weren't getting moved very much anymore.
I grepped my mongo logs for the words "aborted" and see a lot of these messages: Thu Oct 4 22:09:12 [conn128724] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:12-302899", server: "ip-10-188-20-57", clientAddr: "10.166.29.83:44133", time: new Date(1349388552424), what: "moveChunk.from", ns: "db_name.collection1", details: { min: { did: MinKey, version: MinKey, error_hash: MinKey }, max: { did: "0000460e-b541-3e73-92f6-45644b1a7951", version: "3.2.1", error_hash: "e035313ce9ebbec5f9effb44b4550d0a3d4ad77d4a1e796a9a08acf0" }, step1 of 6: 1, step2 of 6: 209, note: "aborted" } } Thu Oct 4 22:09:14 [conn115874] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:14-302901", server: "ip-10-188-20-57", clientAddr: "10.166.47.149:36241", time: new Date(1349388554635), what: "moveChunk.from", ns: "db_name.collection2", details: { min: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('4fe6e574b8d7d56f7e000064') }, max: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('4fea3be08a43cb5505000343') }, step1 of 6: 0, step2 of 6: 242, note: "aborted" } } Thu Oct 4 22:09:16 [conn115874] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:16-302903", server: "ip-10-188-20-57", clientAddr: "10.166.47.149:36241", time: new Date(1349388556107), what: "moveChunk.from", ns: "db_name.collection3", details: { min: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500b46d7e230974770000054') }, max: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500bf0351644321276000477') }, step1 of 6: 0, step2 of 6: 182, note: "aborted" } } Thu Oct 4 22:09:20 [conn115874] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:20-302905", server: "ip-10-188-20-57", clientAddr: "10.166.47.149:36241", time: new Date(1349388560847), what: "moveChunk.from", ns: "db_name.collection1", details: { min: { did: MinKey, version: MinKey, error_hash: MinKey }, max: { did: "0000460e-b541-3e73-92f6-45644b1a7951", version: "3.2.1", error_hash: "e035313ce9ebbec5f9effb44b4550d0a3d4ad77d4a1e796a9a08acf0" }, step1 of 6: 1, step2 of 6: 3140, note: "aborted" } } Thu Oct 4 22:09:32 [conn115852] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:32-302906", server: "ip-10-188-20-57", clientAddr: "10.166.39.153:46072", time: new Date(1349388572097), what: "moveChunk.from", ns: "db_name.collection2", details: { min: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('4fe6e574b8d7d56f7e000064') }, max: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('4fea3be08a43cb5505000343') }, step1 of 6: 0, note: "aborted" } } Thu Oct 4 22:09:45 [conn115852] about to log metadata event: { _id: "ip-10-188-20-57-2012-10-04T22:09:45-302908", server: "ip-10-188-20-57", clientAddr: "10.166.39.153:46072", time: new Date(1349388585515), what: "moveChunk.from", ns: "db_name.collection3", details: { min: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500b46d7e230974770000054') }, max: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500bf0351644321276000477') }, step1 of 6: 0, step2 of 6: 3554, note: "aborted" } }
So it looks like a bunch of the moveChunk.from commands are being aborted.
I tried the flushRouter command, and checking for stale locks, as well as making sure my mongod's are of the correct version. I apologize for not giving enough detail here, but I'm really not sure what would be useful to paste here.
Appreciate any help I can get in what I should be looking at to try to figure this out!
On Thursday, October 4, 2012 3:13:47 PM UTC-7, Kev wrote:
> Wasn't sure whether to make a new topic or not, but I would have named my > problem the same thing.
> It seems like moving chunks has stopped happening recently and I'm not > sure why. I have a bunch of collections that were previously sharding > fine. Recently, I upgraded all my boxes to Mongo 2.2 (I verified that the > mongod's are 2.2) but I noticed that my chunks weren't getting moved very > much anymore.
> I grepped my mongo logs for the words "aborted" and see a lot of these > messages: > Thu Oct 4 22:09:12 [conn128724] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:12-302899", server: "ip-10-188-20-57", > clientAddr: "10.166.29.83:44133", time: new Date(1349388552424), what: > "moveChunk.from", ns: "db_name.collection1", details: { min: { did: MinKey, > version: MinKey, error_hash: MinKey }, max: { did: > "0000460e-b541-3e73-92f6-45644b1a7951", version: "3.2.1", error_hash: > "e035313ce9ebbec5f9effb44b4550d0a3d4ad77d4a1e796a9a08acf0" }, step1 of 6: > 1, step2 of 6: 209, note: "aborted" } } > Thu Oct 4 22:09:14 [conn115874] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:14-302901", server: "ip-10-188-20-57", > clientAddr: "10.166.47.149:36241", time: new Date(1349388554635), what: > "moveChunk.from", ns: "db_name.collection2", details: { min: { > unsymbolized_hash: > "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: > ObjectId('4fe6e574b8d7d56f7e000064') }, max: { unsymbolized_hash: > "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: > ObjectId('4fea3be08a43cb5505000343') }, step1 of 6: 0, step2 of 6: 242, > note: "aborted" } } > Thu Oct 4 22:09:16 [conn115874] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:16-302903", server: "ip-10-188-20-57", > clientAddr: "10.166.47.149:36241", time: new Date(1349388556107), what: > "moveChunk.from", ns: "db_name.collection3", details: { min: { hash: > "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: > ObjectId('500b46d7e230974770000054') }, max: { hash: > "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: > ObjectId('500bf0351644321276000477') }, step1 of 6: 0, step2 of 6: 182, > note: "aborted" } } > Thu Oct 4 22:09:20 [conn115874] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:20-302905", server: "ip-10-188-20-57", > clientAddr: "10.166.47.149:36241", time: new Date(1349388560847), what: > "moveChunk.from", ns: "db_name.collection1", details: { min: { did: MinKey, > version: MinKey, error_hash: MinKey }, max: { did: > "0000460e-b541-3e73-92f6-45644b1a7951", version: "3.2.1", error_hash: > "e035313ce9ebbec5f9effb44b4550d0a3d4ad77d4a1e796a9a08acf0" }, step1 of 6: > 1, step2 of 6: 3140, note: "aborted" } } > Thu Oct 4 22:09:32 [conn115852] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:32-302906", server: "ip-10-188-20-57", > clientAddr: "10.166.39.153:46072", time: new Date(1349388572097), what: > "moveChunk.from", ns: "db_name.collection2", details: { min: { > unsymbolized_hash: > "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: > ObjectId('4fe6e574b8d7d56f7e000064') }, max: { unsymbolized_hash: > "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: > ObjectId('4fea3be08a43cb5505000343') }, step1 of 6: 0, note: "aborted" } } > Thu Oct 4 22:09:45 [conn115852] about to log metadata event: { _id: > "ip-10-188-20-57-2012-10-04T22:09:45-302908", server: "ip-10-188-20-57", > clientAddr: "10.166.39.153:46072", time: new Date(1349388585515), what: > "moveChunk.from", ns: "db_name.collection3", details: { min: { hash: > "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: > ObjectId('500b46d7e230974770000054') }, max: { hash: > "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: > ObjectId('500bf0351644321276000477') }, step1 of 6: 0, step2 of 6: 3554, > note: "aborted" } }
> So it looks like a bunch of the moveChunk.from commands are being aborted.
> I tried the flushRouter command, and checking for stale locks, as well as > making sure my mongod's are of the correct version. I apologize for not > giving enough detail here, but I'm really not sure what would be useful to > paste here.
> Appreciate any help I can get in what I should be looking at to try to > figure this out!
In terms of how I checked for the migrations. I basically added a new collection within the past few days. I noticed it wasn't sharding right away but I figured it might take some time to move the data. Checking db.printShardingStatus() a few days later revealed that while I had chunks, they weren't moving properly:
Fri Oct 5 04:29:32 [Balancer] moving chunk ns: crittercism.handled_exception_log moving ( ns:crittercism.handled_exception_log at: crittercism10:crittercism10/tomato-3.crittercism.com:27018,tomato-4.critter cism.com:27018 lastmod: 7215|1||000000000000000000000000 min: { unsymbolized_hash: "225440a5b53f8b748b036fd9f739d05b96b32cc5143d992eb24b7aa9", _id: ObjectId('5049340ccd2b6e79a6000298') } max: { unsymbolized_hash: "225440a5b53f8b748b036fd9f739d05b96b32cc5143d992eb24b7aa9", _id: ObjectId('50493b1bdcd8b022d600009e') }) crittercism10:crittercism10/tomato-3.crittercism.com:27018,tomato-4.critter cism.com:27018 -> crittercism4:crittercism4/grape-3.crittercism.com:27018,grape-4.crittercism .com:27018 Fri Oct 5 04:29:33 [Balancer] moveChunk result: { cause: { errmsg: "migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
}
Fri Oct 5 04:29:33 [Balancer] balancer move failed: { cause: { errmsg: "migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
} from: crittercism10 to: crittercism4 chunk: min: { unsymbolized_hash:
Fri Oct 5 04:29:33 [Balancer] balancer move failed: { cause: { errmsg: "migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
} from: crittercism to: crittercism1 chunk: min: { hashed_device_id:
MinKey, app_version: MinKey, error_hash: MinKey } max: { hashed_device_id: MinKey, app_version: MinKey, error_hash: MinKey } Fri Oct 5 04:29:33 [Balancer] distributed lock 'balancer/ip-10-161-6-230:27017:1348603845:1804289383' unlocked. Fri Oct 5 04:33:12 [LockPinger] cluster silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critter cism.com:27019 pinged successfully at Fri Oct 5 04:33:12 2012 by distributed lock pinger 'silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critte rcism.com:27019/ip-10-161-6-230:27017:1348603845:1804289383', sleeping for 30000ms Fri Oct 5 04:37:42 [PeriodicTask::Runner] task: DBConnectionPool-cleaner took: 29ms Fri Oct 5 04:38:17 [LockPinger] cluster silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critter cism.com:27019 pinged successfully at Fri Oct 5 04:38:16 2012 by distributed lock pinger 'silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critte rcism.com:27019/ip-10-161-6-230:27017:1348603845:1804289383', sleeping for 30000ms
That does seem a bit more informative, though I'm still not quite sure how to solve.
Hey David, Doesn't look like migration has started again. I'll try the balancer turn off for a longer period of time later tonight, but should I expect to have to wait an hour for it to clear?
Mongos.log output Tue Oct 9 00:49:56 [LockPinger] cluster silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critter cism.com:27019 pinged successfully at Tue Oct 9 00:49:56 2012 by distributed lock pinger 'silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critte rcism.com:27019/ip-10-174-183-254:27017:1349418400:1804289383', sleeping for 30000ms Tue Oct 9 00:50:53 [conn7726] end connection 127.0.0.1:36687 (6 connections now open) Tue Oct 9 00:51:16 [conn7727] end connection 127.0.0.1:37396 (5 connections now open) Tue Oct 9 00:51:19 [conn7721] end connection 127.0.0.1:55807 (4 connections now open) Tue Oct 9 00:51:20 [mongosMain] connection accepted from 127.0.0.1:43635 #7729 (5 connections now open) Tue Oct 9 00:51:43 [mongosMain] connection accepted from 127.0.0.1:44458 #7730 (6 connections now open) Tue Oct 9 00:51:57 [conn7728] end connection 127.0.0.1:40274 (5 connections now open) Tue Oct 9 00:51:57 [mongosMain] connection accepted from 127.0.0.1:45427 #7731 (6 connections now open) Tue Oct 9 00:52:54 [Balancer] warning: distributed lock 'balancer/ip-10-174-183-254:27017:1349418400:1804289383 did not propagate properly. :: caused by :: 8017 update not consistent ns: config.locks query: { _id: "balancer", state: 0, ts: ObjectId('507375615b5aac7653e146e7') } update: { $set: { state: 1, who: "ip-10-174-183-254:27017:1349418400:1804289383:Balancer:1793813364", process: "ip-10-174-183-254:27017:1349418400:1804289383", when: new Date(1349743974432), why: "doing balance round", ts: ObjectId('507375660fa8ef3a64f0699e') } } gle1: { updatedExisting: true, n: 1, connectionId: 163862, waited: 2, err: null, ok: 1.0 } gle2: { updatedExisting: false, n: 0, connectionId: 213070, waited: 20, err: null, ok: 1.0 } Tue Oct 9 00:52:54 [Balancer] lock update lost, lock 'balancer/ip-10-174-183-254:27017:1349418400:1804289383' not propagated. Tue Oct 9 00:52:54 [Balancer] distributed lock 'balancer/ip-10-174-183-254:27017:1349418400:1804289383' was not acquired. Tue Oct 9 00:53:00 [LockPinger] trying to delete 1 old lock entries for process ip-10-174-183-254:27017:1349418400:1804289383 Tue Oct 9 00:53:00 [LockPinger] handled late remove of old distributed lock with ts 507375660fa8ef3a64f0699e Tue Oct 9 00:53:35 [conn7730] end connection 127.0.0.1:44458 (5 connections now open) Tue Oct 9 00:53:40 [mongosMain] connection accepted from 127.0.0.1:49666 #7732 (6 connections now open) Tue Oct 9 00:53:52 [conn7729] end connection 127.0.0.1:43635 (5 connections now open) Tue Oct 9 00:54:01 [mongosMain] connection accepted from 127.0.0.1:50570 #7733 (6 connections now open) Tue Oct 9 00:54:26 [conn7731] end connection 127.0.0.1:45427 (5 connections now open) Tue Oct 9 00:54:31 [mongosMain] connection accepted from 127.0.0.1:51793 #7734 (6 connections now open) Tue Oct 9 00:54:55 [mongosMain] connection accepted from 127.0.0.1:52447 #7735 (7 connections now open) Tue Oct 9 00:55:02 [LockPinger] cluster silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critter cism.com:27019 pinged successfully at Tue Oct 9 00:55:02 2012 by distributed lock pinger 'silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critte rcism.com:27019/ip-10-174-183-254:27017:1349418400:1804289383', sleeping for 30000ms Tue Oct 9 00:56:11 [conn7732] end connection 127.0.0.1:49666 (6 connections now open) Tue Oct 9 00:56:30 [mongosMain] connection accepted from 127.0.0.1:56323 #7736 (7 connections now open) Tue Oct 9 00:56:59 [conn7734] end connection 127.0.0.1:51793 (6 connections now open) Tue Oct 9 00:57:00 [mongosMain] connection accepted from 127.0.0.1:57791 #7737 (7 connections now open) Tue Oct 9 00:58:02 [conn7735] end connection 127.0.0.1:52447 (6 connections now open) Tue Oct 9 00:58:06 [mongosMain] connection accepted from 127.0.0.1:60607 #7738 (7 connections now open) Tue Oct 9 00:58:49 [conn7736] end connection 127.0.0.1:56323 (6 connections now open) Tue Oct 9 00:58:58 [mongosMain] connection accepted from 127.0.0.1:34244 #7739 (7 connections now open) Tue Oct 9 00:59:04 [conn7737] end connection 127.0.0.1:57791 (6 connections now open) Tue Oct 9 00:59:04 [conn7733] end connection 127.0.0.1:50570 (5 connections now open) Tue Oct 9 00:59:04 [conn7738] end connection 127.0.0.1:60607 (4 connections now open) Tue Oct 9 00:59:04 [conn7739] end connection 127.0.0.1:34244 (3 connections now open) Tue Oct 9 00:59:04 [conn7717] end connection 127.0.0.1:49790 (2 connections now open) Tue Oct 9 00:59:06 [mongosMain] connection accepted from 127.0.0.1:34583 #7740 (3 connections now open)
I tried messing around again with restarting the balancer so I did the following: 1. sh.setBalancerState(false) 2. Waited about 15 min 3. Restarted every mongos 4. Waited another 15 min 5. sh.setBalancerState(true)
The logs look a little different, though still nothing migrating.
Tue Oct 9 18:19:20 [conn28] end connection 127.0.0.1:32917 (5 connections now open) Tue Oct 9 18:19:52 [mongosMain] connection accepted from 127.0.0.1:39823 #31 (6 connections now open) Tue Oct 9 18:20:24 [Balancer] distributed lock 'balancer/ip-10-161-7-17:27017:1349806072:1144614846' acquired, ts : 50746ae70f3bc9ac81018ad5 Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on acai-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on blueberry-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on citron-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on tomato-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on olive-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [conn29] resetting shard version of crittercism.app on mango-3.crittercism.com:27018, version is zero Tue Oct 9 18:20:26 [Balancer] ns: crittercism.rawcrashlog going to move { _id: "crittercism.rawcrashlog-unsymbolized_hash_"bad232c81412ae200d88cda6ca0bf7e d4cfcab6514895b061f2c5526"_id_ObjectId('503ad114f2318d1995000297')", lastmod: Timestamp 28866000|1, lastmodEpoch: ObjectId('000000000000000000000000'), ns: "crittercism.rawcrashlog", min: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('503ad114f2318d1995000297') }, max: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('503e047d209996162c0002b0') }, shard: "crittercism" } from: crittercism to: crittercism4 tag [] Tue Oct 9 18:20:26 [Balancer] ns: crittercism.symbolicatedcrashlog going to move { _id: "crittercism.symbolicatedcrashlog-unsymbolized_hash_"0a52c208a1c2320bdf179b 7f2d4352a71b39fb9f4f9ac5111ca1bc7f"", lastmod: Timestamp 573000|1, ns: "crittercism.symbolicatedcrashlog", min: { unsymbolized_hash: "0a52c208a1c2320bdf179b7f2d4352a71b39fb9f4f9ac5111ca1bc7f" }, max: { unsymbolized_hash: "0a9f8f71b18aa8aa073b9af9bc220a67dfec2b9614c3f0afc026cdc4" }, shard: "crittercism4" } from: crittercism4 to: crittercism10 tag [] Tue Oct 9 18:20:27 [Balancer] ns: crittercism.crash_log going to move { _id: "crittercism.crash_log-hash_"e364999fc1280323e3f1070c6322c818c2d27758049a4a 3133170979"_id_ObjectId('500b46d7e230974770000054')", lastmod: Timestamp 2400000|1, lastmodEpoch: ObjectId('000000000000000000000000'), ns: "crittercism.crash_log", min: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500b46d7e230974770000054') }, max: { hash: "e364999fc1280323e3f1070c6322c818c2d27758049a4a3133170979", _id: ObjectId('500bf0351644321276000477') }, shard: "crittercism" } from: crittercism to: crittercism9 tag [] Tue Oct 9 18:20:27 [Balancer] ns: crittercism.crash_analytics going to move { _id: "crittercism.crash_analytics-day_id_ObjectId('4f779a800000000000000000')has h_"ffb66412df4cf23d23a51aa4340911861e8da00875464b5df2e79ece"", lastmod: Timestamp 162000|1, lastmodEpoch: ObjectId('000000000000000000000000'), ns: "crittercism.crash_analytics", min: { day_id: ObjectId('4f779a800000000000000000'), hash: "ffb66412df4cf23d23a51aa4340911861e8da00875464b5df2e79ece" }, max: { day_id: ObjectId('4f7a3d800000000000000000'), hash: "6fdccd2109ad06bb6c162e42b1d17f2d995f905602ecbabf831624fa" }, shard: "crittercism7" } from: crittercism7 to: crittercism9 tag [] Tue Oct 9 18:20:27 [Balancer] ns: crittercism.handled_exception_log going to move { _id: "crittercism.handled_exception_log-unsymbolized_hash_"225440a5b53f8b748b036 fd9f739d05b96b32cc5143d992eb24b7aa9"_id_ObjectId('5049340ccd2b6e79a6000298' )", lastmod: Timestamp 7215000|1, lastmodEpoch: ObjectId('000000000000000000000000'), ns: "crittercism.handled_exception_log", min: { unsymbolized_hash: "225440a5b53f8b748b036fd9f739d05b96b32cc5143d992eb24b7aa9", _id: ObjectId('5049340ccd2b6e79a6000298') }, max: { unsymbolized_hash: "225440a5b53f8b748b036fd9f739d05b96b32cc5143d992eb24b7aa9", _id: ObjectId('50493b1bdcd8b022d600009e') }, shard: "crittercism10" } from: crittercism10 to: crittercism4 tag [] Tue Oct 9 18:20:27 [Balancer] ns: crittercism.user_crash going to move { _id: "crittercism.user_crash-hashed_device_id_MinKeyapp_version_MinKeyerror_hash _MinKey", lastmod: Timestamp 2000|12, lastmodEpoch: ObjectId('506a1657d5dc72ee5f03460c'), ns: "crittercism.user_crash", min: { hashed_device_id: MinKey, app_version: MinKey, error_hash: MinKey }, max: { hashed_device_id: "0000460e-b541-3e73-92f6-45644b1a7951", app_version: "3.2.1", error_hash: "e035313ce9ebbec5f9effb44b4550d0a3d4ad77d4a1e796a9a08acf0" }, shard: "crittercism" } from: crittercism to: crittercism1 tag [] Tue Oct 9 18:20:27 [Balancer] moving chunk ns: crittercism.rawcrashlog moving ( ns:crittercism.rawcrashlog at: crittercism:crittercism/apple-2.crittercism.com:27018,apple-3.crittercism.c om:27018,apple-4.crittercism.com:27018 lastmod: 28866|1||000000000000000000000000 min: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('503ad114f2318d1995000297') } max: { unsymbolized_hash: "bad232c81412ae200d88cda6ca0bf7ed4cfcab6514895b061f2c5526", _id: ObjectId('503e047d209996162c0002b0') }) crittercism:crittercism/apple-2.crittercism.com:27018,apple-3.crittercism.c om:27018,apple-4.crittercism.com:27018 -> crittercism4:crittercism4/grape-2.crittercism.com:27018,grape-3.crittercism .com:27018,grape-4.crittercism.com:27018 Tue Oct 9 18:20:29 [Balancer] moveChunk result: { cause: { errmsg: "migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
}
Tue Oct 9 18:20:30 [Balancer] ChunkManager: time to load chunks for crittercism.rawcrashlog: 684ms sequenceNumber: 18 version: 28866|2801||000000000000000000000000 based on: 28866|2797||000000000000000000000000 Tue Oct 9 18:20:30 [Balancer] Assertion failure n == a src/mongo/s/shard.h 105 0x552271 0x5fcf6d 0x6d0a79 0x6e5b88 0x5ae60f 0x6cf4f6 0x6e09ac 0x807c15 0x80fd76 0x6b5f3e 0x6b75c4 0x61e189 0x7fefc6a72efc 0x7fefc5e03f4d /usr/local/mongodb/bin/mongos(_ZN5mongo15printStackTraceERSo+0x21) [0x552271] /usr/local/mongodb/bin/mongos(_ZN5mongo12verifyFailedEPKcS1_j+0xfd) [0x5fcf6d] /usr/local/mongodb/bin/mongos(_ZN5mongo17ChunkRangeManager12_insertRangeESt 23_Rb_tree_const_iteratorISt4pairIKNS_7BSONObjEN5boost10shared_ptrIKNS_5Chu nkEEEEESB_+0x539) [0x6d0a79] /usr/local/mongodb/bin/mongos(_ZN5mongo12ChunkManager18loadExistingRangesER KSs+0x788) [0x6e5b88] /usr/local/mongodb/bin/mongos(_ZN5mongo8DBConfig15getChunkManagerERKSsbb+0x 4ef) [0x5ae60f] /usr/local/mongodb/bin/mongos(_ZNK5mongo12ChunkManager6reloadEb+0x96) [0x6cf4f6] /usr/local/mongodb/bin/mongos(_ZNK5mongo5Chunk13moveAndCommitERKNS_5ShardEx bRNS_7BSONObjE+0xf2c) [0x6e09ac] /usr/local/mongodb/bin/mongos(_ZN5mongo8Balancer11_moveChunksEPKSt6vectorIN 5boost10shared_ptrINS_11MigrateInfoEEESaIS5_EEb+0x415) [0x807c15] /usr/local/mongodb/bin/mongos(_ZN5mongo8Balancer3runEv+0x976) [0x80fd76] /usr/local/mongodb/bin/mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10sha red_ptrINS0_9JobStatusEEE+0xbe) [0x6b5f3e] /usr/local/mongodb/bin/mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tI vNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS 2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74) [0x6b75c4] /usr/local/mongodb/bin/mongos() [0x61e189] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc) [0x7fefc6a72efc] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fefc5e03f4d] Tue Oct 9 18:20:31 [Balancer] distributed lock 'balancer/ip-10-161-7-17:27017:1349806072:1144614846' unlocked. Tue Oct 9 18:20:31 [Balancer] scoped connection to silva.crittercism.com:27019,kompany.crittercism.com:27019,balotelli.critter cism.com:27019 not being returned to the pool Tue Oct 9 18:20:31 [Balancer] caught exception while doing balance: assertion src/mongo/s/shard.h:105 Tue Oct 9 18:20:37 [Balancer] SyncClusterConnection connecting to [silva.crittercism.com:27019] Tue Oct 9 18:20:37 [Balancer] SyncClusterConnection connecting to [kompany.crittercism.com:27019] Tue Oct 9 18:20:37 [Balancer] SyncClusterConnection connecting to [balotelli.crittercism.com:27019]
On Monday, October 8, 2012 6:01:02 PM UTC-7, Kev wrote:
> Hey David, > Doesn't look like migration has started again. I'll try the balancer > turn off for a longer period of time later tonight, but should I expect to > have to wait an hour for it to clear?
> Mongos.log output > Tue Oct 9 00:49:56 [LockPinger] cluster silva.crittercism.com:27019, > kompany.crittercism.com:27019,balotelli.crittercism.com:27019 pinged > successfully at Tue Oct 9 00:49:56 2012 by distributed lock pinger ' > silva.crittercism.com:27019,kompany.crittercism.com:27019, > balotelli.crittercism.com:27019/ip-10-174-183-254:27017:1349418400:18042893 83', > sleeping for 30000ms > Tue Oct 9 00:50:53 [conn7726] end connection 127.0.0.1:36687 (6 > connections now open) > Tue Oct 9 00:51:16 [conn7727] end connection 127.0.0.1:37396 (5 > connections now open) > Tue Oct 9 00:51:19 [conn7721] end connection 127.0.0.1:55807 (4 > connections now open) > Tue Oct 9 00:51:20 [mongosMain] connection accepted from 127.0.0.1:43635#7729 (5 connections now open) > Tue Oct 9 00:51:43 [mongosMain] connection accepted from 127.0.0.1:44458#7730 (6 connections now open) > Tue Oct 9 00:51:57 [conn7728] end connection 127.0.0.1:40274 (5 > connections now open) > Tue Oct 9