Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Balancing does not work
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  18 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 6:20 am
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 03:20:08 -0700 (PDT)
Local: Thurs, Aug 30 2012 6:20 am
Subject: Balancing does not work

Hello
i've upgraded to mongo 2.2.0 yesterday. all went like charme and i could
shard my collection.
In the documentation i read:
Both splits and migrates are performed automatically.

But unfortunately this doesn't work on our setup.
I get error messages like these:

Thu Aug 30 12:16:42 [Balancer]  ns: production.people going to move { _id:
"production.people-_id_MinKey", lastmod: Timestamp 1000|0, lastmodEpoch:
ObjectId('503e9d5ef940d75c2de07f8e'), ns: "production.people", min: { _id:
MinKey }, max: { _id: 304836 }, shard: "s1" } from: s1 to: s2 tag []

Thu Aug 30 12:16:42 [Balancer] moving chunk ns: production.people moving (
ns:production.people at:
s1:s1/mongo11.example.com:20011,mongo12.example.com:20012 lastmod:
1|0||000000000000000000000000 min: { _id: MinKey } max: { _id: 304836 })
s1:s1/mongo11.example.com:20011,mongo12.example.com:20012 ->
s2:s2/mongo21.example.com:20021,mongo22.example.com:20022

Thu Aug 30 12:16:43 [Balancer] moveChunk result: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0

}

Thu Aug 30 12:16:43 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0

} from: s1 to: s2 chunk:  min: { _id: MinKey } max: { _id: MinKey }

After 10h of waiting there is still all data on the first shard and the
second is empty.
I think about manually splitting but as it's not so easy to find the right
splitpoint in our database i've not done this yet.

Thanks for your Help
Dani


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Aug 30 2012, 6:32 am
From: Adam C <ad...@10gen.com>
Date: Thu, 30 Aug 2012 03:32:06 -0700 (PDT)
Local: Thurs, Aug 30 2012 6:32 am
Subject: Re: Balancing does not work

Dani,

You might have a stale lock lying around, or the mongos may have a stale
view of things - can you do a couple of things for me:

1. Bounce (restart) all of your mongos
2. Once the bounce is complete, log into the mongos and run:

use config;
db.locks.find();

And post the results here.

Thanks,

Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 7:37 am
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 04:37:02 -0700 (PDT)
Local: Thurs, Aug 30 2012 7:37 am
Subject: Re: Balancing does not work

Hi Adam
Here's the output:

mongos> db.locks.find();

{ "_id" : "admin-movePrimary", "process" :
"Web1:27069:1343571761:1804289383", "state" : 0, "ts" :
ObjectId("501555d74439248d85dc8867"), "when" :
ISODate("2012-07-29T15:25:11.099Z"), "who" :
"Web1:27069:1343571761:1804289383:conn122:1714636915", "why" : "Moving
primary shard of admin" }

{ "_id" : "example_production-movePrimary", "process" :
"Web1:27069:1343571761:1804289383", "state" : 0, "ts" :
ObjectId("501553614439248d85dc885a"), "when" :
ISODate("2012-07-29T15:14:41.616Z"), "who" :
"Web1:27069:1343571761:1804289383:conn1:1681692777", "why" : "Moving
primary shard of example_production" }

{ "_id" : "example_production_vanity-movePrimary", "process" :
"Web1:27069:1343571761:1804289383", "state" : 0, "ts" :
ObjectId("501552fb4439248d85dc8855"), "when" :
ISODate("2012-07-29T15:12:59.598Z"), "who" :
"Web1:27069:1343571761:1804289383:conn1:1681692777", "why" : "Moving
primary shard of example_production_vanity" }

{ "_id" : "balancer", "process" : "web1:27069:1346283357:314909341",
"state" : 2, "ts" : ObjectId("503f4f2a3c113ffbd8e4a7e9"), "when" :
ISODate("2012-08-30T11:31:54.320Z"), "who" :
"web1:27069:1346283357:314909341:Balancer:1842493053", "why" : "doing
balance round" }

{ "_id" : "example_production.people", "process" :
"mongo11:20011:1346282264:758785138", "state" : 0, "ts" :
ObjectId("503f4f2ce69a6c2009e22331"), "when" :
ISODate("2012-08-30T11:31:56.182Z"), "who" :
"mongo11:20011:1346282264:758785138:conn37208:1670912857", "why" :
"migrate-{ _id: MinKey }" }

{ "_id" : "example_production.new_coll", "process" :
"web1:27069:1346283357:314909341", "state" : 0, "ts" :
ObjectId("503f36eb3c113ffbd8e4a6a1"), "when" :
ISODate("2012-08-30T09:48:27.208Z"), "who" :
"web1:27069:1346283357:314909341:conn37665:149759223", "why" : "drop" }

{ "_id" : "example_production_vanity.metrics", "process" :
"web1:27069:1346283357:314909341", "state" : 0, "ts" :
ObjectId("503f37833c113ffbd8e4a6aa"), "when" :
ISODate("2012-08-30T09:50:59.474Z"), "who" :
"web1:27069:1346283357:314909341:conn37665:149759223", "why" : "drop" }

Thanks!

Dani

Am Donnerstag, 30. August 2012 12:32:06 UTC+2 schrieb Adam C:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 7:37 am
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 04:37:57 -0700 (PDT)
Local: Thurs, Aug 30 2012 7:37 am
Subject: Re: Balancing does not work

example_production.people is the important collection.

Am Donnerstag, 30. August 2012 13:37:02 UTC+2 schrieb Daniel Schlegel:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Aug 30 2012, 7:48 am
From: Adam C <ad...@10gen.com>
Date: Thu, 30 Aug 2012 04:48:51 -0700 (PDT)
Local: Thurs, Aug 30 2012 7:48 am
Subject: Re: Balancing does not work

You have several movePrimary locks dating back to last month, including the
admin database, which should *always* be on "config" - did you drain a
shard, or use movePrimary for some other reason?

Can you also post the output of:

use config;
db.databases.find();

Adam.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 8:17 am
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 05:17:15 -0700 (PDT)
Local: Thurs, Aug 30 2012 8:17 am
Subject: Re: Balancing does not work

Yes i did drain a shard because i had a problem removing a replica:
1. I was not able to remove a replica but had to move the server.
2. So ive added a new shard, drained the old one and remove the old shard.

Heres the discussion about it:
https://groups.google.com/forum/?fromgroups=#!searchin/mongodb-user/r...

mongos> db.databases.find();
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "silp_production", "partitioned" : true, "primary" : "s1" }

Thank you so much
Dani

Am Donnerstag, 30. August 2012 13:48:51 UTC+2 schrieb Adam C:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Aug 30 2012, 10:36 am
From: Adam C <ad...@10gen.com>
Date: Thu, 30 Aug 2012 07:36:23 -0700 (PDT)
Local: Thurs, Aug 30 2012 10:36 am
Subject: Re: Balancing does not work

Well, let's clear up the old entries first, then we can bounce the mongos
and see if that helps things.  It may help if you restart the mongod's too
- clear out anything stale on their side.  Each primary can only take part
in a single migration at a time, so if they believe they are actually doing
a migration that may be part of the problem too.

To remove the stale entries:

use config;
db.locks.remove({"_id" : "admin-movePrimary"});
db.locks.remove({"_id" : "example_production-movePrimary"});
db.locks.remove({"_id" : "example_production_vanity-movePrimary"});

Then bounce the mongos, see if the migrations kick off.  If not, bounce the
mongod's also to clear out anything stale there.

Adam


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 11:26 am
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 08:26:36 -0700 (PDT)
Local: Thurs, Aug 30 2012 11:26 am
Subject: Re: Balancing does not work

Hi Adam
i removed all the entries and restartet both, mongos and mongod.
The Problem is still here:

Thu Aug 30 17:18:34 [conn502] resetting shard version of
example_production.people on mongo22.example.com:20022, version is zero
Thu Aug 30 17:18:34 [conn503] resetting shard version of
example_production.people on mongo22.example.com:20022, version is zero
...

Thu Aug 30 17:18:41 [Balancer]  ns: example_production.people going to move
{ _id: "example_production.people-_id_MinKey", lastmod: Timestamp 1000|0,
lastmodEpoch: ObjectId('503e9d5ef940d75c2de07f8e'), ns:
"example_production.people", min: { _id: MinKey }, max: { _id: 304836 },
shard: "s1" } from: s1 to: s2 tag []
Thu Aug 30 17:18:41 [Balancer] moving chunk ns: example_production.people
moving ( ns:example_production.people at:
s1:s1/mongo11.example.com:20011,mongo12.example.com:20012 lastmod:
1|0||000000000000000000000000 min: { _id: MinKey } max: { _id: 304836 })
s1:s1/mongo11.example.com:20011,mongo12.example.com:20012 ->
s2:s2/mongo21.example.com:20021,mongo22.example.com:20022
...
Thu Aug 30 17:18:44 [Balancer] moveChunk result: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0

}

Thu Aug 30 17:18:44 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0

} from: s1 to: s2 chunk:  min: { _id: MinKey } max: { _id: MinKey }

mongos> sh.status()
--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
{  "_id" : "s1",  "host" :
"s1/mongo11.example.com:20011,mongo12.example.com:20012" }
{  "_id" : "s2",  "host" :
"s2/mongo21.example.com:20021,mongo22.example.com:20022" }
  databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "example_production",  "partitioned" : true,  "primary" : "s1" }
example_production.people chunks:
s1 3079
too many chunks to print, use verbose if you want to force print

these locks appeared again(in think this is like it should be):

mongos> db.locks.find();
{ "_id" : "example_production.people", "process" :
"mongo11:20011:1346339905:604514035", "state" : 2, "ts" :
ObjectId("503f85a8b61cd58923b1fcea"), "when" :
ISODate("2012-08-30T15:24:24.524Z"), "who" :
"mongo11:20011:1346339905:604514035:conn46:460093839", "why" : "migrate-{
_id: MinKey }" }
{ "_id" : "balancer", "process" : "worker:27069:1346339673:1804289383",
"state" : 2, "ts" : ObjectId("503f85a740d7cd22f64415dc"), "when" :
ISODate("2012-08-30T15:24:23.583Z"), "who" :
"worker:27069:1346339673:1804289383:Balancer:846930886", "why" : "doing
balance round" }

i really appreciate <http://www.dict.cc/englisch-deutsch/appreciate.html> your
help, i'm pretty stuck with my current know how.
Dani

Am Donnerstag, 30. August 2012 16:36:23 UTC+2 schrieb Adam C:

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Aug 30 2012, 11:43 am
From: Adam C <ad...@10gen.com>
Date: Thu, 30 Aug 2012 08:43:50 -0700 (PDT)
Local: Thurs, Aug 30 2012 11:43 am
Subject: Re: Balancing does not work

What it is saying is that the shard it is attempting to migrate to, namely
the primary from:

s2/mongo21.example.com:20021,mongo22.example.com:20022

Thinks it is already taking part in a migration (as I mentioned, it can
only take part in one at a time, hence the error).  Are you sure those
mongod instances were restarted?

Can you show me rs.status() from that s2 replica set, and a db.currentOp()
from the primary?

In addition - are there any migration messages (or other errors) visible in
the logs for the primary for that set?

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Schlegel  
View profile  
 More options Aug 30 2012, 12:04 pm
From: Daniel Schlegel <firmas...@gmail.com>
Date: Thu, 30 Aug 2012 09:04:51 -0700 (PDT)
Local: Thurs, Aug 30 2012 12:04 pm
Subject: Re: Balancing does not work

Im pretty sure i restarted all the mongod's in our system. I made it again
for the s2 mongod's.
Not sure if it was exactly on point of restarting the mongod's but the
exception seam to have disappeared.
Is it possible, that i had to bounce first the primary of s1 and then the
primary of s2? I did it in reverse order.

Here are the outputs:
s2:PRIMARY> rs.status()
{
"set" : "s2",
"date" : ISODate("2012-08-30T15:52:40Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "mongo21.example.com:20021",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 57,
"optime" : Timestamp(1346341931000, 3),
"optimeDate" : ISODate("2012-08-30T15:52:11Z"),
"lastHeartbeat" : ISODate("2012-08-30T15:52:40Z"),
"pingMs" : 0,
"errmsg" : "syncing to: mongo22.example.com:20022"

},

{
"_id" : 1,
"name" : "arbiter23.example.com:20023",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 57,
"lastHeartbeat" : ISODate("2012-08-30T15:52:40Z"),
"pingMs" : 0
},

{
"_id" : 2,
"name" : "mongo22.example.com:20022",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 57,
"optime" : Timestamp(1346341931000, 3),
"optimeDate" : ISODate("2012-08-30T15:52:11Z"),
"self" : true
}

],
"ok" : 1

}

s2:PRIMARY> db.currentOp()
{
"inprog" : [
{
"opid" : 748,
"active" : true,
"secs_running" : 2,
"op" : "getmore",
"ns" : "local.oplog.rs",
"query" : {
 },
"client" : "6.23.38.199:46472",
"desc" : "conn35",
"threadId" : "0x7fad895e7700",
"connectionId" : 35,
"waitingForLock" : false,
"numYields" : 0,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(97),
"w" : NumberLong(0)
},

"timeAcquiringMicros" : {
"r" : NumberLong(11),
"w" : NumberLong(0)
}
}
},

{
"opid" : 24,
"active" : false,
"op" : "none",
"ns" : "",
"query" : {
 },
"desc" : "repl writer worker 1",
"threadId" : "0x7fddb837b700",
"waitingForLock" : false,
"numYields" : 0,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(0),
"w" : NumberLong(33)
},

"timeAcquiringMicros" : {
 }

}
}
]
}

The really good looking part of it is the sharding status :)
mongos> sh.status()
--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
{  "_id" : "s1",  "host" :
"s1/mongo11.example.com:20011,mongo12.example.com:20012" }
{  "_id" : "s2",  "host" :
"s2/mongo21.example.com:20021,mongo22.example.com:20022" }
  databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "example_production",  "partitioned" : true,  "primary" : "s1" }
example_production.people chunks:
s2 1
s1 3078
too many chunks to print, use verbose if you want to force print

it's about 10min ago and there didn't appear another chunk on s2 but it
seams really better.

Thank you very much for your help. I will get back if it's not going on
with sharding.

Dani

Am Donnerstag, 30. August 2012 17:43:50 UTC+2 schrieb Adam C:

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Aug 30 2012, 12:12 pm
From: Adam C <ad...@10gen.com>
Date: Thu, 30 Aug 2012 09:12:25 -0700 (PDT)
Local: Thurs, Aug 30 2012 12:12 pm
Subject: Re: Balancing does not work

The primaries are the ones responsible for the migrations, at the behest of
the mongos.  I would be surprised if the order mattered, but at least the
chunks have started moving.  Let us know if it breaks down or has issues
again.

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geoff L  
View profile  
 More options Aug 31 2012, 9:58 pm
From: Geoff L <ge...@channelmeter.com>
Date: Fri, 31 Aug 2012 18:58:37 -0700 (PDT)
Local: Fri, Aug 31 2012 9:58 pm
Subject: Re: Balancing does not work

I've got a similar problem with my database, I had 2 shards, I added a 3rd,
and now the third one won't accept any data. the mongos logs show the same
"moveChunk failed to engage TO-shard in the data transfer: migrate already
in progress" message that Daniel got. I've tried bouncing just the
mongoses, then both the mongoses and the mongods, but I still get the same
message. I also get

[Balancer] distributed lock 'balancer/dbs3a:27017:1346462580:1804289383'
unlocked. but I think that's the balancer giving up it's lock.

Any advice would be great and greatly appreciated.

Geoff

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Sep 1 2012, 8:46 am
From: Adam C <ad...@10gen.com>
Date: Sat, 1 Sep 2012 05:46:33 -0700 (PDT)
Local: Sat, Sep 1 2012 8:46 am
Subject: Re: Balancing does not work

Geoff, are you also using 2.2 or are you on the 2.0 branch?

Can you post the output of the information I requested above on your
system? In particular, from the mongos:

use config;
db.locks.find();

Thanks,

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Sep 3 2012, 6:27 pm
From: Adam C <ad...@10gen.com>
Date: Mon, 3 Sep 2012 15:27:47 -0700 (PDT)
Local: Mon, Sep 3 2012 6:27 pm
Subject: Re: Balancing does not work

For reference, it seems this is a consequence of this bug:

https://jira.mongodb.org/browse/SERVER-6902

The upgrade sequence needed to ensure this does not happen is as follows:

1) turn off balancer
2) upgrade all mongos's
3) upgrade config dabases
4) upgrade each shard (using procedure for replica sets or standalone,
whichever applies)
5) turn on balancer

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Geoff L  
View profile  
 More options Sep 4 2012, 6:37 pm
From: Geoff L <ge...@channelmeter.com>
Date: Tue, 4 Sep 2012 15:37:52 -0700 (PDT)
Local: Tues, Sep 4 2012 6:37 pm
Subject: Re: Balancing does not work

The system is running 2.0, the output from db.locks.find() is

{ "_id" : "balancer", "process" : "dbs3a:27017:1346462580:1804289383",
"state" : 0, "ts" : ObjectId("5046807a4bbb07ad987074bb"), "when" :
ISODate("2012-09-04T22:28:10.373Z"), "who" :
"dbs3a:27017:1346462580:1804289383:Balancer:846930886", "why" : "doing
balance round" }
{ "_id" : "cm_prod.video_latest_stats", "process" :
"dbs2a:27018:1346463433:367275570", "state" : 0, "ts" :
ObjectId("5046808228145fb74ca67041"), "when" :
ISODate("2012-09-04T22:28:18.167Z"), "who" :
"dbs2a:27018:1346463433:367275570:conn19:96364934", "why" : "migrate-{ _id:
\"-OkLQldVbhI\" }" }
{ "_id" : "cm_prod.video_descriptions", "process" :
"dbs1a:27017:1343957738:1804289383", "state" : 0, "ts" :
ObjectId("5022feeaf7448e409c940963"), "when" :
ISODate("2012-08-09T00:06:02.618Z"), "who" :
"dbs1a:27017:1343957738:1804289383:conn9530:1058323149", "why" : "drop" }
{ "_id" : "cm_prod.video_raw_stats", "process" :
"dbs1a:27018:1346463419:1483632176", "state" : 0, "ts" :
ObjectId("5046807843ad3b2149394704"), "when" :
ISODate("2012-09-04T22:28:08.185Z"), "who" :
"dbs1a:27018:1346463419:1483632176:conn19:2127675310", "why" : "migrate-{
vt: MinKey, h: MinKey }" }
{ "_id" : "cm_prod.profile_raw_stats", "process" :
"dbs2a:27018:1346463433:367275570", "state" : 0, "ts" :
ObjectId("5046808228145fb74ca67042"), "when" :
ISODate("2012-09-04T22:28:18.682Z"), "who" :
"dbs2a:27018:1346463433:367275570:conn19:96364934", "why" : "migrate-{ u:
MinKey, h: MinKey }" }
{ "_id" : "cm_prod.profile_daily_stats", "process" :
"dbs1a:27018:1346463419:1483632176", "state" : 0, "ts" :
ObjectId("5046807643ad3b2149394703"), "when" :
ISODate("2012-09-04T22:28:06.602Z"), "who" :
"dbs1a:27018:1346463419:1483632176:conn19:2127675310", "why" : "migrate-{
u: \"lionsgatelive\", h: 1345939325 }" }
{ "_id" : "cm_prod.channel_engagements", "process" :
"dbs1a:27017:1342673255:1804289383", "state" : 0, "ts" :
ObjectId("5007927048e98d9778948bb9"), "when" :
ISODate("2012-07-19T04:52:00.791Z"), "who" :
"dbs1a:27017:1342673255:1804289383:conn2:972605807", "why" : "drop" }
{ "_id" : "cm_prod.channel_video_grouped", "process" :
"dbs1a:27017:1343540831:1804289383", "state" : 0, "ts" :
ObjectId("501adff96dac5fc19aab36e4"), "when" :
ISODate("2012-08-02T20:15:53.401Z"), "who" :
"dbs1a:27017:1343540831:1804289383:conn1093:258764218", "why" : "drop" }

Thanks,

Geoff

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Hows  
View profile  
 More options Sep 17 2012, 1:30 am
From: David Hows <david.h...@10gen.com>
Date: Sun, 16 Sep 2012 22:30:35 -0700 (PDT)
Local: Mon, Sep 17 2012 1:30 am
Subject: Re: Balancing does not work

Hi Geoff,

Were you able to update to 2.2 following Adams steps?

Did this resolve your issue?

Cheers,

David

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Oct 6 2012, 12:28 pm
From: Adam C <ad...@10gen.com>
Date: Sat, 6 Oct 2012 09:28:43 -0700 (PDT)
Local: Sat, Oct 6 2012 12:28 pm
Subject: Re: Balancing does not work

For anyone else running into similar issues here on 2.2.0, the root cause
is very likely SERVER-7003 <https://jira.mongodb.org/browse/SERVER-7003> which
will be fixed in 2.2.1

There is some evidence that the bad state on the target shard can be eventually
resolved<http://stackoverflow.com/questions/12207059/mongodb-new-shard-appears...> and
restarts may help clear it out, but the fix in 2.2.1 is the real cure here.

Adam.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vasyl  
View profile  
 More options Feb 9, 4:39 pm
From: Vasyl <shumsk...@gmail.com>
Date: Sat, 9 Feb 2013 13:39:51 -0800 (PST)
Local: Sat, Feb 9 2013 4:39 pm
Subject: Re: Balancing does not work

Thanks Adam, it answers my questions in
https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/Rcee...

<code>
Sat Feb  9 04:26:19 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0

} from: replicaSet2 to: replicaSet1 chunk:  min: { User: MinKey } max: {

User: MinKey }
Sat Feb  9 04:26:59 [Balancer] moveChunk result: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
}

Sat Feb  9 04:27:01 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
} from: replicaSet2 to: replicaSet1 chunk:  min: { User: MinKey } max: {

User: MinKey }
Sat Feb  9 04:27:36 [Balancer] moveChunk result: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
}

Sat Feb  9 04:27:38 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
} from: replicaSet2 to: replicaSet1 chunk:  min: { UserID: MinKey } max: {

UserID: MinKey }
Sat Feb  9 04:28:16 [Balancer] moveChunk result: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
}

Sat Feb  9 04:28:22 [Balancer] balancer move failed: { cause: { errmsg:
"migrate already in progress", ok: 0.0 }, errmsg: "moveChunk failed to
engage TO-shard in the data transfer: migrate already in progress", ok: 0.0
} from: replicaSet2 to: replicaSet1 chunk:  min: { OutgoingMessagesID:

MinKey } max: { OutgoingMessagesID: MinKey }</code>

Also, recently we got
DBException in process: socket exception [SEND_ERROR] for 127.0.0.1:41725 (bug
https://jira.mongodb.org/browse/SERVER-7008)
which brokes mongoses connection permanently with cursor timing out on
application side (timeout: 30000, time left: 0:0, status: 0)

Full condition state shows that socket exception [SEND_ERROR] for localhost
occurs once  ChunkManager with autosplitting kicks in:

<code>Sat Feb  9 09:41:49 [conn7]  authenticate db: que_db { authenticate:
1, user: "user", nonce: "xxxx", key: "xxx" }
Sat Feb  9 09:41:49 [conn7] resetting shard version of que_db.exp on
m1.10001, version is zero
Sat Feb  9 11:01:38 [conn7] SyncClusterConnection connecting to
[conf1:10003]
Sat Feb  9 11:01:38 [conn7] SyncClusterConnection connecting to
[conf2:10003]
Sat Feb  9 11:01:39 [conn7] SyncClusterConnection connecting to
[conf3:10003]
Sat Feb  9 11:01:41 [conn7] ChunkManager: time to load chunks for
que_db.exp: 1ms sequenceNumber: 9 version: 1|40||50f5369065d3598b74ff2841
based on: 1|38||50f5369065d3598b74ff2841
Sat Feb  9 11:01:41 [conn7] autosplitted que_db.exp shard: ns:que_db.exp
at: replicaSet2:replicaSet2/m2:10001,m2rpl:10001 lastmod:
1|24||000000000000000000000000 min: { UserID: 21596 } max: { UserID: 21744

} on: { UserID: 21608 } (splitThreshold 67108864)

Sat Feb  9 11:02:36 [conn7] DBException in process: socket exception
[SEND_ERROR] for 127.0.0.1:41725
Sat Feb  9 11:02:36 [conn7] SocketException handling request, closing
client connection: 9001 socket exception [2] server [127.0.0.1:41725]
</code>

Thanks
Vasyl

Субота, 6 жовтня 2012 р. 19:28:43 UTC+3 користувач Adam C написав:

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »