Can't release balancer lock

401 views
Skip to first unread message

den...@pixleeteam.com

unread,
Jan 18, 2017, 8:45:52 AM1/18/17
to mongodb-user
Hey guys,

For some reason, the balancer on my 3 node mongo cluster has stopped running.  

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("57ca0d2267fe3532e2da079f")
}
  shards:
{  "_id" : "shard0000",  "host" : "<redacted>:27017" }
{  "_id" : "shard0001",  "host" : "<redacted>:27017" }
{  "_id" : "shard0002",  "host" : "<redacted>:27017" }
  active mongoses:
"3.2.11" : 3
  balancer:
Currently enabled:  no
Currently running:  yes
Balancer lock taken at Fri Jan 13 2017 20:00:08 GMT-0800 (PST) by <redacted>:27018:1481066386:350584635:Balancer:548012130
Balancer active window is set between 04:00 and 23:00 server local time
Failed balancer rounds in last 5 attempts:  5
Last reported error:  Connection refused
Time of Reported error:  Tue Jan 17 2017 11:12:26 GMT-0800 (PST)
Migration Results for the last 24 hours:
4 : Success
1 : Failed with error 'aborted', from shard0000 to shard0002


I've tried deleted the balancer lock in the config db, but it seems no matter what I do, I can't seem to get sh.stopBalancer() or sh.startBalancer() to work:


mongos> sh.startBalancer()
assert.soon failed, msg:Waited too long for lock balancer to change to state undefined
doassert@src/mongo/shell/assert.js:15:14
assert.soon@src/mongo/shell/assert.js:199:13
sh.waitForDLock@src/mongo/shell/utils_sh.js:198:1
sh.waitForBalancer@src/mongo/shell/utils_sh.js:291:9
sh.startBalancer@src/mongo/shell/utils_sh.js:167:5
@(shell):1:1

2017-01-17T17:49:15.705-0800 E QUERY    [thread1] Error: assert.soon failed, msg:Waited too long for lock balancer to change to state undefined :
doassert@src/mongo/shell/assert.js:15:14
assert.soon@src/mongo/shell/assert.js:199:13
sh.waitForDLock@src/mongo/shell/utils_sh.js:198:1
sh.waitForBalancer@src/mongo/shell/utils_sh.js:291:9
sh.startBalancer@src/mongo/shell/utils_sh.js:167:5
@(shell):1:1




mongos> sh.stopBalancer()
Waiting for active hosts...
Waiting for the balancer lock...
assert.soon failed, msg:Waited too long for lock balancer to unlock
doassert@src/mongo/shell/assert.js:15:14
assert.soon@src/mongo/shell/assert.js:199:13
sh.waitForDLock@src/mongo/shell/utils_sh.js:198:1
sh.waitForBalancerOff@src/mongo/shell/utils_sh.js:264:9
sh.waitForBalancer@src/mongo/shell/utils_sh.js:294:9
sh.stopBalancer@src/mongo/shell/utils_sh.js:161:5
@(shell):1:1

Balancer still may be active, you must manually verify this is not the case using the config.changelog collection.
2017-01-17T17:35:49.288-0800 E QUERY    [thread1] Error: Error: assert.soon failed, msg:Waited too long for lock balancer to unlock :
sh.waitForBalancerOff@src/mongo/shell/utils_sh.js:268:15
sh.waitForBalancer@src/mongo/shell/utils_sh.js:294:9
sh.stopBalancer@src/mongo/shell/utils_sh.js:161:5
@(shell):1:1


I've also tried updating ntp on all the nodes, and restarted mongod, mongo config servers, and mongos

Also, I've been tailing logs for mongod, mongo config servers, and mongos, and I don't see anything about the balancer or locks

I've also tried moving chunks manually, and they work.

It seems to me that for whatever reason the sh. commands can't seem to modify the lock, but I can do so manually.

Is there anything else I might have missed that might help with this issue?


Reply all
Reply to author
Forward
0 new messages