balancer is not working or communication is not working

322 views
Skip to first unread message

Andre Mantei

unread,
Jul 23, 2014, 4:34:36 AM7/23/14
to mongod...@googlegroups.com
Hello,

I have a sharded cluster and a partitioned DB and collection.
But the balancer does not work.
The following error-message is from the mongos:
 distributed lock 'balancer/vmDBTest2:27017:1406103726:41' acquired, ts : 53cf714f795ac1604d54d7be
  ns
: gbCopy.nodesWays going to move { _id: "gbCopy.nodesWays-id_MinKey", lastmod: Timestamp 1000|0, lastmodEpoch: ObjectId('53cf6738c971077bd3f61d8c'), ns: "gbCopy.nodesWays", min: { id: MinKey }, max: { id: "101922540" }, shard: "shard0001" } from: shard0001 to: shard0000 tag []
 moving chunk ns
: gbCopy.nodesWays moving ( ns: gbCopy.nodesWays, shard: shard0001:localhost:4003, lastmod: 1|0||000000000000000000000000, min: { id: MinKey }, max: { id: "101922540" }) shard0001:localhost:4003 -> shard0000:localhost:4000
 moveChunk result
: { errmsg: "exception: socket exception [CONNECT_ERROR] for localhost:27019", code: 11002, ok: 0.0 }
 balancer move failed
: { errmsg: "exception: socket exception [CONNECT_ERROR] for localhost:27019", code: 11002, ok: 0.0 } from: shard0001 to: shard0000 chunk:  min: { id: MinKey } max: { id: "101922540" }
 distributed
lock 'balancer/vmDBTest2:27017:1406103726:41' unlocked.
 distributed
lock 'balancer/vmDBTest2:27017:1406103726:41' acquired, ts : 53cf7157795ac1604d54d7bf


This is the output form the shard where the collection is saved:
2014-07-23T10:32:49.815+0200 [conn2] not logging config change: DBTest-2014-07-23T08:32:48-53cf7330026c8b31953d75c4 socket exception [CONNECT_ERROR] for localhost:27019
2014-07-23T10:32:49.815+0200 [conn2] command admin.$cmd command: moveChunk { moveChunk: "gbCopy.nodesWays", from: "localhost:4003", to: "localhost:4000", fromShard: "shard0001", toShard: "shard0000", min: { id: MinKey }, max: { id: "101922540" }, maxChunkSizeBytes: 67108864, shardId: "gbCopy.nodesWays-id_MinKey", configdb: "localhost:27019", secondaryThrottle: true, waitForDelete: false, maxTimeMS: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:123 2042ms
2014-07-23T10:32:55.847+0200 [conn2] warning: secondaryThrottle selected but no replication
2014-07-23T10:32:56.878+0200 [conn2] warning: Failed to connect to 127.0.0.1:27019, reason: errno:10061 Es konnte keine Verbindung hergestellt werden, da der Zielcomputer die Verbindung verweigerte.

Does anyone know why this does not work. The sharded cluster is working, I can see all the db's on the different shards of the cluster. But the distribution of the chunks is not working :(

Ciao, Andre

Andre Mantei

unread,
Jul 23, 2014, 9:51:13 AM7/23/14
to mongod...@googlegroups.com
Here some further informations:

mongos> sh._lastMigration()
2014-07-23T15:47:52.551+0200 Socket say send() errno:10054 Eine vorhandene Verbindung wurde vom Remotehost geschlossen. 127.0.0.1:27017
2014-07-23T15:47:52.567+0200 Error: socket exception [SEND_ERROR] for 127.0.0.1:27017 at src/mongo/shell/query.js:81
2014-07-23T15:47:52.582+0200 trying reconnect to localhost:27017 (127.0.0.1) failed
2014-07-23T15:47:52.598+0200 reconnect localhost:27017 (127.0.0.1) ok

Asya Kamsky

unread,
Jul 24, 2014, 3:33:13 AM7/24/14
to mongodb-user
You're getting connection errors, maybe to configdb? - you need to check that mongod at 27019 is running.   Are you running all the shards and config DB locally (on the same machine)?
The other error you show is about connectivity to port 27017 also having a problem.

What exactly is the topology of your cluster?  (and also what version are you running?)




--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/5577a81d-2338-4c81-89e8-2d0e84ff5a13%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andre Mantei

unread,
Jul 24, 2014, 5:54:38 AM7/24/14
to mongod...@googlegroups.com
Version: 2.6.3
I have 3 nodes:
first node: configserver(port:27019) and mongos(27017) are running here
second node: mongod(port:27017)
third node: mongod(port:27017)

I have only one configserver running.

From the first node there are two tunnels:
localhost:4003 to second node
localhost:4000 to third node

Andre Mantei

unread,
Jul 24, 2014, 11:17:22 AM7/24/14
to mongod...@googlegroups.com
I made a tunnel from the second and third node to the first node to port 27019, so that both shardserver see the config-server.
And I started the mongod-processes as mongod --shardsvr, so now they listen on port 27018.

I have the following error-message now:
balancer move failed: { errmsg: "exception: socket exception [CONNECT_ERROR] for localhost:4003", code: 11002, ok: 0.0 } from: shard0000 to: shard0001 chunk:  min: { id: MinKey } max: { id: "101922540" }

localhost:4003 is the tunnel to shard0001.

Asya Kamsky

unread,
Jul 24, 2014, 3:47:27 PM7/24/14
to mongodb-user

I'm sorry, I'm not seeing what your configuration is.

What is the mongos command line options?   And what is the output to sh.status() when run on mongos?

Asya

Andre Mantei

unread,
Jul 24, 2014, 4:37:08 PM7/24/14
to mongod...@googlegroups.com
Output of sh.status():

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 4,
        "minCompatibleVersion" : 4,
        "currentVersion" : 5,
        "clusterId" : ObjectId("53d113ccc999ccbee480e4c2")
}
  shards:
        {  "_id" : "shard0000",  "host" : "localhost:4001" }
        {  "_id" : "shard0001",  "host" : "localhost:4003" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "dbNodes",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "gb",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "gbCopy",  "partitioned" : true,  "primary" : "shard0000" }
                gbCopy.nodesWays
                        shard key: { "id" : 1 }
                        chunks:
                                shard0000       175
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "gbNodes",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h171",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h22",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h280",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h3905",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h595",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "h80",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "hannoverdb",  "partitioned" : false,  "primary" : "shard0000" }
        {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0000" }

mongos>

What are "mongos command line options" ?

Asya Kamsky

unread,
Jul 24, 2014, 6:31:05 PM7/24/14
to mongodb-user
What are "mongos command line options" ?

That's the command line you started mongos process with, you can get it from the same place you got sh.status() with:

mongos>  db.serverCmdLineOpts()

Asya



Message has been deleted

Andre Mantei

unread,
Jul 25, 2014, 2:11:46 AM7/25/14
to mongod...@googlegroups.com
mongos> db.serverCmdLineOpts()
{
        "argv" : [
                "mongos",
                "--configdb",
                "localhost:27019"
        ],
        "parsed" : {
                "sharding" : {
                        "configDB" : "localhost:27019"
                }
        },
        "ok" : 1
}

Andre Mantei

unread,
Jul 25, 2014, 3:23:20 AM7/25/14
to mongod...@googlegroups.com
OK, I solved the problem.
I did not know that all components need to communicate, so the two shards need to communicate.
What I have done is I created 2 tunnels, one from shard0 to shard1 and one from shard1 to shard0.
In shard0:
4003   shard1:27018
In shard1:
4001   shard0:27018

Now It is working :)

Asya Kamsky

unread,
Jul 25, 2014, 4:20:51 PM7/25/14
to mongodb-user
Great!  Glad you figured it out.

Yes, all the components in the cluster must be able to reach each other, in case of sharded cluster, when data migrates from one shard to another, the two shards communicate directly (otherwise the data would have to take a long detour :) ).

Asya



--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
Reply all
Reply to author
Forward
0 new messages