Uneven distribution of data across shards

Mongo Adopter

unread,

Nov 29, 2010, 5:03:37 AM11/29/10

to mongodb-user

Hello,

I'm using MongoDB (1.6.3) on 64 bit Linux boxes with autosharding and
replica sets and have setup 3 shards with the data on each of them
being replicated to 3 nodes in the replica set.

I've been running performance and load tests with the above setup over
a few days and am observing that the data distribution across shards
is highly skewed.

The collection has 158.7 million documents and shard1 has 144 million
documents and shard2 and 3 have around 7 million documents each. Since
the shard key is unique I was expecting the data to be more or less
equally distributed across shards. I've even stopped the writes to
check if the data migration across shards takes place when there are
no writes but the data distribution remained as is

Am I missing something here?

Also is there a way to check the replication lag between various
members in the replica set so that we can determine if all the nodes
in the replica set are in sync.

Thanks ahead.

Alberto Lerner

unread,

Nov 29, 2010, 8:46:59 AM11/29/10

to mongod...@googlegroups.com

Can you post the chunk distribution by issuing the following against config:

db.printShardingStatus()

Also in config, can you check that there isn't a migration ongoing by

db.locks.find()

Alberto.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

MongoDB User

unread,

Nov 29, 2010, 10:13:48 AM11/29/10

to mongod...@googlegroups.com

Hi Alberto,

Please find the output of db.printShardingStatus() and db.locks.find() pasted below

--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
      {
        "_id" : "shard1",
        "host" : "sh1/192.168.1.1:25000,192.168.1.2:25000,192.168.1.3:25000"
}
      {
        "_id" : "shard2",
        "host" : "sh2/192.168.1.1:25001,192.168.1.2:25001,192.168.1.3:25001"
}
      {
        "_id" : "shard3",
        "host" : "sh3/192.168.1.1:25002,192.168.1.2:25002,192.168.1.3:25002"
}
databases:
        { "_id" : "admin", "partitioned" : false, "primary" : "config" }
        { "_id" : "local", "partitioned" : false, "primary" : "shard2" }
        { "_id" : "test", "partitioned" : true, "primary" : "shard2" }
                test.foo chunks:
                        { "uid" : { $minKey : 1 } } -->> { "uid" : "id100011" } on : shard3 { "t" : 13000, "i" : 0 }
                        { "uid" : "id100011" } -->> { "uid" : "id1069970404" } on : shard3 { "t" : 14000, "i" : 0 }
                        { "uid" : "id1069970404" } -->> { "uid" : "id1139991848" } on : shard3 { "t" : 15000, "i" : 0 }
                        { "uid" : "id1139991848" } -->> { "uid" : "id1209802727" } on : shard3 { "t" : 17000, "i" : 0 }
                        { "uid" : "id1209802727" } -->> { "uid" : "id127992" } on : shard3 { "t" : 18000, "i" : 0 }
                        { "uid" : "id127992" } -->> { "uid" : "id1349890250" } on : shard3 { "t" : 19000, "i" : 0 }
                        { "uid" : "id1349890250" } -->> { "uid" : "id1420035163" } on : shard3 { "t" : 20000, "i" : 0 }
                        { "uid" : "id1420035163" } -->> { "uid" : "id1489861623" } on : shard3 { "t" : 21000, "i" : 0 }
                        { "uid" : "id1489861623" } -->> { "uid" : "id156025" } on : shard3 { "t" : 22000, "i" : 0 }
                        { "uid" : "id156025" } -->> { "uid" : "id1630535128" } on : shard3 { "t" : 23000, "i" : 0 }
                        { "uid" : "id1630535128" } -->> { "uid" : "id1700748282" } on : shard3 { "t" : 24000, "i" : 0 }
                        { "uid" : "id1700748282" } -->> { "uid" : "id1770992126" } on : shard3 { "t" : 26000, "i" : 0 }
                        { "uid" : "id1770992126" } -->> { "uid" : "id1841361139" } on : shard2 { "t" : 27000, "i" : 0 }
                        { "uid" : "id1841361139" } -->> { "uid" : "id1911606781" } on : shard3 { "t" : 28000, "i" : 0 }
                        { "uid" : "id1911606781" } -->> { "uid" : "id1982172220" } on : shard2 { "t" : 29000, "i" : 0 }
                        { "uid" : "id1982172220" } -->> { "uid" : "id2052620341" } on : shard3 { "t" : 16000, "i" : 0 }
                        { "uid" : "id2052620341" } -->> { "uid" : "id21230" } on : shard3 { "t" : 25000, "i" : 0 }
                        { "uid" : "id21230" } -->> { "uid" : "id264128750" } on : shard2 { "t" : 4000, "i" : 10 }
                        { "uid" : "id264128750" } -->> { "uid" : "id32538" } on : shard2 { "t" : 9000, "i" : 1 }
                        { "uid" : "id32538" } -->> { "uid" : "id38183137" } on : shard2 { "t" : 12000, "i" : 6 }
                        { "uid" : "id38183137" } -->> { "uid" : "id438115" } on : shard2 { "t" : 12000, "i" : 7 }
                        { "uid" : "id438115" } -->> { "uid" : "id494248322" } on : shard2 { "t" : 12000, "i" : 12 }
                        { "uid" : "id494248322" } -->> { "uid" : "id550385" } on : shard2 { "t" : 12000, "i" : 13 }
                        { "uid" : "id550385" } -->> { "uid" : "id60664471" } on : shard2 { "t" : 12000, "i" : 8 }
                        { "uid" : "id60664471" } -->> { "uid" : "id663012" } on : shard2 { "t" : 12000, "i" : 9 }
                        { "uid" : "id663012" } -->> { "uid" : "id719149639" } on : shard2 { "t" : 12000, "i" : 16 }
                        { "uid" : "id719149639" } -->> { "uid" : "id775253" } on : shard2 { "t" : 25000, "i" : 1 }
                        { "uid" : "id775253" } -->> { "uid" : "id83150" } on : shard2 { "t" : 12000, "i" : 14 }
                        { "uid" : "id83150" } -->> { "uid" : "id88769" } on : shard2 { "t" : 12000, "i" : 15 }
                        { "uid" : "id88769" } -->> { "uid" : "id943887986" } on : shard2 { "t" : 12000, "i" : 10 }
                        { "uid" : "id943887986" } -->> { "uid" : "id999999" } on : shard2 { "t" : 12000, "i" : 11 }
                        { "uid" : "id999999" } -->> { "uid" : "id_1120461401" } on : shard1 { "t" : 24000, "i" : 2 }
                        { "uid" : "id_1120461401" } -->> { "uid" : "id_1179036038" } on : shard1 { "t" : 28000, "i" : 4 }
                        { "uid" : "id_1179036038" } -->> { "uid" : "id_1210017887" } on : shard1 { "t" : 29000, "i" : 24 }
                        { "uid" : "id_1210017887" } -->> { "uid" : "id_1241059340" } on : shard1 { "t" : 29000, "i" : 25 }
                        { "uid" : "id_1241059340" } -->> { "uid" : "id_1268101190" } on : shard1 { "t" : 29000, "i" : 14 }
                        { "uid" : "id_1268101190" } -->> { "uid" : "id_1296435793" } on : shard1 { "t" : 29000, "i" : 15 }
                        { "uid" : "id_1296435793" } -->> { "uid" : "id_1325715097" } on : shard1 { "t" : 29000, "i" : 20 }
                        { "uid" : "id_1325715097" } -->> { "uid" : "id_1354923856" } on : shard1 { "t" : 29000, "i" : 21 }
                        { "uid" : "id_1354923856" } -->> { "uid" : "id_141575432" } on : shard1 { "t" : 29000, "i" : 16 }
                        { "uid" : "id_141575432" } -->> { "uid" : "id_1482813527" } on : shard1 { "t" : 29000, "i" : 17 }
                        { "uid" : "id_1482813527" } -->> { "uid" : "id_1596029315" } on : shard1 { "t" : 24000, "i" : 4 }
                        { "uid" : "id_1596029315" } -->> { "uid" : "id_1724457053" } on : shard1 { "t" : 24000, "i" : 5 }
                        { "uid" : "id_1724457053" } -->> { "uid" : "id_1782847770" } on : shard1 { "t" : 26000, "i" : 4 }
                        { "uid" : "id_1782847770" } -->> { "uid" : "id_1811750782" } on : shard1 { "t" : 29000, "i" : 6 }
                        { "uid" : "id_1811750782" } -->> { "uid" : "id_1841342187" } on : shard1 { "t" : 29000, "i" : 7 }
                        { "uid" : "id_1841342187" } -->> { "uid" : "id_1901626692" } on : shard1 { "t" : 26000, "i" : 12 }
                        { "uid" : "id_1901626692" } -->> { "uid" : "id_1966016452" } on : shard1 { "t" : 28000, "i" : 1 }
                        { "uid" : "id_1966016452" } -->> { "uid" : "id_2021958597" } on : shard1 { "t" : 26000, "i" : 10 }
                        { "uid" : "id_2021958597" } -->> { "uid" : "id_2080277229" } on : shard1 { "t" : 26000, "i" : 11 }
                        { "uid" : "id_2080277229" } -->> { "uid" : "id_213794087" } on : shard1 { "t" : 29000, "i" : 22 }
                        { "uid" : "id_213794087" } -->> { "uid" : "id_274641821" } on : shard1 { "t" : 29000, "i" : 23 }
                        { "uid" : "id_274641821" } -->> { "uid" : "id_326816397" } on : shard1 { "t" : 29000, "i" : 12 }
                        { "uid" : "id_326816397" } -->> { "uid" : "id_385873769" } on : shard1 { "t" : 29000, "i" : 13 }
                        { "uid" : "id_385873769" } -->> { "uid" : "id_41695371" } on : shard1 { "t" : 29000, "i" : 10 }
                        { "uid" : "id_41695371" } -->> { "uid" : "id_449316195" } on : shard1 { "t" : 29000, "i" : 11 }
                        { "uid" : "id_449316195" } -->> { "uid" : "id_516397835" } on : shard1 { "t" : 29000, "i" : 3 }
                        { "uid" : "id_516397835" } -->> { "uid" : "id_637357731" } on : shard1 { "t" : 19000, "i" : 2 }
                        { "uid" : "id_637357731" } -->> { "uid" : "id_697693697" } on : shard1 { "t" : 29000, "i" : 18 }
                        { "uid" : "id_697693697" } -->> { "uid" : "id_758246324" } on : shard1 { "t" : 29000, "i" : 19 }
                        { "uid" : "id_758246324" } -->> { "uid" : "id_812467606" } on : shard1 { "t" : 29000, "i" : 8 }
                        { "uid" : "id_812467606" } -->> { "uid" : "id_87336525" } on : shard1 { "t" : 29000, "i" : 9 }
                        { "uid" : "id_87336525" } -->> { "uid" : "id_934488461" } on : shard1 { "t" : 29000, "i" : 4 }
                        { "uid" : "id_934488461" } -->> { "uid" : "id_999999218" } on : shard1 { "t" : 29000, "i" : 5 }
                        { "uid" : "id_999999218" } -->> { "uid" : { $maxKey : 1 } } on : shard1 { "t" : 14000, "i" : 3 }
        { "_id" : "foo", "partitioned" : false, "primary" : "shard3" }


------------------------------------------------------------------------------------

db.locks() indicates the following
{ "_id" : "balancer", "process" : "QuadCore1:1289596920:1804289383", "state" : 1, "ts" : ObjectId("4ce00165e27dd14bf96767c0"), "when" : "Mon Nov 15 2010 00:33:57 GMT+0900 (TLT)", "who" : "QuadCore1:1289596920:1804289383:Balancer:846930886", "why" : "doing balance round" }
{ "_id" : "test.foo", "process" : "QuadCore1:1289596920:1804289383", "state" : 0, "ts" : ObjectId("4ce2a266e27dd14bf96767cc"), "when" : "Wed Nov 17 2010 00:25:26 GMT+0900 (TLT)", "who" : "QuadCore1:1289596920:1804289383:conn551:2145174067", "why" : "split-ns:test.foo at: shard1:sh1/192.168.1.1:25000,192.168.1.2:25000,192.168.1.3:25000 lastmod: 29|1 min: { uid: \"id_1179036038\" } max: { uid: \"id_1241059340\" }" }

Does the above indicate that locks have been held for many days?

Thanks !

Alberto Lerner

unread,

Nov 29, 2010, 11:11:27 AM11/29/10

to mongod...@googlegroups.com

The state attribute indicates the balancing lock is taken. But, as you noticed, it's been long enough for a lock take-over to have happened. Could you issue a find() also over config.changelog as well? (If it gets too long, you can post just the link). I'm looking for the last time a migration occurred.

If it was too long ago, can you check mongos's log about whether the balancer is running or a migration is ongoing?

Alberto.

Mongo Adopter

unread,

Nov 30, 2010, 11:20:10 AM11/30/10

to mongodb-user

Hi Alberto,

db.changelog.find().sort({time:-1}) indicates that the last split and
migration took place a long while ago (over 2 weeks ago)

The balancer appears to be running currently but migration doesn't
appear to be going on. While going through the mongos logs a lot of
errors similar to the one below are observed on the day when the last
chunk was moved as per the changelog

Sun Nov 14 08:55:51 [Balancer] moving chunk ns: test.foo moving
( ns:test.foo at:

shard1:sh1/192.168.1.1:25000,192.168.1.2:25000,192.168.1.3:25000

lastmod: 9|3 min: { uid: "id1770992126" } max: { uid:
"id1841361139" })
shard1:sh1/192.168.1.1:25000,192.168.1.2:25000,192.168.1.3:25000 ->
shard2:sh2/192.168.1.1:25001,192.168.1.2:25001,192.168.1.3:25001
Sun Nov 14 08:56:25 [Balancer] MOVE FAILED **** db assertion failure
{ assertion: "assertion s/d_migrate.cpp:189", errmsg: "db assertion
failure", ok: 0.0 }

Current mongos logs contain the following -

Balancer] dist_lock forcefully taking over from: { _id: "balancer",

process: "QuadCore1:1289596920:1804289383", state: 1, ts:

ObjectId('4ce00165e27dd14bf96767c0'), when: new Date(1289748837094),

who: "QuadCore1:1289596920:1804289383:Balancer:846930886", why: "doing

balance round" } elapsed minutes: 20557

As part of our tests we had taken down one of the nodes in the replica
set and one of the 3 config servers on that day and brought it back
online a little while later. I'm not sure if that could be related to
the above behavior in any way (we had 3 nodes in the replica set of
which two were always active)

Snippets from the config.changelog
(db.changelog.find().sort({time:-1}))
{ "_id" : "QuadCore1-2010-11-16T15:25:31-32", "server" : "QuadCore1",
"time" : "Wed Nov 17 2010 00:25:31 GMT+0900 (TLT)", "what" : "split",
"ns" : "test.foo", "details" : { "before" : { "min" : { "uid" :
"id_1179036038" }, "max" : { "uid" : "id_1241059340" } }, "left" :
{ "min" : { "uid" : "id_1179036038" }, "max" : { "uid" :
"id_1210017887" } }, "right" : { "min" : { "uid" : "id_1210017887" },
"max" : { "uid" : "id_1241059340" } } } }

{ "_id" : "test1.mongodb-2010-11-14T12:51:58-683", "server" :
"test1.mongodb", "time" : "Sun Nov 14 2010 21:51:58 GMT+0900 (TLT)",
"what" : "moveChunk.from", "ns" : "test.foo", "details" : { "step1" :
0, "step2" : 8244, "step3" : 260 } }

Thanks a lot for your help!

On Nov 29, 9:11 pm, Alberto Lerner <aler...@10gen.com> wrote:
> The state attribute indicates the balancing lock is taken. But, as you
> noticed, it's been long enough for a lock take-over to have happened. Could
> you issue a find() also over config.changelog as well? (If it gets too long,
> you can post just the link). I'm looking for the last time a migration
> occurred.
>
> If it was too long ago, can you check mongos's log about whether the
> balancer is running or a migration is ongoing?
>
> Alberto.
>

> ...
>
> read more »

Alberto Lerner

unread,

Nov 30, 2010, 11:57:00 AM11/30/10

to mongod...@googlegroups.com

It seems that the lock take-over is not going through.

Can you check if everything is fine in the log of one of the configs
that got up? If the config was refusing updates, that would be a
reason.

Alberto.

Mongo Adopter

unread,

Dec 1, 2010, 4:42:27 AM12/1/10

to mongodb-user

Hi Alberto,

Two of the three config servers were always up at every point of time.
Can there be such issues even if one of the config servers goes down
or is down for extended periods of times or if the machine cannot be
recovered?

Is there any way to distribute the data across the shards in the
current scenario?

The config logs seem fine on all the config servers. The following
messages are observed in the logs which don't seem to be errors :
fsync from getlasterror
CMD fsync: sync:1 lock:0

Is there anything specific that should be checked for in the logs?

Thanks for your help!

On Nov 30, 9:57 pm, Alberto Lerner <aler...@10gen.com> wrote:
> It seems that the lock take-over is not going through.
>
> Can you check if everything is fine in the log of one of the configs
> that got up? If the config was refusing updates, that would be a
> reason.
>
> Alberto.
>

> ...
>
> read more »

Alberto Lerner

unread,

Dec 1, 2010, 7:37:42 AM12/1/10

to mongod...@googlegroups.com

> Two of the three config servers were always up at every point of time.
> Can there be such issues even if one of the config servers goes down
> or is down for extended periods of times or if the machine cannot be
> recovered?

The three configs must be up for changes in metadata (splits or
migrates) to go through.

We also do some lock book keeping at the config and changes to that
also require the three of them to be up.

> Is there any way to distribute the data across the shards in the
> current scenario?

With three config servers, all should be back to normal.

To bring the third one back, you'd have to use a mongodump from one of
the configs.

If the third config uses the address you gave to mongos's, then you
should be good. Otherwise, you'd have to bounce mongos (to give them
the new config) and mongod's.

Alberto.

Reply all

Reply to author

Forward