remove shard problem

46 views
Skip to first unread message

lwolf

unread,
May 29, 2011, 5:13:37 AM5/29/11
to mongodb-user
Hi.
When I'm trying to remove shard007 from my cluster, all processes are
frozen and config server returns this log
Whats wrong?

Sun May 29 11:59:08 [initandlisten] connection accepted from
127.0.0.1:57519 #76
Sun May 29 11:59:09 [Balancer] chose [shard0007] to [shard0000] { _id:
"ptwdb.profiles-id_MinKey", lastmod: Timestamp 9808000|0, ns:
"ptwdb.profiles", min: { id: MinKey }, max: { id: 136 }, shard:
"shard0007" }
Sun May 29 11:59:09 [Balancer] Assertion: 10181:not
sharded:ptwdb.profiles
0x51f4a9 0x5ecf19 0x681f69 0x688671 0x5035ab 0x504e64 0x69ec30
0x7fc892b9b8ba 0x7fc89215702d
mongos(_ZN5mongo11msgassertedEiPKc+0x129) [0x51f4a9]
mongos(_ZN5mongo8DBConfig15getChunkManagerERKSsb+0x119) [0x5ecf19]

mongos(_ZN5mongo8Balancer11_moveChunksEPKSt6vectorIN5boost10shared_ptrINS_14BalancerPolicy9ChunkInfoEEESaIS6_EE
+0xe9) [0x681f69]
mongos(_ZN5mongo8Balancer3runEv+0x761) [0x688671]

mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE
+0x12b) [0x5035ab]

mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv
+0x74) [0x504e64]
mongos(thread_proxy+0x80) [0x69ec30]
/lib/libpthread.so.0(+0x68ba) [0x7fc892b9b8ba]
/lib/libc.so.6(clone+0x6d) [0x7fc89215702d]
Sun May 29 11:59:09 [Balancer] ~ScopedDbConnection: _conn != null
Sun May 29 11:59:09 [Balancer] caught exception while doing balance:
not sharded:ptwdb.profiles
Sun May 29 11:59:09 [conn75] end connection 127.0.0.1:57518
Sun May 29 11:59:09 [DataFileSync] flushing diag log

Eliot Horowitz

unread,
May 29, 2011, 1:27:31 PM5/29/11
to mongod...@googlegroups.com
That message should be transient.
What do you mean by locked up?
What version are you on?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

lwolf

unread,
May 29, 2011, 1:45:49 PM5/29/11
to mongodb-user
After the removing of a node and restarting mongos, I'm get this in
the log, and no balancer processes are going.
I'm using 1.8.1 x64 version on debian linux.

Eliot Horowitz

unread,
May 29, 2011, 1:47:11 PM5/29/11
to mongod...@googlegroups.com
Can you bounce mongos again.
If they doesn't resolve, can you send db.printShardingStatus()

lwolf

unread,
May 29, 2011, 2:08:47 PM5/29/11
to mongodb-user
Have restarted several times already.
Here is sharding status.As you can see there is no shard0005 among
shards but we still have its chunks

--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "localhost:27023" }
{ "_id" : "shard0001", "host" : "localhost:27024" }
{ "_id" : "shard0002", "host" : "localhost:27025" }
{ "_id" : "shard0003", "host" : "localhost:27026" }
{ "_id" : "shard0004", "host" : "localhost:27027" }
{ "_id" : "shard0006", "host" : "localhost:27029" }
{ "_id" : "shard0007", "host" : "localhost:27028" }
{ "_id" : "shard0008", "host" : "localhost:27030" }
{ "_id" : "shard0009", "host" : "localhost:27031" }
{ "_id" : "shard0010", "host" : "localhost:27032" }
{ "_id" : "shard0011", "host" : "localhost:27033" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" :
"config" }
{ "_id" : "ptwdb", "partitioned" : true, "primary" :
"shard0000" }
ptwdb.profiles chunks:
shard0003 6
shard0006 5
shard0005 2
shard0002 5
shard0004 5
shard0000 5
shard0001 5
too many chunksn to print, use verbose if you
want to force print

Eliot Horowitz

unread,
May 29, 2011, 4:08:37 PM5/29/11
to mongod...@googlegroups.com
How did you remove it?

lwolf

unread,
May 30, 2011, 1:33:10 AM5/30/11
to mongodb-user
db.runCommand({removeShard : "localhost:27025"})

We've tried to delete four shards and then to add another four.
Original shard list was
shard0000
shard0001
shard0002
shard0003
shard0004
shard0005
shard0006
shard0007
shard0008
shard0009
all chunks were reallocated successfully. Then we added four shards.
but on adding mongo
has skipped shard0005 and created shard0010. So we got :
shard0000
shard0001
shard0002
shard0003
shard0004
shard0006
shard0007
shard0008
shard0009
shard00010
Then we've restarted all system and got this errors in logs.

On 29 май, 23:08, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> How did you remove it?
>

Nat

unread,
May 30, 2011, 1:41:15 AM5/30/11
to mongod...@googlegroups.com
Can you do

db.chunks.find()
db.shards.find()
db.collections.find()
db.databases.find()


from config database?

lwolf

unread,
May 30, 2011, 3:08:26 AM5/30/11
to mongodb-user
> db.chunks.find()
{ "_id" : "ptwdb.profiles-id_MinKey", "lastmod" : { "t" : 11000, "i" :
0 }, "ns" : "ptwdb.profiles", "min" : { "id" : { $minKey : 1 } },
"max" : { "id" : 10000232 }, "shard" : "shard0003"}
{ "_id" : "ptwdb.profiles-id_10000232", "lastmod" : { "t" : 12000,
"i" : 8 }, "ns" : "ptwdb.profiles", "min" : { "id" : 10000232 },
"max" : { "id" : 12965542 }, "shard" : "shard0003" }
{ "_id" : "ptwdb.profiles-id_19999812", "lastmod" : { "t" : 3000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 19999812 },
"max" : { "id" : { $maxKey : 1 } }, "shard" : "shard0002"}
{ "_id" : "ptwdb.profiles-id_16545025", "lastmod" : { "t" : 26000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 16545025 },
"max" : { "id" : 16919222 }, "shard" : "shard0004" }
{ "_id" : "ptwdb.profiles-id_18247542", "lastmod" : { "t" : 15000,
"i" : 10 }, "ns" : "ptwdb.profiles", "min" : { "id" : 18247542 },
"max" : { "id" : 18411115 }, "shard" : "shard0004" }
{ "_id" : "ptwdb.profiles-id_14969559", "lastmod" : { "t" : 36000,
"i" : 1 }, "ns" : "ptwdb.profiles", "min" : { "id" : 14969559 },
"max" : { "id" : 15316620 }, "shard" : "shard0000" }
{ "_id" : "ptwdb.profiles-id_19087159", "lastmod" : { "t" : 16000,
"i" : 1 }, "ns" : "ptwdb.profiles", "min" : { "id" : 19087159 },
"max" : { "id" : 19280812 }, "shard" : "shard0000" }
{ "_id" : "ptwdb.profiles-id_15686124", "lastmod" : { "t" : 25000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 15686124 },
"max" : { "id" : 15869399 }, "shard" : "shard0003" }
{ "_id" : "ptwdb.profiles-id_19531793", "lastmod" : { "t" : 35000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 19531793 },
"max" : { "id" : 19720352 }, "shard" : "shard0001" }
{ "_id" : "ptwdb.profiles-id_17808967", "lastmod" : { "t" : 14000,
"i" : 1 }, "ns" : "ptwdb.profiles", "min" : { "id" : 17808967 },
"max" : { "id" : 18010241 }, "shard" : "shard0001" }
{ "_id" : "ptwdb.profiles-id_18630821", "lastmod" : { "t" : 13000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 18630821 },
"max" : { "id" : 18830527 }, "shard" : "shard0001" }
{ "_id" : "ptwdb.profiles-id_16065814", "lastmod" : { "t" : 12000,
"i" : 14 }, "ns" : "ptwdb.profiles", "min" : { "id" : 16065814 },
"max" : { "id" : 16300044 }, "shard" : "shard0006" }
{ "_id" : "ptwdb.profiles-id_14562189", "lastmod" : { "t" : 24000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 14562189 },
"max" : { "id" : 14760990 }, "shard" : "shard0002" }
{ "_id" : "ptwdb.profiles-id_15316620", "lastmod" : { "t" : 22000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 15316620 },
"max" : { "id" : 15466665 }, "shard" : "shard0000" }
{ "_id" : "ptwdb.profiles-id_16919222", "lastmod" : { "t" : 27000,
"i" : 0 }, "ns" : "ptwdb.profiles", "min" : { "id" : 16919222 },
"max" : { "id" : 17092785 }, "shard" : "shard0006" }
{ "_id" : "ptwdb.profiles-id_12965542", "lastmod" : { "t" : 12000,
"i" : 9 }, "ns" : "ptwdb.profiles", "min" : { "id" : 12965542 },
"max" : { "id" : 14242854 }, "shard" : "shard0003" }
{ "_id" : "ptwdb.profiles-id_19280812", "lastmod" : { "t" : 15000,
"i" : 6 }, "ns" : "ptwdb.profiles", "min" : { "id" : 19280812 },
"max" : { "id" : 19408360 }, "shard" : "shard0000" }
{ "_id" : "ptwdb.profiles-id_19720352", "lastmod" : { "t" : 17000,
"i" : 1 }, "ns" : "ptwdb.profiles", "min" : { "id" : 19720352 },
"max" : { "id" : 19859338 }, "shard" : "shard0002" }
{ "_id" : "ptwdb.profiles-id_16300044", "lastmod" : { "t" : 12000,
"i" : 15 }, "ns" : "ptwdb.profiles", "min" : { "id" : 16300044 },
"max" : { "id" : 16545025 }, "shard" : "shard0006" }
{ "_id" : "ptwdb.profiles-id_18830527", "lastmod" : { "t" : 36000,
"i" : 2 }, "ns" : "ptwdb.profiles", "min" : { "id" : 18830527 },
"max" : { "id" : 18949176 }, "shard" : "shard0003" }
has more

db.shards.find()
{ "_id" : "shard0000", "host" : "localhost:27023" }
{ "_id" : "shard0001", "host" : "localhost:27024" }
{ "_id" : "shard0002", "host" : "localhost:27025" }
{ "_id" : "shard0003", "host" : "localhost:27026" }
{ "_id" : "shard0004", "host" : "localhost:27027" }
{ "_id" : "shard0006", "host" : "localhost:27029" }
{ "_id" : "shard0007", "host" : "localhost:27028" }
{ "_id" : "shard0008", "host" : "localhost:27030" }
{ "_id" : "shard0009", "host" : "localhost:27031" }
{ "_id" : "shard0010", "host" : "localhost:27032" }
{ "_id" : "shard0011", "host" : "localhost:27033" }
> db.collections.find()
{ "_id" : "ptwdb.profiles", "lastmod" :
ISODate("1970-01-16T02:57:43.459Z"), "dropped" : false, "key" :
{ "id" : 1 }, "unique" : true }
> db.databases.find()
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "ptwdb", "partitioned" : true, "primary" : "shard0000" }

In db.chunks.find() there were 2 chunks from shard0005, I removed it
manually.

Nat

unread,
May 30, 2011, 3:21:15 AM5/30/11
to mongod...@googlegroups.com
The data in the configdb looks correct. Did you make sure you also restart all mongos routers and mongodb config servers as well?

lwolf

unread,
May 30, 2011, 3:34:50 AM5/30/11
to mongodb-user
Yes, I've restarted entire machine.

Nat

unread,
May 30, 2011, 5:07:36 AM5/30/11
to mongod...@googlegroups.com
Can you post mongos log since restart then? Did you see any error message in mongos?

lwolf

unread,
May 30, 2011, 2:00:04 PM5/30/11
to mongodb-user
After some manipulations and several reboots mongo finally started but
its not working right.

Before add/remove of shards every single shard had equal amount of
documents.
and now all documents are going only to the primary shard
Here is a stat:
localhost:27021 10390269 - mongos
localhost:27023 10390269 - primary shard
localhost:27024 198012
localhost:27025 131840
localhost:27026 182045
localhost:27027 193034
localhost:27028 59021
localhost:27029 176164
localhost:27030 0 - new shard
localhost:27031 0 - new shard
localhost:27032 0 - new shard

How can I reenable sharding?

Scott Hernandez

unread,
May 30, 2011, 3:15:04 PM5/30/11
to mongod...@googlegroups.com
Can you run db.printShardingStatus()?

lwolf

unread,
May 31, 2011, 1:23:28 AM5/31/11
to mongodb-user
> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "localhost:27023" }
{ "_id" : "shard0001", "host" : "localhost:27024" }
{ "_id" : "shard0002", "host" : "localhost:27025" }
{ "_id" : "shard0003", "host" : "localhost:27026" }
{ "_id" : "shard0004", "host" : "localhost:27027" }
{ "_id" : "shard0007", "host" : "localhost:27028" }
{ "_id" : "shard0008", "host" : "localhost:27030" }
{ "_id" : "shard0009", "host" : "localhost:27031" }
{ "_id" : "shard0010", "host" : "localhost:27032" }
{ "_id" : "shard0011", "host" : "localhost:27033" }
{ "_id" : "shard0006", "host" : "localhost:27029" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" :
"config" }
{ "_id" : "ptwdb", "partitioned" : true, "primary" :
"shard0000" }
ptwdb.profiles chunks:
shard0003 6
shard0006 5
shard0002 5
shard0004 5
shard0000 5
shard0001 5
too many chunksn to print, use verbose if you
want to force print



Nat

unread,
May 31, 2011, 2:01:24 AM5/31/11
to mongod...@googlegroups.com
Something still doesn't feel right.
- Can you check your mongos? Do you start it with the right config server?
- Can you do the following from config database again?

db.mongos.find()
db.chunks.find()
db.shards.find()
db.collections.find()
db.databases.find()

- Can you send mongos log file?

lwolf

unread,
May 31, 2011, 2:22:13 AM5/31/11
to mongodb-user
mongo localhost:27021
MongoDB shell version: 1.8.1
connecting to: localhost:27021/test

> show dbs
admin (empty)
config 0.1875GB
ptwdb 42.1552734375GB

> use config
switched to db config

> db.collections.find()
{ "_id" : "ptwdb.profiles", "lastmod" :
ISODate("1970-01-16T02:57:43.459Z"), "dropped" : false, "key" :
{ "id" : 1 }, "unique" : true }

> db.databases.find()
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "ptwdb", "partitioned" : true, "primary" : "shard0000" }

> db.shards.find()
{ "_id" : "shard0000", "host" : "localhost:27023" }
{ "_id" : "shard0001", "host" : "localhost:27024" }
{ "_id" : "shard0002", "host" : "localhost:27025" }
{ "_id" : "shard0003", "host" : "localhost:27026" }
{ "_id" : "shard0004", "host" : "localhost:27027" }
{ "_id" : "shard0007", "host" : "localhost:27028" }
{ "_id" : "shard0008", "host" : "localhost:27030" }
{ "_id" : "shard0009", "host" : "localhost:27031" }
{ "_id" : "shard0010", "host" : "localhost:27032" }
{ "_id" : "shard0011", "host" : "localhost:27033" }
{ "_id" : "shard0006", "draining" : true, "host" : "localhost:
27029" }
> db.mongos.find()
{ "_id" : "woodpecker:27021", "ping" :
ISODate("2011-05-31T06:15:15.407Z"), "up" : 71455 }


Mongos log - http://dl.dropbox.com/u/18466806/mongos.log.bz2

Scott Hernandez

unread,
May 31, 2011, 2:34:34 AM5/31/11
to mongod...@googlegroups.com
It seems like you have a lot of data which is not in the sharded "ptwdb.profiles" collection. Can you run the second script here: http://www.mongodb.org/display/DOCS/Excessive+Disk+Space#ExcessiveDiskSpace-Helpfulscripts  It will print out the stats for each collection.

Can you also connect directly to localhost:27029 and run "use ptwdb; db.stats()", and the script above on this shard?

Does it still show any chunks left on that shard when you run db.printShardingStatus()?

lwolf

unread,
May 31, 2011, 2:54:07 AM5/31/11
to mongodb-user
{
"ns" : "ptwdb.profiles",
"sharded" : false,
"primary" : "shard0000",
"ns" : "ptwdb.profiles",
"count" : 20685870,
"size" : 28973297464,
"avgObjSize" : 1400.6322897707469,
"storageSize" : 31225715312,
"numExtents" : 45,
"nindexes" : 2,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 2005968768,
"indexSizes" : {
"_id_" : 1133298624,
"id_1" : 872670144
},
"ok" : 1
}
{
"ns" : "ptwdb.system.indexes",
"sharded" : false,
"primary" : "shard0000",
"ns" : "ptwdb.system.indexes",
"count" : 2,
"size" : 156,
"avgObjSize" : 78,
"storageSize" : 4608,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 4608,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {

},
"ok" : 1
}
{
"ns" : "config.changelog",
"sharded" : false,
"primary" : "config",
"ns" : "config.changelog",
"count" : 176,
"size" : 49580,
"avgObjSize" : 281.70454545454544,
"storageSize" : 10486016,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 10486016,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {

},
"capped" : 1,
"max" : 2147483647,
"ok" : 1
}
{
"ns" : "config.chunks",
"sharded" : false,
"primary" : "config",
"ns" : "config.chunks",
"count" : 31,
"size" : 5076,
"avgObjSize" : 163.74193548387098,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 4,
"lastExtentSize" : 8192,
"paddingFactor" : 1.0099999999999996,
"flags" : 1,
"totalIndexSize" : 32768,
"indexSizes" : {
"_id_" : 8192,
"ns_1_min_1" : 8192,
"ns_1_shard_1_min_1" : 8192,
"ns_1_lastmod_1" : 8192
},
"ok" : 1
}
{
"ns" : "config.collections",
"sharded" : false,
"primary" : "config",
"ns" : "config.collections",
"count" : 1,
"size" : 88,
"avgObjSize" : 88,
"storageSize" : 5376,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 5376,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}
{
"ns" : "config.databases",
"sharded" : false,
"primary" : "config",
"ns" : "config.databases",
"count" : 3,
"size" : 172,
"avgObjSize" : 57.333333333333336,
"storageSize" : 3328,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 3328,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}
{
"ns" : "config.lockpings",
"sharded" : false,
"primary" : "config",
"ns" : "config.lockpings",
"count" : 18,
"size" : 1132,
"avgObjSize" : 62.888888888888886,
"storageSize" : 3840,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 3840,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 16384,
"indexSizes" : {
"_id_" : 8192,
"ping_1" : 8192
},
"ok" : 1
}
{
"ns" : "config.locks",
"sharded" : false,
"primary" : "config",
"ns" : "config.locks",
"count" : 2,
"size" : 656,
"avgObjSize" : 328,
"storageSize" : 2816,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 2816,
"paddingFactor" : 1.1799999999999997,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}
{
"ns" : "config.mongos",
"sharded" : false,
"primary" : "config",
"ns" : "config.mongos",
"count" : 1,
"size" : 56,
"avgObjSize" : 56,
"storageSize" : 3328,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 3328,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}
{
"ns" : "config.settings",
"sharded" : false,
"primary" : "config",
"ns" : "config.settings",
"count" : 1,
"size" : 36,
"avgObjSize" : 36,
"storageSize" : 2048,
"numExtents" : 1,
"nindexes" : 1,
"lastExtentSize" : 2048,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 8192,
"indexSizes" : {
"_id_" : 8192
},
"ok" : 1
}
{
"ns" : "config.shards",
"sharded" : false,
"primary" : "config",
"ns" : "config.shards",
"count" : 11,
"size" : 820,
"avgObjSize" : 74.54545454545455,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 8192,
"paddingFactor" : 1.5,
"flags" : 1,
"totalIndexSize" : 16384,
"indexSizes" : {
"_id_" : 8192,
"host_1" : 8192
},
"ok" : 1
}
{
"ns" : "config.system.indexes",
"sharded" : false,
"primary" : "config",

"ns" : "config.system.indexes",
"count" : 14,
"size" : 1096,
"avgObjSize" : 78.28571428571429,
"storageSize" : 3840,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 3840,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {

},
"ok" : 1
}

db._adminCommand("listDatabases").databases.forEach(function (d) {mdb
= db.getSiblingDB(d.name);
mdb.getCollectionNames().forEach(function(c) {s = mdb[c].stats();
printjson(s)})})



mongo localhost:27029
MongoDB shell version: 1.8.1
connecting to: localhost:27029/test
> use ptwdb;
switched to db ptwdb
> db.stats()
{
"db" : "ptwdb",
"collections" : 3,
"objects" : 176170,
"avgObjSize" : 1579.4102514616563,
"dataSize" : 278244704,
"storageSize" : 334815488,
"numExtents" : 19,
"indexes" : 2,
"indexSize" : 17113088,
"fileSize" : 1006632960,
"ok" : 1
}



On 31 май, 09:34, Scott Hernandez <scotthernan...@gmail.com> wrote:
> It seems like you have a lot of data which is not in the sharded
> "ptwdb.profiles"
> collection. Can you run the second script here:http://www.mongodb.org/display/DOCS/Excessive+Disk+Space#ExcessiveDis...
> > Mongos log -http://dl.dropbox.com/u/18466806/mongos.log.bz2

Scott Hernandez

unread,
May 31, 2011, 3:02:29 AM5/31/11
to mongod...@googlegroups.com
I assume you ran this directly on the shard directly, is that correct? What does db.printShardingStatus() show?


On Mon, May 30, 2011 at 11:54 PM, lwolf <ipa...@gmail.com> wrote:
{

lwolf

unread,
May 31, 2011, 3:11:22 AM5/31/11
to mongodb-user
Script was ran on mongos and db.stats() on shard, if you're asking
this.

and here is shardingstatus

> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "localhost:27023" }
{ "_id" : "shard0001", "host" : "localhost:27024" }
{ "_id" : "shard0002", "host" : "localhost:27025" }
{ "_id" : "shard0003", "host" : "localhost:27026" }
{ "_id" : "shard0004", "host" : "localhost:27027" }
{ "_id" : "shard0007", "host" : "localhost:27028" }
{ "_id" : "shard0008", "host" : "localhost:27030" }
{ "_id" : "shard0009", "host" : "localhost:27031" }
{ "_id" : "shard0010", "host" : "localhost:27032" }
{ "_id" : "shard0011", "host" : "localhost:27033" }
{ "_id" : "shard0006", "draining" : true, "host" : "localhost:
27029" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" :
"config" }
{ "_id" : "ptwdb", "partitioned" : true, "primary" :
"shard0000" }
ptwdb.profiles chunks:
shard0003 6
shard0006 5
shard0002 5
shard0004 5
shard0000 5
shard0001 5
too many chunksn to print, use verbose if you
want to force print
{ "_id" : "test", "partitioned" : false, "primary" :
"shard0008" }
> ...
>
> продолжение »

Nat

unread,
May 31, 2011, 3:13:21 AM5/31/11
to mongod...@googlegroups.com
It looks like you dropped certain chunks manually. That leaves some holes in chunks collection. That's why you are getting the error. 

lwolf

unread,
May 31, 2011, 3:33:15 AM5/31/11
to mongodb-user
Yes I've wrote about this before, but I delete this chunks after
successfull rebalance.
Is it possible to fix my db?

Nat

unread,
May 31, 2011, 3:47:41 AM5/31/11
to mongod...@googlegroups.com
You would need to update chunks collection to cover the entire $minKey to $maxKey without any gaps. Make sure it's consistent with your data in each shard

lwolf

unread,
May 31, 2011, 3:54:10 AM5/31/11
to mongodb-user
Glad that the case is not hopeless.

Can you please tell me how to do this?

Nat

unread,
May 31, 2011, 3:58:06 AM5/31/11
to mongod...@googlegroups.com
You can iterator through db.chunks.find().sort({"min":1}) and see where the holes are and make certain changes to ensure that you don't have any holes anymore.

lwolf

unread,
May 31, 2011, 8:14:24 AM5/31/11
to mongodb-user
I fixed issue with gaps but it still doesnt work...
looks like something with balancer.. is it possible to reinit balancer
manually??

here is mongos log

Tue May 31 15:11:42 [Balancer] Assertion: 10181:not
sharded:ptwdb.profiles
0x51f4a9 0x5ecf19 0x681f69 0x688671 0x5035ab 0x504e64 0x69ec30
0x7f55c2bf78ba 0x7f55c21b302d
mongos(_ZN5mongo11msgassertedEiPKc+0x129) [0x51f4a9]
mongos(_ZN5mongo8DBConfig15getChunkManagerERKSsb+0x119) [0x5ecf19]

mongos(_ZN5mongo8Balancer11_moveChunksEPKSt6vectorIN5boost10shared_ptrINS_14BalancerPolicy9ChunkInfoEEESaIS6_EE
+0xe9) [0x681f69]
mongos(_ZN5mongo8Balancer3runEv+0x761) [0x688671]

mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE
+0x12b) [0x5035ab]

mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENS
$
mongos(thread_proxy+0x80) [0x69ec30]
/lib/libpthread.so.0(+0x68ba) [0x7f55c2bf78ba]
/lib/libc.so.6(clone+0x6d) [0x7f55c21b302d]
Tue May 31 15:11:42 [Balancer] ~ScopedDbConnection: _conn != null
Tue May 31 15:11:42 [Balancer] caught exception while doing balance:
not sharded:ptwdb.profiles
Tue May 31 15:12:12 [Balancer] chose [shard0006] to [shard0007] { _id:
"ptwdb.profiles-id_14380840", lastmod: Timestamp 33000|0, ns:
"ptwdb.profiles", min: $
Tue May 31 15:12:12 [Balancer] Assertion: 10181:not
sharded:ptwdb.profiles
0x51f4a9 0x5ecf19 0x681f69 0x688671 0x5035ab 0x504e64 0x69ec30
0x7f55c2bf78ba 0x7f55c21b302d
mongos(_ZN5mongo11msgassertedEiPKc+0x129) [0x51f4a9]
mongos(_ZN5mongo8DBConfig15getChunkManagerERKSsb+0x119) [0x5ecf19]

mongos(_ZN5mongo8Balancer11_moveChunksEPKSt6vectorIN5boost10shared_ptrINS_14BalancerPolicy9ChunkInfoEEESaIS6_EE
+0xe9) [0x681f69]
mongos(_ZN5mongo8Balancer3runEv+0x761) [0x688671]

mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE
+0x12b) [0x5035ab]

mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENS
$
mongos(thread_proxy+0x80) [0x69ec30]
/lib/libpthread.so.0(+0x68ba) [0x7f55c2bf78ba]
/lib/libc.so.6(clone+0x6d) [0x7f55c21b302d]
Tue May 31 15:12:12 [Balancer] ~ScopedDbConnection: _conn != null
Tue May 31 15:12:12 [Balancer] caught exception while doing balance:
not sharded:ptwdb.profiles

Nat

unread,
May 31, 2011, 8:23:52 AM5/31/11
to mongod...@googlegroups.com
There should be one more assert before that. Did you see it in the log file? Can you also list down your chunks now?

lwolf

unread,
May 31, 2011, 8:55:47 AM5/31/11
to mongodb-user

what exactly should I look for?
btw here is full log - http://dl.dropbox.com/u/18466806/mongos.log.bz2

also have tried to move chunks manually but got an error:

> db.adminCommand({ moveChunk: "ptwdb.profiles", find: {id:14380841}, to: "shard008"})
{
"ok" : 0,
"errmsg" : "ns not sharded. have to shard before can move a
chunk"

Nat

unread,
May 31, 2011, 10:15:25 AM5/31/11
to mongod...@googlegroups.com
Now your chunks config has overlapped data. I think it's best to dump the data from each shard and import them into a new shard setup instead of messing around with the config.

lwolf

unread,
May 31, 2011, 1:16:39 PM5/31/11
to mongodb-user
Actually this is the second setup. When I faced this problem first
time I was thought that I made something wrong and created another
setup.
Will try to update to 1.8.2 or 1.9 and test this situation again.

Guys I appreciate your time.
Thanks.
Reply all
Reply to author
Forward
0 new messages