saving chunks failed when trying to split, and now db throws assertion s/chunk.cpp:768

178 views
Skip to first unread message

oferfort

unread,
Oct 22, 2010, 3:16:46 AM10/22/10
to mongodb-user
please help,
i tried splitting my biggest chunk in one of my collections, and i got
this reply:

> db.runCommand({split:"storage.object_interaction", middle:{_id:877070867}})
{
"assertion" : "saving chunks failed. cmd: { applyOps: [ { op:
\"u\", b: true, ns: \"config.chunks\
", o: { _id: \"storage.object_interaction-_id_832352546\", lastmod:
Timestamp 179000|4, ns: \"storage.objec
t_interaction\", min: { _id: 832352546 }, max: { _id: 832352546.0 },
shard: \"shard0001\" }, o2: { _id: \"s
torage.object_interaction-_id_832352546\" } }, { op: \"u\", b: true,
ns: \"config.chunks\", o: { _id: \"sto
rage.object_interaction-_id_832352546.0\", lastmod: Timestamp 179000|
5, ns: \"storage.object_interaction\",
min: { _id: 832352546.0 }, max: { _id: 832352547.0 }, shard:
\"shard0001\" }, o2: { _id: \"storage.object_
interaction-_id_832352546.0\" } }, { op: \"u\", b: true, ns:
\"config.chunks\", o: { _id: \"storage.object_
interaction-_id_877070867\", lastmod: Timestamp 179000|6, ns:
\"storage.object_interaction\", min: { _id: 8
77070867 }, max: { _id: 877070867.0 }, shard: \"shard0001\" }, o2:
{ _id: \"storage.object_interaction-_id_
877070867\" } }, { op: \"u\", b: true, ns: \"config.chunks\", o:
{ _id: \"storage.object_interaction-_id_877070867.0\", lastmod:
Timestamp 179000|7, ns: \"storage.object_interaction\", min: { _id:
877070867.0 }, max: { _id: 877217527 }, shard: \"shard0001\" }, o2:
{ _id: \"storage.object_interaction-_id_877070867.0\" } } ],
preCondition: [ { ns: \"config.chunks\", q: { query: { ns:
\"storage.object_interaction\" }, orderby: { lastmod: -1 } }, res:
{ lastmod: Timestamp 179000|3 } } ] } result: { got: { _id:
\"storage.object_interaction-_id_832352546\", lastmod: Timestamp
179000|4, ns: \"storage.object_interaction\", min: { _id: 832352546 },
max: { _id: 832352546.0 }, shard: \"shard0001\" }, whatFailed: { ns:
\"config.chunks\", q: { query: { ns: \"storage.object_interaction\" },
orderby: { lastmod: -1 } }, res: { lastmod: Timestamp 179000|3 } },
errmsg: \"pre-condition failed\", ok: 0.0 }",
"assertionCode" : 13327,
"errmsg" : "db assertion failure",
"ok" : 0
}

after that i tried it this way:

> db.runCommand({split:"storage.object_interaction", find:{_id:854711706}})
{
"assertion" : "Couldn't load a valid config for
storage.object_interaction after 3 tries. Giving up",
"assertionCode" : 13282,
"errmsg" : "db assertion failure",
"ok" : 0
}

and now i can't do anything, seems like the collection is corruped, i
try to find and i get:
> db.object_interaction.findOne({_id:854711706})
Fri Oct 22 00:04:24 uncaught exception: error { "$err" : "assertion s/
chunk.cpp:768", "code" : 0 }


help please, this is the production DB!!!

Ofer Fort

unread,
Oct 22, 2010, 3:42:13 AM10/22/10
to mongodb-user
i think that what caused it might be that the id i gave for the split was the id that was in the end and start of a chunk?

                        { "_id" : NumberLong(811572880) } -->> { "_id" : NumberLong(832352546) } on : shard0001 { "t" : 146000, "i" : 56 }
                        { "_id" : NumberLong(832352546) } -->> { "_id" : NumberLong(877070867) } on : shard0001 { "t" : 148000, "i" : 2 }

is there anything i can do now?


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Eliot Horowitz

unread,
Oct 22, 2010, 7:51:18 AM10/22/10
to mongod...@googlegroups.com
Can you send the entire chunk list?
Can manually fix it if that's the issue.

Erez Zarum

unread,
Oct 22, 2010, 8:04:45 AM10/22/10
to mongod...@googlegroups.com
Hello Eliot,
I work with Ofer,
I'm attaching the entire chunklist.
Please note the chunk on line 1779, it's seems different from the others.
That's seems the issue, if it's possible to fix it we would be very very
happy, our website is currently down because of that and we shutdown our
service for the moment.

Thanks!,
Erez.

chunklist

Eliot Horowitz

unread,
Oct 22, 2010, 8:06:41 AM10/22/10
to mongod...@googlegroups.com
Yeah - seems like it.

So you can do

db.chunks.update( { "min._id" : 832352546 } ,{ $set : { "max._id" :
877070867 } } )

Erez Zarum

unread,
Oct 22, 2010, 8:12:40 AM10/22/10
to mongod...@googlegroups.com
I have run this, and it haven't returned any output.
How can i know if it effected anything?

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:12:42 AM10/22/10
to mongod...@googlegroups.com
Look at the chunk list and see if changed anything
did you do that on the config db?

Erez Zarum

unread,
Oct 22, 2010, 8:15:42 AM10/22/10
to mongod...@googlegroups.com
Chunklist is the same on that line especially.
I have run it through mongos, should i connect specificly to the mongod
config server through port 27019 (our case)?

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:15:03 AM10/22/10
to mongod...@googlegroups.com
When you run through mongos, did you do "use config" first
you can also do directly to config server
assuming you have 3, just have to make sure you do to all

Erez Zarum

unread,
Oct 22, 2010, 8:21:36 AM10/22/10
to mongod...@googlegroups.com
I tried to use "use config", tried to login to the config server
directly and also use "use config"
We currently only have one config server.
none seemed to do anything except throwing:
Fri Oct 22 02:27:03 [conn45] update config.locks query: { _id:
"balancer" } byid 223ms
Fri Oct 22 02:28:05 [conn259] update config.locks query: { _id:
"balancer" } byid 571ms
Fri Oct 22 02:53:56 [conn45] update config.locks query: { _id:
"balancer" } byid 366ms
Fri Oct 22 02:56:58 [conn259] update config.locks query: { _id:
"balancer" } byid 216ms
Fri Oct 22 03:05:35 [conn270] update config.locks query: { _id:
"balancer" } byid 1194ms
Fri Oct 22 03:08:08 [conn45] update config.locks query: { _id:
"balancer" } byid 177ms
Fri Oct 22 03:11:46 [conn270] update config.locks query: { _id:
"balancer" } byid 107ms
Fri Oct 22 03:15:50 [conn45] update config.locks query: { _id:
"balancer" } byid 382ms

in the config mongodb process log file.

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:23:04 AM10/22/10
to mongod...@googlegroups.com
Can you do
db.chunks.findOne( { "min._id" : 832352546 } )

Erez Zarum

unread,
Oct 22, 2010, 8:26:10 AM10/22/10
to mongod...@googlegroups.com
Connected through mongos, "use config"
> db.chunks.findOne( { "min._id" : 832352546 } )
{
"_id" : "storage.object_interaction-_id_832352546",
"lastmod" : {
"t" : 179000,
"i" : 4

},
"ns" : "storage.object_interaction",
"min" : {
"_id" : NumberLong(832352546)
},
"max" : {
"_id" : 877070867
},
"shard" : "shard0001"

Eliot Horowitz

unread,
Oct 22, 2010, 8:26:29 AM10/22/10
to mongod...@googlegroups.com
It looks like update succesfully.
Can you now restart a mongos and try normal queries.

Erez Zarum

unread,
Oct 22, 2010, 8:31:18 AM10/22/10
to mongod...@googlegroups.com
Heh :) that seems to work!
but still why there's no NumberLong in this specific max_id?

> db.chunks.findOne( { "min._id" : 832352546 } )
{
"_id" : "storage.object_interaction-_id_832352546",
"lastmod" : {
"t" : 179000,
"i" : 4
},
"ns" : "storage.object_interaction",
"min" : {
"_id" : NumberLong(832352546)
},
"max" : {
"_id" : 877070867
},
"shard" : "shard0001"
}

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:30:19 AM10/22/10
to mongod...@googlegroups.com
When you fixed via shell it become an int.
that's fine - number space is all intermingled - type doesn't much matter.

Erez Zarum

unread,
Oct 22, 2010, 8:37:10 AM10/22/10
to mongod...@googlegroups.com
Wow, i really appreciate your help,
Can you explain what we did wrong? we really need to split chunks so it
will be almost even between the two shard servers, so it's important to
us to understand how to cope with it well.

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:36:09 AM10/22/10
to mongod...@googlegroups.com
Can you send all of the command you did and the mongos logs

Erez Zarum

unread,
Oct 22, 2010, 8:39:39 AM10/22/10
to mongod...@googlegroups.com
The problematic mongos log is around 2.2M is it possible to upload to
this mailing list?

Thanks,
Erez.

Eliot Horowitz

unread,
Oct 22, 2010, 8:39:33 AM10/22/10
to mongod...@googlegroups.com
Probably not the best idea :)
You can open a case @ http://jira.mongodb.org/

Ofer Fort

unread,
Oct 23, 2010, 3:34:27 AM10/23/10
to mongod...@googlegroups.com
http://jira.mongodb.org/browse/SERVER-1989

We would appreciate if you could tell us what we did wrong cause we
have to do a split . We have the index sizes of 30GB and 20GB and we
must find a way to balance them.

Reply all
Reply to author
Forward
0 new messages