I heavily use findandmodify to increment a userid counter in a one-
document collection (to create ascending unique userids as needed),
and I assume that this is where the JS execution comes from, since
other queries are very simple. This is the relevant application code:
inc_command = pymongo.son.SON([('findandmodify',
self.user_collection), ('update', {'$inc':{'userid':1}}), ('new',
True)])
result = mdb.command(inc_command)
return result['value']['userid']
The mongod environment had ulimit -n 8192 and ulimit -v unlimited.
Timeline:
14:00:39 May 24: Server started after upgrade to 1.4.2 from 1.4.0.
Immediately ran database repair after startup since server had shut
down uncleanly before (same pattern of JS Errors followed by segfault
killed 1.4.0).
06:59:53 May 25: First "JS Error: out of memory" message in the log.
This time roughly corresponds to our servers starting to receive heavy
morning traffic after the nighttime quiet period.
09:59:57 May 25: Logged signal 11 (segmentation fault) after a total
of 31475 JS Errors.
09:59:58 May 25: Mongo really exited.
I was able to observe mongod in the 15 minutes or so before it died.
During this time CPU use was pinned at 100%. Top showed 30.6 GB memory
virtual, only 1.5 GB resident. The output of db.currentOp() showed
dozens of operations waiting for the write lock, all of them like
queries like this:
"query" : {"$eval" : function cf__1_anon() {return
db.runCommand({findandmodify:"users", update:{$inc:{userid:1}},
'new':true});
The function names were function cf__1_anon(), cf__2_anon(), etc. At
one point I saw up to cf__72_anon.
I tried db.eval( "gc()" ) a few minutes before the crash as suggested
in another thread. It just returned "null" which appears to be the
same response I get from an unloaded brand-new mongod instance.
Unfortunately I neglected to save the output of db.serverStatus(). I
remember from it that I had slightly over 2000 connections open; the
number seemed a bit high but was well short of the file descriptor
limit. I do have a web admin interface page saved from a few minutes
before it died. One weird thing is that the SecsRunning column for
several waiting queries showed values over 1.2 million, but the server
had been running for less than a day when it crashed.
I would be more than happy to provide a full copy of logs, our data
directory, and the saved currentOp and status pages I've mentioned
above, as long as it can be uploaded somewhere private. I am also
happy to provide any further details that would be helpful in
diagnosing this problem.
Startup info:
Mon May 24 14:00:39 Mongo DB : starting : pid = 6326 port = 27017
dbpath = /usr/mongodb/data/ master = 0 slave = 0
64-bit
Mon May 24 14:00:40 db version v1.4.2, pdfile version 4.5
Mon May 24 14:00:40 git version:
53749fc2d547a3139fcf169d84d58442778ea4b0
Mon May 24 14:00:40 sys info: Linux domU-12-31-39-06-79-A1
2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST
2009 x86_64 BOOST_LIB_VERSION=1_41
First logged error:
Tue May 25 06:59:53 JS Error: out of memory
Tue May 25 06:59:53 Assertion: 10433:js init failed
0x4fe064 0x54e621 0x542973 0x537307 0x536139 0x5dd7c4 0x5deaef
0x67e6a1 0x67f5a7 0x5565bf 0x55a65c 0x5f309a 0x5f80c7
0x6a9af4 0x6bc1f0 0x3b36c06307 0x3b364d1ded
/usr/local/mongodb142/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x204)
[0x4fe064]
/usr/local/mongodb142/bin/mongod(_ZN5mongo7SMScopeC1Ev+0x231)
[0x54e621]
/usr/local/mongodb142/bin/mongod(_ZN5mongo8SMEngine11createScopeEv
+0x23) [0x542973]
/usr/local/mongodb142/bin/mongod(_ZN5mongo12ScriptEngine8newScopeEv
+0x17) [0x537307]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo12ScriptEngine14getPooledScopeERKSs+0x249) [0x536139]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo6dbEvalEPKcRNS_7BSONObjERNS_14BSONObjBuilderERSs+0xb4)
[0x5dd7c4]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo7CmdEval3runEPKcRNS_7BSONObjERSsRNS_14BSONObjBuilderEb
+0x1bf) [0x5deaef]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuil
derEb+0x821) [0x67e6a1]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi
+0
x4d7) [0x67f5a7]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBu
ilderEbi+0x3f) [0x5565bf]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpE
+0x116c) [0x55a65c]
/usr/local/mongodb142/bin/mongod [0x5f309a]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERK11sockaddr_in
+0x1227)
[0x5f80c7]
/usr/local/mongodb142/bin/mongod(_ZN5mongo10connThreadEv+0x244)
[0x6a9af4]
/usr/local/mongodb142/bin/mongod(thread_proxy+0x80) [0x6bc1f0]
/lib64/libpthread.so.0 [0x3b36c06307]
/lib64/libc.so.6(clone+0x6d) [0x3b364d1ded]
Dying:
Tue May 25 09:59:57 Got signal: 11 (Segmentation fault).
Tue May 25 09:59:57 Backtrace:
0x6a8569 0x3b364301b0 0x752d90 0x75344c 0x751500 0x783e95 0x78538c
0x7c5df8 0x78868d 0x771565 0x7571d0 0x757619 0x71
4fa2 0x5540e2 0x5ddab2 0x5deaef 0x67e6a1 0x67f5a7 0x5565bf 0x55a65c
/usr/local/mongodb142/bin/mongod(_ZN5mongo10abruptQuitEi+0x399)
[0x6a8569]
/lib64/libc.so.6 [0x3b364301b0]
/usr/local/mongodb142/bin/mongod(js_MarkStackFrame+0x191) [0x752d90]
/usr/local/mongodb142/bin/mongod(js_GC+0x3b8) [0x75344c]
/usr/local/mongodb142/bin/mongod(js_NewGCThing+0xd7) [0x751500]
/usr/local/mongodb142/bin/mongod [0x783e95]
/usr/local/mongodb142/bin/mongod(js_AllocSlot+0x1a0) [0x78538c]
/usr/local/mongodb142/bin/mongod(js_AddScopeProperty+0x837)
[0x7c5df8]
/usr/local/mongodb142/bin/mongod(js_SetProperty+0x6df) [0x78868d]
/usr/local/mongodb142/bin/mongod(js_Interpret+0x1885b) [0x771565]
/usr/local/mongodb142/bin/mongod(js_Invoke+0xef2) [0x7571d0]
/usr/local/mongodb142/bin/mongod(js_InternalInvoke+0x189) [0x757619]
/usr/local/mongodb142/bin/mongod(JS_CallFunction+0x56) [0x714fa2]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo7SMScope6invokeEP10JSFunctionRKNS_7BSONObjEib+0x262)
[0x5540e2]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo6dbEvalEPKcRNS_7BSONObjERNS_14BSONObjBuilderERSs
+0x3a2) [0x5ddab2]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo7CmdEval3runEPKcRNS_7BSONObjERSsRNS_14BSONObjBuilderEb
+0x1bf) [0x5deaef]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuil
derEb+0x821) [0x67e6a1]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi
+0
x4d7) [0x67f5a7]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBu
ilderEbi+0x3f) [0x5565bf]
/usr/local/mongodb142/bin/
mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpE
+0x116c) [0x55a65c]
Tue May 25 09:59:57 dbexit: ...
Matt
I believe that's the only JS. As above, the relevant application code
looks like this:
inc_command = pymongo.son.SON([('findandmodify',
self.user_collection), ('update', {'$inc':{'userid':1}}), ('new',
True)])
result = mdb.command(inc_command)
return result['value']['userid']
If there's a better way to use findandmodify from pymongo, one that
doesn't involve JS, I'm happy to use it. I grepped the pymongo source
and the only mention of findandmodify was in the database command
method, and I didn't find alternatives after a few minutes searching
the web, so I went with the command method. The findandmodify wiki
page in fact suggests running findandmodify as a command if your
driver does not have a helper function for it:
http://www.mongodb.org/display/DOCS/findandmodify+Command
Matt
were you using an eval before? or something else?
Scratch that.
Looks like there's some out-of-date code in place on the production
nodes.
I actually saw this bug quite some time ago and fixed it by changing
the eval to the pymongo command, and all was well. When I saw this
sort of crash return recently I thought there must be some new factor
at work since I "knew" explicit evals were not being used anymore.
On May 25, 11:16 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> That looks ok - but something caused this:
> {"$eval" : function cf__1_anon() {return
> db.runCommand({findandmodify:"users", update:{$inc:{userid:1}},
> 'new':true});
>
> were you using an eval before? or something else?
>