mongo server crash without log

548 views
Skip to first unread message

sirpy

unread,
Oct 12, 2010, 6:30:40 AM10/12/10
to mongodb-user
I'm running mongodb 1.6.2, on ubuntu. I use the /etc/init.d/mongod
script to start the database. the log file is configured correctly in /
etc/mongod.conf.
The server seems to crash in an environment where there are many
simultaneous requests.
There is no trace of the crash in the log files. I just see that the
drivers stop getting a response from the server and the mongod process
is gone.

What could be the reason for the crash, and how can I make mongo log
the crash reason, or show some kind of trace?

Markus Gattol

unread,
Oct 12, 2010, 6:42:56 AM10/12/10
to mongod...@googlegroups.com

[skipping a lot of lines ...]

sirpy> There is no trace of the crash in the log files. I just see that
sirpy> the drivers stop getting a response from the server and the
sirpy> mongod process is gone.

Hm ... can you try strace mongod please. I am on Debian/Ubuntu too and I
have never seen this (segfault) or anything like it.

Eliot Horowitz

unread,
Oct 12, 2010, 9:39:22 AM10/12/10
to mongod...@googlegroups.com
There was a bug in 1.6.2 that removed some logging.
Can you upgrade to 1.6.3?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Tejaswi

unread,
Oct 14, 2010, 9:30:33 AM10/14/10
to mongodb-user
Apologies for jumping in, but I am facing the same problems; and it's
with 1.6.3

I have 20 or so connections, doing both reads and writes to capped and
non-capped collections. The reads use both tailable and regular
cursors from pymongo.

I used a -vvvv level of verbosity in logging; but the last log
messages just look like this:

Thu Oct 14 04:06:25 [conn22] getmore stream_db_p.np_stats_15m cid:
4536908767758877386 getMore: { $query: { time: { $gt: new
Date(1287012257698) } } } bytes:20 nreturned:0 0ms
Thu Oct 14 04:06:25 [conn30] getmore stream_db_p.at_stats_15m_p_trans
cid:793766897252288317 getMore: { $query: { time: { $gt: new
Date(1287024412774) } } } bytes:20 nreturned:0 0ms
Thu Oct 14 04:06:25 [conn13] getmore stream_db_p.article_stats_1m cid:
987110231666301169 getMore: { $query: { time: { $gt: new
Date(1287015557458) } } } bytes:20 nreturned:0 0ms
Thu Oct 14 04:06:25 [conn12] getmore stream_db_p.np_stats_1m cid:
3264274421394671960 getMore: { $query: { time: { $gt: new
Date(1287015557447) } } } bytes:20 nreturned:0 0ms
Thu Oct 14 04:06:25 [conn7] getmore stream_db_p.np_stats cid:
4701366243798518183 getMore: { $query: { time: { $gt: new
Date(1287015796894) } } } bytes:20 nreturned:0 0ms
Thu Oct 14 04:06:25 [conn20] getmore stream_db_p.at_stats_5m cid:
455373033945667476 getMore: { $query: { time: { $gt: new
Date(1287014957506) } } } bytes:20 nreturned:0 0ms

And that's it. The mongod process is dead. I am trying to think of
anything out of ordinary I do, other than typical usecases, but can't
seem to think of any. One possibility is - I issue create_collection
commands everytime I start my processes, and if the collection exists,
I ignore the exception. This should be relatively harmless though.

Any help would be appreciated.

-T

Eliot Horowitz

unread,
Oct 14, 2010, 9:50:33 AM10/14/10
to mongod...@googlegroups.com
Can you look in the system log for anything related to mongo?

Tejaswi

unread,
Oct 14, 2010, 11:19:23 AM10/14/10
to mongodb-user
Well, bingo :-)

Oct 14 04:06:25 ip-10-245-210-112 kernel: [823453.107884] 61087 pages
reserved
Oct 14 04:06:25 ip-10-245-210-112 kernel: [823453.107885] 17232 pages
shared
Oct 14 04:06:25 ip-10-245-210-112 kernel: [823453.107886] 1892425
pages non-shared
Oct 14 04:06:25 ip-10-245-210-112 kernel: [823453.107889] Out of
memory: kill process 5659 (mongod) score 584557 or a child
Oct 14 04:06:25 ip-10-245-210-112 kernel: [823453.107905] Killed
process 5659 (mongod)

I know exactly what parameters to tweak for this. If the server dies
again, I will let you know. But knowing the true reason, I am sure
it's unlikely to happen.

Thanks a lot again. A great weight of my mind.

-T

On Oct 14, 9:50 am, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> Can you look in the system log for anything related to mongo?
>

Ferrari

unread,
Oct 14, 2010, 12:42:17 PM10/14/10
to mongodb-user
Hello,

I solved my problems of killed mongod when I applied the patch from
http://jira.mongodb.org/browse/SERVER-1827 on the 1.6.3 version.
Now it's not dying anymore.

Fabio

Tejaswi

unread,
Oct 15, 2010, 12:35:44 PM10/15/10
to mongodb-user
The OOM server kill is still happening.

To get the latest patch for http://jira.mongodb.org/browse/SERVER-1827
, I am using 1.7.1, which has the patch.

0 pages in swap cache
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.690217] Swap cache
stats: add 0, delete 0, find 0/0
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.690218] Free swap =
0kB
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.690219] Total swap =
0kB
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723613] 1968128
pages RAM
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723616] 61087 pages
reserved
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723617] 11320 pages
shared
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723618] 1892048
pages non-shared
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723621] Out of
memory: kill process 11276 (mongod) score 852049 or a child
Oct 15 16:19:25 ip-10-245-210-112 kernel: [953826.723634] Killed
process 11276 (mongod)

I am on an Ec2 large instance (m1.large).

Do I have to increase swap-space?

-T

On Oct 14, 12:42 pm, Ferrari <fferr...@ahgora.com.br> wrote:
> Hello,
>
> I solved my problems of killed mongod when I applied the patch fromhttp://jira.mongodb.org/browse/SERVER-1827on the 1.6.3 version.

Eliot Horowitz

unread,
Oct 15, 2010, 1:00:51 PM10/15/10
to mongod...@googlegroups.com
Can you run mongostat from start to when it finishes?
Want to see if its leaking or something weird is happening.
Also db.serverStatus()

sirpy

unread,
Oct 18, 2010, 5:23:31 AM10/18/10
to mongodb-user
this happened again this time there is some kind of a log:

on Oct 18 04:37:41 [conn329] running multiple plans
Mon Oct 18 04:37:41 [conn324] query caches.objects ntoreturn:1 idhack
reslen:4078 24ms
Mon Oct 18 04:37:43 [conn323] JS Error: out of memory nofile_a:0
Mon Oct 18 04:37:43 [conn323] Assertion: 13072:JS_NewObject failed:
NumberLong1
0x54098e 0x5f9883 0x5f8b42 0x5ecc07 0x9015c9 0x9011a6 0x902906
0x8e28be 0x8d2030 0x8d2479 0x88fe02 0x5fa1bc 0x5ba301 0x6eb872
0x6164b1 0x73a5a6 0x7423c7 0x7468b2 0x74740f 0x747e57
/usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x1de) [0x54098e]
/usr/bin/mongod(_ZN5mongo9Convertor5tovalERKNS_11BSONElementE+0x17a3)
[0x5f9883]
/usr/bin/mongod(_ZN5mongo9Convertor5tovalERKNS_11BSONElementE+0xa62)
[0x5f8b42]
/usr/bin/
mongod(_ZN5mongo16resolveBSONFieldEP9JSContextP8JSObjectljPS3_+0x377)
[0x5ecc07]
/usr/bin/mongod(js_LookupPropertyWithFlags+0x421) [0x9015c9]
/usr/bin/mongod(js_LookupProperty+0x49) [0x9011a6]
/usr/bin/mongod(js_GetProperty+0xff) [0x902906]
/usr/bin/mongod(js_Interpret+0xed54) [0x8e28be]
/usr/bin/mongod(js_Invoke+0xef2) [0x8d2030]
/usr/bin/mongod(js_InternalInvoke+0x189) [0x8d2479]
/usr/bin/mongod(JS_CallFunction+0x56) [0x88fe02]
/usr/bin/mongod(_ZN5mongo7SMScope6invokeEP10JSFunctionRKNS_7BSONObjEib
+0x2ec) [0x5fa1bc]
/usr/bin/
mongod(_ZN5mongo7Matcher7matchesERKNS_7BSONObjEPNS_12MatchDetailsE
+0xfc1) [0x5ba301]
/usr/bin/
mongod(_ZN5mongo19CoveredIndexMatcher7matchesERKNS_7BSONObjERKNS_7DiskLocEPNS_12MatchDetailsE
+0xe2) [0x6eb872]
/usr/bin/mongod(_ZN5mongo11UserQueryOp4nextEv+0x2a1) [0x6164b1]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE
+0x56) [0x73a5a6]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x6f7)
[0x7423c7]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x232)
[0x7468b2]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE
+0x5f) [0x74740f]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17)
[0x747e57]
Mon Oct 18 04:37:44 Got signal: 11 (Segmentation fault).

Mon Oct 18 04:37:44 Backtrace:
0x8212f9 0x7f2459b7d040 0x8cdbf0 0x8ce2ac 0x8cc360 0x8fefe7 0x88b2c0
0x5f792d 0x5f7f5a 0x5ba25d 0x6eb872 0x6164b1 0x73a5a6 0x7423c7
0x7468b2 0x74740f 0x747e57 0x5ff6aa 0x70547a 0x708ab6
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x8212f9]
/lib/libc.so.6 [0x7f2459b7d040]
/usr/bin/mongod(js_MarkStackFrame+0x191) [0x8cdbf0]
/usr/bin/mongod(js_GC+0x3b8) [0x8ce2ac]
/usr/bin/mongod(js_NewGCThing+0xd7) [0x8cc360]
/usr/bin/mongod(js_NewObject+0xe4) [0x8fefe7]
/usr/bin/mongod(JS_NewObject+0x3f) [0x88b2c0]
/usr/bin/mongod(_ZN5mongo9Convertor10toJSObjectEPKNS_7BSONObjEb+0x8d)
[0x5f792d]
/usr/bin/mongod(_ZN5mongo7SMScope7setThisEPKNS_7BSONObjE+0x6a)
[0x5f7f5a]
/usr/bin/
mongod(_ZN5mongo7Matcher7matchesERKNS_7BSONObjEPNS_12MatchDetailsE
+0xf1d) [0x5ba25d]
/usr/bin/
mongod(_ZN5mongo19CoveredIndexMatcher7matchesERKNS_7BSONObjERKNS_7DiskLocEPNS_12MatchDetailsE
+0xe2) [0x6eb872]
/usr/bin/mongod(_ZN5mongo11UserQueryOp4nextEv+0x2a1) [0x6164b1]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner6nextOpERNS_7QueryOpE
+0x56) [0x73a5a6]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet6Runner3runEv+0x6f7)
[0x7423c7]
/usr/bin/mongod(_ZN5mongo12QueryPlanSet5runOpERNS_7QueryOpE+0x232)
[0x7468b2]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner9runOpOnceERNS_7QueryOpE
+0x5f) [0x74740f]
/usr/bin/mongod(_ZN5mongo16MultiPlanScanner5runOpERNS_7QueryOpE+0x17)
[0x747e57]
/usr/bin/
mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_
+0x103a) [0x5ff6aa]
/usr/bin/mongod [0x70547a]
/usr/bin/
mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE
+0x14d6) [0x708ab6]

Mon Oct 18 04:37:44 dbexit:



On Oct 12, 12:30 pm, sirpy <had...@gmail.com> wrote:
> I'm running mongodb 1.6.2, on ubuntu. I use the /etc/init.d/mongod
> script to start the database. thelogfile is configured correctly in /
> etc/mongod.conf.
> The server seems tocrashin an environment where there are many
> simultaneous requests.
> There is no trace of thecrashin thelogfiles. I just see that the
> drivers stop getting a response from the server and the mongod process
> is gone.
>
> What could be the reason for thecrash, and how can I make mongolog
> thecrashreason, or show some kind of trace?

Dwight Merriman

unread,
Oct 18, 2010, 9:04:12 AM10/18/10
to mongod...@googlegroups.com
i made a jira for this log (it shouldn't end in this ugly manner), but the real problem is running out of memory.  if you can isolate down to a small script that reproduces that would be most helpful.

http://jira.mongodb.org/browse/SERVER-1962

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages