mongod segmentation fault

11 views
Skip to first unread message

Mike Richmond

unread,
Sep 6, 2010, 12:39:22 PM9/6/10
to mongod...@googlegroups.com
We have a mongodb 1.6.2 auto-sharding installation running across 16
servers divided into 8 replica sets (each machine has 1 other server in
a replica set, along with an arbiter).

We had the same issue occur across 2 of the shards where the Replica set
primary crashed and a new primary was elected (roll over appears to have
gone smoothly). Then within 24 hours the newly elected primary
crashed. We did not have monitoring in place on the mongod processes,
so we were unaware of the crash until both of the replica set members
went down.

Here are the segmentation fault details from the log files on one of the
machines:
http://gist.github.com/567238

Some details on the traffic hitting mongo:
-Approx 300 total inserts per second split across 8 separate sharded
databases (all containing only one table)
-Once every 10 minutes we run simple map / reduce queries against the
databases, each one takes about 40 seconds to complete


Please let me know if you need more details or context from the log
files. Willing to help where I can.


--Mike

Eliot Horowitz

unread,
Sep 6, 2010, 12:57:56 PM9/6/10
to mongod...@googlegroups.com
Can you send the entire log for one of the servers?
Can attach to a jira case if its big.

Couple of other questions:
 - did you build this with 1.6.2 or upgrade a 1.6.0 or 1.6.1 cluster?
 - there were some issues before 1.6.2 we fixed that theoretically could have caused this, but would be a bit surprised.
 
-Eliot


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Mike Richmond

unread,
Sep 6, 2010, 1:14:22 PM9/6/10
to mongod...@googlegroups.com, Eliot Horowitz
Cluster and all nodes were built with 1.6.2

Will create a JIRA ticket with an attached log soon.  Need to scrub IPs from the log first.


--Mike
Reply all
Reply to author
Forward
0 new messages