mongodb://IP: Transport endpoint is not connected

1,010 views
Skip to first unread message

Marcin

unread,
May 13, 2012, 4:12:24 PM5/13/12
to mongodb-user
Hello,

We are having an issue with MongoDB which happens once every 2-3
months.
For no reason it stops to work and all the websites using it go
offline. Here is what server log says:

/# tail /var/log/mongodb/mongodb.log
/usr/bin/mongod(_ZN5mongo10MongoMutex4lockEv+0x17c) [0x7d9e6c]
/usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE
+0x206) [0x886516]
/usr/bin/
mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE
+0x1105) [0x8897d5]
/usr/bin/
mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE
+0x76) [0xa9c576]
/usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287)
[0x638937]
/lib/libpthread.so.0 [0x2b843c26dfc7]
/lib/libc.so.6(clone+0x6d) [0x2b843ccf964d]

Logstream::get called in uninitialized state
Fri May 11 15:19:02 ERROR: Client::shutdown not called: conn

OS: Debian
Mongo version was 2.0 or 2.0.1, now it is updated to 2.0.5 after last
crash.

Any hints appreciated. If it happens again we will have to drop mongo
in favour of a more stable database :(

Eliot Horowitz

unread,
May 14, 2012, 12:33:21 AM5/14/12
to mongod...@googlegroups.com
The key part of the log is above that.
CAn you send the whole thing?
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Marcin Gil

unread,
May 14, 2012, 3:37:46 AM5/14/12
to mongod...@googlegroups.com
Dear Eliot! Thanks for reply! I noticed that there were many entries like that since one day before the crash. Below is one of them. Please let me know if I can help any further:

Fri May 11 15:17:40 [conn2076909]  authenticate: { authenticate: 1, user: "kXXXXX", nonce: "4874a879bb220b35", key: "d1c806f98b28209d440ef75e66df3d42" }
Fri May 11 15:17:40 [conn2076909] ERROR: mmap private failed with out of memory. (64 bit build)
Fri May 11 15:17:40 [conn2076909] Assertion: 13636:file /var/lib/mongodb/kXXXXX_cache.4 open/create failed in createPrivateMap (look in log for more information)
0x588cb2 0x75b95e 0x75c853 0x8a07cb 0x89c0e2 0x89c7c5 0x89ca3f 0x8a1b52 0x8a1d57 0x8aede9 0x8b04bb 0x947ae9 0x94c395 0x88678c 0x8897d5 0xa9c576 0x638937 0x2b843c26dfc7 0x2b843ccf964d
 /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x112) [0x588cb2]
 /usr/bin/mongod(_ZN5mongo8MongoMMF13finishOpeningEv+0x1be) [0x75b95e]
 /usr/bin/mongod(_ZN5mongo8MongoMMF6createESsRyb+0x63) [0x75c853]
 /usr/bin/mongod(_ZN5mongo13MongoDataFile4openEPKcib+0x14b) [0x8a07cb]
 /usr/bin/mongod(_ZN5mongo8Database7getFileEiib+0x102) [0x89c0e2]
 /usr/bin/mongod(_ZN5mongo8Database12suitableFileEPKcibb+0x55) [0x89c7c5]
 /usr/bin/mongod(_ZN5mongo8Database11allocExtentEPKcibb+0x7f) [0x89ca3f]
 /usr/bin/mongod(_ZN5mongo10outOfSpaceEPKcPNS_16NamespaceDetailsEibNS_7DiskLocE+0x112) [0x8a1b52]
 /usr/bin/mongod(_ZN5mongo26allocateSpaceForANewRecordEPKcPNS_16NamespaceDetailsEib+0x77) [0x8a1d57]
 /usr/bin/mongod(_ZN5mongo11DataFileMgr6insertEPKcPKvibbPb+0x4a9) [0x8aede9]
 /usr/bin/mongod(_ZN5mongo11DataFileMgr16insertWithObjModEPKcRNS_7BSONObjEb+0x4b) [0x8b04bb]
 /usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES2_bbbRNS_7OpDebugEPNS_11RemoveSaverE+0x829) [0x947ae9]
 /usr/bin/mongod(_ZN5mongo13updateObjectsEPKcRKNS_7BSONObjES2_bbbRNS_7OpDebugE+0x125) [0x94c395]
 /usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE+0x47c) [0x88678c]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x1105) [0x8897d5]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x76) [0xa9c576]

Marcin Gil

unread,
May 14, 2012, 3:44:55 AM5/14/12
to mongod...@googlegroups.com
I should probably add that memory usage was as below at the time of the crash:


Before the crash:
             total       used       free     shared    buffers     cached
Mem:          8096       7175        920          0          0          0
-/+ buffers/cache:       7175        920
Swap:            0          0          0
Total:        8096       7175        920

After the crash:
             total       used       free     shared    buffers     cached
Mem:          8096        160       7935          0          0          0
-/+ buffers/cache:        160       7935
Swap:            0          0          0
Total:        8096        160       7935

We have another mongoDB server with 24GB of ram and the same thing happend there - memory usage was >95%, the database size was 50GB.

Marcin Gil

unread,
May 14, 2012, 4:02:35 AM5/14/12
to mongod...@googlegroups.com
ulimit -v
unlimited

journaling was off when it happend first time, then we turned it on.
It is also hooked with MMS since some months.

/proc/sys/vm/overcommit_memory = 0

Adam C

unread,
May 14, 2012, 12:55:10 PM5/14/12
to mongod...@googlegroups.com
Marcin,

Does the 24GB machine have 0 swap too?

Any sign of OOM Killer messages in the syslog?


Because of the way MongoDB memory maps the database files none of its data will ever end up in swap; this means that on a healthy system the swap space will rarely be used on a system only running MongoDB.  Having some swap on though can keep the kernel from killing the mongod process when the memory limits are reached.

Adam

Marcin Gil

unread,
May 16, 2012, 6:20:31 AM5/16/12
to mongod...@googlegroups.com
Hello,

Sorry for late reply, I had to check with our admin.

Both servers have swap 0.
No sign of oom-killer in logs.

What would you suggest we could do?

Thank you,
Marcin

Adam C

unread,
May 16, 2012, 6:43:10 AM5/16/12
to mongod...@googlegroups.com
It's odd, because it looks like an oom killer based on your memory profile (or at least the limited view of it here) and yet you are not seeing the classic messaging we associate with it.  Assuming this is happening more frequently on the 8GB host (unless it has significantly less usage), I would be curious to see if adding swap there stopped the crash in future.

Are these hosts in MMS for us to take a look? (I would just need the group name to find them - if you don't want to share on the group, just send it to me via e-mail)

Adam

Marcin Gil

unread,
May 16, 2012, 8:31:38 AM5/16/12
to mongod...@googlegroups.com
Dear Adam,

I sent our MMS group name to you via e-mail.
We can try setting swap to 1 if you feel this might help. These servers are dedicated mongo hosts.

As you guessed correctly the 24GB host serves a more demanding application.

Please let me know if I can help you investigate this issue further. I am greatly motivated to solve it.

Thank you,
Marcin

Adam C

unread,
May 16, 2012, 8:40:26 AM5/16/12
to mongod...@googlegroups.com
Marcin - are you running in OpenVZ?  If so, you are probably running into the issues described here:


If you are running on OpenVZ and the CentOS/RHEL 6.x or newer, then there are several suggested tweaks to make to the container to try to mitigate the issues, though as yet we have no corroboration as to whether it has been completely successful, and OpenVZ functionality is at best speculative.  In older versions (5.x) it simply will not work for the reasons stated in the thread.  If you are running on the later versions, and you have enacted the suggested tweaks, then please add your experience to the issue.  If not, then trying out the suggestions and determining their efficacy would be appreciated.

Adam.

Marcin

unread,
May 16, 2012, 8:58:41 AM5/16/12
to mongodb-user
Dear Adam,

I have to check on that with our admin. I will write here when I know
the answer.

Thank you,
Marcin

Marcin Gil

unread,
May 24, 2012, 7:28:45 AM5/24/12
to mongodb-user
Dear Adam,

I can confirm now that we are using CentOS5 with OpenVZ on all servers (2 hosting providers).
The admins are not willing to update it. Is this the only solution? What about setting some limits?: http://hachiari.com/blog/2011/03/31/getting-mongodb-to-work-on-openvz-without-out-of-memory-problem/

Thank you,
Marcin

Adam C

unread,
May 24, 2012, 9:26:41 AM5/24/12
to mongod...@googlegroups.com
Actually it is the limitation mechanism in OpenVZ that is at issue here.  This comment specifically addresses it:


If you set the limits you mention inside your container, then it might stop the crashing, but may cause issues when trying to mmap large files.

The only real solution pre-RHEL6 that is mentioned is to remove the limit for the container, but I doubt the admins are going to want to do that either, since it puts other containers at risk.

Adam
Reply all
Reply to author
Forward
0 new messages