Mongod stuck in stop/waiting on Ubuntu 14.04

968 views
Skip to first unread message

james33

unread,
Jan 4, 2017, 8:44:36 AM1/4/17
to mongodb-user

I installed MongoDB on Ubuntu 14.04 as per the documentation. It runs and works fine, except that it mysteriously stops every few hours and requires a manual restart with service mongod restartto get going again. There's nothing in the mongod.log when this happens.


In addition, when I restart and then check status with service mongod status, it gives me mongod stop/waiting event hough the application is able to connect to it. I've been searching around on how to debug this, but am stuck at this point.

Tom Li

unread,
Jan 11, 2017, 10:57:32 PM1/11/17
to mongodb-user

Hi James,

I installed MongoDB on Ubuntu 14.04 as per the documentation. It runs and works fine, except that it mysteriously stops every few hours and requires a manual restart with service mongod restartto get going again. There’s nothing in the mongod.log when this happens.

Could you please provide the following relevant information?

  1. Did you follow this documentation for installation Install MongoDB Community Edition on Ubuntu?
  2. What MongoDB version are you running?
  3. It may be possible that the mongod process was killed by the OOMkiller. Could you check the system log for any hint if this is the case?
  4. Could you please elaborate on what do you mean by “nothing in the logs”? Could you also verify your correct mongod log path from /etc/mongod.conf and then check the mongo.log for any notable event, i.e. “Detected unclean shutdown”.

In addition, when I restart and then check status with service mongod status, it gives me mongod stop/waiting event hough the application is able to connect to it.

If your mongod service is stopped you should not be able to connect to it. Another possibility is there is a mongod instance already running on the default port, so the service cannot be started. Do you usually have other mongod processes running on this machine? It may be the case that if your application is connecting to another mongod instance not started by the service process. Please check the relevant logs to discover if this is the case.

Regards,

Tom

james33

unread,
Jan 12, 2017, 11:10:36 AM1/12/17
to mongodb-user
1. Yes, that is what I followed.
2. 3.2.11
3. I don't see anything in syslog about OOMkiller, but maybe I'm looking in the wrong place.
4. The following does show in the log, but this only gets logged when I run `service mongod restart`: `[initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.`. I don't see anything in the logs about the server shutting down or any errors.

Right, when I start it it seems to be running and it does work for 12-24 hours, but every single day it will silently stop (seems to always be in the middle of the night, not sure if that might be related or not). This is the only instance of mongo running or being connected to.

Tom Li

unread,
Jan 15, 2017, 11:54:27 PM1/15/17
to mongodb-user

Hi James,

To verify if your mongod was killed by OOMkiller, you can run the following command: dmesg | egrep -i 'killed process' to see if you have the following outputs:

Jan 13 02:13:49 ip-xxx-xxx-xxx-xxx kernel: [xxxxxx.xxxxxx] Out of memory: Kill process 'PID' (Process name) score xxx or sacrifice child Jan 13 02:13:49 ip-xxx-xxx-xxx-xxx kernel: [xxxxxx.xxxxxx] Killed process PID (Process name) total-vm:937032kB, anon-rss:887396kB, file-rss:4kB

[initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.

This message typically appears if the mongod process was killed with SIGKILL (kill -9), please refer to Unix_signal for more details about SIGKILL. OOMkiller sends SIGKILL to any process it kills. Please note that this signal is very disruptive and could result in data loss (see https://docs.mongodb.com/v3.2/tutorial/manage-mongodb-processes/#use-kill).

If you discover the messages above, then the OOMkiller is the reason why your MongoDB was terminated. You may want to review the Production Notes, specifically the section titled Disk and Storage Systems - Swap

If you’re still having issues with your deployment, could you please provide more information:

  1. What is your hardware configuration? i.e. RAM, CPU, storage etc. Or if your server is on AWS what is the instance type?
  2. How much swap has been configured? You can run free -m to get this information.
  3. What is the output of the db.stats() command?
  4. Is there any process configured to run in middle of the night when this issue happens?

Regards,

Tom

james33

unread,
Jan 17, 2017, 2:44:49 PM1/17/17
to mongodb-user
Thanks for the help, the lack of a swap file was causing OOMkiller to be the culprit. Adding the swap file seems to have fixed this.
Reply all
Reply to author
Forward
0 new messages