MongoDB Sync Problems

35 views
Skip to first unread message

Robin Tang

unread,
Dec 7, 2016, 7:11:46 PM12/7/16
to mongodb-user
Hello!

I am currently experiencing some MongoDB issues:
  1. Queries time often spikes for our one collection "instances"
    1. All the queries we perform are indexed
  2. Due to spikes, our replica set keeps switching primaries, and one of the hosts actually stopped running MongoDB
    1. This caused it to fall outside of the oplog, and thus unrecoverable

I am currently trying to re-sync the machine again, and have failed fifteen times :(


Just some specs:
  • Primary:
    • RAM 210G
    • MongoDB v 3.0.12, WiredTiger
    • OS: Centos 7.2
  • Start Up 2 (trying to sync this)
    • RAM 225G
    • MongoDB v 3.0.14, WiredTiger
    • OS: Centos 7.2
  • MongoDB Collection (353G)
Symptoms I am facing when trying to sync
  • "unable to fork cannot allocate memory" - this happens very frequently, and when diaging this with free -m
    • MongoDB seems to be using ~140, Free ~ 1G, The rest were in buff/cache
  • Unexpected Query Spikes
    • Happens usually at 3-7 am window - which is really odd, as we get very little traffic around that time
If anyone can shed some insight on this that would be great!!
- Robin


Tom Li

unread,
Dec 15, 2016, 10:26:25 PM12/15/16
to mongodb-user

Hi Robin,

Could you post more details regarding your deployment, such as:

  • Are the MongoDB servers running inside any container environment, e.g. docker or virtual machine?
  • Are there other processes running on the machines besides MongoDB that could create a resource contention during busy period (e.g. application servers, web servers, etc.)?
  • The output of rs.status() and rs.conf() .
  • The slow queries and their explain() result (see https://docs.mongodb.com/v3.0/reference/explain-results/) .
  • The output of rs.printReplicationInfo() and rs.printSlaveReplicationInfo().
  • Any log entries that corresponds to the 3-7AM window. Is there maybe a backup process running during that time?

I noticed that you are running a different version of MongoDB in your Primary (3.0.12) and Secondary node (3.0.14). Is there a reason why you are using different versions within the same replica set? Please note that running different versions of MongoDB is only supported during an upgrade process, and not recommended in a long-term production setup.

Also, if you are having an issue with initial sync, you may find the following links helpful:

Having said that, please note that the latest MongoDB version is 3.4.0, which contains improvements to initial sync. You may want to determine if upgrading is applicable to your use case. However, before doing any major changes to your deployment, please ensure that all data are backed up and all procedures thoroughly tested.

Regards,
Tom


Reply all
Reply to author
Forward
0 new messages