Primary Mongod RAM Usage growing high

212 views
Skip to first unread message

anil neeluru

unread,
Feb 4, 2020, 2:30:54 PM2/4/20
to mongodb-user
Hi,

Environment details:
1) Mongo Java Driver 3.7.1 
2) Mongo DB server 3.6.9
3) Approximately around 4 GB size of records are present on each mongod instance.
4) And there will be continuous queries(update/read/delete) towards primary and secondaries of replica set.
5) Storage Engine - MMAP

Configured an 7 member shard replica set on my setup. 
3 Arbiters and 4 Non-Arbiter members

I observe the primary mongod instance RAM usage very high and keeps increasing over a period of time, when any of the non-arbiter member(out of 4) goes down. Eventually, it consumes all RAM on my host.
If replica members which are down are brought back, RAM usage goes down and over a period of time it stays stable.

Looked at mongostats to confirm the RAM usage on mongod instance, see that RES memory keeps increasing when replica member/members are down.

Attaching the logs i have collected during observation. ( mongostat and db.serverstatus)

How to check why the RAM usage going higher on primary Mongod instance ? 
Does mongo maintains separate cache when its replica members are down ? 


Thanks,
Anil

mongo_stats_ram_usage

Cahaba Data

unread,
Feb 4, 2020, 10:02:49 PM2/4/20
to mongodb-user
sorry didn't look at log attachment

not your question but: don't know why 3 arbiters; the odd number 5 will suffice for election which only requires 1 arbiter

RAM going up when a data bearing node is down - makes one wonder if there is a write/read concern value that is thereby causing connections to hang open....

Kevin Adistambha

unread,
Feb 5, 2020, 7:50:22 PM2/5/20
to mongodb-user

Hi Anil,

When you say it “consumes all RAM on the host”, do you see any issue, e.g. the mongod getting killed by OOMkiller, slow performance, etc.? Or does the server stays running normally, but only the RAM usage goes up without any effect on operation?

Does mongo maintains separate cache when its replica members are down ?

No. However I would rehash Cahaba’s question: do your clients/apps use secondary reads? If yes, then it could be the cause. Each client connection to mongod would use ~1MB of RAM. If you have many clients doing secondary reads, having replica nodes offline would mean that the remaining online nodes are forced to pick up more clients that were previously spread throughout the replica set, and would see a jump in connected clients.

If you’re planing to use replication nodes for scaling purposes, I would recommend you to check out this blog post: Can I use more replica nodes to scale.

In addition:

Mongo DB server 3.6.9
Storage Engine - MMAP

Note that MongoDB 3.6.9 was released in Nov 2018. The latest in the 3.6 series is currently 3.6.17. Please consider upgrading, since there are many improvements from 3.6.10 to 3.6.17.

Also, the MMAPv1 storage engine was removed in MongoDB 4.2 and has been deprecated for some time. Please consider upgrading to WiredTiger. WiredTiger has many advantages vs. MMAPv1 and is now the only officially available storage engine in modern versions of mongod.

Best regards,
Kevin

anil neeluru

unread,
Feb 6, 2020, 6:49:06 AM2/6/20
to mongodb-user
Hi Kevin,

Thanks for your responses, see my comments below.

When you say it “consumes all RAM on the host”, do you see any issue, e.g. the mongod getting killed by OOMkiller, slow performance, etc.? Or does the server stays running normally, but only the RAM usage goes up without any effect on operation?
 
    Not seeing any issue with mongod instances, also performance was good till it exhausts all RAM completely. Only heartbeat errors are present in mongod logs for the replica members which are down.

No. However I would rehash Cahaba’s question: do your clients/apps use secondary reads? If yes, then it could be the cause. Each client connection to mongod would use ~1MB of RAM. If you have many clients doing secondary reads, having replica nodes offline would mean that the remaining online nodes are forced to pick up more clients that were previously spread throughout the replica set, and would see a jump in connected clients.
If you’re planing to use replication nodes for scaling purposes, I would recommend you to check out this blog post: Can I use more replica nodes to scale.

    yes, our app does secondary reads, but it prefers the nearest secondary. In this case, the replica members which are brought down are from remote sites (which has latency involved), where we dont have many reads going to them. We use Secondaries mostly for High Availability prefarably.

Overall connections counts on available mongod instances are as below. 

PRIMARY mongod - around 500 connections
SECONDARY mongod - around 130 connections

Both primary and secondary mongod runs on two different hosts. And above connections are taken at high rate of update/read/delete queries and numbers are stable, not seeing increase of connections further.

As i have added in my first post, we are trying to understand why Primary mongod usage of RAM going high only when 2 of its replica members are down.

we see system recovering RAM only with 2 scenarios.
1) Bring back the replica members which are down
2) Remove the replica members which are down from replica set and reconfigure the replica with only 2 non-Arbiter(which are available) and 3 Arbiter members.

Since app/clients are using same connections across the tests, mongod itself releasing the memory for some reason, so trying to understand the same.  

In addition:
Mongo DB server 3.6.9
Storage Engine - MMAP
Note that MongoDB 3.6.9 was released in Nov 2018. The latest in the 3.6 series is currently 3.6.17. Please consider upgrading, since there are many improvements from 3.6.10 to 3.6.17.
Also, the MMAPv1 storage engine was removed in MongoDB 4.2 and has been deprecated for some time. Please consider upgrading to WiredTiger. WiredTiger has many advantages vs. MMAPv1 and is now the only officially available storage engine in modern versions of mongod.

    Yes, we will look into taking 3.6.17 and try a test. Since we have our production deployment planned in next month, really dont have much time to repeat all the tests on new mongod version. So trying to look for a solution if possible on 3.6.9 only.

 

Best regards,
Kevin


anil neeluru

unread,
Feb 6, 2020, 6:51:59 AM2/6/20
to mongodb-user
we have deployments done across regions geographically, and need more backup sites so configured 3 arbiters.
Regarding connections, we have stable count of connections across our test with high load.

Cahaba Data

unread,
Feb 7, 2020, 5:45:19 PM2/7/20
to mongodb-user

Ah, yes, sorry. Generally if there are any queries that normally go to secondaries (secondaryPreferred read preference) that now go to primary they are likely using the extra RAM.


----- I described your scenario at the new MongoDB community forum as a way to test drive in the forum....and the reply above came back.  you may also note my first post about write concerns....


what maybe is coming into focus is that when a secondary goes down those secondary users are not out-of-service but rather moving their activity to the primary.... this kind of makes sense in that probably was the reason they were set up to work directly to the secondary in the first place..to offload the primary..


if there is an assumption that the direct-secondary activity is out of service when the secondary goes down - - that assumption should be re-examined in regard to how it is implemented per Asya's input....  hope this helps.....



anil neeluru

unread,
Feb 12, 2020, 8:44:48 AM2/12/20
to mongodb-user
Thank you Cahaba for the detailed response. 

In our case, if we assume queries from out-of-service secondaries shifted to available primary/in-service secondaries, Resident RAM usage on Primary growing high and never becomes stable, that's concern actually.

we see system recovering RAM only with 2 scenarios.
1) Bring back the replica members which are down
2) Remove the replica members which are down from replica set and reconfigure the replica with only 2 non-Arbiter(which are available) and 3 Arbiter members.

Currently, looking at option 2: 
In run time detect if any members are out-of-service for considerable amount of time (say 30 mins) then reconfigure replica set accordingly to stop consuming more RAM on available Primary and secondary members.

Thanks,
Anil

Frank Shimizu

unread,
Feb 20, 2020, 5:34:46 AM2/20/20
to mongodb-user
Hi,

I stumbled upon something today and it made me remember this thread:

It says that in a three member Primary-Secondary-Arbiter cluster, the cache pressure will increase if any Secondary data bearing node is down, if readConcern:majority is enabled, which seems to be the default. Therefore it recommends disabling readConcern:majority in such a cluster setup. I understand from your original post that you have more than three members, but I was wondering if this could also be affecting you, since the symptom sounds similar to what the article describes. Maybe somebody with more knowledge can chime in?

Regards

Frank Shimizu

unread,
Feb 20, 2020, 5:38:54 AM2/20/20
to mongodb-user
Sorry for the double-post, but in this issue there is additional information:
The risk is greatest in a 3-member PSA, since you only need one data-bearing node to go down in order to lose majority writes. However, the risk exists in any set with an arbiter, since they allow you to accept writes when the majority commit point cannot move forward. We are going to recommend all sets with arbiters use enableMajorityReadConcern:"false".

So it seems that it's not exclusive to a 3 member replica set. It might be worth for you to investigate this further.

Abhay Mukewar

unread,
Mar 1, 2020, 1:40:25 PM3/1/20
to mongod...@googlegroups.com
Hi Frank,
                Thanks for pointing that out. The code is being checked if there are any places that may have been overlooked, where there is a read to the DB but the read Concern not specified.
                 I understand that, if the read concern is specified as either "secondary" or "secondaryPreffered" will help avoid problems when there is an Arbiter present in the replica set. Please correct me if I am wrong.

Regards,
Abhay


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/51861217-fd98-432e-b9b4-7337189dd93e%40googlegroups.com.


--
Regards,
Abhay
Reply all
Reply to author
Forward
0 new messages