Best Practices | MongoDB on Virtual machine

3,191 views
Skip to first unread message

Tarun Kumar Jaiswal

unread,
Mar 28, 2014, 6:57:28 AM3/28/14
to mongod...@googlegroups.com
Hello,

I wanted to check if there is documentation available around configuring CentOS for better mongoDB performance. Below are some that I could find :-


  • CentOS 6.x 

  • MongoDB 2.4.9 installed via Yum 

  • Individual persistent disks for data, journal, and log 

  • Updated read-ahead values for each disk 

  • Updated ulimit settings 

  • Updated TCP keep-alive settings

  • MongoDB replicaSet with 1 secondary and 1 Arbiter



1) Ulimit settings 

Every deployment may have unique requirements and settings; however, the following thresholds and settings are particularly important for mongod and mongos deployments: 

 

  • -f (file size): unlimited 

  • -t (cpu time): unlimited 

  • -v (virtual memory): unlimited [1] 

  • -n (open files): 64000 

  • -m (memory size): unlimited [1] 

  • -u (processes/threads): 64000 

Always remember to restart your mongod and mongos instances after changing the ulimit settings to make sure that the settings change takes effect. 

 

2) Make sure the TZ is set to UTC 

 

ls -al /etc/localtime 

lrwxrwxrwx 1 root root 23 Mar 14 03:31 /etc/localtime -> /usr/share/zoneinfo/UTC 

 

 

3) Change default keep-alive time to 300 

echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time 

 

4) Disabled overcommit feature in VMWare machines 

Overcommitting resources is very bad, particularly memory. This is because VMWare will swap them around guest VMs on the host and your guest VMs will suddenly not have memory available.   

   

5) Adjust the noatime - 

noatime - Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read  or in other words: Faster file access and less disk wear.  

From <http://blog.softlayer.com/2012/mongodb-architectural-best-practices/ 


6) MongoDB preallocates its database files before using them and often creates large files. As such, you should use the Ext4 and XFS file systems 

7) Running mongodb as a forked process.

8) Set mongoDB to auto-start on machine start-up

9) Disk - RAID 1+0 ( depending on type of file you want to keep, this may change)

10) Running mongod with separate user ( not root)


Any suggestions/inputs from production deployment experience?


Note-  I referenced mongoDB official documentation for many of above recommendations and have tried to provide links to blogs I used excerpts from.

Eoin Brazil

unread,
Apr 1, 2014, 1:49:49 PM4/1/14
to mongod...@googlegroups.com
Hi Tarun,

There are a couple of documents for best performance but the first and primary source is the Production Notes (http://docs.mongodb.org/manual/administration/production-notes/), then the Administration guide (http://docs.mongodb.org/manual/MongoDB-administration-guide.pdf) and the next set of useful documents after these are the platform specific notes (http://docs.mongodb.org/ecosystem/platforms/) which cover EC2, RackSpace, Azure, Cloud Foundry, GCE, Joyent, OpenShift, dotCloud.

It is always recommended that where possible you use a package manager such as Yum as this will also include the necessary control script(s) which you can modify to allow for the automatic restart of your MongoDB when the machine reboots.

You should configure your system to use swap.
Using RAID-10 is recommended.
Disable NUMA on your machine.
Disable transparent huge pages (THP) as MongoDB performs better with normal (4096 bytes) virtual memory pages.
You must only use 64-bit builds for production, if possible you should just use the 64-bit build.
Use the Network Time Protocol (NTP) to synchronize time among your hosts.
Make sure you run your MongoDB instances in a secure network so configure the necessary firewall to prevent unauthorised access (http://docs.mongodb.org/manual/administration/production-notes/#use-trusted-networking-environments).
It is recommended to use EXT4 or XFS for your storage layer.

Before selecting your operating system there are a number of choices around RAM and the disk storage you will use, for instance to get the best performance we recommend sufficient RAM to hold at least your working data set and the related indexes. In the case of disk storage, generally it is best to use local disks and if you can SSDs will give better performance over spinning disks. If you are considering a SAN, you should guarantee that it has sufficient low latency and IOPS to meet your requirements. In situations, where IOPS is variable you might considered provisioned options such as our example of how to do this on EC2 with PIOPS (http://docs.mongodb.org/ecosystem/platforms/amazon-ec2/). As you mentioned it is best to mount your data files, your journal and your log on separate volumes on dedicated disks.

There are various operating system settings including network and process limits configurations. The ulimits for processes are important and typically need to be adjusted for MongoDB (see here http://docs.mongodb.org/manual/reference/ulimit) as does the read-ahead depending our your document sizes and your specific workload (typically, the generic recommendation is 32 blocks / 16KB but this varies). The TCP keep-alive time should be lowered to 300 seconds (five minutes) from a common default setting of 7200 seconds (2 hours). Turn off atime should be set on your MongoDB volume with your database file to reduce writes as you highlighted when only data is being read.

The recommended deployment architecture for ReplicaSets (http://docs.mongodb.org/manual/core/replica-set-architectures/) and Sharded Clusters (http://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/) are covered in our docs.

Hope this helps clarify your questions.

Thanks,
Eoin
Reply all
Reply to author
Forward
0 new messages