How to keep the indexes in memory

Racki

unread,

Feb 5, 2016, 3:10:45 AM2/5/16

to mongodb-user

Dear All,

We would like to keep the indexes fit entirely in RAM all time so that the system can avoid reading the index from disk.

JFYI, our database box has 64GB RAM, and out of which (wiredTigerCacheSizeGB=50) 50GB is allocated for the wiredTiger.

The total index size is only 13 GB. We want to keep the indexes permanently in the RAM.

In our system, we update entire collections everyday (~300 million documents which is of 40 GB size) parallely (i.e full COLLSCAN not the IXSCAN).

During the period, indexes which were in memory may be released because no collection is using the index or the memory is required for the updates.

Subsequently if any query using indexes, then it may have to read it from the disk and memory may not be available.

In order to handle these situations, do we have any option like pinning the objects in memory permanently in MongoDB.

Thanks & Regards,

Racki

+65 94202754

rac...@gmail.com

Rhys Campbell

unread,

Feb 5, 2016, 4:29:07 AM2/5/16

to mongodb-user

You could use the touch command to achieve this...

https://docs.mongodb.org/v3.0/reference/command/touch/

I'd be careful though and make sure you're actually doing something beneficial.

Message has been deleted

Saleem Mirza

unread,

Feb 5, 2016, 9:27:32 AM2/5/16

to mongodb-user

touch is not supported for wired Tiger engine. Please see mongoldb touch.

it says

Changed in version 3.0.0.
If the current storage engine does not support touch, the touch command will return an error.
The MMAPv1 storage engine supports touch.
The WiredTiger storage engine does not support touch.

I believe MongoDB loads all indexes in memory if it finds enough memory. However, OS can flush it to disk when memory is needed for some other operation.

Stephen Steneker

unread,

Feb 6, 2016, 12:30:37 AM2/6/16

to mongodb-user

On Fri, Feb 5, 2016 at 7:10 PM, Racki wrote:

We would like to keep the indexes fit entirely in RAM all time so that the system can avoid reading the index from disk.

JFYI, our database box has 64GB RAM, and out of which (wiredTigerCacheSizeGB=50) 50GB is allocated for the wiredTiger.

The total index size is only 13 GB. We want to keep the indexes permanently in the RAM.

Hi Racki,

As at MongoDB 3.2 there is no mechanism for permanently pinning indexes in memory or configuring a separate cache for indexes. Collection data and indexes are paged out of memory based on a Least Recently Used (LRU) algorithm.

If you suspect your indexes are being frequently evicted from RAM, obvious strategies include removing unused/unnecessary indexes, added missing indexes to avoid collection scans, sharding, or adding more RAM. You could also consider using the WiredTiger engine’s directoryForIndexes option and storing your indexes on higher IOPS drives than your collection data.

However, there’s another recommendation which may be less obvious: you should try reducing your cacheSizeGB back to the default size and test the effects for your workload. WiredTiger’s cacheSizeGB defaults to 50% of RAM (MongoDB 3.0) or 60% of RAM (MongoDB 3.2) and we generally recommend you avoid increasing this. For some workloads, an even smaller setting can be very beneficial.

The cacheSizeGB setting only limits the memory directly used by the WiredTiger storage engine (not the total memory used by mongod). For example: open connections, aggregations, and server-side JavaScript can all consume memory. Memory which is not used by mongod or other processes is available for filesystem cache.

Data in the WiredTiger cache is generally uncompressed (although indexes do support prefix compression), whereas blocks in the filesystem cache will match the on-disk representation. If your data has significant compression, the filesystem cache will effectively keep more of your working set in memory. Misses from the filesystem cache (involving disk I/O) are much more expensive than misses from the WiredTiger cache. For best performance you want to allow sufficient memory for both caches to work efficiently for your workload.

In our system, we update entire collections everyday (~300 million documents which is of 40 GB size) parallely (i.e full COLLSCAN not the IXSCAN).