Deletes make index grown

80 views
Skip to first unread message

eborrell...@gmail.com

unread,
Mar 7, 2018, 6:43:59 AM3/7/18
to mongodb-user

Hi;

 

We have a replicaset with 3 nodes, we have take in account than when we make a lot of deletes the index grown a lot (1GB). We do not understand why a delete make an index grown. We want to understand how grow an index, it reserve space or it is a continuous growing?. Has anyone information about index growing?



Regards

Eugenia

Kevin Adistambha

unread,
Apr 17, 2018, 9:17:35 PM4/17/18
to mongodb-user

Hi Eugenia

You posted this question some time ago. Have you found a satisfactory explanation yet?

If by “index grow” you meant growth in space used by the data files and you’re using the WiredTiger storage engine (default in MongoDB 3.2 and newer), it is possible in some cases that the data file can increase in size, even after deletes.

Part of the reason of why this could be the case is that WiredTiger is a no-overwrite storage engine. That is, it will never overwrite your data on disk (e.g. updating a document will not overwrite the current on-disk document). Instead, it will create a new copy of the updated document in a different place, before marking the old document as unused. This is also the case with deletion.

Another reason is that WiredTiger optimizes speed over disk space usage to some extent, and shrinking/extending on-disk file is an expensive process. Unused spaces (from unused documents or removed documents) are reused eventually, but likely not immediately.

What likely happened in your case is that you have previous or currently running insert/update workload requiring the extra space. The deletes would return the space to WiredTiger, but not to the OS, unless under very specific circumstances (e.g. when WiredTiger can truncate the data file without any performance penalty). To force the space to be returned to the OS, you may be able to run the compact command. However please note that there is no guarantee that space will be returned, and the command itself may require additional disk space to run.

You can determine how much space is available for reuse if you run:

db.collection.stats()['wiredTiger']['block-manager']

The relevant metric is listed as file bytes available for reuse. If you see a high number on this metric, then it is possible that you can insert a lot of documents without growing the disk usage at all, since it’s all reused from your previous deletes.

Since you’re running a replica set, you can also perform an initial sync process on the relevant node. The initial sync process will allow MongoDB to start clean with a new set of data files. However this is a major maintenance process that could take some time and not without risk.

Best regards
Kevin

Reply all
Reply to author
Forward
0 new messages