Tracking unused space and disabling mongo preallocation

539 views
Skip to first unread message

Ratheesh R

unread,
Jan 16, 2014, 9:25:01 AM1/16/14
to mongod...@googlegroups.com, Prabu RM
Hi,

In order to confirm the Mongo prealloc functionality we did go through some tests in our environment and we are in need to clarify the results which differ from the expected one. We disabled the prealloc by setting noprealloc=true in config file and restarted the server instance. Then we built a typical mongo DB(named sgserver) of size 330 MB which should extend probably 3 db files sgserver.0,sgserver.1, sgserver.2 with sizes 64MB, 128MB, 256MB respectively with the namespace file sgserver.ns. 

Conversely there are four db files have been created sgserver.0,sgserver.1, sgserver.2, sgserver.3 with sizes 64MB, 128MB, 256MB, 512 MB respectively. Why preallocation happens even when noprealloc is set true. How we calculated the db usage size is below.

DB Data size = datasize + totalindexsize = 330 MB

330 MB should be within the range of 3rd db files where the sum is 448 MB(64+128+256). 

Also we believed the fourth file with size(512MB) is an unused one we tried deleted it and try to query the db through application. There is no issue with mongod restart whereas we faced below error in querying. We got this by enabling the logging.

Wed Jan 15 23:52:13.242 [conn23] error: getFile() called in a read lock, yet file to return is not yet open
Wed Jan 15 23:52:13.242 [conn23] getFile(3) _files.size:3 /home/MongoDB/MongoDB/srv/db/mongodb/sgserver.3
Wed Jan 15 23:52:13.242 [conn23] context ns: sgserver.

How we can find the unused file and how we can eradicate the above issues after deletion of those files?

Please throw some light on this. Awaiting for  your reply.

Thanks & Regards,
Ratheesh R.

s.molinari

unread,
Jan 16, 2014, 10:47:42 AM1/16/14
to mongod...@googlegroups.com, Prabu RM
From what I understand, once the database starts using the 256MB file (sgserver.02), then MongoDB allocates the next file automatically. That would be sgserver.3 with 512MB. So if your data is 330MB, the DB would have taken up 960MB of disk space.

Scott

s.molinari

unread,
Jan 16, 2014, 10:49:54 AM1/16/14
to mongod...@googlegroups.com, Prabu RM
Ah, and it is also advised to not turn off preallocation, as this will cause performance issues.

Scott

Ratheesh

unread,
Jan 17, 2014, 8:28:59 AM1/17/14
to mongod...@googlegroups.com, Prabu RM
Thanks For your reply. I know that preallocation should not be turned off. I am not sure why MongoDB allocates the next file automatically even after noprealloc is set to true. Is there any reason why Mongo automatically creates the file. If you know the reason, could you please let me know.  Awaiting for your reply.


Thanks & Regards,
Ratheesh R.

s.molinari

unread,
Jan 17, 2014, 10:32:39 AM1/17/14
to mongod...@googlegroups.com, Prabu RM
I am not sure why noprealloc isn't working for you, but the preallocation of the data file, before it is even used, is just a matter of doing it ahead of time, in order to make sure it is done and ready, when the database needs the space. In other words, it is done again for performance purposes. This is what the MongoDB manual says.

Preallocating data files in the background prevents significant delays when a new database file is next allocated.

Worded differently, the preallocation prevents significant delays, which would happen at the time the database would need the space, had the allocation not been done previously.

Scott

William Zola

unread,
Jan 18, 2014, 9:13:52 AM1/18/14
to mongod...@googlegroups.com, Prabu RM
Hi Ratheesh!

If you're running with '--noprealloc', then the 'mongod' process will only allocate the next file when it has run out of space in the current files.  There's a fair amount of overhead in MongoDB relative to the actual amount of data stored in the collection and in the indexes.  Unfortunately, the db.<collection>.stats() and the db.stats() command don't account for that overhead.  Here's a pair of good presentations on how the storage engine works:

Once you've understood the content of those presentations, then you'll be able to understand the output of the db.<collection>.validate() command, which will likely show you where the "extra" space is being used.

Since you've removed the data files at the OS level and restarted 'mongod', you have corrupted your database.  At this point, you need to either:
 - re-sync from another node in the replica set
 - repair the database, and accept that whatever data was in the files that you removed is now permanently lost.

In the future, if you want to reduce your disk file usage, you should use the '--repair' flag to 'mongod'.  This will reduce your disk space usage to the minimum possible.   If disk usage is an issue for you, you can consider and evaluate the following:
 - Use a smaller number of databases: there is unavoidable disk space overhead for each additional database you use
 - Use the '--smallfiles' option in addition to the '--noprealloc' option
 - Modify your schema to use the least amount of space possible
 - Run --repair on a regular basis
 - Buy bigger disks (often the easiest and cheapest option!)


 -William 

s.molinari

unread,
Jan 18, 2014, 11:24:36 AM1/18/14
to mongod...@googlegroups.com, Prabu RM
Nice post. Thanks William. Learning every day!

Scott

Ratheesh

unread,
Jan 20, 2014, 12:01:34 AM1/20/14
to mongod...@googlegroups.com, Prabu RM
Thanks for your valuable information.


Thanks & Regards,
Ratheesh R.

Reply all
Reply to author
Forward
0 new messages