pre-allocation in mongodb

838 views
Skip to first unread message

Abhi

unread,
Jun 10, 2014, 8:55:15 AM6/10/14
to mongod...@googlegroups.com
Hi,
I am using mongodb with default settings. I inserted 100GB dataset into it and on disk it was taking 154 GB. Which seemed too much.
I was trying to check how it got to this value:-

avg object size = 496 bytes
dataSize = 107374210896 bytes -- 100gb
storage size = 112334441712 bytes -- 104.6 gb
total index size = 38482903088 bytes -- 35.8 gb
fileSize = 165209440256 bytes -- 154 gb

still i am not able to explain the extra 10gb from these stats, probably due to pre-allocation.

Can some one explain how pre-allocation works, when does pre-allocation triggered and how much files it allocates in advance?

Thanks,
Abhi



s.molinari

unread,
Jun 10, 2014, 9:06:32 AM6/10/14
to mongod...@googlegroups.com
Hi Abhi,

If you are using 2.6.0+, then take a look at this.

http://docs.mongodb.org/manual/core/storage/#power-of-2-sized-allocations

which is now standard in these versions.

Scott

Abhi

unread,
Jun 10, 2014, 11:14:32 AM6/10/14
to mongod...@googlegroups.com
Hi Scott,
I have looked at that section and i am not asking about how records are allocated? My question is about the extra space taken by datafiles and how pre-allocation of data files work.

Thank,
Abhi

s.molinari

unread,
Jun 11, 2014, 9:12:38 AM6/11/14
to mongod...@googlegroups.com
But the record allocation has to do with the size of the allocated space taken too, AFAIK.

Scott

Abhi

unread,
Jun 12, 2014, 2:26:45 AM6/12/14
to mongod...@googlegroups.com
Hi Scott,

I agree that record allocation determines how much size will be allocated to the document. But my question is more towards how pre-allocation of data files works.
 How many data files are preallocated? At what point pre-allocation happens, is there any condition which triggers it?


Thanks,
Abhi

s.molinari

unread,
Jun 12, 2014, 4:07:55 AM6/12/14
to mongod...@googlegroups.com
AFAIK, pre-allocation is always on with powerOf2Sizes as of 2.6.0. I am still, too, uncertain how it works for sure. 

I also know, MongoDB allocates the files in the file system too. The first two data files are 64MB and 128MB. So a new database in Mongo will take up 192MB for data. There is also, 16MB for the NS file and depending on your journal settings, the journal could take up several gigs.

But yeah, I'd also like to understand pre-allocation of documents better. 


Scott

s.molinari

unread,
Jun 12, 2014, 4:09:46 AM6/12/14
to mongod...@googlegroups.com
Also interesting to note is the efficiency differences between the padding modes shown here.


Scott

Abhi

unread,
Jun 13, 2014, 2:20:11 AM6/13/14
to mongod...@googlegroups.com
Can someone from mongodb dev team explain little bit about pre-allocation of data files, when it is triggered, how many files are pre-allocated etc

Thanks,
Abhi

William Zola

unread,
Jun 14, 2014, 6:53:52 PM6/14/14
to mongod...@googlegroups.com
Hi Abhi!

MongoDB pre-allocates data files by default, unless the --noprealloc flag is set (preallocDatafiles in release 2.6), either on the command line or from the configuration file [1] [2].  

MongoDB pre-allocates files on a per-database basis.  Pre-allocation is done when the database is initially created: MongoDB will create a namespace file and two data files.  These are called "foo.ns", "foo.0", and "foo.1", where "foo" is replaced with the name of the database you've created.  

Subsequent data files are called "foo.n" where "n" is the next number in the sequence.

MongoDB will always start allocating new Records from the lowest-numbered data file.  So the first allocations will come from the "foo.0" data file.  The instant it goes to allocate a new record from the "foo.1" data file, it will pre-allocate the "foo.2" data file.  This will continue: as soon as MongoDB starts allocating data from the highest-numbered file, it will then pre-allocate the next file.  This behavior is designed to insure that no write operation has to wait for the data file to be allocated.

MongoDB will pre-allocate a file with the next file size, using either the default file sizes or the "smallfiles" file sizes.

The default data file sizes are:
 - The default size of the namespace (.ns) file is 16 MB.  You cannot change the size of an existing .ns file.  You can control the size of new .ns files with the nsSize [3] option
 - On a 64-bit system, the default size of the first data file is 64 MB, and the second data file is 128 MB
 - Each subsequent data file is 2x the previous file.  By default on a 64-bit system, the 3d data file is 256 MB, the 4th 512 MB, etc.  
 - If you are not using the smallfiles option, the largest data file size is 2 GB

The smallfiles data file sizes are:
 - The size of the .ns file is unaffected
 - On a 64-bit system the size of the first data file is 16 MB, and the second data file is 128 MB [4] 
 - On subsequent data files, the data file size doubles every other file: the 3d data file is also 128 MB, the 4th and 5th are 256 MB, and all subsequent files are 512 MB

NOTE: pre-allocation is on a per-database basis.   MongoDB will separately pre-allocate files for each database; if you have 5 different databases each one will be pre-allocated according to the number of files that exist for that database.

Given this algorithm, and using the default data file sizes, you can have up to 4 GB of files allocated for each database that are only used for preallocation:
 - 2 GB preallocated for the last file
 - A 2 GB "previous" file with only a small portion in use -- say 512 bytes for the one and only document in that file

I hope this helps.

 -William 


Abhi

unread,
Jun 17, 2014, 3:57:45 AM6/17/14
to mongod...@googlegroups.com
Thanks for your response William. It was extremely helpful.

Thanks,
Abhi
Reply all
Reply to author
Forward
0 new messages