Managing database file sizes

665 views
Skip to first unread message

Brad Chapman

unread,
May 7, 2009, 7:52:04 PM5/7/09
to mongodb-user
Hi all;
I have been learning MongoDB and doing some initial testing with my
data sets. These are ~2.8 million simple key/value pairs that I have
been storing in Berkeley DB. The performance on loading and querying
is excellent and the overall design ideas are very exciting; you all
are doing some great work.

I wanted to find out more details about managing database file size.
Storing my test data set takes ~2 gigs of space in 4 files:

64M reads_090504.0
128M reads_090504.1
256M reads_090504.2
512M reads_090504.3
1.0G reads_090504.4

Reading through the logs, it looks as if progressively larger files
are added as a database grows, with each new file doubling in size.
Since the file is initially blanked to this size the entire space
appears to be taken on disk even if the file is not filled with real
data.

Thinking forward to using MongoDB for larger data sets, I was
wondering if you had any tips for minimizing file size and flushing
"empty" storage space in files after creation. Are the Capped
Collections a solution? Are there other directions I should be
looking?

For comparisons sake, the same data in Tokyo Cabinet is 24M and on
Berkeley DB is 191M. I appreciate the added expressiveness of document
storage over strict key values and understand there will be some trade
offs in terms of space.

Thanks for any pointers you can offer,
Brad

Eliot

unread,
May 7, 2009, 7:59:10 PM5/7/09
to mongod...@googlegroups.com
The file size doubles until 2gb, and then each file is just 2gb, so
you're only going to waste 2gb per database - which for a huge
database is going to be a very small percentage.

I am very interested in why the data is so much bigger than other
databases, that doesn't quite make sense to me. Capped collections
won't really help make the data any smaller, they really are just
meant if you want a fixed size, and if it gets full, the oldest data
gets removed.

Would it possible to send a sample data set with how much storage its
taking up in other engines? Also, could you send the output of
validate() ? That will tell us a little but more about the usage.

-Eliot

Brad Chapman

unread,
May 8, 2009, 8:17:17 AM5/8/09
to mongodb-user
Eliot;
Thanks for the reply. Capping the file doubling at 2gb works great;
that alleviates
my fears of partially filled 64gb files.

> Would it possible to send a sample data set with how much storage its
> taking up in other engines?

The data is a basic set of key value pairs where the keys are
incremental indexes and
the values are frequency counts. Nothing fancy, so if you create a
file with 2.8 million random
integers in it, that will be identical to what I am loading. Below is
the Python code that
loads the file and indexes the collection on the ID:

from pymongo.connection import Connection
from pymongo import ASCENDING

def main(in_file):
conn = Connection("server_machine")
db = conn["reads_090504"]
col = db["read_to_freq"]

with open(in_file) as in_handle:
for read_index, freq in enumerate(in_handle):
col.insert(dict(read_id=read_index, frequency=int(freq)))
col.create_index("read_id", ASCENDING)

For the Tokyo Cabinet comparison, the data set is the same; the keys
are the indexes
and the values are JSON string dictionaries of dict(frequency=freq) to
be as similar
as possible to what I am loading in MongoDB. The database is a BTree
with compression:
test.tcb#opts=ld#bnum=1000000#lcnum=10000

> Also, could you send the output of
> validate() ?  That will tell us a little but more about the usage.

validate
details: 0x2aaac13c8c80 ofs:8c8c80
firstExtent:0:2a00 ns:reads_090504.read_to_freq
lastExtent:2:58f1000 ns:reads_090504.read_to_freq
# extents:18
datasize?:146157464 nrecords?:2810718 lastExtentSize:33205248
padding:1
first extent:
loc:0:2a00 xnext:0:24a00 xprev:null
ns:reads_090504.read_to_freq
size:3072 firstRecord:0:2ab0 lastRecord:0:3594
2810718 objects found, nobj:2810718
191128952 bytes data w/headers
146157464 bytes data wout/headers
deletedList: 0100000000000000010
deleted: n: 8 size: 3294744
nIndexes:2
reads_090504.read_to_freq.$_id_ keys:2810718
reads_090504.read_to_freq.$read_id_1 keys:2810718

Thanks again for taking a look. Let me know if I can provide any other
information,
Brad

dwight_10gen

unread,
May 8, 2009, 9:29:52 AM5/8/09
to mongodb-user
Hi Brad,

1) One thing that is making datafiles bigger is that Mongo
automatically inserts an _id field in each object, and automatically
builds an index on that field. If you would like to optimize the size
a little, I would suggest the following. Store your read_id in _id
instead. As long as they are unique, that will be fine. Then, there
will be one index instead of two, which saves space. Further, the
auto generated _id field won't be in the object anymore either, and
its size is significant given these are tiny objects.

2) In BSON, each fieldname is stored in each object. So the longer
the dictionary key (fieldname), the bigger the objects. So changing
"frequency" to "freq" would save 5 bytes per record too. (This will
be much less dramatic savings than the _id change above.)

P.S. This may go without saying but do a db.dropDatabase() from the
mongo shell before rerunning -- while mongo will reuse deleted space
in datafiles, it won't ever free up at the filesystem excess space
(unless you do say, a repairDatabase() operation, which would compact
everything).

Brad Chapman

unread,
May 8, 2009, 1:35:42 PM5/8/09
to mongodb-user
Dwight;
Great tips -- thanks very much for the ideas.

> 1) One thing that is making datafiles bigger is that Mongo
> automatically inserts an _id field in each object, and automatically
> builds an index on that field.  If you would like to optimize the size
> a little, I would suggest the following.  Store your read_id in _id
> instead.  As long as they are unique, that will be fine.  Then, there
> will be one index instead of two, which saves space.  Further, the
> auto generated _id field won't be in the object anymore either, and
> its size is significant given these are tiny objects.

This makes good sense. I am used to key/value databases, so am happy
generating meaningful unique IDs. Using the python interface, I did
the inserts with:

col.save(dict(_id=read_index, freq=int(freq)))

Is this correct in terms of passing a raw integer (or string) as the
_id? Digging around in the pymongo code, it looks like there are some
checks that _id is an ObjectId, and ObjectIds appear to be 12
character digits. Is there a more correct way to put this into an
object first? Beyond that worry, loading and retrieving using the
default _id worked great.

Making this change along with your suggested frequency->freq change
speeds up the loading, and does reduce the file size. We no need that
1gb file:

-rw------- 1 chapman users 64M 2009-05-08 12:27 reads_090504.0
-rw------- 1 chapman users 128M 2009-05-08 12:27 reads_090504.1
-rw------- 1 chapman users 256M 2009-05-08 12:33 reads_090504.2
-rw------- 1 chapman users 512M 2009-05-08 10:38 reads_090504.3
-rw------- 1 chapman users 16M 2009-05-08 12:33 reads_090504.ns

Thanks again for the help,
Brad

aaron

unread,
May 8, 2009, 1:54:07 PM5/8/09
to mongodb-user
Hi Brad,

Just a brief note for you in addition to what Dwight and Eliot have
already written. For performance reasons mongo does quite a bit of
allocation up front. You are correct in stating that the newest file
on disk may not be completely full, but it's also true that the second
newest file may not be full. I don't have enough info to let you know
how much space in the files you've recently listed is actually being
utilized, but it could be as little as 64M + 128M + 1 byte. As we've
mentioned, the overhead of this sort of preallocation is minimal for
large databases because the max file size is 2gb.

Aaron

Michael Dirolf

unread,
May 8, 2009, 1:59:27 PM5/8/09
to mongod...@googlegroups.com
> This makes good sense. I am used to key/value databases, so am happy
> generating meaningful unique IDs. Using the python interface, I did
> the inserts with:
>
> col.save(dict(_id=read_index, freq=int(freq)))
>
> Is this correct in terms of passing a raw integer (or string) as the
> _id? Digging around in the pymongo code, it looks like there are some
> checks that _id is an ObjectId, and ObjectIds appear to be 12
> character digits. Is there a more correct way to put this into an
> object first? Beyond that worry, loading and retrieving using the
> default _id worked great.

Yes this should work - the driver will accept values for _id that are
not ObjectIds. If you don't give a value for _id the driver will
generate a 12-byte ObjectId instance.


dwight_10gen

unread,
May 8, 2009, 3:31:48 PM5/8/09
to mongodb-user
Good point i forgot that we now preallocate the next file even if the
current one isn't full so that we can pre-zero it so there it not
latency when it first goes into effect. One day, if this is a problem
for anyone, we could make this optional.

Also: the reason we preallocate the files is to avoid some potential
file system fragmentation by growing the files gradually.

Kunthar

unread,
May 8, 2009, 4:13:26 PM5/8/09
to mongod...@googlegroups.com
I really dont want to hijack the topic but just want to ask you now, did you try with different file systems yet?
I mean xfs, jfs and traditional ext3.
Is there any known performance difference between them?

Peace
\|/ Kunth

dwight_10gen

unread,
May 8, 2009, 5:25:28 PM5/8/09
to mongodb-user
We have not really experimented with that. My *guess* is that you
won't see a big difference in performance, as we aren't creating all
that many files, but I could be wrong.

I would recommend using noatime for the data volume.

Brad Chapman

unread,
May 10, 2009, 11:54:40 AM5/10/09
to mongodb-user
Aaron, Micheal and Dwight;
Thanks for the additional feedback. Those points definitely answer the
questions I had about storage scaling. Disk space is not at a premium
for my work, so squeezing out every byte is not top priority.

Overall, I am very happy with MongoDB and will be doing more with it
in the future. You all are doing some great work. I wrote up the
details of my simple evaluation here:

http://bcbio.wordpress.com/2009/05/10/evaluating-key-value-and-document-stores-for-short-read-data/

Thanks again,
Brad
Reply all
Reply to author
Forward
0 new messages