Archiving MongoDB Data

843 views
Skip to first unread message

Ian White

unread,
Mar 6, 2011, 11:40:14 PM3/6/11
to mongod...@googlegroups.com
I want to archive a fairly large amount of MongoDB data (billions of documents containing subobjects and varied fields). I will need some availability (looking up individual documents by _id), but load should be low. I do need to index by a couple different fields for occasional analysis of the archived data, so stashing in a pure key-value store doesn't work.

I could of course use MongoDB itself, but I'm concerned about storage size. It'd probably be better to trade off some performance for disk space in this instance. It'd be nice to use an engine with compression.

I could use MySQL or similar, store the fields I need to index under their own fields, and store a json-encoded string of the document inside a MEDIUMTEXT field.

Are there some other good options I might not be thinking of?

roger

unread,
Mar 6, 2011, 11:54:48 PM3/6/11
to mongodb-user
Mongodb will add some compression related features sometime in the
future. How much data are we talking about ?

Depending on how often you need to access your data, you could save
backups on S3 or so. Given the fact that you want to do analytics, it
might be best to keep your data accesible with the same schema as your
production app. YOu could also tradeoff performance and price by using
different storage infrastructure (sata instead of sas for example) for
your archive.

-Roger

Ken Egozi

unread,
Mar 7, 2011, 12:07:23 AM3/7/11
to mongod...@googlegroups.com
I guess that if you do put it in a MongoDB server that has very low load, it will just keep (almost) everything paged out to the disk and would not take up much space, so I'd first go with the easy path - zip and backup to an off-site location for archive/backup, and keep a mongoDb server on a machine with large HD and small memory for occasional query.



Ken Egozi.
http://www.kenegozi.com/blog
http://www.delver.com
http://www.musicglue.com
http://www.castleproject.org
http://www.idcc.co.il - הכנס הקהילתי הראשון למפתחי דוטנט - בואו בהמוניכם


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Gaetan Voyer-Perrault

unread,
Mar 7, 2011, 12:23:21 AM3/7/11
to mongod...@googlegroups.com
@Ian:

I could of course use MongoDB itself, but I'm concerned about storage size. It'd probably be better to trade off some performance for disk space in this instance

What type of disk space trade-off are you trying to make?
How much space are you trying to save here?

I know that compression is on the roadmap, but MongoDB already supports sparse indexes which will help keep your index sizes down.

Where else are you trying to make up space?

Ian White

unread,
Mar 7, 2011, 12:52:06 AM3/7/11
to mongod...@googlegroups.com, Gaetan Voyer-Perrault
It's about 350GB on MongoDB right now and growing fairly fast. Basically it'd be nice to put it on a 1TB drive and not have to worry about it for a while. With compression, that would be no problem.

Eliot Horowitz

unread,
Mar 7, 2011, 3:18:40 AM3/7/11
to mongod...@googlegroups.com
I've only played with this a little, but you could use zfs.
If you use zfs, encryption is seamless under mongo.
Note: I've only done very limited testing so I judge the stability at
this point.
Reply all
Reply to author
Forward
0 new messages