Collection creation to hold bson artifacts

12 views
Skip to first unread message

Alex Smith

unread,
Aug 31, 2017, 7:40:21 PM8/31/17
to mongodb-user
This is my first post and my first time using Mongo, so forgive my question if it seems simplistic.  We have set up a docker mongo instance and are planning to move files generated by an application (doc and pdf) to the database.  Currently, the file space grows around 1 to 2GB per month.  Our testing using grails to create the objects within the collection has been very successful.  However, a question came up.  The total file growth is 1 to 2GB per month, but that breaks down into different parts of the application (several directories for different parts of the program).  As an example, we might have ten directories with varying amounts of pdf and docx files in them.  The developers are questioning whether to use a single collection for everything with a continuing expansion of data or break the document structure down to one collection per directory.  I am personally a fan of the directory breakdown approach so one collection per directory, but I wanted to know if I am way off base here.  I know Mongo can handle either equally, but what is the best methodology to use on bson objects like these?  Thank you all for your help!

Kevin Adistambha

unread,
Sep 11, 2017, 2:13:44 AM9/11/17
to mongodb-user

Hi Alex

We have set up a docker mongo instance and are planning to move files generated by an application (doc and pdf) to the database

Are you planning to move the files by storing them in the database itself, e.g. by using a blob object (BinData)?

I am personally a fan of the directory breakdown approach so one collection per directory

what is the best methodology to use on bson objects like these?

It’s hard to say without knowing the details of your use case. However, please note that MongoDB requires that no single document can be larger than 16MB (see Limit and Thresholds), so if any document in the future exceeds this limit, you may not be able to store it.

If you need to store more than 16MB of data, you may want to consider GridFS, which is a convention designed to store files larger than 16MB in MongoDB. Officially supported drivers would have support for GridFS natively.

There could be arguments made for a single collection, or multiple collections, and the solution would be highly dependent on your use case. If you feel that the database could be managed more efficiently using multiple collections, then that's probably the right solution for you. Regarding general pointers for schema design, you may find the following links helpful:

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages