Proc & Cons of storing images in MongoDB vs. GFS, NFS, ...

1,804 views
Skip to first unread message

Alexey Petrushin

unread,
Apr 19, 2012, 6:27:01 PM4/19/12
to mongod...@googlegroups.com
I read a couple of topics here about using Mongo as a file storage, for "standard" web app but still have some questions.

By "standard" web app I mean - news site, blogs, e-commerce, organizers, social network and similar stuff. Majority of files are small-to-medium images.
Load is big enough so MongoDB would be shareded on multiple servers.
Special cases like file/video hosting are outside of this scope.

What are Proc & Cons of using GridFS versus "standard" NFS and other options?

From the easy of development it seems that it's easier to store all in Mongo than program and manage one more database for files.

But, here are some questions:

- How good is its performance versus "standard" distributed file systems & NGinx? In terms of throughput, cpu and memory usage. 

- Does it affect the speed of queries against ordinary documents that stored in the same database? Or it's better to have 2 Mongo databases -  one for objects and other for files.

- How to serve files? Using nginx-gridfs module? I also specially interested in Node.js - would it be enough to serve files with node with cache in front of it? Or node.js would be too slow for serving files, even with cache?

Other potential problems?

Thanks.

Adam C

unread,
Apr 20, 2012, 6:41:45 AM4/20/12
to mongod...@googlegroups.com
- How good is its performance versus "standard" distributed file systems & NGinx? In terms of throughput, cpu and memory usage. 

There's no reason you can't use both, I have seen solutions where the distributed filesystem is still used and MongoDB is used to store the meta data about the files rather than the binary data itself.  How good the performance is versus another implementation using completely different technologies and approaches is too vague to answer meaningfully, and is going to vary massively based on your use case and your application. I know you mention "standard" web applications but really there is no agreed standard to comment on.

What are Proc & Cons of using GridFS versus "standard" NFS and other options?

It's not a comparison anyone would make - NFS and GridFS are not directly comparable.

- Does it affect the speed of queries against ordinary documents that stored in the same database? Or it's better to have 2 Mongo databases -  one for objects and other for files.

This will depend on your load and available resources on your MongoDB hosts.  If your working set and indexes fit into RAM and you have spare RAM (and disk IO), and your lock contention is low, then for the GridFS implementation, you don't need separate instances.  If you need to scale the "regular" database and GridFS independently, it might be good to separate them out anyway.  In the end, from a mongoDB perspective it's just more queries and data.  If GridFS and the normal queries are competing for resources, then it will impact performance, otherwise it will not.

- How to serve files? Using nginx-gridfs module? I also specially interested in Node.js - would it be enough to serve files with node with cache in front of it? Or node.js would be too slow for serving files, even with cache?

There is a big difference in that the nginx-gridfs module is not officially supported, the node.js driver is - so from that perspective I think node.js would win, but if the nginx module is active and mature it might be a good fit - that's a judgement call really.  As for speed - how fast is fast enough for you?  There are no ready made benchmarks here, the only way to be sure something is fast enough for you is to test it.  There are too many moving parts to be able to make a statement one way or the other.

Adam.

Sam Millman

unread,
Apr 20, 2012, 7:52:19 AM4/20/12
to mongod...@googlegroups.com
"What are Proc & Cons of using GridFS versus "standard" NFS and other options?"

I suppose what you mean about this is the comparison of flexibility.

I suppose one problem with NFS is that a NFS store can only have a max bandwidth of 100MB/s (if I remember right) which means mass deletion is gonna not only take time but be a pain in the ass. Gridfs would be easier since it is not tied to such things, deletions would occur at a much greater pace. This also applies to changes and aditions.

NFS is not a good choice for physical file storage at all in my opinion, I have had nothing but problems constantly from it. If you need a hard file system you would want to look at cloudfiles or CDN or S3.

I suppse the downside of Gridfs is if you required a folder structure. You would need to use something like materialised paths, unlike on Linux or Windows NFS folders on gridfs cannot be (well can actually but in my scenario I'll say they ain't) rows describing the files under them. Instead you would overcome this problem by using a materialised path tree string for each row.

But at the end of the day NFS is a physical file storage where as gridfs is a database file storage mechanism.

As Adam says NFS and Gridfs are completely different and they are required for very different tasks. Do not make the mistake of mixing them up for the same task like so many do, if you need a physical file system try using one....simple as that in my opinion.


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/YBQggOXxHHkJ.

To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Reply all
Reply to author
Forward
0 new messages