How should we store image and videos in MongoDB?

7,568 views
Skip to first unread message

Projapati

unread,
Apr 1, 2011, 2:40:03 PM4/1/11
to mongodb-user
We store photo and video info in the database(SQL server) and store
the file itself in the file system.
The problem is that the file is disconnected from database. Any update
\delete has to ensure that the file system and db are in sync.

Should we follow the same approach with Mongo? We want to take the
approach that will make it fast and scalable.

Thanks

Scott Hernandez

unread,
Apr 1, 2011, 2:43:19 PM4/1/11
to mongod...@googlegroups.com, Projapati
That is one option, or you can store the data in mongodb with gridfs
so you don't have that problem:
http://www.mongodb.org/display/DOCS/GridFS

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Projapati

unread,
Apr 1, 2011, 2:49:48 PM4/1/11
to mongodb-user
There is no C# GridFS driver according to this link.

Scott Hernandez

unread,
Apr 1, 2011, 3:02:09 PM4/1/11
to mongod...@googlegroups.com

Sam Millman

unread,
Apr 1, 2011, 5:10:04 PM4/1/11
to mongod...@googlegroups.com
You shouldnt store videos in mongo, you should CDN them.

This will be so much more scalable and healthy for mongo. Plus doc sizes are 16MB max. Most videos are greater than that (avg on youtube is about 300MB-600MB). Just mark a video as inactive then use a cronjob to delete later.

When searching make all inactive videos not show.

Scott Hernandez

unread,
Apr 1, 2011, 5:45:14 PM4/1/11
to mongod...@googlegroups.com
Sam, maybe you didn't read the links I sent. GridFS is meant to store
arbitrary sized (binary) files in mongodb.

You can front with a cdn if you like but those are different issues and needs.


http://www.mongodb.org/display/DOCS/GridFS
http://www.mongodb.org/display/DOCS/GridFS+Specification

Sam Millman

unread,
Apr 1, 2011, 6:23:57 PM4/1/11
to mongod...@googlegroups.com
So you would recommend chunking a file which (as youtube limits are) could be upto 2GB?

Not all players (divx and even flash) accept serving files over a language specific server (as I have tried and failed with PHP servers, works fine when you browse there in a web browser but for sme reason wont work in the player, dunno why). They must be served as raw files, that is why youtube serves them raw straight over CDN.

Also from testing I have found Mongo just is not upto the job for video streaming, it creates a bottleneck that really shows up and stunts your users viewing capabilities creating constant stutter and frame tearing.

Thats why I proposed the CDN only solution, just throw the file upto a cloud based CDN and it'll do all teh hard work for you and it will scale nicely.

mohammad musa

unread,
Apr 1, 2011, 6:26:48 PM4/1/11
to mongodb-user
Scott,
If I read this correctly, does it mean, GridFS is still highly
scalable w/o CDN setup?
Creating CDN is out of scope for our app right now. Sam mentioned
youtube videos.

Given their sizes and traffic, what type of settings you recommend for
that type site? Our problem is very much similar.

Thanks for the driver link. I will check it out.

On Apr 1, 2:45 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> Sam, maybe you didn't read the links I sent. GridFS is meant to store
> arbitrary sized (binary) files in mongodb.
>
> You can front with a cdn if you like but those are different issues and needs.
>
> http://www.mongodb.org/display/DOCS/GridFShttp://www.mongodb.org/display/DOCS/GridFS+Specification
>
>
>
>
>
>
>
> On Fri, Apr 1, 2011 at 2:10 PM, Sam Millman <sam.mill...@gmail.com> wrote:
> > You shouldnt store videos in mongo, you should CDN them.
>
> > This will be so much more scalable and healthy for mongo. Plus doc sizes are
> > 16MB max. Most videos are greater than that (avg on youtube is about
> > 300MB-600MB). Just mark a video as inactive then use a cronjob to delete
> > later.
>
> > When searching make all inactive videos not show.
>
> > On 1 April 2011 20:02, Scott Hernandez <scotthernan...@gmail.com> wrote:
>
> >> It is here:
> >>http://api.mongodb.org/csharp/1.0/html/0e461cba-c217-b8a4-b03f-cf05cf...
>
> >> On Fri, Apr 1, 2011 at 11:49 AM, Projapati <mohammad.m...@gmail.com>

Scott Hernandez

unread,
Apr 1, 2011, 8:56:12 PM4/1/11
to mongod...@googlegroups.com
I think you are mistaking streaming, chunked http requests and
progressive downloads when it comes to video. This really has little
to do with mongodb. The only thing that differentiates things are how
the files are served. Mongodb is not a web-server, you must provide
the code that bridges the database and the web.

You can store multi-gb files in gridfs.

Scott Hernandez

unread,
Apr 1, 2011, 8:57:28 PM4/1/11
to mongod...@googlegroups.com
It can be used in a replicated setup and you can host replicas in
different data centers. The part that cdns have which is big is
routing request to local (closest) servers. That is not anything that
mongodb offers.

Sam Millman

unread,
Apr 2, 2011, 5:08:55 AM4/2/11
to mongod...@googlegroups.com
CDNs also hold the capability to store files and then serve and route.

What I meant by chunking was on Mongo Side. As we know Mongo stores a file ID in the file collection and then houses chunks of that file in another collection.

Now Videos can be 2GB in size and BSON objects are 16Meg max which mean per file you could have 125 documents per video and 12,500 per 100. Also at 200GB you would need to shard the collection to keep it at speed increasing the complexity to retrieve a file from the DB turning fast into slow very quickly. Not only this but you'll find it costs a lot of money to house that amount of data (to house 1,000 videos would be out of scope of a normal budget).

Also as I said most players do not accept serving the files in the manner you are speaking, you must serve them raw. You must test this with your own player but as I said I couldn't get it to work properly using a PHP file server (worked fine when I browsed to it, it is just that players access the file slightly differently).

rackspacecloud.com's CDN (when it gets CNAMES) will not only be a cheap solution for me but also a feasible one for scaling.

I wrote a couple of white papers based upon my feasibility tests for housing videos.

But you must find out what you got so you can see what you have to play with.
Reply all
Reply to author
Forward
0 new messages