Serving images from GridFS

1,146 views
Skip to first unread message

Joshua Partogi

unread,
Sep 18, 2010, 11:56:30 PM9/18/10
to mongod...@googlegroups.com
Hi all,

I want to serve images from GridFS instead of from the filesystem. But many people said that static files should be served by webserver because it can be cached. How bad would the trade-off for serving images from GridFS? Would there be any performance penalty for doing this? Or does GridFS handle this use case very well?

Thanks for your help.

Regards,

Andreas Jung

unread,
Sep 19, 2010, 12:20:10 AM9/19/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

huh?

The basic point of GridFS is having a distributed filesystem and being
able to serve content/data from a large distributed database without
having every piece of data on the local filesystem and not having to use
some kind of network filesystem (NFS & friends).

- -aj

> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.


- --
ZOPYX Limited | zopyx group
Charlottenstr. 37/1 | The full-service network for Zope & Plone
D-72070 T�bingen | Produce & Publish
www.zopyx.com | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQGUBAEBAgAGBQJMlY96AAoJEADcfz7u4AZjCMALwIHVwHe/xlBCQJVVobZSoVgp
piboQJOnigJ/bOgltmfCXgnbPS8UxSZMPbiQRz1MXP9MLDElgI1StvOBhDrmLamR
4kye7tHyjg6aNhXI8q4KHGHWbZcTX7uO7hIIeS1nbNQyscZvR+q56khovojmmEwA
06UzriT7KOzkKWPoDlocDsUKolEBt26UurpCZg36pvWmr8nPfcpkDkd3ztToo7qy
1pYml7D6hkwk67fuwJkCO/RApZOSAiGw+40yYtF8mMUh7Z1C2hpRfslhITIWHdz2
Q2fsYaZvYZhu23fw9SkcUt+Tr77dOUjrG3F2jKgO6TmbaaecdGz9rEb2bCbkUYXG
QBcAc28mhJu/K93YVcQE4z9y9z9QjHznB2w+2qwanvwMNbCeZZipqh3rm/rtEQCi
nBVMJM5QbaSL8t5Erl0lNDGqy6gUzIoD4DlGdlgDH6DH4g07Uv+lfiYIYWaBwR4O
F9a6eR3i3M9xgiSujCZQ2K33QEvDSo4=
=R3Aj
-----END PGP SIGNATURE-----

lists.vcf

Eliot Horowitz

unread,
Sep 19, 2010, 12:21:27 AM9/19/10
to mongod...@googlegroups.com
When you mean caching do you mean server or client side?
Both GridFS and filesystem can keep data in ram.
Also - you can control client side caching with either as well.

Some ideas on when to use GridFS over file system.
http://www.mongodb.org/display/DOCS/When+to+use+GridFS

Markus Gattol

unread,
Sep 19, 2010, 7:03:15 AM9/19/10
to mongod...@googlegroups.com
In addition to what has been said already, I think a major benefit of
having stuff in GridFS is that you do not have to care about UUIDs ...
things like using filesystem paths in combination with filenames e.g.
../2010/07/party_0001.jpg

Note however that in case you want to cache static files (.css, .js,
...) from your webapplication, then GridFS is the wrong horse (has
mentioned). In this case you might want to look at things like
http://djangopackages.com/packages/p/django-mediagenerator in case you
are using Django. If nonetheless GridFS vs. filesystem is what you are
considering, this might help you a bit:

http://www.markus-gattol.name/ws/mongodb.html#why_use_gridfs_over_ordinary_filesystem_storage


Fitz H. Agard

unread,
Sep 19, 2010, 8:32:22 AM9/19/10
to mongod...@googlegroups.com
With GridFS you also get a free MD5. I use it to ensure that files aren't loaded into Mongo more than once.

http://www.lightcubesolutions.com/blog/?p=352

Joshua Partogi

unread,
Sep 19, 2010, 8:35:30 AM9/19/10
to mongodb-user
Hi Markus,

Thanks for the feedback. When I said static files really I don't mean
css or javascript but think of photo albums where all the photos are
served from GridFS. Is this a ridiculous thing to do?

Cheers,
Joshua

On Sep 19, 9:03 pm, Markus Gattol <markus.gat...@sunoano.org> wrote:
> In addition to what has been said already, I think a major benefit of
> having stuff in GridFS is that you do not have to care about UUIDs ...
> things like using filesystem paths in combination with filenames e.g.
> ../2010/07/party_0001.jpg
>
> Note however that in case you want to cache static files (.css, .js,
> ...) from your webapplication, then GridFS is the wrong horse (has
> mentioned). In this case you might want to look at things likehttp://djangopackages.com/packages/p/django-mediageneratorin case you
> are using Django. If nonetheless GridFS vs. filesystem is what you are
> considering, this might help you a bit:
>
> http://www.markus-gattol.name/ws/mongodb.html#why_use_gridfs_over_ord...

Joshua Partogi

unread,
Sep 19, 2010, 8:39:22 AM9/19/10
to mongodb-user
Thanks Elliot,


That really helps me reduce the FUD. What I meant is caching on the
server side. Back in the past, people would not really store images/
files as BLOB in SQL database because serving it directly from the DB
is slow. I guess that's how the FUD started. I guess GridFS is not
like BLOB in SQL database thus it is more performant? CMIIW


Cheers,
Joshua.

On Sep 19, 2:21 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> When you mean caching do you mean server or client side?
> Both GridFS and filesystem can keep data in ram.
> Also - you can control client side caching with either as well.
>
> Some ideas on when to use GridFS over file system.http://www.mongodb.org/display/DOCS/When+to+use+GridFS

Eliot Horowitz

unread,
Sep 19, 2010, 8:40:29 AM9/19/10
to mongod...@googlegroups.com
Yes - GridFS is designed for things like serving images, so should work well.

Tim Hawkins

unread,
Sep 19, 2010, 8:58:33 AM9/19/10
to mongod...@googlegroups.com
http://www.101gg.com/

Test site built on mongoDB, that serves images directly from GridFS, Zend Framework based, Running on a standard size EC2 instance

Fitz H. Agard

unread,
Sep 19, 2010, 9:08:55 AM9/19/10
to mongod...@googlegroups.com
Oh btw, the images on www.totsy.com are being served out of mongo.

bingomanatee

unread,
Sep 20, 2010, 1:14:42 PM9/20/10
to mongodb-user
I would look at what your overall goals are:

GridFS provides:
* a place to hook metadata (request counts, etc.)
* a way to enforce archiving
* (as mentioned) an MD5 mechanism
* a way to mange replication across servers
* an in-memory transport for files that is quicker than disk hits ...
but see below.
* a secure (replicated), larger (sharded) store for a huge collection
of important files.

However many of these things can be accomplished by attaching metadata
in Mongo to a traditional file bank. Also keep in mind that in large
scale operations you're not just going up against a LAMP server - you
are going up against a content delivery network (CDN) and/or cloud
banked fileysystems as your point of reference.

No matter how fast Mongo is, it will never beat physical file system
retrieval unless the files are so small and few that they can be
caches in memory. Keep in mind your files will be using the same
memory space that other Mongo data uses, so if yo have a lot of
activity in MongoDB proper AND GridFS, you might end up with GridFS
reducing the effectiveness of other uses of Mongo, so if in-memory
retrieval of commonly asked for files is your goal you'd be best off
with a dedicated server.

You might consider using GridFS as a kind of SVN for files; it could,
for instance, be a great vehicle for storing, archiving, and rapidly
deploying files on clouds. If you wanted, you could go through GridFS
only when a file is missing, and deploy it to disk when it is asked
for, or do some sort of intelligent front-loading of the file system
based on popularity or the content of pages you know will be hit.

In short, as far as I can see, unlike most MongoDB use cases, speed is
likely to be the last reason you'd want to use GridFS. Deployment
optimization, security and metadata are the benefits you'll likely
see, so your use cases will revolve around those needs. In tandem with
selective filesystem deployment, though, GridFS is likely to be not
that much slower than traditional content systems.
Reply all
Reply to author
Forward
0 new messages