A BILLION pictures...how do they do it?

1 view
Skip to first unread message

cbmeeks

unread,
Jun 12, 2007, 8:27:11 AM6/12/07
to
I am writing my own family photo sharing site that I hope to take
public (like so many others). Anyway, currently, when the user
uploads a picture, I store the picture outside my htdocs folder and
record the image details in a MySQB db. When you browse the picture,
I read the record and build the image by sending an image/jpeg header.

Seems to work but I am a little disappointed with performance.
Granted I am running on a really old machine which might be the
reason. lol

Seriously though, if I take this public and get extremely lucky and
millions of photos are uploaded, would this be the best method?

I've read pros and cons of storing images in a database. I've read
about Flickr, SmugMug, Photobucket having HUNDREDS of millions to over
a BILLION images stored!

Obviously, load balancing plays into this but what other secrets do
you think they use?

One thing I worry about is my file system. I have something like:

pix
-----user1
-------------thumbs
-----user2
-------------thumbs

etc...

Any pointers would be appreciated.
Thanks

cbmeeks

Jerry Stuckle

unread,
Jun 12, 2007, 11:09:30 AM6/12/07
to

First of all, you should be asking this in a database newsgroup, not a
PHP one. And preferably a newsgroup aimed at the database you're using.

I store pictures in databases. It works quite well. Takes some tuning,
but I find it provides good performance.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstu...@attglobal.net
==================

cbmeeks

unread,
Jun 12, 2007, 11:28:52 AM6/12/07
to
> First of all, you should be asking this in a database newsgroup, not a
> PHP one. And preferably a newsgroup aimed at the database you're using.

Well, that's assuming I would only use MySQL and not PHP to serve my
files. :-)

> I store pictures in databases. It works quite well. Takes some tuning,
> but I find it provides good performance.

Yeah, I'm not surprised you replied. I have been reading some of your
posts about images in db's. You really have me thinking about images
in db's. I have to admit, I am walking on top of the fence and could
jump to either side when it comes to file system/db for storing
images. I agree with your postings about actually doing it instead of
quoting theories.

Scalability is very important but it's not the only thing.
Portability is also important. I am thinking of using Amazon's S3
(which I believe is a flat file system). But the bad thing about
using Amazon is that I put all of my eggs in one basket. They just
recently had a price change that made a lot of people happy but not
all...point is, they did that because they can.

I would love to be the fly on the wall at Amazon, eBay, Google, etc
and see how they store images. I know Google has their BigTable.

I guess I should follow by example. SmugMug uses their own internal
system that is helped along with S3. But I have no idea of how much
they serve from S3 or if they just use S3 as a backup.

Oh well, sorry for the rambling.

cbmeeks
http://www.eblarg.com

max.s...@googlemail.com

unread,
Jun 12, 2007, 11:30:53 AM6/12/07
to
> jstuck...@attglobal.net
> ==================

You should read the Database DOCS. In case of MySQL, if you index your
table and use the right mysql database type, then you will get more
perfomance with storing images in the database.
Also if you run a very huge site, your database server's will run on
SCSI machine's which means that you have often faster Database
Harddrive's then your webserver.

Jerry Stuckle

unread,
Jun 12, 2007, 12:17:09 PM6/12/07
to

Either way you're going to have to use PHP (or PERL or some language) to
serve the images up. But the database design and configuration is the
more important thing here. That's why I suggested a database newsgroup.
It's a better place to discuss these things.

cbmeeks

unread,
Jun 12, 2007, 12:50:45 PM6/12/07
to
Max/Jerry:

Oh believe me, I would like to use the DB and I will certainly try it
and run some performance testing.


> serve the images up. But the database design and configuration is the
> more important thing here. That's why I suggested a database newsgroup.
> It's a better place to discuss these things.
>

Agreed. I just don't like to cross post and I knew that PHP and MySQL
would be involved. That's why I started here first.

NC

unread,
Jun 12, 2007, 1:52:43 PM6/12/07
to
On Jun 12, 5:27 am, cbmeeks <cbme...@gmail.com> wrote:
>
> I am writing my own family photo sharing site that
> I hope to take public (like so many others). Anyway,
> currently, when the user uploads a picture, I store
> the picture outside my htdocs folder and record the
> image details in a MySQB db. When you browse the
> picture, I read the record and build the image by
> sending an image/jpeg header.
>
> Seriously though, if I take this public and get
> extremely lucky and millions of photos are uploaded,
> would this be the best method?
>
> I've read pros and cons of storing images in a database.
> I've read about Flickr, SmugMug, Photobucket having
> HUNDREDS of millions to over a BILLION images stored!
>
> Obviously, load balancing plays into this but what
> other secrets do you think they use?

Separating (static) pictures from other (dynamic) content. Say, you
have two servers, one with PHP/MySQL (let's call it www.yoursite.com),
another with nothing but Apache (content.yoursite.com), optimized for
serving static images. The application residing on www.yoursite.com
saves images onto content.yoursite.com and records their full URLs
(http://content.yoursite.com/path/file.jpg) in its database. When
content.yoursite.com gets low on available disk space, you put up a
new server (content2.yoursite.com) for writing and start filling it up
with pictures, while content.yoursite.com still remains accessible for
reading. You can continue to add new content*.yoursite.com servers as
you go. Dynamically generated HTML gets served from www.yoursite.com
(which may eventually outgrow a single server and morph into a server
cluster), static images, from content*.yoursite.com.

A slight variation of this approach is that multiple servers are open
for writing at any given time; images are written onto a randomly
chosen server. This helps ensure that highly popular content will be
spread between multiple servers and can thus be served faster.

Yet another possibility is to hide your application behind a layer of
caching proxies...

> One thing I worry about is my file system. I have
> something like:
>
> pix
> -----user1
> -------------thumbs
> -----user2
> -------------thumbs

There's absolutely no need for the file structure to replicate your
database structure...

Cheers,
NC

Jerry Stuckle

unread,
Jun 12, 2007, 5:14:27 PM6/12/07
to

Ah, but cross-posting is the ONLY way to fly! :-)

cbmeeks

unread,
Jun 18, 2007, 2:17:28 PM6/18/07
to
That makes sense. I see many of the big sites use
"static123.example.com".
Reply all
Reply to author
Forward
0 new messages