Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Difference between storing files on folder and in mysql db

12 views
Skip to first unread message

missmoo

unread,
Jan 8, 2008, 11:56:02 AM1/8/08
to
Hey,

I would like to know what the differences are in terms of security,
reliability and resources between storing user-uploaded images in a
file or storing them in a mySQL database.

Our company is starting an image-competition soon, and I am not sure
if I should write the php script to insert the binary code into a
mySQL database or if I should just store the files in a dedicated
folder and the data about them in the mySQL database.

We are expecting relatively big loads of files, at least 10,000 in the
first 2 months.

Which system should I use? Is there a real difference between the two?
our server is apache2, if that has any effects on the answer..

Thank you!

Moriel Schottlender

Rik Wasmus

unread,
Jan 8, 2008, 11:59:11 AM1/8/08
to

Security & reliability => data also in MySQL (or at least a database).
Granted, the load would be somewhat higher, but it the database machine is
the same as the webserver, not very siginificant.

If MySQL is allready stretched then offcourse go for the files solution.
--
Rik Wasmus

The Natural Philosopher

unread,
Jan 8, 2008, 12:08:22 PM1/8/08
to

I thought it would be a big issue to stuff them in the database, but it
was - with one notable issue* - easy.

Once indsider the database, they can only be accessed BY the database,
and the root user,. A considerable extra bit of security, and, since the
databasse is backed up regularly, a Good Thing in that sense as well..
no need to explicitly back up a directory of uploaded files, nor worry
about access permnissions to it.

* the one notable exception was a 'feature' of PHP file handling, I
wanted to use SQL LOAD_FILE to load the temporary file into the
database. BUT it does not actually exist on disk until php closes..a tad
too late to be useful. move_uploaded_file() did not create a disk file
either,. But copy() did. The bug and workaround are on the php docs site
somewhere.

The other gotcha was maximum file and memory limits on php. I upped
these from 16Mbyte to 64Mbytes, and did accidentally upload a 60Mbyte
PDF succesfully.


Jerry Stuckle

unread,
Jan 8, 2008, 1:30:40 PM1/8/08
to

Not a whole lot of difference. I've been doing images in RDB's since we
were putting scanned documents in DB2 back in the 80's. It works well.

Security isn't that much different. If the images are in files, you
need a system userid and password with write access to the files. If
they're in a database, you need the database userid and password. (Of
course, if the database is not accessible from the network, you need both).

The downside is that you cannot server images from a MySQL database with
a simple <img> tag. You need to write a script with will fetch the
image and display it, then put the name of that script in the <img> tag.
Not hard to do - it's only a few lines of code. And you can have a
single file with the image to be used passed as a query parameter.

I do both, just depending on what's needed. For instance, if the image
is tied to other data in the database (i.e. a product image for a
shopping cart), I'll typically store the image in the database. But if
it's unrelated to anything else (i.e. a common header image), I'll just
put it in the file system.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstu...@attglobal.net
==================

faulkes

unread,
Jan 8, 2008, 4:11:33 PM1/8/08
to
On Jan 8, 11:56 am, missmoo <mor...@gmail.com> wrote:
> I would like to know what the differences are in terms of security,
> reliability and resources between storing user-uploaded images in a
> file or storing them in a mySQL database.

Few, as has been pointed out, the less secure (IMO) being handling
files to disk, although any number of ways to mitigate that.

> Our company is starting an image-competition soon, and I am not sure
> if I should write the php script to insert the binary code into a
> mySQL database or if I should just store the files in a dedicated
> folder and the data about them in the mySQL database.

What is the intended end use? security & reliability both have
intended consequences (well, benefits). If reliability is the
more important of the two you have to consider what happens if
the system housing the images in the db grows too large to deal
with? what if the db dies and needs to be restored? how much
downtime is acceptable? Are you doing any sort of replication?

> We are expecting relatively big loads of files, at least 10,000 in the
> first 2 months.
>
> Which system should I use? Is there a real difference between the two?
> our server is apache2, if that has any effects on the answer..

How big do you expect each image to be? how often do you expect each
image to be viewed? All important questions you need to ask before
saying the db or the fs is the best place to store the image.

Based on what you have stated, that there will be an expected 10k of
images, I would say you want to go with the filesystem, with an
appropriate
directory split-up (so you aren't dumping 100k files into one
directory).

The Natural Philosopher

unread,
Jan 8, 2008, 4:24:19 PM1/8/08
to
Agreed on all counts.

Norman Peelman

unread,
Jan 8, 2008, 10:45:36 PM1/8/08
to

Not true... my website does this all day long. Tested with a sleep
function and watched temp file get created and processed in designated
directory. Check your open_basedir and upload_tmp_dir directives.


> The other gotcha was maximum file and memory limits on php. I upped
> these from 16Mbyte to 64Mbytes, and did accidentally upload a 60Mbyte
> PDF succesfully.
>
>
>
>


--
Norman
Registered Linux user #461062

Rik Wasmus

unread,
Jan 8, 2008, 10:55:06 PM1/8/08
to
On Wed, 09 Jan 2008 04:45:36 +0100, Norman Peelman <npee...@cfl.rr.com>
wrote:

>> I thought it would be a big issue to stuff them in the database, but
>> it was - with one notable issue* - easy.
>> Once indsider the database, they can only be accessed BY the database,
>> and the root user,. A considerable extra bit of security, and, since
>> the databasse is backed up regularly, a Good Thing in that sense as
>> well.. no need to explicitly back up a directory of uploaded files, nor
>> worry about access permnissions to it.
>> * the one notable exception was a 'feature' of PHP file handling, I
>> wanted to use SQL LOAD_FILE to load the temporary file into the
>> database. BUT it does not actually exist on disk until php closes..a
>> tad too late to be useful. move_uploaded_file() did not create a disk
>> file either,. But copy() did. The bug and workaround are on the php
>> docs site somewhere.
>>
>
> Not true... my website does this all day long. Tested with a sleep
> function and watched temp file get created and processed in designated
> directory. Check your open_basedir and upload_tmp_dir directives.

It's a known problem, and not related to those 2 settings. It seems to
depend on some OS settings regarding the tmp dir. Most boxes don't have
this problems, others have. Sadly I cannot tell you exactly what the
settings/parameter on which this problem depends actually are, as far as I
can remember it could be quite complex.
--
Rik Wasmus

NC

unread,
Jan 8, 2008, 10:55:49 PM1/8/08
to
On Jan 8, 8:56 am, missmoo <mor...@gmail.com> wrote:
>
> I would like to know what the differences are in terms of security,
> reliability and resources between storing user-uploaded images in a
> file or storing them in a mySQL database.

The only significant difference is the DB server load. Since <img
data="*"> tags are not (yet?) commonly supported by browsers, you need
a separate instance of an image display script (and a separate
connection to the DB server) to display each image. So if your Web
page has 100 images on it, it will require 101 nearly simultaneous
connections to display itself and the images, as opposed to one
connection if images were stored in the file system. Granted, image
retrieval connections would be very short, but at high loads, this
architecture would be patently inferior to disk-based alternative.

Also, if you use MyISAM tables, storing large images in them will
bring you to the limit of table file size much more quickly. If your
average image is 1 MB and your file system's maximum file size is 2 GB
(there are still some older Linux servers whose file systems have this
constraint), you will only be able to store about 2,000 images until
your table exceeds the maximum allowed file size.

Additionally, you'll need to make sure that your MySQL server is
configured with a packet size large enough to transmit the entire
record, including the image. Up to MySQL 3.23, this limit was 16MB
for MyISAM tables. As of MySQL 4.0, situation changed; the effective
maximum length of LONGBLOB is determined by maximum packet size and
available memory.

> Our company is starting an image-competition soon, and I am not sure
> if I should write the php script to insert the binary code into a
> mySQL database or if I should just store the files in a dedicated
> folder and the data about them in the mySQL database.

Large image sharing sites invariably stick to the second approach.
Moreover, images may be stored on separate servers, optimized for
serving static content.

I answered a similar question here a few months ago, take a look:

http://groups.google.com/group/comp.lang.php/msg/e44b33db1a37543b

> Which system should I use? Is there a real difference between the two?

Only at high loads...

Cheers,
NC

Jerry Stuckle

unread,
Jan 8, 2008, 11:05:21 PM1/8/08
to
NC wrote:
> On Jan 8, 8:56 am, missmoo <mor...@gmail.com> wrote:
>> I would like to know what the differences are in terms of security,
>> reliability and resources between storing user-uploaded images in a
>> file or storing them in a mySQL database.
>
> The only significant difference is the DB server load. Since <img
> data="*"> tags are not (yet?) commonly supported by browsers, you need
> a separate instance of an image display script (and a separate
> connection to the DB server) to display each image. So if your Web
> page has 100 images on it, it will require 101 nearly simultaneous
> connections to display itself and the images, as opposed to one
> connection if images were stored in the file system. Granted, image
> retrieval connections would be very short, but at high loads, this
> architecture would be patently inferior to disk-based alternative.
>

Not true. <img> tags are handled identically by the client, whether the
image comes from the database or the file system. The client doesn't
know or care if the image is from a database or not.

> Also, if you use MyISAM tables, storing large images in them will
> bring you to the limit of table file size much more quickly. If your
> average image is 1 MB and your file system's maximum file size is 2 GB
> (there are still some older Linux servers whose file systems have this
> constraint), you will only be able to store about 2,000 images until
> your table exceeds the maximum allowed file size.
>

Few Linux implementations have a 2GB limit any more. Most are at least
4GB, and some implementations are much larger.

Also, the typical image size is *much smaller* than 1MB.

> Additionally, you'll need to make sure that your MySQL server is
> configured with a packet size large enough to transmit the entire
> record, including the image. Up to MySQL 3.23, this limit was 16MB
> for MyISAM tables. As of MySQL 4.0, situation changed; the effective
> maximum length of LONGBLOB is determined by maximum packet size and
> available memory.
>

MySQL easily handles this for you.

>> Our company is starting an image-competition soon, and I am not sure
>> if I should write the php script to insert the binary code into a
>> mySQL database or if I should just store the files in a dedicated
>> folder and the data about them in the mySQL database.
>
> Large image sharing sites invariably stick to the second approach.
> Moreover, images may be stored on separate servers, optimized for
> serving static content.
>

An over-generalization. Many have images stored in the database itself.

> I answered a similar question here a few months ago, take a look:
>
> http://groups.google.com/group/comp.lang.php/msg/e44b33db1a37543b
>
>> Which system should I use? Is there a real difference between the two?
>
> Only at high loads...
>
> Cheers,
> NC
>

Rik Wasmus

unread,
Jan 8, 2008, 11:08:58 PM1/8/08
to
On Wed, 09 Jan 2008 05:05:21 +0100, Jerry Stuckle
<jstu...@attglobal.net> wrote:

> NC wrote:
>> On Jan 8, 8:56 am, missmoo <mor...@gmail.com> wrote:
>>> I would like to know what the differences are in terms of security,
>>> reliability and resources between storing user-uploaded images in a
>>> file or storing them in a mySQL database.
>> The only significant difference is the DB server load. Since <img
>> data="*"> tags are not (yet?) commonly supported by browsers, you need
>> a separate instance of an image display script (and a separate
>> connection to the DB server) to display each image. So if your Web
>> page has 100 images on it, it will require 101 nearly simultaneous
>> connections to display itself and the images, as opposed to one
>> connection if images were stored in the file system. Granted, image
>> retrieval connections would be very short, but at high loads, this
>> architecture would be patently inferior to disk-based alternative.
>>
>
> Not true. <img> tags are handled identically by the client, whether the
> image comes from the database or the file system. The client doesn't
> know or care if the image is from a database or not.

I think NC is talking about 'in this scenario, for every image request a
database connection is opened and closed again'. Which is true, unless one
uses persistent connections.
--
Rik Wasmus

Jerry Stuckle

unread,
Jan 8, 2008, 11:12:40 PM1/8/08
to

True - a database connection is opened and closed. But as discreet
files, a file system connection must be opened and close - something a
lot of people ignore.

My point being that a network connection does not need to be opened and
close.

And either way, if it's a commonly used image, the cache (database or
file system) can produce the image. In fact, the database cache could
be more efficient, as it doesn't have to cache unrelated files, such as
scripts.

The Natural Philosopher

unread,
Jan 9, 2008, 3:34:53 AM1/9/08
to

It created the temp files in /tmp.

They wouldn't load.

I moved them to another filename in /tmp.

They wouldn't load.

I copied them to another filke in /tmp.

They did load.

I'd be happy to know what I did wrong. but what you suggest ain't it.

The Natural Philosopher

unread,
Jan 9, 2008, 3:38:40 AM1/9/08
to

Thats true, but what overhead is that? Compared with dishing out all
that data?

NC

unread,
Jan 9, 2008, 1:43:29 PM1/9/08
to
On Jan 8, 8:05 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:

> NC wrote:
>
> > The only significant difference is the DB server load. Since <img
> > data="*"> tags are not (yet?) commonly supported by browsers, you need
> > a separate instance of an image display script (and a separate
> > connection to the DB server) to display each image. So if your Web
> > page has 100 images on it, it will require 101 nearly simultaneous
> > connections to display itself and the images, as opposed to one
> > connection if images were stored in the file system. Granted, image
> > retrieval connections would be very short, but at high loads, this
> > architecture would be patently inferior to disk-based alternative.
>
> Not true. <img> tags are handled identically by the client, whether the
> image comes from the database or the file system. The client doesn't
> know or care if the image is from a database or not.

The client doesn't indeed, but the DB server does. Let's say we have
a script img.php (error handling omitted for brevity):

$id = (int) $_GET['id'];
mysql_connect('server', 'user', 'password');
mysql_select_db('images');
$result = mysql_query("SELECT imageData FROM images WHERE id = $id");
$record = mysql_fetch_row($result);
header('Content-type: image/jpeg');
echo record[0];

Let's further say that there is a script index.php that includes this
fragment:

for ($i = 1; $i <= 100; $i++) {
echo "<img src='img.php?id=$i'>";
}

So when index.php runs, it outputs 100 <img src='img.php?id=*'> tags,
so 100 instances of img.php are also run and each makes a connection
to the DB server.

Please feel free to point out any errors in the above argument.

> > Also, if you use MyISAM tables, storing large images in them will
> > bring you to the limit of table file size much more quickly. If your
> > average image is 1 MB and your file system's maximum file size is 2 GB
> > (there are still some older Linux servers whose file systems have this
> > constraint), you will only be able to store about 2,000 images until
> > your table exceeds the maximum allowed file size.
>
> Few Linux implementations have a 2GB limit any more. Most are at least
> 4GB, and some implementations are much larger.

Yes, but it's an issue the OP needs to check on before commencing
development.

> Also, the typical image size is *much smaller* than 1MB.

Depends on where you are and what you do. The OP mentioned an "image
competition", which could be en event where average image size is much
LARGER than 1 MB. A digital photo shot by a professional (or even top-
of-the line amateur) camera can easily be 10 MB, so can a piece of
digital artwork or a magazine page layout.

> > > Our company is starting an image-competition soon, and I am not sure
> > > if I should write the php script to insert the binary code into a
> > > mySQL database or if I should just store the files in a dedicated
> > > folder and the data about them in the mySQL database.
>
> > Large image sharing sites invariably stick to the second approach.
> > Moreover, images may be stored on separate servers, optimized for
> > serving static content.
>
> An over-generalization. Many have images stored in the database itself.

Please name one. I, meanwhile, will follow my own advice and do the
same. In his book, "Building Scalable Web Sites", Cal Henderson,
chief software architect of Flickr, describes Flickr's image storage
facility (p.152): disk-based and redundant, with synchronous writes to
primary storage and asynchronous writes to multiple backup storage
locations.

Cheers,
NC

NC

unread,
Jan 9, 2008, 2:07:19 PM1/9/08
to
On Jan 8, 8:12 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>
> My point being that a network connection does not need to
> be opened and close.

Which network connection are we talking about -- the one between HTTP
client and HTTP server or the one between MySQL client and MySQL
server?

> And either way, if it's a commonly used image, the cache
> (database or file system) can produce the image. In fact,
> the database cache could be more efficient, as it doesn't
> have to cache unrelated files, such as scripts.

This is definitely true for content applications, where most recent
items account for the lion's share of traffic, but it is generally
false for social networking applications, where there are VERY few
"commonly used" images, but rather, most of the content is used
uncommonly -- my friends view my pictures, your friends view your
pictures...

Cheers,
NC

Jerry Stuckle

unread,
Jan 9, 2008, 3:31:51 PM1/9/08
to
NC wrote:
> On Jan 8, 8:12 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> My point being that a network connection does not need to
>> be opened and close.
>
> Which network connection are we talking about -- the one between HTTP
> client and HTTP server or the one between MySQL client and MySQL
> server?
>

My misstatement. What I meant to say was that no additional network
connections need to be opened or closed (between the client and the
server).

>> And either way, if it's a commonly used image, the cache
>> (database or file system) can produce the image. In fact,
>> the database cache could be more efficient, as it doesn't
>> have to cache unrelated files, such as scripts.
>
> This is definitely true for content applications, where most recent
> items account for the lion's share of traffic, but it is generally
> false for social networking applications, where there are VERY few
> "commonly used" images, but rather, most of the content is used
> uncommonly -- my friends view my pictures, your friends view your
> pictures...
>
> Cheers,
> NC
>

In that case either way they won't be in the cache and will need to be
retrieved. However, it's still more likely it will be in the database
cache than the file system cache. There's a lot more going on in the
file system.

Jerry Stuckle

unread,
Jan 9, 2008, 3:39:15 PM1/9/08
to

No errors at all. But if it's serving from the file system, then the
server has to make the connection to the file system, also. Sure, it is
less handling. But it's not that significant in most site.

Additionally, how many sites do you know which serve 100 images in a
single page? Not many - the load time would be so slow it wouldn't be
funny.

>>> Also, if you use MyISAM tables, storing large images in them will
>>> bring you to the limit of table file size much more quickly. If your
>>> average image is 1 MB and your file system's maximum file size is 2 GB
>>> (there are still some older Linux servers whose file systems have this
>>> constraint), you will only be able to store about 2,000 images until
>>> your table exceeds the maximum allowed file size.
>> Few Linux implementations have a 2GB limit any more. Most are at least
>> 4GB, and some implementations are much larger.
>
> Yes, but it's an issue the OP needs to check on before commencing
> development.
>

Yes, and they need to ensure they aren't running under MS DOS, with it's
640K limit, also :-)

Seriously - if they're running that old of a kernel, it's time to
upgrade anyway.

>> Also, the typical image size is *much smaller* than 1MB.
>
> Depends on where you are and what you do. The OP mentioned an "image
> competition", which could be en event where average image size is much
> LARGER than 1 MB. A digital photo shot by a professional (or even top-
> of-the line amateur) camera can easily be 10 MB, so can a piece of
> digital artwork or a magazine page layout.
>

Sure, but they typically aren't uploaded to the server at that size, and
if they are, I hope the developer will trim them down to size. Load
time for 10MB files would be terrible, especially on DSL.

>>>> Our company is starting an image-competition soon, and I am not sure
>>>> if I should write the php script to insert the binary code into a
>>>> mySQL database or if I should just store the files in a dedicated
>>>> folder and the data about them in the mySQL database.
>>> Large image sharing sites invariably stick to the second approach.
>>> Moreover, images may be stored on separate servers, optimized for
>>> serving static content.
>> An over-generalization. Many have images stored in the database itself.
>
> Please name one. I, meanwhile, will follow my own advice and do the
> same. In his book, "Building Scalable Web Sites", Cal Henderson,
> chief software architect of Flickr, describes Flickr's image storage
> facility (p.152): disk-based and redundant, with synchronous writes to
> primary storage and asynchronous writes to multiple backup storage
> locations.
>
> Cheers,
> NC
>

Sure, that's one way. But it's not the only way.

IBM's document management system (don't know what it's called nowadays)
has been doing it since the mid 80's, at least. I do it on some of my
sites. And I've seen other people do it.

You don't see it as much on the web because too many people who have
never tried it claim it's a bad way to do things.

In fact, a database can often be the better way to go. It can, for
instance, handle 100K images in one table quite easily. There aren't
many file systems which can handle that many images in a single directory.

The Natural Philosopher

unread,
Jan 9, 2008, 3:54:49 PM1/9/08
to
..but its generally larger..in fact its probably true to say that all
ram NOT being used for programs,is used for file system cache in most
systems.


Any machine that isn't using every last ounce of RAM to cache
everything, is an overspecified machine IMHO ;-)

Jerry Stuckle

unread,
Jan 9, 2008, 5:49:27 PM1/9/08
to

Not necessarily. I've seen some MySQL systems with huge caches. You
have direct control over the MySQL cache size. You don't have as much
control over the file system cache.

0 new messages