Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Sotring images in SQL Server 2005

5 views
Skip to first unread message

John Shwon

unread,
Dec 1, 2007, 5:11:00 AM12/1/07
to
My application extracts approximatly 5000-8000 PDF's per day from mobile
devices. The PDF's are mostly between 100K-500K and never bigger then 6MB. I
currently store these images in SQL Server 2005 in column of type image. (I
think I should change this to varbinary(MAX) but I originnaly did the design
on MS SQL 2000) The table has over 3 million reconds and is just over 270GB.
Our C# line-of-business application accesses these images and displayes them
for capture to users.

My new manager is not happy with the PDF's being in the database and would
like us to migrate then to a folder structure on NTFS. He reasons are mostly
the following:

1. Backup take too long.

2. If we have to restore it's going to take very long.

3. SQL Server resource are being wasted on storing/retrieving images.

4. He experienced corruption in previous SQL Server versions when storing
binary data.

My reasons for prefering storing the images in SQL Server is the following:

1. The filesystem does not give my any consistency checking abilities like
DBCC CHECKDB.

2. If I save the images to the filesystem and need to rollback a transaction
it becomes a major problem, so bye-bye referential integrity.

3. I know long restores is a potential risk but if we upgrade to enterprise
edition I can manage this easilty with partitioning. (I would only retore the
newest images and restore other filegroups later.)

Any comments? How would you prefer to store so much images?

Has anybody experienced any specific problems with storing images in the
database?

Madhivanan

unread,
Dec 1, 2007, 7:21:52 AM12/1/07
to

Dan Guzman

unread,
Dec 1, 2007, 9:01:21 AM12/1/07
to
> My new manager is not happy with the PDF's being in the database and would
> like us to migrate then to a folder structure on NTFS. He reasons are
> mostly
> the following:
>
> 1. Backup take too long.

Rather than "too long", perhaps your boss means longer than without the
images stored in the database. This is of course true but keep in mind that
the PDF files must also be backed if stored separately and that time needs
to be considered for comparison. Furthermore, you'll need to ensure the
files are synchronized with the database.

> 2. If we have to restore it's going to take very long.

Same as #1 and note that you'll need to restore both the database and file
system data.

> 3. SQL Server resource are being wasted on storing/retrieving images.

I think "used" is more correct than "wasted". It is true that without the
overhead of storing and retrieving image data, SQL Server can use resources
exclusively for serving relational data instead.

> 4. He experienced corruption in previous SQL Server versions when storing
> binary data.
>

I'm not sure how is that relevant under SQL 2005. FWIW, we have heavily
used third-party apps that use image data under both SQL 2000 and SQL 2005
without data corruption.

> My reasons for prefering storing the images in SQL Server is the
> following:
>
> 1. The filesystem does not give my any consistency checking abilities like
> DBCC CHECKDB.
>

The Windows CHKDSK command can be used to verify/correct file system
integrity.

> 2. If I save the images to the filesystem and need to rollback a
> transaction
> it becomes a major problem, so bye-bye referential integrity.
>

Data integrity is one of the main reasons people who choose to store images
in the database do so.

> 3. I know long restores is a potential risk but if we upgrade to
> enterprise
> edition I can manage this easilty with partitioning. (I would only retore
> the
> newest images and restore other filegroups later.)
>

You are right that well-designed designed partitions in general can improve
manageability and availability.

> Any comments? How would you prefer to store so much images?
>

If I were in your situation, I would consider migrating to SQL 2008 next
year.
Plans are to introduce a FILSTREAM data type that is mostly the best of both
worlds,
although there are considerations with the piecemeal restore you mentioned.

You can download the latest CTP from
https://connect.microsoft.com/SQLServer/content/content.aspx?ContentID=5395.

--
Hope this helps.

Dan Guzman
SQL Server MVP

--
Hope this helps.

Dan Guzman
SQL Server MVP

"John Shwon" <John...@discussions.microsoft.com> wrote in message
news:7E96BCB9-418E-4B3D...@microsoft.com...

Erland Sommarskog

unread,
Dec 1, 2007, 11:42:07 AM12/1/07
to
John Shwon (John...@discussions.microsoft.com) writes:
> My new manager is not happy with the PDF's being in the database and
> would like us to migrate then to a folder structure on NTFS. He reasons
> are mostly the following:
>
> 1. Backup take too long.
>
> 2. If we have to restore it's going to take very long.

Why don't you simply suggest him that you stop storing the images
altogther? As Dan pointed out, the files needs to be backed up in
the file system as well. And restored. Plus that the restore would
be a headache to get data and files in sync.

One thing you can do keep down the database in size is to compress
the files before you store them, but maybe you do this already.

If the files are largely static, you could save backup time and size
by using differential backups and only run full backups occasionally.
Of course, this is not going to help RESTORE operations, but rather
they will be a little more complex.

You mention partitionong and Enterprise Edition, but you could also
use partitioned views and move old documents to a different database
and set up a partitioned view to access these. You could then
backup that database once and then set it read-only. Again, this idea
assumes that documents are static. If any document can be replaced at any
time, this may not be viable at all.

> 3. SQL Server resource are being wasted on storing/retrieving images.

True, but someone's gotta do the job. :-)



> 4. He experienced corruption in previous SQL Server versions when storing
> binary data.

And he has never encountered corruption in the file system? And as you
point out yourself, you are much more in danger to get corruption in
your data when it's stored in two places.



> 1. The filesystem does not give my any consistency checking abilities like
> DBCC CHECKDB.

Dan suggested CHKDSK, but you would still need your checker to verify
that database and file system is in sync.



> Any comments? How would you prefer to store so much images?

varbinary(MAX), hands down!

As Dan mentioned, SQL 2008 brings us a new feature, FILESTREAM where you
can say that a varbinary(MAX) column is FILESTREAM. A filestream column
is stored in the file system, but you can still access it from SQL Server
as any other column. But there is also an API that permits you to retrieve
a file handle, so that you can access the file directly.

But I don't think that FILESTREAM is of interest to you. Books Online
says that FILSTREAM becomes an option of interest when the size exceeds
1MB. Below that traditional column storage is more efficient to access.
Furthermore, when you back up the database, the FILESTREAM data is
included in the backup, so your backups will not be any smaller. Well,
since FILESTREAM data is on a separate filegroup, you can back it up
seaprately.

I should also point out that some of the issues with NTFS storage exists
with FILESTREAM as well, at least as it behaves in CTP5 of SQL 2008. The
files are exposed in Explorer, and you can easily go in and create a
mess if you don't know what you are doing.

--
Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

0 new messages