Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MySQL as document storage?

0 views
Skip to first unread message

Andy Eastham

unread,
Jan 8, 2004, 5:09:04 AM1/8/04
to
Steve,

I know you've been pointed at an interesting resource, and others are saying
they've done this successfully, but I think you should get some balance.
I've worked in document management for years, and I think it's a very bad
idea. Have you ever had to restore a system from loads of incremental
backups?

I'd recommend you look at storing the files on the files system in a proper
storage array that can be expanded when needed. Backing up and restoring
files is much easier in this scenario, as when someone deletes 1 document by
mistake or a file gets corrupted, you only have to restore 1 file. How do
you do this in your database system? The storage array will give much more
flexible backup options.

It's also much easier to use a full text indexer such as Lucene on the raw
files.

In the database, you store all of the document metadata and a unique id. On
the file system, name the files as the same as the id. That way, it's easy
to get between the file system and the database record, or the full text
index and the database record.

I have to say that reading the PHP article made me feel very uneasy - it all
seemed fundamentally flawed to me.

However, good luck if you decide to go that way. I'd be interested to hear
if you genuinely feel it has been a success, especially after you've
recovered from your first major data loss :-)
(This is not meant to be sarcastic)

Best regards,

Andy


-----Original Message-----
From: Steve Folly [mailto:my...@spfweb.co.uk]
Sent: 07 January 2004 20:56
To: MySQL MySQL
Subject: MySQL as document storage?

Hi,

(disclaimer - this thread could easily go off topic; I'm interested
only in the MySQL aspects of what follows...)

At work we are currently investigating ways of filing all our
electronic documents.

There is commercial software that will do this I know, but I was
wondering whether MySQL would be suitable for this type of thing.

The 'documents' could be literally any binary file. My idea would be to
create a table with a blob column for the document itself, and document
title, reference number, keywords, other meta-data. And a web-based
front-end to search and serve documents.

Although the documents could be any file, the majority would be textual
documents (Word documents, PDF, etc). How would one go about indexing
such data, since full text searches operate on textual columns?

How to cope with columns exceeding the max packet length? Why is there
a max_packet_length setting; surely this is low-level stuff that
shouldn't affect query and result sizes?

Is storing the actual documents in the database such a good idea
anyway? Perhaps store the file in a file system somwhere and just store
the filename?


If anyone has experience in doing (or been dissuaded from doing) this
kind of application your thoughts and comments would be appreciated.
(If only to tell me "don't be so stupid, it'll never work" :)

Thanks.

--
Regards,
Steve.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/mysql?unsub=an...@barllama.demon.co.uk


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=myo...@freebsd.csie.nctu.edu.tw

Harald Fuchs

unread,
Jan 8, 2004, 7:34:09 AM1/8/04
to
In article <200401072222....@tollyboy.demon.co.uk>,
Richard Davies <pers...@tollyboy.demon.co.uk> writes:

> I find at present from reasons I can't work out if the image to be stored is a
> jpg then there is no problem but if it is a tiff them MySQL won't store it.

> insert into charts values(5,'No
> comment',(LOAD_FILE("/home/richard/fred.jpg")));

> works fine.

> insert into charts values(5,'No
> comment',(LOAD_FILE("/home/richard/fred.tiff")));

> inserts NULL instead of the image. fred.jpg and fred.tiff both exist in the
> same directory.

Maybe fred.tiff is larger than max_allowed_packet?

Richard Davies

unread,
Jan 8, 2004, 9:58:47 AM1/8/04
to
On Thursday 08 Jan 2004 12:33, Harald Fuchs wrote:
> In article <200401072222....@tollyboy.demon.co.uk>,

> Richard Davies <pers...@tollyboy.demon.co.uk> writes:
> > I find at present from reasons I can't work out if the image to be stored
> > is a jpg then there is no problem but if it is a tiff them MySQL won't
> > store it.

> > insert into charts values(5,'No
> > comment',(LOAD_FILE("/home/richard/fred.jpg")));

> > works fine.

> > insert into charts values(5,'No
> > comment',(LOAD_FILE("/home/richard/fred.tiff")));

> > inserts NULL instead of the image. fred.jpg and fred.tiff both exist in
> > the same directory.

> Maybe fred.tiff is larger than max_allowed_packet?

No it isn't. fred.tiff is just fred.jpg copied and renamed. Apart from the
name the files are identical at about 400k and the max_allowed_packet is set
in /etc/my.cnf to be 10Meg.

It is extemely odd but this install of MySQL doesn't seem to like the
extension tiff.

--
Regards

Richard

Troy T. Hall

unread,
Jan 8, 2004, 10:26:35 AM1/8/04
to
pardon me for asking a stupid question, but when your storing data into
MySQL or any database for that matter, isn't "data" data? Does this have
something to do with the BLOB type? ( Which I've never understood anyway ).
I'm very new at this and am wanting to store documents that have database
fields in them so I am following this discussion closely trying to learn
something.

Troy
"Richard Davies" <pers...@tollyboy.demon.co.uk> wrote in message
news:200401081447....@tollyboy.demon.co.uk...

http://lists.mysql.com/mysql?unsub=gcdmg...@m.gmane.org
>
>


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.558 / Virus Database: 350 - Release Date: 1/2/04

Fred van Engen

unread,
Jan 8, 2004, 11:52:12 AM1/8/04
to
On Thu, Jan 08, 2004 at 02:47:51PM +0000, Richard Davies wrote:
> On Thursday 08 Jan 2004 12:33, Harald Fuchs wrote:
> > In article <200401072222....@tollyboy.demon.co.uk>,
>
> > Richard Davies <pers...@tollyboy.demon.co.uk> writes:
> > > I find at present from reasons I can't work out if the image to be stored
> > > is a jpg then there is no problem but if it is a tiff them MySQL won't
> > > store it.
>
> > > insert into charts values(5,'No
> > > comment',(LOAD_FILE("/home/richard/fred.jpg")));
>
> > > works fine.
>
> > > insert into charts values(5,'No
> > > comment',(LOAD_FILE("/home/richard/fred.tiff")));
>
> > > inserts NULL instead of the image. fred.jpg and fred.tiff both exist in
> > > the same directory.
>
> > Maybe fred.tiff is larger than max_allowed_packet?
>
> No it isn't. fred.tiff is just fred.jpg copied and renamed. Apart from the
> name the files are identical at about 400k and the max_allowed_packet is set
> in /etc/my.cnf to be 10Meg.
>

Are the file permissions identical? Especially the third set is important.
Mysqld probably runs under an account (mysql) that is not yourself or in
your group, so the file needs to be world-readable. Use 'ls -l' to show
these permissions.

Also, did the copying or renaming change the case of the filename, e.g.
because you accessed it through a Samba share?


Regards,

Fred.

--
Fred van Engen XB Networks B.V.
email: fred.va...@xbn.nl Televisieweg 2
tel: +31 36 5462400 1322 AC Almere
fax: +31 36 5462424 The Netherlands

Richard Davies

unread,
Jan 8, 2004, 12:57:08 PM1/8/04
to
On Thursday 08 Jan 2004 16:50, you wrote:
> On Thu, Jan 08, 2004 at 02:47:51PM +0000, Richard Davies wrote:
> > On Thursday 08 Jan 2004 12:33, Harald Fuchs wrote:
> > > In article <200401072222....@tollyboy.demon.co.uk>,
> > >
> > > Richard Davies <pers...@tollyboy.demon.co.uk> writes:
> > > > I find at present from reasons I can't work out if the image to be
> > > > stored is a jpg then there is no problem but if it is a tiff them
> > > > MySQL won't store it.
> > > >
> > > > insert into charts values(5,'No
> > > > comment',(LOAD_FILE("/home/richard/fred.jpg")));
> > > >
> > > > works fine.
> > > >
> > > > insert into charts values(5,'No
> > > > comment',(LOAD_FILE("/home/richard/fred.tiff")));
> > > >
> > > > inserts NULL instead of the image. fred.jpg and fred.tiff both exist
> > > > in the same directory.
> > >
> > > Maybe fred.tiff is larger than max_allowed_packet?
> >
> > No it isn't. fred.tiff is just fred.jpg copied and renamed. Apart from
> > the name the files are identical at about 400k and the max_allowed_packet
> > is set in /etc/my.cnf to be 10Meg.
>
> Are the file permissions identical? Especially the third set is important.

They should have been exactly the same as the only thing I did to the file was
mv fred.jpg fred.tiff
I can no longer check this as I deleted the directory.

> Mysqld probably runs under an account (mysql) that is not yourself or in
> your group, so the file needs to be world-readable. Use 'ls -l' to show
> these permissions.

You must be right though as I have just done some extensive testing with a
range of 17 files ranging from 100k to 1.7meg and have no problems so long as
I chmod 666 *

> Also, did the copying or renaming change the case of the filename, e.g.
> because you accessed it through a Samba share?

Purely linux system no Samba but a very good thought.

Thank you all for you helpful suggestions I still have no idea how the file
permissions got changed but it seems to be the answer.

>
>
> Regards,
>
> Fred.

--
Regards

Richard

0 new messages