Backup strategy for Resourcespace?

1,302 views
Skip to first unread message

Benjamin Bailes

unread,
Oct 12, 2010, 4:45:17 PM10/12/10
to resour...@googlegroups.com
I've been working with Resourcespace for a while and am starting to
get comfortable with the idea of deploying it at the museum where I
work, where we're in desperate need of a DAM.

I have two questions that I need to try to answer and haven't gotten
anywhere with the documentation wiki or searching the Google group, so
I'm seeking some input from all of you.

1. What is the prevailing method for backup and recovery? I
understand that I can export the database, and theoretically restore
it if necessary, though not aware of any method beyond the manual one
(Team Center, Export Data). I can also make a regular backup of my
/.../filestore directory, which I plan to do, but my question here is
am I limited to backing up in this way or is there a more robust
method that I just haven't found out about?

Also wondering what happens if I need to recover a specific file for
some reason? Let's say someone accidentally deletes an important
file. Assuming that I have a backup tape and all of the files in
/.../filestore are on it, how do I find the individual file that I
need? Even if I wanted to upload it again how would I find it so I
could recover it and then sign into Resourcespace and resubmit it?

2. What is the best way to get all of the files out of Resourcespace
if, say 5 years from now when we need to migrate all of our digital
resources to XYZ platform? Is there a way to do this? I'll need to
prove that there is a viable exit strategy if I'm going to be
successful deploying Resourcespace.

Many thanks for your attention to these questions and many thanks for
developing such a great system!

Ben Bailes

David Dwiggins

unread,
Oct 12, 2010, 5:38:33 PM10/12/10
to resour...@googlegroups.com
Hi, Ben,

1. Backing up ResourceSpace is pretty easy, since it is just a MySQL database and a massive collection of organized files. I can't say what the prevailing method is, but I can say a bit about what we do.

I do periodic MySQL dumps of the database (I keep meaning to automate this, but haven't gotten around to it yet.)  This preserves all of the metadata. The metadata is also written to XML files in each resource directory, so if you back up the filestore you also have that there as a safety net. (although if you had to restore from there, you obviously wouldn't get your collections, themes, etc.) You could easily set up a timed job to dump the database every night or every week, and then back this up along with the other files.

For the files themselves, we keep a mirror of our filestore in an Amazon S3 bucket off site. This bucket serves as a source for the images we use on our website, and also provides a backup should the system be lost. I have a script that looks for new resources and sends the small previews immediately, and then goes along behind and starts adding bigger files. Since there is sometimes a large backlog, I've also used Amazon's data loading service to copy sets of files to a hard drive and ship it to them.

I have also enabled Amazon's S3 versioning feature, which means that if we copy a file to them, and then it changes on our system and is recopied, Amazon actually retains both versions. Right now finding and retrieving the old versions would be a manual process, but eventually I'd like to integrate this into ResourceSpace.

I am working on developing my S3 backup tool into a plugin that can be dropped into any ResourceSpace installation, but it's not quite ready for prime time yet. However, the code is in the plugin repository, and I'm still actively working on it.

Any other backup method would also work -- you could back up to tape, to hard drive, etc. There's nothing exotic about the ResourceSpace system. It can be backed up just like any other system that stores data in standard files.

Re: finding a file, this depends on what you knew about it. If you knew the resource ID, finding it would be simple, since the filestore is organized into folders by resource number. If you did not know this but had a backup of the database, you could load the database backup into a separate MySQL database and then search the resource data table to find the ID of the one you want. Or, since all the metadata exists in (plain text) XML files in the filestore, if you had a lot of time you could simply do a search across the complete filestore by keyword.

2. Getting files out of ResourceSpace would be easy for anyone with a bit of scripting experience. In addition to the database itself (which would likely be the easiest source), all of the metadata is dumped to XML files along with the resource files. So you would simply be able to crawl the filestore and ingest any file that wasn't a preview, using the XML file for metadata. Having worked with various other DAM/CMS systems in the past, I can say that ResourceSpace is one of the easiest to read data out of. I would not be concerned about the ability to migrate in the future.

As someone who's using the system in a museum setting, I'd be happy to talk to your further. Feel free to contact me at ddwiggins [at] historicnewengland [dot] org, or 617-994-5948.

-David Dwiggins



--
You received this message because you are subscribed to the Google Groups "ResourceSpace" group.
To post to this group, send email to resour...@googlegroups.com.
To unsubscribe from this group, send email to resourcespac...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/resourcespace?hl=en.


dmisaacs

unread,
Oct 12, 2010, 5:40:20 PM10/12/10
to ResourceSpace
Hi Benjamin,

I'm sure someone more knowledgeable than me will chime in on this, but
I had a couple of thoughts for you:

1) I believe the way to back up the database would be to either
utilize SQL to create a backup copy (e.g. mysql dump), or to use
phpMyAdmin to generate the backups (In either case, you would simply
make a copy of the database and save it somewhere else). That, along
with the backup of filestore (as well as /include/config.php, which is
VERY important!), should put you in a position to recover your work.

I am not sure about recovering a file that is accidentally deleted. I
would recommend that you set permissions so that general users cannot
permanently delete anything -- you can configure permissions on the
user groups so that when a user "deletes" a resource it actually gets
put into a queue that requires review by a system administrator before
it is permanently deleted.

2) Hopefully you won't need to migrate from ResourceSpace to another
platform, but I understand the need to have it as a backup plan.
Since the data is stored in a MySQL database, you should be able to
move from RS to another DAM. The details are beyond me, as I have
never done this, but if you dig around a bit you should be able to
craft a migration plan. Perhaps if you or someone you know is a
database wiz...

You should get more responses from the group, but I hope these
preliminary thoughts are helpful to you.

Danny

dmisaacs

unread,
Oct 12, 2010, 5:42:30 PM10/12/10
to ResourceSpace
See, David beat me to the punch with way better answers!

David Dwiggins

unread,
Oct 12, 2010, 5:50:27 PM10/12/10
to resour...@googlegroups.com
Well, I did manage to leave out the importance of the config file -- which, as you point out, is VERY important!

Benjamin Bailes

unread,
Oct 13, 2010, 2:11:40 PM10/13/10
to ResourceSpace
Thanks, David, Danny for your thoughtful replies. I was the first to
leave out the config.php file, certainly an oversight. Recreating the
customizations in /.../include/config.php would be a pain, but a minor
one in the scheme of things. Best to just back it up!

Based on your answers, I think Backup and Recovery look like they're
an area for some improvement in the system overall. I wish I were a
developer and could contribute something, but hopefully identifying
and calling attention to the need is a small contribution of a sort.

For example, David, you said you'd been meaning to script your mySQL
dump. Perhaps such a capability could be written into the
functionality on the Export Data page? My Barracuda spam firewall
does this (mounts a share and copies its database contents at
configurable intervals).

Perhaps there could also be other tools created to simplify the
recovery process (such as a way to pull in the MySQL data from a
file), etc?

I see that there is an experimental feature (looking through
config.default.php) to create checksums. I'd really like to know more
about that and how it could help us.

All in all it seems that backup is easy enough, but *restoring* a
small number of files that were deleted or corrupted is completely
manual. In a disaster recovery (or even a migration) scenario a fresh
install, importing of the database, files into /filestore, and config
should work fine, though still a fairly manual process. In other
words, recovery seems like an easier proposition than simple restore,
and from an archival perspective, Resourcespace may still have a
little growing to do.

Having said all of this, I should add that the feature set is awesome
and it seems totally usable to me. Not interested in anything but
making my implementation solid so I can put it into production.

If anything I've mentioned here sounds like I'm not quite thinking
about this clearly, please, anyone, let me know.

Thanks again for your kind and thoughtful replies,
Ben
> > resourcespac...@googlegroups.com<resourcespace%2Bunsubscribe@goog legroups.com>
> > .

Tom Gleason

unread,
Oct 13, 2010, 2:26:22 PM10/13/10
to resour...@googlegroups.com
It doesn't take much work to develop a script that will back up the database and even the whole RS installation. I'm not sure this is a job appropriate for RS base code, as it's just a general IT issue.

cron job to do a mysql dump nightly, and rsync the whole thing to another server. It's really not that difficult, but it involves issues outside of RS code, such as how to set up a remote SSH server to rsync to (or an amazon s3 bucket mounted with s3fs as David and I were recently discussing), though I agree it might be an area that could use documentation.

Recovery would always be in a disaster situation with unknown variables, but is easy enough if you have a copy of all the files and the database.

To unsubscribe from this group, send email to resourcespac...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/resourcespace?hl=en.




--
Tom Gleason, PHP Developer
DBA Impressive Design

Exploring ResourceSpace at:
http://resourcespace.blogspot.com

Paul Manno

unread,
Oct 13, 2010, 2:39:33 PM10/13/10
to resour...@googlegroups.com
Although cron are rsync are great tools, I agree that a more robust
backup/import/export utility would be great and is needed.

For example, an installation that has multiple clients may want to
flush the system of one particular client's resources in a way that
would allow them to easily import them again. In my case, I manage
multiple "productions". When a production is over, I would like to be
able to get all of the resources/collections/metadata for that given
production out of the system and backed up onto tape for long-term
storage. This way, I wouldn't have to hold all the resources on disk
forever. I could dump them out and reclaim that storage space for
future productions. At the same time, I'd like to be able to import
all those items again if, for some reason, that production gets going
again.

So, it's a big project, and it's probably quite difficult (what to do
about collections, themes, etc...) but it's worth looking into for
future development projects.

Tom Gleason

unread,
Oct 13, 2010, 2:49:26 PM10/13/10
to resour...@googlegroups.com
I don't disagree...we should certainly pursue the possibilities for easing this process in as many use-cases as necessary. If only sample bash scripts and cron jobs on the wiki.

All ResourceSpace development needs to be motivated in some practical way, either by code contributions that can be built upon or by funding us developers to solve specific problems.

Benjamin Bailes

unread,
Oct 13, 2010, 3:30:59 PM10/13/10
to ResourceSpace
Thanks Tom and Paul.

I would imagine a little additional documentation, sample bash scripts
and cron jobs up on the wiki would be a big help for people in my
situation.

Would someone be able to direct me to some more information about the
(currently experimental) checksums and then I'll pipe down. Thanks
again to everyone who replied to my posts. All of your replies were
most helpful.

Ben


On Oct 13, 2:49 pm, Tom Gleason <theorysav...@gmail.com> wrote:
> I don't disagree...we should certainly pursue the possibilities for easing
> this process in as many use-cases as necessary. If only sample bash scripts
> and cron jobs on the wiki.
>
> All ResourceSpace development needs to be motivated in some practical way,
> either by code contributions that can be built upon or by funding us
> developers to solve specific problems.
>
>
>
> On Wed, Oct 13, 2010 at 2:39 PM, Paul Manno <pgma...@gmail.com> wrote:
> > Although cron are rsync are great tools, I agree that a more robust
> > backup/import/export utility would be great and is needed.
>
> > For example, an installation that has multiple clients may want to
> > flush the system of one particular client's resources in a way that
> > would allow them to easily import them again.  In my case, I manage
> > multiple "productions".  When a production is over, I would like to be
> > able to get all of the resources/collections/metadata for that given
> > production out of the system and backed up onto tape for long-term
> > storage.  This way, I wouldn't have to hold all the resources on disk
> > forever.  I could dump them out and reclaim that storage space for
> > future productions.  At the same time, I'd like to be able to import
> > all those items again if, for some reason, that production gets going
> > again.
>
> > So, it's a big project, and it's probably quite difficult (what to do
> > about collections, themes, etc...) but it's worth looking into for
> > future development projects.
>
> > On Wed, Oct 13, 2010 at 1:26 PM, Tom Gleason <theorysav...@gmail.com>
> > wrote:
> > > It doesn't take much work to develop a script that will back up the
> > database
> > > and even the whole RS installation. I'm not sure this is a job
> > appropriate
> > > for RS base code, as it's just a general IT issue.
>
> > > cron job to do a mysql dump nightly, and rsync the whole thing to another
> > > server. It's really not that difficult, but it involves issues outside of
> > RS
> > > code, such as how to set up a remote SSH server to rsync to (or an amazon
> > s3
> > > bucket mounted with s3fs as David and I were recently discussing), though
> > I
> > > agree it might be an area that could use documentation.
>
> > > Recovery would always be in a disaster situation with unknown variables,
> > but
> > > is easy enough if you have a copy of all the files and the database.
>
> > > On Wed, Oct 13, 2010 at 2:11 PM, Benjamin Bailes <
> > bailes.benja...@gmail.com>
> > >> To post to this group, send email to...
>
> read more »

jledhead

unread,
Oct 21, 2010, 8:58:22 PM10/21/10
to ResourceSpace
for mysql I always recommend this
http://sourceforge.net/projects/automysqlbackup/

it will take daily backups, rotate them, give you weekly, monthly, and
yearly backups. We then use our backup software to grab it off the
server. you could use your backup software or rsync it somewhere so
its in 2 places.
> ...
>
> read more »

Tom Gleason

unread,
Oct 27, 2010, 3:06:06 PM10/27/10
to resour...@googlegroups.com
http://www.marksanborn.net/howto/use-rsync-for-daily-weekly-and-full-monthly-backups/

Here's how to make backups daily/weekly/monthy. Remember you also have
to back up MySQL database.

I'd also suggest creating a script that mounts the backup drive only
when the backup is being performed.

> --
> You received this message because you are subscribed to the Google Groups "ResourceSpace" group.
> To post to this group, send email to resour...@googlegroups.com.

> To unsubscribe from this group, send email to resourcespac...@googlegroups.com.


> For more options, visit this group at http://groups.google.com/group/resourcespace?hl=en.
>
>

--

Benjamin Bailes

unread,
Oct 28, 2010, 5:03:58 PM10/28/10
to ResourceSpace
Thanks Jason and Tom,

Very clear instructions that help flesh out a general method for
backing up Resourcespace.

It takes a little while working with the system to come to understand
that what you need to back it up covers several different areas:

/filestore
config.php
the database

As long as you have these you should be certainly be ok from a DR
perspective. As far as restoring a file here and there...perhaps
keeping multiple copies of backups over time can also help.

Thanks again!

Ben


On Oct 27, 3:06 pm, Tom Gleason <theorysav...@gmail.com> wrote:
> http://www.marksanborn.net/howto/use-rsync-for-daily-weekly-and-full-...
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages