digital object storage

239 views
Skip to first unread message

GR Mulcaster

unread,
Jan 6, 2017, 1:02:48 AM1/6/17
to AtoM Users
The UTAS project team is about to upload several large batches of digital objects. 
In budgeting for likely future storage requirements, it is good to know where/how images are stored. 
Another long-term digital preservation environment will be rolled out under a future project.  
But to what extent do AtoM users regard AtoM as a database or purely a descriptive/discovery layer?
Some images will reside on external repositories and be only "linked digital objects" but some will reside on the AtoM server database as "uploaded digital objects".


Glenn Mulcaster
Project Archivist, 
UTAS Library Special and Rare Materials Collection
Launceston, TAS

 


Dan Gillean

unread,
Jan 6, 2017, 4:20:38 PM1/6/17
to ICA-AtoM Users
Hi Glenn

AtoM stores digital objects in the uploads directory - if you have command-line access to your AtoM instance, I suggest you explore it. If you are at the root AtoM directory, simply change directories to uploads.

The uploads directory usually has one main subdirectory in it called r. Inside this, the next subdirectories are based on the authorized form of name of institutions in your AtoM instance. If you have themed your repository, then a repo directory will contain a conf subdirectory, in which you'll find uploaded banners and logos. It gets more difficult to trace the exact path of the digitial objects below this, however. For each uploaded object, AtoM will add a SHA-256 hash to the object, and then, to avoid collisions and aid in retrieval, it will create a specific set of subdirectories based on this hash. I believe it will first create a directory based on the first character of the checksum, then another subdirectory based on the second checksum character, and finally another with the full checksum value as the directory name, in which the digital object is placed:



Note that more recent versions of AtoM might actually include a 3rd nesting based on the third checksum value - I was looking at our old demo data uploads directory, so it might not reflect all the changes. Here's an example of the code doing this in our current 2.4 development branch: https://github.com/artefactual/atom/blob/qa/2.4.x/lib/model/QubitDigitalObject.php#L1839

return '/'.QubitSetting::getByName('upload_dir')->__toString().'/r/'.$repoDir.'/'.$checksum[0].'/'.$checksum[1].'/'.$checksum[2].'/'.$checksum;

So depending on your AtoM version, the full path to an uploaded digital object is likely going to be either:

uploads/r/repository-name/first-checksum-character/second-checksum-character/full-checksum/

or

uploads/r/repository-name/first-checksum-character/second-checksum-character/third-checksum-character/full-checksum/



When you link to external digital objects, AtoM will still generate a local thumbnail and reference copy for use in search/browse results, but the master will not be pulled into AtoM.

You can use a protocol such as rsync and some kind of cron job to regularly back up your uploads directory - or you could regularly add them to a zipped tar file. You might find the instructions in our upgrade documentation  and our data backup suggestions helpful for figuring this out:

We do NOT consider AtoM to be suitable for long-term digital preservation - it is a system for archival description and access. If you are interested in standards-based digital preservation workflows that can be used in conjunction with AtoM, I would strongly recommend you look into our sibling project, Archivematica, which can perform DIP digital object uploads (with DC metadata) to AtoM, while also generating standards-based AIPs for long term preservation and storage. See:

Regards,



Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6c3cd991-9762-4a0b-9671-4c48a252fc05%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

GR Mulcaster

unread,
Jan 9, 2017, 5:56:09 AM1/9/17
to ica-ato...@googlegroups.com
Dan
This is so thorough and cleverly covers several more potential follow-up questions.
Regards
Glenn Mulcaster


To post to this group, send email to ica-ato...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/8-iwOOELQNo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-users+unsubscribe@googlegroups.com.

To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.

For more options, visit https://groups.google.com/d/optout.



--
AtoM Project Archivist
SPARC
University of Tasmania Library Special and Rare Materials Collection

Level 5, Morris Miller Library
Sandy Bay Campus
Sandy Bay, TAS 7005
&
Launceston Campus Library
Newnham Road
Newnham, TAS 7248







Fernando Fernández de Aranguiz

unread,
Jan 24, 2020, 1:57:38 AM1/24/20
to AtoM Users
Hi,

What happens if the file system is almost full?
Can you "add" other filesystems?

Thanks

Dan Gillean

unread,
Jan 24, 2020, 2:39:40 PM1/24/20
to ICA-AtoM Users
  Hi Fernando, 

Unfortunately, we don't know of any method of pointing the uploads directory to multiple storage locations. Your best strategy is probably to back up the uploads directory, upgrade the size of the datastore (perhaps using a network attached storage device?) and then migrate the uploads directory back to the new storage location. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/7846c5c3-7bb7-4a89-9097-58593d293e56%40googlegroups.com.

Fernando Fernández de Aranguiz

unread,
Jan 27, 2020, 1:05:00 AM1/27/20
to AtoM Users
Thank your very much, Dan. :-)


El viernes, 24 de enero de 2020, 20:39:40 (UTC+1), Dan Gillean escribió:
  Hi Fernando, 

Unfortunately, we don't know of any method of pointing the uploads directory to multiple storage locations. Your best strategy is probably to back up the uploads directory, upgrade the size of the datastore (perhaps using a network attached storage device?) and then migrate the uploads directory back to the new storage location. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


On Fri, Jan 24, 2020 at 1:57 AM Fernando Fernández de Aranguiz <fernando.fdz.aranguiz@odei.eus> wrote:
Hi,

What happens if the file system is almost full?
Can you "add" other filesystems?

Thanks

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-ato...@googlegroups.com.

Karl Goetz

unread,
Jan 27, 2020, 8:02:40 PM1/27/20
to ica-ato...@googlegroups.com
On Fri, 24 Jan 2020 14:39:24 -0500
Dan Gillean <d...@artefactual.com> wrote:

> Hi Fernando,
>
> Unfortunately, we don't know of any method of pointing the uploads
> directory to multiple storage locations. Your best strategy is probably to
> back up the uploads directory, upgrade the size of the datastore (perhaps
> using a network attached storage device?) and then migrate the uploads
> directory back to the new storage location.
>


Hi,

Unix like systems support mounting filesystems ("drives") at arbitrary mount points so the following is an entirely valid way of mounting
filesystems:

# First drive, two partitions
/dev/sda1 /
/dev/sda2 /usr/share/nginx/atom/uploads

# Another drive just for big-project
/dev/sdc1 /usr/share/nginx/atom/uploads/r/big-project

You could also, atom permitting ( and I haven't proactively researched the atom component of this sorry), symlink big-project
to /mnt/some-other-mountpoint.
In these examples the filesystems could be local or remote.

For doing the move to another filesystem, you'll do something like this:

# Tell people to stop editing and uploading!
cd /usr/share/nginx/atom/uploads/r/
mv big-project big-project.backup
mkdir big-project
mount /your/new/filesystem big-project
mv big-project.backup/* big-project/
# just in case
mv big-project.backup/.??* big-project/
# update fstab and other system configuration as required.
# tell everyone they can edit again.

Your choice of per-project filesystem or symlinks to external locations will depend on atoms support of symlinks and your local site
requirements.
thanks,
kk


>
>
> On Fri, Jan 24, 2020 at 1:57 AM Fernando Fernández de Aranguiz
> <fernando.f...@odei.eus> wrote:
>
> > Hi,
> >
> > What happens if the file system is almost full?
> > Can you "add" other filesystems?
> >
> > Thanks
> >



--
Karl Goetz
Technical Services Officer - eResearch, Information Technology Services
University of Tasmania & Tasmanian Partnership for Advanced Computing

Mail: University of Tasmania, Private Bag 69, Hobart, Tasmania 7001
Delivery: TT Flynn Street, Sandy Bay, Tasmania 7005



University of Tasmania Electronic Communications Policy (December, 2014).
This email is confidential, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone outside the intended recipient organisation is prohibited and may be a criminal offence. Please delete if obtained in error and email confirmation to the sender. The views expressed in this email are not necessarily the views of the University of Tasmania, unless clearly intended otherwise.

Fernando Fernández de Aranguiz

unread,
Jan 28, 2020, 1:55:24 AM1/28/20
to AtoM Users
Thank you very much, Karl.

Matthew Bruton

unread,
Feb 3, 2020, 4:41:34 AM2/3/20
to ica-ato...@googlegroups.com
Karl,
I'm with you so far as moving directories and mounting drives is concerned, but don't know what you are talking about when is comes to fstab and configuration. Is this how you tell atom where the new location of the files is? Do you know where I can get more info on how to do this?
Thanks,
Matthew

Karl Goetz

unread,
Feb 3, 2020, 4:39:49 PM2/3/20
to ica-ato...@googlegroups.com
On Mon, 3 Feb 2020 01:41:33 -0800 (PST)
Matthew Bruton <matthewb...@gmail.com> wrote:

> Karl,
> I'm with you so far as moving directories and mounting drives is concerned,
> but don't know what you are talking about when is comes to fstab and
> configuration. Is this how you tell atom where the new info is? Do you know
> where I can get more info on this?
> Thanks,
> Matthew

Hi Matthew,
fstab is a system configuration file (located at /etc/fstab) which tells your operating system what to mount. You will almost certainly
need entries added there if you are doing something like I described, to ensure the filesystems are in place when needed.

Running `man 5 fstab` on your server will give you a technical description of the files format and I'm sure there are some 'friendly'
references to be found online.

A word of warning - you mess up that file and your system stops booting.


thanks,
kk

> On Tuesday, January 28, 2020 at 1:02:40 AM UTC, Karl Goetz wrote:
> >[...]
> > Unix like systems support mounting filesystems ("drives") at arbitrary
> > mount points so the following is an entirely valid way of mounting
> > filesystems:
> >
> > # First drive, two partitions
> > /dev/sda1 /
> > /dev/sda2 /usr/share/nginx/atom/uploads
> > # Another drive just for big-project
> > /dev/sdc1 /usr/share/nginx/atom/uploads/r/big-project
> > [...]
> > For doing the move to another filesystem, you'll do something like this:
> >
> > # Tell people to stop editing and uploading!
> > cd /usr/share/nginx/atom/uploads/r/
> > mv big-project big-project.backup
> > mkdir big-project
> > mount /your/new/filesystem big-project
> > mv big-project.backup/* big-project/
> > # just in case
> > mv big-project.backup/.??* big-project/
> > # update fstab and other system configuration as required.
> > # tell everyone they can edit again.
> >


Reply all
Reply to author
Forward
0 new messages