Assetstore Size limitation

47 views
Skip to first unread message

Jeffrey A Trimble

unread,
Oct 6, 2016, 8:58:16 AM10/6/16
to dspac...@googlegroups.com

Is there a mandatory size limitation of an Assetstore?  My disk space is more or less limitless—I just have to declare space (I’m using AIX with DS4700 Storage array).

 

Is there a point when I should begin a new asststore, (such as assetstore2)?  Should it be based on backup conditions such as tape size (yes, we are still old technology with LTO 4.0 tapes).

 

TIA,

 

Jeff

 

Jeffrey Trimble, MLS

William F. Maag Library

Youngstown State University

330.941.2483 (Office)

http://digital.maag.ysu.edu

helix84

unread,
Oct 6, 2016, 9:21:24 AM10/6/16
to Jeffrey A Trimble, dspac...@googlegroups.com
There is no practical limitation. The would-be limitation of number of
files per directory is already mitigated by having a deep structure
(AA/BB/CC/AABBCCDDEEFFGGHH...) to keep the files sparse.

You still may want to check what's your largest number of files per
directory and create a new assetstore if it's in the thousands, but
that should not happen unless you have a humongous number of
bitstreams (and assuming no significant deficiencies in the hashing
function used):

du /dspace/assetstore/ --inodes -S | sort -n | tail

Multiple assetstores are available not due to a limitation in size
from DSpace, but rather to accomodate size limitations of physical
devices and make adding another device/assetstore easy.



Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech...@googlegroups.com.
> To post to this group, send email to dspac...@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.

Mark Wood

unread,
Oct 6, 2016, 10:46:56 AM10/6/16
to DSpace Technical Support
The only limitation I can think of in DSpace itself is the size of the unique identifier by which a bitstream is retrieved.  In DSpace 6, that's a UUID, which is a 128-bit number.  In v5 and earlier it was a Java 'int', which is *only* 32 bits signed (so, can identify 2 billion bitstreams).

The OS filesystem that holds the bitstreams will have its own limits.  Contemporary filesystem designs have limits on the order of exabytes, and billions to quintillions of files.  You may need to do some tuning of your assetstore filesystem(s) to even approach those limits, though.

I think it most likely that you'll first hit limits of filesystem *performance*, and particularly of backup performance.  And the answer to many performance questions is "keep measuring regularly and react to signs of significant deterioration."

I find backup of large services particularly worrisome.  We need ways to do backup smarter, not just faster.  That's true of everything, not just DSpace.

Jeffrey A Trimble

unread,
Oct 6, 2016, 4:22:46 PM10/6/16
to Mark Wood, DSpace Technical Support

Thanks to Helix and Mark for illuminating the issue I need to look at.  I’m not nearly at the 2 billion bitstreams, and my storage array “can be” defined as one physical volume on the server, and yet be further define as logical volumes.  While this isn’t my area of expertise, the explanations assist me in making wise choices ahead.

 

Thanks again.

 

 

Jeffrey Trimble, MLS

William F. Maag Library

Youngstown State University

330.941.2483 (Office)

http://digital.maag.ysu.edu

 

--

Reply all
Reply to author
Forward
0 new messages