Save Disk Images to preserve bits

4 views
Skip to first unread message

Michael Peterson

unread,
Mar 6, 2014, 11:11:43 AM3/6/14
to SavingThe DigitalWorld



Regards,
Michael Peterson



(805)201-3178
        mpet...@ltdprm.com
www.ltdprm.org

Begin forwarded message:

Subject: [digital-curation] Digest for digital-...@googlegroups.com - 7 Messages in 1 Topic
Date: March 6, 2014 at 2:23:09 AM PST
To: Digest Recipients <digital-...@googlegroups.com>

Group: http://groups.google.com/group/digital-curation/topics

    Nathan Tallman <ntal...@gmail.com> Mar 05 11:03AM -0500  

    [Apologies for cross-posting]
     
    Recently, when a colleague asked me why we save disk images of legacy
    media, as opposed to just copying the file structure, I was hard pressed to
    provide a definitive answer. After some gesticulations about a bit-level
    copy of the original media and file headers, I couldn't come up with much.
     
    Can someone please elucidate more articulately on this topic? I'd like to
    provide a more cogent argument than, "because we're supposed to!"
     
    Many Thanks,
    Nathan
     
    Seth Shaw <seth....@gmail.com> Mar 05 11:31AM -0500  

    There are a few practical things at play:
    1) Due to variances in the metadata & naming constraints supported by
    various file systems (and operating systems) direct copying usually leads
    to alteration/loss of metadata. This concern may be countered by harvesting
    the filesystem metadata (e.g. run fiwalk on the disk itself) and separately
    storing that metadata outside the filesystem, resulting in individual files
    + file metadata, but a disk image provides a single original
    representation. (Not to mention "hidden" content.)
    2) Disk images are more efficient at preserving associated representation
    information (if the software is also included on the disk) resulting in
    greater ease of emulation (generally speaking).
    3) Disk image (as containers) are less likely to experience accidental loss
    of association and, in my experience, usually easier and more efficient to
    manage as a unit. E.g. accidentally deleting or moving a file from a folder
    after a direct copy v. modifying a disk image to remove a file (subject, of
    course to the various tools and controls put in place). Or validating
    checksums for a single image v. each file. (Okay, so maybe not a huge
    difference if managed correctly.)
    4) Fragile media may only have a single read left in them and a disk image
    is the most effective way to ensure you get everything you may need for
    further analysis if necessary.
    n...) probably more that I can't think of off the top of my head.
     
    Be sure to read Kirschenbaum, Matthew, Richard Ovenden, and Gabriela
    Redwine. *Digital Forensics and Born-Digital Content in Cultural Heritage
    Collections*. Washington, D.C.: Council on Library and Information
    Resources, 2010 for more information on this topic.
     
     
     
    Tom Creighton <nt.cre...@gmail.com> Mar 05 10:02AM -0700  

    As a counter point to some points, maintaining the disk image only means
    that file format deprecation might be harder to deal with. Obviously,
    maintaining a disk image over time requires maintaining the software that
    is capable of reading the image. That is exactly the same issue faced by
    all emulation approaches. One can't assume that the software that
    originally rendered the file content contained in the image is necessarily
    part of the image. Not only that, but even if it is, one can't assume that
    maintaining the disk image in and of itself ensures viability of that
    software. In other words, software that enables reading of a particular
    disk image does not necessarily support emulation of programs that are
    stored on said image.
     
    My point here is not to derail the discussion. All the points made are
    valid. But there are more issues to consider. One way to think of it is
    that capturing a disk image is necessary but not sufficient with respect to
    long term preservation of the artifacts contained within that image.
     
     
     
     
    Chris Prom <chris...@gmail.com> Mar 05 12:12PM -0600  

    I've enjoyed reading this thread, but my perspective here is a little bit different if the word "Save" from Nathan's original question is meant to mean "save permanently."
     
    Let me preface this by saying that creating a disk image is an essential part workflow for capturing, appraising, and processing records, for the reasons indicated. At Illinois, we capture a disk image for all of these reasons, whenever possible, and those can be articulated to donors. So, I appreciate the value of disk imaging (in spite of some skepticism that I had a few years ago.)
     
    However, I believe there are legitimate cases where a repository may decided to discard the disk after completing the archival 'business process' of capture, appraisal, arrange, describe, store, leading to the generation of the 'archival information packet' For example:
     
    Case one: You have a disk image for a 3 TB hard drive that includes 20GB of files, and the remainder marked as deleted space. Why store 3 TB of nothing if your storage infrastructure cannot handle it?
    Case two: A disk image contains many files that marked deleted, but recoverable using forensics tools, and where the donor did not agree to preserve the deleted files. Yes, it's nice to think about saving anddeleted files, but does your donor agree with this? If not, you may be setting yourself up for a huge breach of trust.
    Case three: Related to Tom's point--A disk image where the files have been processed and migrated to a preservation format and you don't care about emulating or going back to the originals.
    Case four: A disk image which contains numerous files that based on consultation with the donor and standard archival appraisal, have no continuing value
     
    My point here is not to bash disk imaging, just to say that repositories should be very intentional in considering the role disk imaging will play in the overall digital curation program, by balancing resources, technology, and institutional capacity in a way that makes most sense for the records, donor, and user community.
     
    Thanks,
     
    Chris Prom
    University of Illinois at Urbana-Champaign
    chris...@gmail.com
     
     
     
     
    L Snider <lsn...@gmail.com> Mar 05 02:23PM -0600  

    One word, authenticity...It helps prove what we did and how we did it (if
    we also properly document all steps IMO)
     
    Cheers
     
    Lisa
     
    --
    Lisa Snider
    Electronic Records Archivist
    Harry Ransom Center
    The University of Texas at Austin
    P.O. Box 7219
    Austin, Texas 78713-7219
    P: 512-232-4616
    www.hrc.utexas.edu
     
     
     
     

--
You received this message because you are subscribed to the Google Groups "Digital Curation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digital-curati...@googlegroups.com.
To post to this group, send email to digital-...@googlegroups.com.
Visit this group at http://groups.google.com/group/digital-curation.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages