Islandora and METS

356 views
Skip to first unread message

Priscilla Caplan

unread,
May 3, 2012, 11:59:35 AM5/3/12
to isla...@googlegroups.com
I am having trouble understanding METS support in Islandora. I know
that Fedora can ingest and export METS objects, and I have seen several
references to the fact that Islandora can use METS:
-- http://islandora.ca/metadata_schemas
-- http://csul.net/sites/csul.fcla.edu/uploads/disc-findings-09-01-11.pdf
--
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=13&ved=0CDEQFjACOAo&url=http%3A%2F%2Fwww.covenantuniversity.edu.ng%2Fcontent%2Fdownload%2F6582%2F46715%2Ffile%2Fomoregbe1.doc&ei=aqqiT9ciip7xBOqcud4I&usg=AFQjCNHrfNk4l2pxES6MyuIKj7USuChqow&sig2=zOprQFizwUZn5f4KD_fR_A

However, none of the Solution Packs use METS and I've been unable to
find an example of any publicly available site using METS. Also, I
can't find any site that displays tables of contents for complex objects
such as books, which is how we use METS currently.

Can anybody point me to information about METS use in Islandora?

Priscilla Caplan
FCLA

David Wilcox

unread,
May 3, 2012, 6:11:28 PM5/3/12
to isla...@googlegroups.com
Hi Priscilla,

Islandora supports a variety of metadata standards using the XML Forms
modules. You can create a METS ingest form by referencing the schema
and building the form using the XML Form Builder. However, some
knowledge of XML and XPath is required to build complex forms. You can
find some information and tutorials here:
https://wiki.duraspace.org/display/ISLANDORA6121/Chapter+6+-++Islandora+and+Ingest+Forms.

That being said, if anyone has already created a METS form you could
import it and edit it (which is much easier than creating a form from
scratch).

Once you have a METS form you can create objects with a METS metadata
Datastream and export that Datastream as XML.

Hope this helps!

David
> --
> You received this message because you are subscribed to the Google Groups
> "islandora" group.
> To post to this group, send email to isla...@googlegroups.com.
> To unsubscribe from this group, send email to
> islandora+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/islandora?hl=en.
>



--
David Wilcox, BA, MLIS
Islandora Training/Support Coordinator
Robertson Library
University of Prince Edward Island
dwi...@upei.ca
Skype Name: david.wilcox82
902.620.5167

Priscilla Caplan

unread,
May 3, 2012, 6:39:46 PM5/3/12
to isla...@googlegroups.com
Thanks David,

METS files are wicked hard to create but we have a good METS-maker. My
concern is whether Islandora can:
-- ingest METS files that has been created externally,
-- use METS structMaps to generate Tables of Contents and correct page
numbering,
-- update the METS file when datastreams are added to or removed from
the content object, or when labels are edited

This ties into the Table of Contents question I asked below, because
most systems I know use METS to create tables of contents and I have not
seen an Islandora implementation that displays Tables of Contents, so
I'm wondering is this even possible.

Priscilla

David Wilcox

unread,
May 4, 2012, 8:32:11 AM5/4/12
to isla...@googlegroups.com
Hi Priscilla,

If you already have XML forms (created using Oxygen, for example) you
can import these forms into the XML Form Builder and edit them in the
interface.

Support for METS structMaps doesn't currently exist, but if you're
interested in pursuing this kind of customization you can contact
DiscoveryGarden (http://discoverygarden.ca) for more information.

With regard to your third question: you would normally associate your
METS form with one or more content models in your Islandora
installation. Whenever you create a new object using that content
model you would then fill out the METS metadata form - this would save
this form as a Datastream and crosswalk the metadata to the default DC
Datastream. Whenever you edit the metadata for that object you would
do so by making changes to the METS form itself - these changes would
then be crosswalked back to the DC form. So your METS form will always
be up-to-date.

Adding Datastreams to the object works a bit differently - normally
the Datastreams associated with the object aren't recorded on the
metadata form. Instead, the Datastreams can be viewed in the object's
Detailed List of Content. I'm not very familiar with METS though -
would you normally keep track of individual Datastreams within the
form itself?

I'm also not aware of any current Islandora installations that use
METS to create tables of contents - again, this is likely possible to
implement, but it would require some customization at the code level.

Hope this helps!

David

Aaron Brenner

unread,
May 4, 2012, 9:55:45 AM5/4/12
to isla...@googlegroups.com
Hi Priscilla,

We're in a similar situation -- we have METS for all paged objects, and we've used the METS structMap in the past to create Table of Contents and to drive page numbering when presenting these objects online.

We're just beginning to work with Islandora, and after working with DiscoveryGarden are running a pilot site with a new version of the Book Solution Pack that uses the Internet Archive BookReader for content viewing.  We talked about wanting to display our METS-derived TOC data there, and brainstormed some possible strategies, but were not able to directly address the issue in the scope of the project so far. (I can share examples of how other IA BookReader users display text structure on the front-end if you're interested.)  As it stands now, we are ingesting METS with our text objects into Fedora, but are not doing anything with it in the Islandora layer.  So I remain very interested in getting this working.

On your other question about using METS data to generate page numbering, we are doing this by parsing the METS document at the time we're ingesting page objects into Fedora.  This is happening in a python script that automates the ingest process.  In our case, we know the file name for each page, but need to retrieve the corresponding label from the METS, so this function parses the METS once for each book and returns a dictionary object that contains the METS page labels, keyed by file names.  Maybe it's of interest?  It requires the lxml library for the parsing, and assumes that pages are in mets:div[@type="page"] elements.

def get_page_label_dict_from_mets(mets_path):
   
"""
    Parse the METS structMap to get proper page label
    """

    METS_NS_MAP
= {'mets': 'http://www.loc.gov/METS/'}
    mets
= etree.parse(open(mets_path, 'r'))
    labels
= {}
   
for file in mets.iter('{http://www.loc.gov/METS/}file'):
        file_id
= file.get('ID')
        file_name
= file[0].get('{http://www.w3.org/1999/xlink}href')
        xpath_string
= '//mets:div[@TYPE="page"]/mets:fptr[@FILEID="%s"]' % (file_id,)
        fptr
= mets.xpath(xpath_string, namespaces=METS_NS_MAP)[0]
        label
= fptr.getparent().get('LABEL')
        labels
[file_name] = label
   
return labels

-AB
--
Coordinator, Digital Research Library
University Library System
University of Pittsburgh
7500 Thomas Blvd., Room 306
Pittsburgh, PA 15260
Reply all
Reply to author
Forward
0 new messages