Islandora and METS

Showing 1-5 of 5 messages
Islandora and METS Priscilla Caplan 5/3/12 8:59 AM
I am having trouble understanding METS support in Islandora.  I know
that Fedora can ingest and export METS objects, and I have seen several
references to the fact that Islandora can use METS:
-- http://islandora.ca/metadata_schemas
-- http://csul.net/sites/csul.fcla.edu/uploads/disc-findings-09-01-11.pdf
--
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=13&ved=0CDEQFjACOAo&url=http%3A%2F%2Fwww.covenantuniversity.edu.ng%2Fcontent%2Fdownload%2F6582%2F46715%2Ffile%2Fomoregbe1.doc&ei=aqqiT9ciip7xBOqcud4I&usg=AFQjCNHrfNk4l2pxES6MyuIKj7USuChqow&sig2=zOprQFizwUZn5f4KD_fR_A

However, none of the Solution Packs use METS and I've been unable to
find an example of any publicly available site using METS.  Also, I
can't find any site that displays tables of contents for complex objects
such as books, which is how we use METS currently.

Can anybody point me to information about METS use in Islandora?

Priscilla Caplan
FCLA
This message has been hidden because it was flagged for abuse.
Re: [islandora] Islandora and METS Priscilla Caplan 5/3/12 3:39 PM
Thanks David,

METS files are wicked hard to create but we have a good METS-maker.  My
concern is whether Islandora can:
-- ingest METS files that has been created externally,
-- use METS structMaps to generate Tables of Contents and correct page
numbering,
-- update the METS file when datastreams are added to or removed from
the content object, or when labels are edited

This ties into the Table of Contents question I asked below, because
most systems I know use METS to create tables of contents and I have not
seen an Islandora implementation that displays Tables of Contents, so
I'm wondering is this even possible.

Priscilla
This message has been hidden because it was flagged for abuse.
Re: Islandora and METS Aaron Brenner 5/4/12 6:55 AM
Hi Priscilla,

We're in a similar situation -- we have METS for all paged objects, and we've used the METS structMap in the past to create Table of Contents and to drive page numbering when presenting these objects online.

We're just beginning to work with Islandora, and after working with DiscoveryGarden are running a pilot site with a new version of the Book Solution Pack that uses the Internet Archive BookReader for content viewing.  We talked about wanting to display our METS-derived TOC data there, and brainstormed some possible strategies, but were not able to directly address the issue in the scope of the project so far. (I can share examples of how other IA BookReader users display text structure on the front-end if you're interested.)  As it stands now, we are ingesting METS with our text objects into Fedora, but are not doing anything with it in the Islandora layer.  So I remain very interested in getting this working.

On your other question about using METS data to generate page numbering, we are doing this by parsing the METS document at the time we're ingesting page objects into Fedora.  This is happening in a python script that automates the ingest process.  In our case, we know the file name for each page, but need to retrieve the corresponding label from the METS, so this function parses the METS once for each book and returns a dictionary object that contains the METS page labels, keyed by file names.  Maybe it's of interest?  It requires the lxml library for the parsing, and assumes that pages are in mets:div[@type="page"] elements.

def get_page_label_dict_from_mets(mets_path):
   
"""
    Parse the METS structMap to get proper page label
    """

    METS_NS_MAP
= {'mets': 'http://www.loc.gov/METS/'}
    mets
= etree.parse(open(mets_path, 'r'))
    labels
= {}
   
for file in mets.iter('{http://www.loc.gov/METS/}file'):
        file_id
= file.get('ID')
        file_name
= file[0].get('{http://www.w3.org/1999/xlink}href')
        xpath_string
= '//mets:div[@TYPE="page"]/mets:fptr[@FILEID="%s"]' % (file_id,)
        fptr
= mets.xpath(xpath_string, namespaces=METS_NS_MAP)[0]
        label
= fptr.getparent().get('LABEL')
        labels
[file_name] = label
   
return labels

-AB
--
Coordinator, Digital Research Library
University Library System
University of Pittsburgh
7500 Thomas Blvd., Room 306
Pittsburgh, PA 15260