Mapping ETD metadata sent from ProQuest to DSpace via SWORDv1

117 views
Skip to first unread message

tuf0...@temple.edu

unread,
Nov 30, 2018, 1:38:25 PM11/30/18
to DSpace Technical Support
Group-  

We've configured our SWORDv1 end and ProQuest has successfully delivered to our repository through it.  ProQuest support is saying that mapping is done DSpace-side, rather than on their end, but I'm unclear on how this works.  Could someone with more experience on the ProQuest->DSpace workflow start me down the right path?  

Would we be creating an XML/.XSL map along the lines of the file I'm attaching?  Or would we need to create another type of document?  How would this be added to DSpace/SWORD?  

Thank you so much for your help!  

-Gabe Galson
diss-to-dc (1).xsl

tuf0...@temple.edu

unread,
Nov 30, 2018, 1:43:54 PM11/30/18
to DSpace Technical Support
To clarify what I mean by mapping, I mean mapping ProQuest ETD Admin xml field 'X' to DSpace dublin core field 'Y'.    How would we get "//DISS_description/DISS_categorization/DISS_category/DISS_cat_desc" to map to "dc.subject"

Thanks!

-Gabe

tuf0...@temple.edu

unread,
Dec 13, 2018, 1:30:21 PM12/13/18
to DSpace Technical Support
Checking one more time.  Does anyone have experience using SWORD to crosswalk metadata?  

-Gabe

Tim Donohue

unread,
Dec 13, 2018, 2:04:19 PM12/13/18
to tuf0...@temple.edu, DSpace Technical Support
Hi Gabe,

I don't have direct experience with mapping *ProQuest* fields into DSpace fields.  However, I do have a few possible clues on how DSpace SWORD does metadata mapping that could help you out here.

The SWORD (v1) interface just uses DSpace's built in "METS" Ingester by default for all imports (via SWORD).  That's in this configuration in sword-server.cfg: https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/modules/sword-server.cfg#L21

The default "METS" ingester uses a specific Crosswalk, based on the "mdtype" specified in the METS metadata. Those are configured in your dspace.cfg here:
https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/dspace.cfg#L563-L574 

For SWORD, the recommended format is often EPDCX (EPrints DC XML), but whatever format is specified in the METS is the format that DSpace will try to crosswalk.  If the format is specified as EPDCX, then DSpace will use this configuration:
https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/dspace.cfg#L438 

(If you want to see what a SWORD METS package with EPDCX specified as the "mdtype", we have an example that is distributed with DSpace at: https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace-sword/example/example.zip)

That configuration tells DSpace that anything claiming to be "EPDCX" should be mapped using the XSLT stylesheet at [dspace]/config/crosswalks/sword-swap-ingest.xsl.  That specific XSLT stylesheet is what translates EPDCX to DSpace's internal dublin core fields: https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/crosswalks/sword-swap-ingest.xsl 

This entire process is somewhat documented in the SWORD configuration docs at: https://wiki.duraspace.org/dsdoc6x/using-dspace/ingesting-content-and-metadata/swordv1-server (You'll see all these configurations mentioned there.)

So, if you wanted to change this to map ProQuest to DSpace, you'd first want to see what these ProQuest METS files look like (and the "mdtype" they report).  Then, you can add in a *new* set of these configurations to map that "mdtype" to an XSLT mapper.  

So, assuming ProQuest's files say something like "MDTYPE='PROQUEST'" (or "MDTYPE='OTHER' OTHERMDTYPE='PROQUEST'), you should be able to do something like this:

mets.submission.crosswalk.PROQUEST = PROQUEST
crosswalk.submission.PROQUEST.stylesheet = crosswalks/my-proquest-mapper.xsl  

Then, you'd just need to be sure your "my-proquest-mapper.xsl" is able to map all the necessary Proquest metadata fields into DSpace dublin core (similar to how "sword-swap-ingest.xsl" does).

So, maybe that'd give you some clues on how you could get started!

Tim



--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
--
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

tuf0...@temple.edu

unread,
Dec 13, 2018, 2:20:17 PM12/13/18
to DSpace Technical Support
Tim-  

Thank you so much!!  This is just the overview I needed to move forward.  Keep up the good work!

-Gabe

Robert Wilson

unread,
May 5, 2020, 12:48:32 PM5/5/20
to DSpace Technical Support
Hi Gabe,

I stumbled across this thread while researching my own IRs lack of crosswalking from Proquest SWORD into DSpace. I'm curious, did you make any headway with this?

Cheers,
Robert Wilson
Walker Library, MTSU

Bram Luyten

unread,
May 7, 2020, 3:08:23 AM5/7/20
to Robert Wilson, DSpace Technical Support
Good morning,

we have configured and customised Proquest mappings for a number of clients.
In the deposits that we have seen, Proquest uses multiple dmdSec entries, while SWORDv1 is generally geared at selecting and only processing one.

We found it helpful to customize

to read from multiple dmdsec entries

Hope this helps!

Bram

logoBram Luyten
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Gaston Geenslaan 14, 3001 Leuven, Belgium
DSpace Express Hosting - Open Repository Hosting - Custom DSpace Services


Reply all
Reply to author
Forward
0 new messages