OAI-PMH harvest from external system to DSpace collection

106 views
Skip to first unread message

Jakub Řihák

unread,
Mar 29, 2018, 5:42:57 AM3/29/18
to DSpace Technical Support
Hello everyone,

I am trying to set a DSpace collection Content source to be a OAI-PMH set from external system. I would like to harvest metadata of all items in the external OAI-PMH set and use custom made crosswalk to "map" metadata from external OAI set to metadata format defined in our DSpace instance.

According to documentation for DSpace 5, there is the following  OAI harvester configuration option in oai.cfg file:

harvester.oai.metadataformats.PluginName
 
 

This field can be repeated and serves as a link between the metadata formats supported by the local repository and those supported by the remote OAI-PMH provider. It follows the form harvester.oai.metadataformats.PluginName = NamespaceURI,Optional Display Name . The pluginName designates the metadata schemas that the harvester "knows" the local DSpace repository can support. Consequently, the PluginName must correspond to a previously declared ingestion crosswalk. The namespace value is used during negotiation with the remote OAI-PMH provider, matching it against a list returned by the ListMetadataFormats request, and resolving it to whatever metadataPrefix the remote provider has assigned to that namespace. Finally, the optional display name is the string that will be displayed to the user when setting up a collection for harvesting. If omitted, the PluginName:NamespaceURI combo will be displayed instead.


There is the following advice: 
 Consequently, the PluginName must correspond to a previously declared ingestion crosswalk.

I was trying to find how exactly i should declare new crosswalk, but found nothing comprehensive so far. Does anyone have some experience with defining a declaring new OAI ingest crosswalk?

The idea behind this is to be able to "parse" as much information from external OAI set as possible. When using standard Simple Dublin Core Metadata Format (setting collection to have external content source in Edit collection -> Content source  and then selecting Simple Dublin Core from Metadata format list in setup form), we are only able to parse a very limited amount of information from the external OAI-PMH set.

I was trying to set up a new OAI metadata format (as described in OAI-PMH dissemination) and reference this new OAI metadata format in oai.cfg, but it didn't work. So far it seems hardly the same as OAI-PMH ingestion metadata format, as I understand it.

We are running DSpace 5.6 with XMLUI. 

Thank you for any advice or suggestion,
with best regards,

Jakub Řihák

Charles University, Central Library

Kiszely András

unread,
Nov 28, 2018, 3:11:02 PM11/28/18
to DSpace Technical Support
Up.
Me and others are fighting with the same problem....
Did u find any answer?
Reply all
Reply to author
Forward
0 new messages