OAI-PMH harvesting not oai_dc compatible

95 views
Skip to first unread message

Leif Longva

unread,
Mar 5, 2018, 5:15:44 AM3/5/18
to Dataverse Users Community
Hi

We would like to have our Dataverse harvested by e.g. BASE (https://www.base-search.net/) and other harvesters. From BASE we get the response that our OAI is not oai_dc compatible. It does not return the metadata correctly:
"... returns fields as dc:license, dc:dateSubmitted, dc:modified, dc:issued which looks more as qualfied dc."

The problem seems to be that OAI returns elements from the dcterms namespace, but presents them as dc elements (http://dublincore.org/documents/dcmi-terms):
...
<dc:isReferencedBy> ... </dc:isReferencedBy>
<dc:dateSubmitted> ... </dc:dateSubmitted>
<dc:license>CC0</dc:license>
...

Is this something that may be solved in Dataverse? So that e.g.
<dc:isReferencedBy> is shown as <dc:relation>
<dc:dateSubmitted> is shown as <dc:date>
<dc:license> is shown as <dc:rights>,
in accordance with unqualified dublin core.

I see that OAI-PMH issues are discussed previously in this User Community, but I do not see where this issue stands. Is this perhaps something that the OpenAIRE compliancy will address?

Yours
Leif Longva
UiT, Norway and 
DataverseNO

Philip Durbin

unread,
Mar 5, 2018, 9:08:41 AM3/5/18
to dataverse...@googlegroups.com
Hi Leif,

Thank you for this bug report. Can you please create a GitHub issue about it at https://github.com/IQSS/dataverse/issues ? That way we can start tracking it and eventually estimate the effort required and work on it.

As Danny pointed out at https://groups.google.com/d/msg/dataverse-community/FG7Iuh-kj90/Z9Hb-XdpAgAJ for Dataverse 4.6 in https://github.com/IQSS/dataverse/issues/3307 we did work on validating our OAI-PMH implementation but it sounds like there is still more work to do. We appreciate the feedback!

I don't know the exact scope of the OpenAIRE effort but I'm glad you mentioned it.

Thanks,

Phil

p.s. If you're curious where a particular GitHub issue stands, I would suggest doing a searches in our kaban board (Waffle). For example, for OAI-PMH/harvesting, you could do searches like this (screenshots attached):

- https://waffle.io/IQSS/dataverse?search=oai-pmh
- https://waffle.io/IQSS/dataverse?search=harvesting

(You can also filter issues in Waffle/GitHub by issue "labels".)

This will give you a sense of where a particular issue is in our development process (Inbox, Backlog, Next Sprint, Development, Code Review, QA, Done). You can also visit https://dataverse.org/goals-roadmap-and-releases for a higher-level view of the direction we're taking. We make an effort to be transparent in our process, which I wrote about at https://opensource.com/open-organization/17/11/transparency-dataverse-project (see also https://groups.google.com/d/msg/dataverse-community/brxCn1E9tX0/VbsNz4u8BgAJ ) . I hope this helps.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/eb960760-08e0-40f2-a866-bb1db334ad6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
waffle-harvesting.png
waffle-oai-pmh.png

Bollini Andrea

unread,
Mar 9, 2018, 3:14:17 AM3/9/18
to dataverse...@googlegroups.com

Dear all,

below you will find what we have proposed and get funded by OpenAIRE:

"Dataverse supports the OAI-PMH v2 protocol, but, unfortunately, it is not yet compliant with OpenAIRE; therefore, we propose to implement the openAIRE guidelines for data repository, taking into account the latest DataCite metadata schema 4.1, released 23-Oct-2017: https://schema.datacite.org/ and the clarification about deviations that might be required by OpenAIRE in the preparation of the new version of the guidelines, e.g. how to expose funder and funding information, relation types, resource types, provided this clarification is delivered at latest by mid-March 2018. The development will be based on the 4.8.5 version of Dataverse, up to now the latest released version, paying attention to easy update and forward compatibility"

The deadline for this enhancement is 19th April, so we essentially plan to work on the export of the dataverse metadata in the Datacite 4.1 format, the creation of the correspondent "wrapper" in the oai-pmh (oai_datacite format) and definition of the oai-pmh set required by the openaire guidelines.

Hope this clarify,

Andrea

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CABbxx8H5pNzUwt5n7xJ-LqQ5%2Bj%2B7TxQWRT%3D8S9RF5%3DEovMhJ2w%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.


--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come spam.


-- 
Andrea Bollini
Chief Technology and Innovation Officer

4Science,  www.4science.it
office: Via Edoardo D'Onofrio 304, 00155 Roma, Italy
mobile: +39 333 934 1808
skype: a.bollini
linkedin: andreabollini
orcid: 0000-0002-9029-1854

an Itway Group Company
Italy, France, Spain, Portugal, Greece, Turkey, Lebanon, Qatar, U.A.Emirates

-- 
This message has been checked by Libra ESVA and is believed to be clean.

Leif Longva

unread,
Mar 15, 2018, 9:31:39 AM3/15/18
to Dataverse Users Community
Thanks you, Philip and Andrea, for your responses.

From what you write, Andrea, I interpret the OpenAIRE compliancy project such that it will make dataverse oai_dc compatible. Is that correctly understood?

Yours,
Leif
To post to this group, send email to dataverse...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come spam.


julian...@g.harvard.edu

unread,
Mar 15, 2018, 10:24:51 AM3/15/18
to Dataverse Users Community
Hi Leif,

I don't see the connection between OpenAIRE compliancy and correcting Dataverse's oai_dc, but maybe I'm missing a broader recommendation that OpenAIRE has made about oai_dc?

Could the issue you described about dcterms showing up in your installation's oai_dc records be related to this issue: https://github.com/IQSS/dataverse/issues/3563? In it, Leonid shares how to delete oai_dc caches so that Dataverse can reexport records using an oai-pmh fix introduced in Dataverse 4.6.

Best,
Julian

Leif Longva

unread,
Mar 16, 2018, 4:28:22 AM3/16/18
to dataverse...@googlegroups.com
Thanks,  Julian

I guess I am revealing my lack of knowledge here. I was thinking that OpenAIRE uses OAI harvesting, and thus that OpenAIRE compliancy would do the trick.

I / we will look closer into this - next week.

Leif


Julian
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come spam.



-- Andrea Bollini Chief Technology and Innovation Officer 4Science, www.4science.it office: Via Edoardo D'Onofrio 304, 00155 Roma, Italy mobile: +39 333 934 1808 skype: a.bollini linkedin: andreabollini orcid: 0000-0002-9029-1854 an Itway Group Company Italy, France, Spain, Portugal, Greece, Turkey, Lebanon, Qatar, U.A.Emirates
-- 
This message has been checked by Libra ESVA and is believed to be clean.



--
You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/GgnN6BvDAUU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/6a1d9bb4-6579-4ea8-8f64-8d3320be5922%40googlegroups.com.

Philip Durbin

unread,
Mar 26, 2018, 8:46:49 AM3/26/18
to dataverse...@googlegroups.com
Leif, thanks for opening https://github.com/IQSS/dataverse/issues/4537 so that we can have some tracking around this issue.

Phil

To unsubscribe from this group and all its topics, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse-community@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages