Access information in metadata complying with OpenAIRE Guidelines

124 views
Skip to first unread message

Philipp at UiT

unread,
Jul 12, 2017, 2:27:11 AM7/12/17
to Dataverse Users Community
The EU-sponsored research infrastructure project openAIRE aims to promote open scholarship and substantially improve the discoverability and reusability of research publications and data. Their guidelines have by now gained status as de-facto standards for OA research publication and data providers. In their Guidelines for Data Archives, they state i.a. what kind of metadata information research data archives should provide. In the section about rights, they state two pieces of information that should be in place:

1. Information about access: closedAccess, embargoedAccess, restrictedAccess, openAccess
2. Information about license(s)

We find information about license(s) in the metadata provided by Dataverse, in the Dublin Core and JSON format, but not in the DDI format.
However, we could not find any information about access in the Dataverse metadata.

For us, and I guess for other Dataverse installations/users in Europe, compliance with the openAIRE guidelines is important. So, I wonder whether information about access and license(s) could be complemented in a new version?

Best,
Philipp

Philip Durbin

unread,
Jul 12, 2017, 6:50:32 AM7/12/17
to dataverse...@googlegroups.com
Hi Philipp,

The openAIRE project sounds interesting. Thanks.

The JSON export* format Dataverse is something we invented ourselves so we can add whatever information we want without worrying about complying with any standards. Perhaps the DDI and Dublin Core formats can also include extra information but someone would need to make sure we're putting it in the right fields. I'd say you should go ahead and create a GitHub issue. Since we're trying to work in small chunks, maybe that GitHub issue could be about adding the information to the JSON export format.

I hope this helps,

Phil

* http://guides.dataverse.org/en/4.7/admin/metadataexport.html

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/f58b85b3-6127-4212-8be2-025c406cb430%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Pete Meyer

unread,
Jul 12, 2017, 11:02:51 AM7/12/17
to Dataverse Users Community
Hi Philipp,

Could you provide any more information, or pointers to information, about what "mandatory when applicable" (https://guidelines.openaire.eu/en/latest/data/field_rights.html) means in this context? 

Best,
Pete

Sebastian Karcher

unread,
Jul 12, 2017, 1:06:31 PM7/12/17
to dataverse...@googlegroups.com
I haven't tested this, but based on the Metadata table that I got from Julian, access information should be included in the DDI:

DV LabelDDI 2.5
Waiver2.4.2 useStmt
Terms of Use2.4.2 useStmt
Confidentiality Declaration2.4.2.1 confDec
Special Permissions2.4.2.2 specPerm
Restrictions2.4.2.3 restrctn
Citation Requirements2.4.2.5 citeReq
Depositor Requirements2.4.2.6 deposReq
Conditions2.4.2.7 conditions
Disclaimer2.4.2.8 disclaimer
Data Availability
Terms of Access2.4 dataAccs
Data Access Place2.4.4.1 accsPlac
Original Archive2.4.4.2 origArch
Availability Status2.4.4.3 avlStatus
Contact for Access2.4.2.4 contact
Size of Collection2.4.4.4 collSize
Study Completion2.4.4.5 complete

On Wed, Jul 12, 2017 at 6:50 AM, Philip Durbin <philip...@harvard.edu> wrote:
Hi Philipp,

The openAIRE project sounds interesting. Thanks.

The JSON export* format Dataverse is something we invented ourselves so we can add whatever information we want without worrying about complying with any standards. Perhaps the DDI and Dublin Core formats can also include extra information but someone would need to make sure we're putting it in the right fields. I'd say you should go ahead and create a GitHub issue. Since we're trying to work in small chunks, maybe that GitHub issue could be about adding the information to the JSON export format.

I hope this helps,

Phil

* http://guides.dataverse.org/en/4.7/admin/metadataexport.html
On Wed, Jul 12, 2017 at 2:27 AM, Philipp at UiT <uit.p...@gmail.com> wrote:
The EU-sponsored research infrastructure project openAIRE aims to promote open scholarship and substantially improve the discoverability and reusability of research publications and data. Their guidelines have by now gained status as de-facto standards for OA research publication and data providers. In their Guidelines for Data Archives, they state i.a. what kind of metadata information research data archives should provide. In the section about rights, they state two pieces of information that should be in place:

1. Information about access: closedAccess, embargoedAccess, restrictedAccess, openAccess
2. Information about license(s)

We find information about license(s) in the metadata provided by Dataverse, in the Dublin Core and JSON format, but not in the DDI format.
However, we could not find any information about access in the Dataverse metadata.

For us, and I guess for other Dataverse installations/users in Europe, compliance with the openAIRE guidelines is important. So, I wonder whether information about access and license(s) could be complemented in a new version?

Best,
Philipp

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Sebastian Karcher, PhD
www.sebastiankarcher.com

Gautier, Julian

unread,
Jul 13, 2017, 4:49:20 PM7/13/17
to dataverse...@googlegroups.com
Hi Philipp,

Just confirming that all of the metadata Sebastian pointed out appears in the DDI that a user can export for each dataset that Dataverse publishes. None of it is mandatory. (You also made me realize that some of this metadata is in the wrong place in the XML according to DDI's codebook 2.5 schema.)

I'm also curious about the definition of mandatory when applicable. 


--
Sebastian Karcher, PhD
www.sebastiankarcher.com

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Julian Gautier
Product Research Specialist, IQSS

Gautier, Julian

unread,
Jul 13, 2017, 5:03:32 PM7/13/17
to dataverse...@googlegroups.com
"OpenAIRE expects metadata to be encoded in the DataCite metadata format."

I'm not clear if the mapping in Dataverse's crosswalk between (1) Dataverse's terms of use and access metadata and (2) related metadata in DataCite are actually being used. This github issue suggests adding additional DataCite fields, although it doesn't mention access metadata.

--
Julian Gautier
Product Research Specialist, IQSS

julian...@g.harvard.edu

unread,
Jul 14, 2017, 6:01:16 PM7/14/17
to Dataverse Users Community
Just correcting something I wrote earlier. The contact metadata is mandatory. (Although a bug, reported here, is preventing contact emails from being included in the DDI export if the optional contact name is empty when the dataset is published.)

Philipp at UiT

unread,
Jul 17, 2017, 7:39:56 AM7/17/17
to Dataverse Users Community
Sorry for my late reply.

Information about license(s):
I noticed that information about the licence is provided in DDI. It shows up in the field useStmt.

Information about access:
According to openAIRE, "mandatory when applicable" means "when the value of the field can be obtained it must be present in the metadata record". But I'm not quite sure how Dataverse can provide the correct access information, jf. the values closedAccess, embargoedAccess, restrictedAccess, openAccess. For instance, this dataset is licensed under CC0, but apart from the README file, all files are restricted. Does this mean that the value restrictedAccess has to be used? Actually, as stated under Terms of Access in the Terms section, the files are embargoed until the 1st of January 2019 at the latest. So here, embargoedAccess would most appropriate. However, as of today, there is now machine readable embargo function in Dataverse (cf. this issue). For embargoed access, also the embargo date is required.

OpenAIRE expects metadata to be encoded in the DataCite metadata format, so I guess information about access should be placed in the sections Rights (DataCite Metadata ID 16) and rightsURI (ID 16.1).

In addition, in order to be made visible in openAIRE search interfaces, metadata also must include funding information.

Maybe some other Dataverse users in Europe have some more thoughts about this?

Best,
Philipp

Pete Meyer

unread,
Jul 17, 2017, 10:27:44 AM7/17/17
to Dataverse Users Community
Hi Philipp,


On Monday, July 17, 2017 at 7:39:56 AM UTC-4, Philipp at UiT wrote:
Sorry for my late reply.

Information about license(s):
I noticed that information about the licence is provided in DDI. It shows up in the field useStmt.

Information about access:
According to openAIRE, "mandatory when applicable" means "when the value of the field can be obtained it must be present in the metadata record". But I'm not quite sure how Dataverse can provide the correct access information, jf. the values closedAccess, embargoedAccess, restrictedAccess, openAccess. For instance, this dataset is licensed under CC0, but apart from the README file, all files are restricted. Does this mean that the value restrictedAccess has to be used? Actually, as stated under Terms of Access in the Terms section, the files are embargoed until the 1st of January 2019 at the latest. So here, embargoedAccess would most appropriate. However, as of today, there is now machine readable embargo function in Dataverse (cf. this issue). For embargoed access, also the embargo date is required.


Thanks for the clarification of "mandatory when applicable" - I'd asked about that because it seems as though having software evaluate when something is applicable could be tricky.  It sounds like interpreting  that is still something that a person would be doing.

Best,
Pete
 
Reply all
Reply to author
Forward
0 new messages