publisher: repository or provider?

78 views
Skip to first unread message

Penny Labropoulou

unread,
Feb 3, 2022, 3:39:42 AM2/3/22
to DataCite Metadata

Dear Metadata Working Group,

We would like some clarifications as to the notion of "publisher" in the DataCite schema, and how it should be applied in the context of data/software repositories.

The European Language Grid platform (ELG, https://live.european-language-grid.eu/) hosts and makes available language resources and technologies (datasets and software), for which we intend to issue DOIs through DataCite (discussions have been initiated).

We are currently looking into the metadata and we are not sure about the value of "publisher". The DataCite definition is "The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource..."

 In our case, ELG hosts and makes available the resource, but does not produce it. Based on other guidelines, we will add ELG as a 'contributor' with 'contributorType' 'HostingInstitution' and 'Distributor', since we also distribute these resources.

Moreover, we include in our metadata the (optional) element "resource provider" used for the entity that has made the resource available to ELG and who is usually the producer of the resource.

So, should the publisher be ELG or the resource provider? Please, note that the resource provider is an optional element, while 'publisher' in DataCite is mandatory.

On the other hand, we have found in other guidelines (e.g. DataVerse), the 'publisher' value used for the 'repository that published the dataset' (https://dataverse.org/best-practices/data-citation). This is also a practice that we see adopted by other repositories (e.g. Zenodo, DRYAD) who act as 'publisher' of all uploaded data, even though they do not "produce" them.

Thank you in advance for any clarifications and advice.

Best regards,

Penny Labropoulou

Karen Gutzman

unread,
Jun 22, 2022, 3:06:06 PM6/22/22
to DataCite Metadata
Hi Penny and all, 

Did you receive a response to this question? I have a similar question about publisher in DataCite. We have a general purpose repository (takes in historical materials, current publications, datasets, etc.). My question is what happens when there is a print and digital publisher? For example our library digitized a historic book published by the medical school and disseminates the book via our repository. Would our repository be listed as the publisher since we disseminate the digital version, or is the medical school listed because they were the print publisher? It seems like from Penny's information above and previous emails on this Google Group, that we would list our repository as the publisher. But I've noticed that other institutional repositories have done the opposite - they list the print publisher. 

Maybe this is an easy question, but I welcome your thoughts. 

Karen Gutzman
Galter Health Sciences Library
Feinberg School of Medicine
Northwestern University

Roy, Sophie

unread,
Jun 23, 2022, 7:48:13 AM6/23/22
to Karen Gutzman, DataCite Metadata

Hello everyone,

 

We, at the National Research Council of Canada, struggled with this too when we released a collection of old print reports we had digitized. We looked at various definitions of publisher and decided that our role as a repository was more capture and distribution. So we used the original copy’s publisher’s name as the value for publisher.

Publishing Definition & Meaning - Merriam-Webster

 

Here’s an example of the metadata displayed to the end user, for one of the records in that collection.

Arrow I: flight tests - NRC Digital Repository - Canada.ca

 

We considered only describing the digital version at the main level, and keeping descriptions of the print version in a related item “otherFormat” (MODS) element [isVariantFormOf / isOtherFormOf (DataCite)], but landed with a final record that includes metadata about the original as well as the digitized version at the main level. Physical description at main level definitely describes the digital version. In that record, the related item “otherFormat” metadata includes form=print and location of print.

 

For dates (in MODS format) we decided this would be best:

<dateIssued encoding="w3cdtf">1958</dateIssued>

<dateCaptured encoding="w3cdtf">2020-02-19</dateCaptured>

<dateOther encoding="w3cdtf" type="online distribution">2021-06-07</dateOther>

 

I can see how describing only the digitized version at the main level and keeping all metadata about the original in a related item “otherFormat” would also work well.

 

Best regards,

Sophie

 

Sophie Roy |  sophi...@nrc-cnrc.gc.ca

 

On behalf of the NRC Digital Repository / NRC.DR-...@nrc-cnrc.gc.ca

Online: https://nrc-digital-repository.canada.ca/eng/home/

Knowledge, Information and Technology Services Branch / National Research Council Canada / Government of Canada

 

From: datacite...@googlegroups.com [mailto:datacite...@googlegroups.com] On Behalf Of Karen Gutzman
Sent: June 22, 2022 3:06 PM
To: DataCite Metadata <datacite...@googlegroups.com>
Subject: Re: publisher: repository or provider?

 

***ATTENTION*** This email originated from outside of the NRC. ***ATTENTION*** Ce courriel provient de l'extérieur du CNRC

--
You received this message because you are subscribed to the Google Groups "DataCite Metadata" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datacite-metad...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/datacite-metadata/19b242ea-e8c4-4e61-98ac-b6da24ebace4n%40googlegroups.com.

Kelly Stathis

unread,
Jun 27, 2022, 6:18:52 AM6/27/22
to Roy, Sophie, Karen Gutzman, DataCite Metadata
Hi Karen, Sophie,

Thank you for following up on this question and thanks, Sophie, for sharing how the National Research Council of Canada has approached this!

Because the current definition of publisher in the DataCite Metadata Schema is very open-ended, I believe there is no single correct approach.

I agree that the role of the library that digitized a resource—and the repository hosting the digitized resource—is more distributor than publisher. That said, it is also acceptable to list the repository as the publisher. In some cases, this may be due to platform constraints—for example, some repository platforms do not allow the value of "publisher" to be changed at the item level.

I will note that for PublicationYear, the DataCite Metadata Schema has specific guidance for digitized objects (p. 18):

In the case of a digitised version of a physical object

If the DOI is being used to identify a digitised version of an original item, the recommended approach is
to supply the PublicationYear for the digital version and not the original object.

The Title field may be used to convey the approximate or known date of the original object. Other
metadata properties available for additional date information about the object include: Subject and
Description. However, only Title will be part of the citation.

Here are two examples of citations using dates or date information in the titles.
Schmidt, S., Andersen, V., Belviso, S., & Marty, J.-C. (2002). Dissolved and particulate thorium 234
concentration at time series station DYFAMED from date 1995-05-07 (Data set). PANGAEA - Data
Publisher for Earth & Environmental Science. https://doi.org/10.1594/pangaea.183607

Tape, K. D. (2015). Aerial Images of Alaska’s Arctic Coastal Plain; 1948-1949. U.S. Geological Survey.

I would welcome any feedback on this current guidance, as we are looking to clarify how digitized objects can be described in future schema versions.

All best,
Kelly

Kelly Stathis
Technical Community Manager | DataCite 
they/them | Pacific Time
A: DataCite -- Welfengarten 1B, 30167 Hannover, Germany


Reply all
Reply to author
Forward
0 new messages