Best Practices in DSpace DOIs and Dublin Core?

78 views
Skip to first unread message

Jere Odell

unread,
Oct 9, 2018, 3:13:03 PM10/9/18
to DSpace Community
1. If your DSpace instance is generating DOIs and storing them in the DSpace Dublin Core metadata, how are you doing this?

To be specific, are you (A) storing the DSpace DOI in the same metadata field as the DSpace handle? dc.identifier.uri? (See for example: https://conservancy.umn.edu/handle/11299/188066 ... which stores both the handle and the doi in dc.identifier.)

Or, are you (B) using a different metadata field to store the DSpace DOI? (One might use dc.identifier.doi, for example? But I haven't seen this used for this purpose yet.)

2. With the above in mind, where do you store non-DSpace DOIs? For example, if the work has a doi that resolves to a different site (a CrossRef journal doi, probably), do you store that external doi in dc.identifier.doi? Or do you store it in a different metadata field, such as ... dc.relation.isversionof?

3. Finally, are you aware of any downstream issues that might result from choosing to store the DSpace DOI in the same DC field with the handle? Will sites that harvest or crawl our metadata fail to see the DOI?

Jere Odell
Scholarly Communication Librarian
IUPUI ScholarWorks





Dykas, Felicity A.

unread,
Oct 9, 2018, 6:25:46 PM10/9/18
to Jere Odell, DSpace Community

1.     I will be storing DOIs in dc.identifier.doi, as soon as I get it set up.  (I am new to DOI registrations.)  I am temporarily storing them in the same field as handles. 

2.     Other DOIs – my plan is to put these in a dc.relation field.  I see that UMN did that, too.  When doing retrospective assignment of DOIs, I found that a dataset I have is also in Dryad.  I need to come up with a better plan, but in the meantime I added:  dc.relation   Also available in Dryad: https://doi.org/10.5061/dryad.9bg43

3.     I see advantages to having different persistent identifiers in different fields.  a) Easier to manipulate the data, if needed, in the DSpace instance of when the data is exported.  b) The display of the fields in a local repository can be customized with different labels, order of display changed, etc.

 

There was a discussion on the DCAT list last fall about DOIs.  This ticket was referenced:  https://jira.duraspace.org/browse/DS-3708:  …  it is proposed that we move all DOIs generated by DSpace into a new field 'dc.identifier.doi'.  This will resolve the discrepancy between the DataCite and EZID DOI generation code (DS-2199), disentangle generated DOIs from Handles, and consistently separate generated DOIs from user-entered DOIs.  Thereafter, dc.identifier will hold all user-supplied identifiers; dc.identifier.doi will hold all DSpace-generated DOIs; and dc.identifier.uri will hold all Handles.  This will make storage of identifiers more consistent, prevent confusion of generated and submitted identifiers, and make it simpler to work with generated identifiers since each field will hold only one type.

 

Felicity

 

Felicity Dykas

Head, Digital Services Department

MU Libraries

University of Missouri--Columbia

(573) 882-4656

dyk...@missouri.edu

--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-communi...@googlegroups.com.
To post to this group, send email to dspace-c...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-community.
For more options, visit https://groups.google.com/d/optout.

Reed, Marianne A.

unread,
Oct 10, 2018, 11:49:55 AM10/10/18
to Jere Odell, DSpace Community

At KU, we use dc.identifier.doi to store DOIs for anything in the repository that has one.  This includes those materials for which we assign DOis, such as articles from journals that we publish, datasets, or Open Educational Resources created with the support of the Libraries.

 

We use dc.relation.isversionof to store links to published material for which there is no DOI.  

 

Best,

Marianne Reed

Digital Initiatives Coordinator

450 Watson Library

University of Kansas Libraries

mr...@ku.edu

785-864-8913

 

From: dspace-c...@googlegroups.com <dspace-c...@googlegroups.com> On Behalf Of Jere Odell
Sent: Tuesday, October 9, 2018 2:13 PM
To: DSpace Community <dspace-c...@googlegroups.com>
Subject: [dspace-community] Best Practices in DSpace DOIs and Dublin Core?

 

1. If your DSpace instance is generating DOIs and storing them in the DSpace Dublin Core metadata, how are you doing this?

--

Mark Wood

unread,
Oct 11, 2018, 9:33:00 AM10/11/18
to DSpace Community
On Tuesday, October 9, 2018 at 3:13:03 PM UTC-4, Jere Odell wrote:
1. If your DSpace instance is generating DOIs and storing them in the DSpace Dublin Core metadata, how are you doing this?

To be specific, are you (A) storing the DSpace DOI in the same metadata field as the DSpace handle? dc.identifier.uri? (See for example: https://conservancy.umn.edu/handle/11299/188066 ... which stores both the handle and the doi in dc.identifier.)



If your DSpace instance is generating DOIs then it is storing them in dc.identifier.uri.  This is not configurable.

Claudia Jürgen

unread,
Oct 11, 2018, 9:37:35 AM10/11/18
to dspace-c...@googlegroups.com
Hi,

as Mark pointed out the self generated doi's are in dc.identifier.uri.
If you have secondary publications and use the doi service I would put
the original doi of the publication in another field like
dc.identifier.doi in order to be able to seperate them in crosswalks etc.

Hope this helps

Claudia Jürgen
--
Claudia Juergen
Eldorado

Technische Universität Dortmund
Universitätsbibliothek
Vogelpothsweg 76
44227 Dortmund

Tel.: +49 231-755 40 43
Fax: +49 231-755 40 32
claudia...@tu-dortmund.de
www.ub.tu-dortmund.de

Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen.

Important note: The information included in this e-mail is confidential. It is solely intended for the recipient. If you are not the intended recipient of this e-mail please contact the sender and delete this message. Thank you. Without prejudice of e-mail correspondence, our statements are only legally binding when they are made in the conventional written form (with personal signature) or when such documents are sent by fax.

Jere Odell

unread,
Oct 11, 2018, 11:59:56 AM10/11/18
to claudia...@tu-dortmund.de, dspace-c...@googlegroups.com
Claudia, Mark, and friends,

I think there's mismatch between how librarians think metadata should be applied and how DSpace can auto-register (DataCite) DOIs. If Mark and Claudia are correct, DSpace generates DOIs in dc.identifier.uri and [cannot/is not currently able to] register DOIs from other Dublin Core fields ... such as dc.identifier.doi.

If I understand correctly, DSpace was designed to issue one persistent identifier ... the handle. DOIs were a more recent request and, for now, if we want to auto-generate DOIs we have to store them in dc.identifier.uri. Is that correct?

If so, that puts those of us that want to assign DOIs to our DSpace records in a difficult spot ... we must choose between a) manual methods of registering the DOI or b) rely on a less-than-optimal metadata practice.

Am I missing something?

Jere

--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to a topic in the Google Groups "DSpace Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dspace-community/Jz4VlgPuK8w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dspace-communi...@googlegroups.com.

Mark Wood

unread,
Oct 11, 2018, 12:36:08 PM10/11/18
to DSpace Community
On Thursday, October 11, 2018 at 11:59:56 AM UTC-4, Jere Odell wrote:
I think there's mismatch between how librarians think metadata should be applied and how DSpace can auto-register (DataCite) DOIs. If Mark and Claudia are correct, DSpace generates DOIs in dc.identifier.uri and [cannot/is not currently able to] register DOIs from other Dublin Core fields ... such as dc.identifier.doi.

If I understand correctly, DSpace was designed to issue one persistent identifier ... the handle. DOIs were a more recent request and, for now, if we want to auto-generate DOIs we have to store them in dc.identifier.uri. Is that correct?

If so, that puts those of us that want to assign DOIs to our DSpace records in a difficult spot ... we must choose between a) manual methods of registering the DOI or b) rely on a less-than-optimal metadata practice.

Am I missing something?



Perhaps it is I who is missing something.  How, specifically, is this less-than-optimal practice?  Some points to consider:

o  There actually is no such field as identifier.uri in Qualified Dublin Core.  So what would an aggregator do with it?  It has no meaning outside of DSpace.  It should be mapped to something standardized, when exposed to harvesters.  Screen-scraping harvesters should know they are on shaky ground and carefully examine the values that they find.

o  Resolvable URLs for DOIs and for general Handles use distinct authorities (hdl.handle.net vs. dx.doi.org).  They are easily distinguished by humans and by machines.

o  If a raw Handle has the prefix "10." then it is a DOI, otherwise it is not.

o  How a repository stores a metadata value, and how it presents it, are separate questions.  What is the appropriate, standardized or generally accepted mapping of "DOI for this version of a resource" for interchange among heterogeneous systems?

o  A system which creates identifiers for its own purposes must know which identifiers it controls.  Others must know which identifiers they do not control.  I presume that this is why the DOI identifier providers use one field and the stock submission form uses another.

I would have preferred that different types of identifiers were stored separately, so we don't have to parse them to know what they are.  But that isn't difficult, and non-brittle external systems will do that anyway to protect themselves from unknown practices at sites that they harvest.  Do we know of any systems which do not?

emilio lorenzo

unread,
Oct 11, 2018, 2:17:26 PM10/11/18
to dspace-c...@googlegroups.com

I think the assignation of DOIs to whatever field,  is possible, just some extra lines of codes would solve it

in (dspace 5)  http://85.152.11.156:7172/handle/10317/4931?show=full  you can show that the system is automatically assigning dois to new records (well, you have to trust that this is the process we are following)   and via cron jobs the system is able to retrospectively assign dois to archived objects

dc.identifier.uri   is used for the handle   dc.identifier.uri: http://hdl.handle.net/10317/4931  and 

dc.identifier.doi is used for DOIs   dc.identifier.doi    --->   10.31428/10317/4931        (note that following crossref recomendations  the doi suffix   re-user the main indicator, the handle

regards


Emilio

You received this message because you are subscribed to the Google Groups "DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-communi...@googlegroups.com.
elorenzo.vcf

Mark Wood

unread,
Oct 11, 2018, 4:18:23 PM10/11/18
to DSpace Community
On Thursday, October 11, 2018 at 2:17:26 PM UTC-4, Emilio Lorenzo wrote:

I think the assignation of DOIs to whatever field,  is possible, just some extra lines of codes would solve it

in (dspace 5)  http://85.152.11.156:7172/handle/10317/4931?show=full  you can show that the system is automatically assigning dois to new records (well, you have to trust that this is the process we are following)   and via cron jobs the system is able to retrospectively assign dois to archived objects

dc.identifier.uri   is used for the handle   dc.identifier.uri: http://hdl.handle.net/10317/4931  and 

dc.identifier.doi is used for DOIs   dc.identifier.doi    --->   10.31428/10317/4931        (note that following crossref recomendations  the doi suffix   re-user the main indicator, the handle




Indeed, that must be local code modifications.  Stock DSpace code does not define a dc.identifier.doi field and would not accept it.  dc.identifier.uri is hard-coded into the org.dspace.identifier.DOIIdentifierProvider. (Also, this provider is letting the persistence layer generate the suffix, probably via a DBMS sequence generator -- I didn't follow it all the way down.  On a stock 5.4 instance I can see in the database that the Handle and DOI suffixes are unrelated.)

Jere Odell

unread,
Oct 31, 2018, 5:42:08 PM10/31/18
to DSpace Community
I would like to ping Mark Wood's questions on this thread one more time ... Mark and I use very different language for describing what we want to do ... as a repository manager I want to:

1. automatically register DOIs (the manually process is tedious)
2. store those DOIs in metadata fields that are meaningful to both machines and to people (but, yes, machines are probably more important, in this case)
3. do the above without modifying dspace such that future upgrades are a pain in the neck.

I agree with the few repository managers that responded that DOIs are best stored in dc.identifier.doi ... and that external DOIs (those that we do not register) should be stored in another field (version.isrelationof, for example) ... but this is the human readable solution. I assume that those that responded with this arrangement are registering DOIs manually or are at the very least not using dspace to make the registration.

It makes sense to me (sort of) as Mark says: "A system which creates identifiers for its own purposes must know which identifiers it controls." ... which means for now, dspace should store these in dc.identifier.uri. But ...

Can anyone confirm that we are not creating downstream headaches for systems that seek to make sense of the multiple values stored in dc.identifier.uri? Or ... as Mark says:

"What is the appropriate, standardized or generally accepted mapping of "DOI for this version of a resource" for interchange among heterogeneous systems?"

AND

"[N]on-brittle external systems will [parse the different types of identifiers] anyway to protect themselves from unknown practices at sites that they harvest.  Do we know of any systems which do not?"

Any thoughts on these questions?

Jere Odell
IUPUI

helix84

unread,
Oct 31, 2018, 7:19:03 PM10/31/18
to jered...@gmail.com, dspace-c...@googlegroups.com
See also the previous discussion of this issue in Jira:

https://jira.duraspace.org/browse/DS-3472


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
> --
> All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups "DSpace Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dspace-communi...@googlegroups.com.

Jere Odell

unread,
Oct 31, 2018, 8:59:30 PM10/31/18
to hel...@centrum.sk, dspace-c...@googlegroups.com
Hi helix84,

Thanks for linking to Jira DS-3472. Very helpful. Mark Diggory recommended providing configurable options. As far as I know this hasn’t been a focus for development yet. Correct?

I don’t think DS-3472 addresses our worries about how other systems use or fail to use multiple values stored as identifier.uri ... any thoughts anyone?

Jere

Sent from my iPhone

Tim Donohue

unread,
Nov 1, 2018, 10:19:49 AM11/1/18
to Jere Odell, hel...@centrum.sk, dspace-c...@googlegroups.com
Hi Jere and all,

Correct, I don't believe any development has started on DS-3472, as that ticket is still in a "Needs More Details" status.  


Tickets in that state are often waiting for feedback or best practices to be proposed. As developers, we don't always know what the best solution may be for repository managers/harvesters (or for metadata storage in general). So, we occasionally have to set aside tickets for more feedback. Plus, to be honest, all our development work is volunteer based -- so even if we got this feedback *today*, we'd have to look around for someone interested/available in helping build out the proposed solution.

My recommendation here would be to try to gather together others who are interested in helping propose a solution, perhaps either on this list, or possibly the DCAT group can be used for finding interested folks: https://wiki.duraspace.org/display/cmtygp/DSpace+Community+Advisory+Team

The more clarity we (as developers) can get on what is the "best practice" here, the more likely we can move this forward in the near future.

- Tim
--
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org
Reply all
Reply to author
Forward
0 new messages