Author/Contributor handling in publications

157 views
Skip to first unread message

Christian Gutknecht

unread,
Aug 15, 2018, 5:04:07 AM8/15/18
to ORCID API Users
Hi all

I've got a question regarding the handing of authors/contributors within publications. Actually we would like to import those to our system, without going to the external source (e.g. Crossref).

If specific publication metadata is embedded as bibtex, we parse the authors in bibtex, which works fine. But there are also publications that do not have embedded Bibtex.

Example: Citation with list of Contributors: https://pub.orcid.org/v2.0/0000-0003-0336-1955/work/28594120

There is a citation:

<work:citation> <work:citation-type>formatted-unspecified</work:citation-type> <work:citation-value>Vogt L, Reichlin TS, Nathues C, Würbel H, PLoS biology, 2016, vol. 14, no. 12, pp. e2000598, 2016</work:citation-value> </work:citation>

And a list of contributors
<work:contributors> <work:contributor> <work:credit-name>Vogt L</work:credit-name> <work:contributor-attributes> <work:contributor-sequence>first</work:contributor-sequence> <work:contributor-role>author</work:contributor-role> </work:contributor-attributes> </work:contributor> <work:contributor> <work:credit-name>Reichlin TS</work:credit-name> <work:contributor-attributes> <work:contributor-sequence>first</work:contributor-sequence> <work:contributor-role>author</work:contributor-role> </work:contributor-attributes> </work:contributor> ....

<work:short-description>Planz C, Nathues H, Brinkmann U, große Beilage E. . 2010; 38 (G): 205-264</work:short-description>


Questions:
1. Is there any rule how the work:credit-name should be interpreted in case you want to extract the first and last name seperated. Sometimes it comes with a space, sometimes with a "," Sometimes first and last name are switched.
2. What's exactly the mechanism how work:contributors are added to the ORCID record? Adding a publication via orcid.org (Add work manually) does not provide the fields to enter the authors. So I assume there's an automatic parsing of the entered citation or DOI?
3. Is it "worth" to parse the description for contributors? Or is an unusual case that the citation information is stored in there?


Thansk for any answers.

Best regards

Christian

Peters, Robert

unread,
Aug 15, 2018, 9:22:21 AM8/15/18
to Christian Gutknecht, ORCID API Users
Hi Christian,
Short answer: 
You should resolve the external identifier(s).

Longer answer:
We've seen every kind of name convention used and populated in the record. Citation is optional and can be any string populated by members or users. Contributors list are populated by our members organizations. We currently we do not compare what they are populating with the external  identifier(s). In cases where ORCID iD is included in the contributor list the credit name is replaced with the researcher's preferred credit name display. Members may or may not keep contributors and or citations up to date. We always suggest identifiers are resolved instead of relying on metadata that may not be up to date or correct.

We are looking at providing an API endpoint service that makes resolution of external identifiers easier although we are not likely to get to implementing in a time frame that works for you.  Until then you may find this javascript library(or the pattern it follows) useful https://github.com/ORCID/orcid-js.

Cheers,
Rob



 

Robert Peters
Technology Director at ORCID.org

Cellphone: +1.805.440.9056
Skype: rcpeters
Timezone: PST
Key for OpenPGP email communication:  
https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378

--
You received this message because you are subscribed to the Google Groups "ORCID API Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orcid-api-users+unsubscribe@googlegroups.com.
To post to this group, send email to orcid-api-users@googlegroups.com.
Visit this group at https://groups.google.com/group/orcid-api-users.
For more options, visit https://groups.google.com/d/optout.

Antonin Delpeuch (lists)

unread,
Aug 15, 2018, 10:03:29 AM8/15/18
to orcid-a...@googlegroups.com
Hi Rob,

Just to follow-up on this, I have also struggled with this exact issue
when working on ORCID integration for Dissemin. I have already submitted
a ticket about this here, 3 years ago already:
https://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/8820178-author-lists-and-the-associated-orcids-are-not-pro

I have also resorted to using a bibtex parser to extract author
metadata. In my humble opinion, I think it is really sad that Bibtex
plays such a central role in ORCID. It feels wrong to see it used as a
primary storage format (using it for import and export is useful of course).

- Bibtex is very old and basically not standardized at all. It imports
all the oddities of LaTeX in its data format (for instance, diacritics
are handled in a very obscure way).

- As a result, the tooling around Bibtex is really poor in most
programming languages I am aware of;

- The information stored in the Bibtex record is redundant with other
fields in your data model, creating headaches for data consumers having
to resort to unreliable heuristics to reconcile Bibtex metadata to the
rest of the data model.

- There are loads of alternatives to Bibtex that are far more
interoperable. Even the antediluvian Dublin Core, as poor as it may be,
would do a much better job.

So I don't see why ORCID, as a champion of open data and
interoperability, should be relying on such an antique, unreliable and
ill-behaved format.

One of the main reasons why data consumers can't afford to ignore Bibtex
in ORCID is really to have acces to the authors list. In fact, as an
ORCID user, using Bibtex is the only way to store that authors list in
any publication I add myself. Even if I wanted to input manually the
<work:contributor> field for my own papers, I would not be able to do so
because that is simply not exposed by the UI! Why is the list of authors
so inaccessible in a system that is supposed to solve the author
disambiguation problem? This is a crucial piece of metadata - in many
fields it matters a great deal whether you are the first author, the
last one, or somewhere in the middle, for instance.

I don't expect any change on this (changing the data model is hard) but
it would be great to have some sort of acknoledgment that this really is
a design mistake.

Cheers,
Antonin

On 15/08/2018 14:21, Peters, Robert wrote:
> Hi Christian,
> Short answer: 
> You should resolve the external identifier(s).
>
> Longer answer:
> We've seen every kind of name convention used and populated in the
> record. Citation is optional and can be any string populated by members
> or users. Contributors list are populated by our members organizations.
> We currently we do not compare what they are populating with the
> external  identifier(s). In cases where ORCID iD is included in the
> contributor list the credit name is replaced with the researcher's
> preferred credit name display. Members may or may not keep contributors
> and or citations up to date. We always suggest identifiers are resolved
> instead of relying on metadata that may not be up to date or correct.
>
> We are looking at providing an API endpoint service that makes
> resolution of external identifiers easier although we are not likely to
> get to implementing in a time frame that works for you.  Until then you
> may find this javascript library(or the pattern it follows)
> useful https://github.com/ORCID/orcid-js.
>
> Cheers,
> Rob
>
>
>
>  
>
> Robert Peters
> Technology Director at ORCID.org <http://ORCID.org>
>
> Cellphone: +1.805.440.9056
> Email: r.pe...@orcid.org <mailto:r.pe...@orcid.org>
> Skype: rcpeters
> Timezone: PST
> Key for OpenPGP email communication:  
> https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378
>
> On Wed, Aug 15, 2018 at 2:04 AM, Christian Gutknecht
> <ch.gut...@gmail.com <mailto:ch.gut...@gmail.com>> wrote:
>
> Hi all
>
> I've got a question regarding the handing of authors/contributors
> within publications. Actually we would like to import those to our
> system, without going to the external source (e.g. Crossref).
>
> If specific publication metadata is embedded as bibtex, we parse the
> authors in bibtex, which works fine. But there are also publications
> that do not have embedded Bibtex.
>
> *Example: Citation with list of
> Contributors:* https://pub.orcid.org/v2.0/0000-0003-0336-1955/work/28594120
> <https://pub.orcid.org/v2.0/0000-0003-0336-1955/work/28594120>
>
> There is a *citation*:
>
> <work:citation>
> <work:citation-type>formatted-unspecified</work:citation-type>
> <work:citation-value>Vogt L, Reichlin TS, Nathues C, Würbel H, PLoS
> biology, 2016, vol. 14, no. 12, pp. e2000598,
> 2016</work:citation-value> </work:citation>
>
> And a *list of contributors*
> <work:contributors> <work:contributor> <work:credit-name>Vogt
> L</work:credit-name> <work:contributor-attributes>
> <work:contributor-sequence>first</work:contributor-sequence>
> <work:contributor-role>author</work:contributor-role>
> </work:contributor-attributes> </work:contributor>
> <work:contributor> <work:credit-name>Reichlin TS</work:credit-name>
> <work:contributor-attributes>
> <work:contributor-sequence>first</work:contributor-sequence>
> <work:contributor-role>author</work:contributor-role>
> </work:contributor-attributes> </work:contributor> ....
>
> *Example: Authors only in
> Description:* https://pub.orcid.org/v2.0/0000-0003-0336-1955/work/28073434
> <https://pub.orcid.org/v2.0/0000-0003-0336-1955/work/28073434>
> <work:short-description>Planz C, Nathues H, Brinkmann U, große
> Beilage E. . 2010; 38 (G): 205-264</work:short-description>
>
>
> Questions:
> 1. Is there any rule how the work:credit-name should be interpreted
> in case you want to extract the first and last name seperated.
> Sometimes it comes with a space, sometimes with a "," Sometimes
> first and last name are switched.
> 2. What's exactly the mechanism how work:contributors are added to
> the ORCID record? Adding a publication via orcid.org
> <http://orcid.org> (Add work manually) does not provide the fields
> to enter the authors. So I assume there's an automatic parsing of
> the entered citation or DOI?
> 3. Is it "worth" to parse the description for contributors? Or is an
> unusual case that the citation information is stored in there?
>
>
> Thansk for any answers.
>
> Best regards
>
> Christian
> https://orcid.org/0000-0002-7265-1692
> <https://orcid.org/0000-0002-7265-1692>
>
> --
> You received this message because you are subscribed to the Google
> Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-use...@googlegroups.com>.
> To post to this group, send email to
> orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>.
> <https://groups.google.com/group/orcid-api-users>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-use...@googlegroups.com>.
> To post to this group, send email to orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>.

Peters, Robert

unread,
Aug 15, 2018, 10:42:30 AM8/15/18
to Antonin Delpeuch (lists), ORCID API Users
Hi Antonin,
No disagreements from me on Bibtex.

My perspective is advocating for easier/better PID infrastructure is the solution. Not author list hand typed(or copied/pasted) by users into ORCID in any format. Along those lines I'll advocate for getting involved in community efforts including metadata 2020(http://www.metadata2020.org/), JROST(http://jrost.org/) and PidApalooza(https://pidapalooza.org/).

Cheers,
Rob

Robert Peters
Technology Director at ORCID.org

Cellphone: +1.805.440.9056
Skype: rcpeters
Timezone: PST
Key for OpenPGP email communication:  
https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378
>     To post to this group, send email to
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
--
You received this message because you are subscribed to the Google Groups "ORCID API Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orcid-api-users+unsubscribe@googlegroups.com.
To post to this group, send email to orcid-api-users@googlegroups.com.

Antonin Delpeuch (lists)

unread,
Aug 16, 2018, 6:56:03 AM8/16/18
to orcid-a...@googlegroups.com
Hi Rob,

Thanks. I agree identifiers are very useful. But there will always be
manually imported publications, often without any identifiers, where the
only metadata consumers can rely on is what is stored in ORCID.

Is there any reason why this particular field (the authors list) is not
exposed in the UI, while many others are?

Cheers,
Antonin

On 15/08/2018 15:42, Peters, Robert wrote:
> Hi Antonin,
> No disagreements from me on Bibtex.
>
> My perspective is advocating for easier/better PID infrastructure is the
> solution. Not author list hand typed(or copied/pasted) by users into
> ORCID in any format. Along those lines I'll advocate for getting
> involved in community efforts including metadata
> 2020(http://www.metadata2020.org/), JROST(http://jrost.org/) and
> PidApalooza(https://pidapalooza.org/).
>
> Cheers,
> Rob
>
> Robert Peters
> Technology Director at ORCID.org <http://ORCID.org>
>
> Cellphone: +1.805.440.9056
> Email: r.pe...@orcid.org <mailto:r.pe...@orcid.org>
> Skype: rcpeters
> Timezone: PST
> Key for OpenPGP email communication:  
> https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378
>
> > useful https://github.com/ORCID/orcid-js <https://github.com/ORCID/orcid-js>.
> >
> > Cheers,
> > Rob
> >
> >
> >
> >  
> >
> > Robert Peters
> > Technology Director at ORCID.org <http://ORCID.org>
> >
> > Cellphone: +1.805.440.9056
> > Email: r.pe...@orcid.org <mailto:r.pe...@orcid.org>
> <mailto:r.pe...@orcid.org <mailto:r.pe...@orcid.org>>
> > Skype: rcpeters
> > Timezone: PST
> > Key for OpenPGP email communication:  
> > https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378
> <https://keys.mailvelope.com/pks/lookup?op=get&search=0x1519F37D99E18378>
> >
> > On Wed, Aug 15, 2018 at 2:04 AM, Christian Gutknecht
> > <ch.gut...@gmail.com <mailto:ch.gut...@gmail.com>
> >     send an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-users%2Bunsu...@googlegroups.com>
> >     <mailto:orcid-api-use...@googlegroups.com
> <mailto:orcid-api-users%2Bunsu...@googlegroups.com>>.
> >     To post to this group, send email to
> >     orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>
> >     <mailto:orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>>.
> >     Visit this group at https://groups.google.com/group/orcid-api-users
> <https://groups.google.com/group/orcid-api-users>
> >     <https://groups.google.com/group/orcid-api-users
> <https://groups.google.com/group/orcid-api-users>>.
> >     For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>
> >     <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "ORCID API Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-users%2Bunsu...@googlegroups.com>
> > <mailto:orcid-api-use...@googlegroups.com
> <mailto:orcid-api-users%2Bunsu...@googlegroups.com>>.
> > To post to this group, send email to orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>
> > <mailto:orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>>.
> > Visit this group at
> https://groups.google.com/group/orcid-api-users
> <https://groups.google.com/group/orcid-api-users>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-users%2Bunsu...@googlegroups.com>.
> To post to this group, send email to
> orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/orcid-api-users
> <https://groups.google.com/group/orcid-api-users>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to orcid-api-use...@googlegroups.com
> <mailto:orcid-api-use...@googlegroups.com>.
> To post to this group, send email to orcid-a...@googlegroups.com
> <mailto:orcid-a...@googlegroups.com>.

Christian Gutknecht

unread,
Aug 16, 2018, 10:32:01 AM8/16/18
to li...@antonin.delpeuch.eu, orcid-a...@googlegroups.com
Hi Rob

Thanks for your answer. I see your point around the idea of getting to the source.

I also would support a possiblity to add authors manually via UI, especially as they are forseen in the data model.

Best regards
Christian

You received this message because you are subscribed to a topic in the Google Groups "ORCID API Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orcid-api-users/zB9E0qcaAQE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orcid-api-use...@googlegroups.com.
To post to this group, send email to orcid-a...@googlegroups.com.

Christian Gutknecht

unread,
Aug 20, 2018, 4:24:35 AM8/20/18
to orcid-a...@googlegroups.com
Hi all

I did a small analysis how applications currently write contributors to ORCID:
https://drive.switch.ch/index.php/s/a96L9CY0lxdvKtu  (the list is based on stats reporting Tom once provided to the ORBIT group)

Besides the self asserted works, it’s fortunately quite rare, that works come without bibtex and without contributors at all. Yet it can be the case with the following sources:
Also it can be noted that the majority of applications adhere to the practice of writing the names like:

Firstname Lastname (without coma)
Lastname, Firstname (with coma)

Exceptions are:
I already have written to Europe PMC to add a coma in their output to ORCID, so the parsing for First- and Lastname would become easier.

Best regards

Christian

Jason Ronallo

unread,
Dec 21, 2018, 11:54:32 AM12/21/18
to Christian Gutknecht, orcid-a...@googlegroups.com
Christian,

Thank you for this analysis. Any way to analyze the "self asserted"
set more in depth? How many of those use each allowed citation type?
If that self asserted set could be extracted, I'd be interested in
doing some more analysis of it. What exactly is in the BibTeX? What
types are used in the BibTeX and how does that relate to selected
ORCID work types?

For our own citation management we're moving away from BibTeX to CSL
JSON. We're finding we get much better citation rendering using CSL
JSON. When we push citations with DOIs to ORCID though we'll need to
downgrade to BibTeX. And we have already found that some of our users
who are self asserting works are using BibTeX as well which we're
needing to deal with.

Jason

On Mon, Aug 20, 2018 at 7:59 AM Christian Gutknecht
> --
> You received this message because you are subscribed to the Google Groups "ORCID API Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to orcid-api-use...@googlegroups.com.
> To post to this group, send email to orcid-a...@googlegroups.com.
> Visit this group at https://groups.google.com/group/orcid-api-users.
> For more options, visit https://groups.google.com/d/optout.



--
Jason Ronallo
Department Head, Digital Library Initiatives
North Carolina State University Libraries
https://orcid.org/0000-0002-6080-7549

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.
Reply all
Reply to author
Forward
0 new messages