Dataverse and CRIS system

44 views
Skip to first unread message

Péter Király

unread,
Jun 17, 2019, 11:04:51 AM6/17/19
to dataverse...@googlegroups.com
Dear all,

at Göttingen we are working on Dataverse and paralell on a publication
repository, wich contains bibliographical infromation of the campus.
Dataverse uses Shibbolet, while the DSpace based CRIS sytsem uses LDAP
- the database behind that is the same. We would like to fetch
information from Dataverse into the CRIS system's profile oage, but I
haven't found a proper solution for that. In the Solr index there is
the authorIdentifier field, which corresponds to an external
identifier of the author, such as ORCID, however it is an optional
data element, and because of the key-value nature of the index, we can
not be sure, that it is ORCID or something else. In the native
Datasets API (https://data.gro.uni-goettingen.de/api/datasets/:persistentId/?persistentId=[DOI])
we can extract the datasetContactEmail, which would be a common
identifier between the systems, but I haven't found a way how to
search for this information (it is not indexed in Solr), and it won't
cover not primary authors of the datasets.

Is there anybody with similar requirements? How did you solve it?
Write your own API? Batch process?

Any hint would be useful.

Thanks!
Péter

--
Péter Király
software developer
GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
http://linkedin.com/in/peterkiraly

Philip Durbin

unread,
Jun 25, 2019, 12:10:06 PM6/25/19
to dataverse...@googlegroups.com
Hi Péter,

Sorry for the slow reply. Busy week. Wolfram represented Göttingen very well both in his talks and in the Dataverse Cup! It was tough playing opposite him. :)

I've been thinking about your concern over knowing if "authorIdentifier" is an ORCID or not and I think I have a solution for you.

If you append 'AND authorIdentifierScheme:"ORCID"'  to your author query, you can be sure that you're only searching ORCID IDs rather than ISNI, LCNA, or other identifiers.

To make this more concrete, here's a search for my ORCID ID:

authorIdentifier:"0000-0002-9528-9470" AND authorIdentifierScheme:"ORCID"

Here's the URL for the search above after it has been typed into the search box of an installation of Dataverse: https://dataverse.harvard.edu/dataverse/harvard/?q=authorIdentifier%3A%220000-0002-9528-9470%22+AND+authorIdentifierScheme%3A%22ORCID%22

I'll also attach a screenshot. Please note that the values for the following fields are both highlighted in bold, indicating a match:

- Author Identifier Scheme
- Author Identifier

Does this help? You should be able to do the same from the Search API but I haven't tried it.

You're right that datasetContactEmail is not indexed into Solr. This is by design, for privacy reasons. You can find the logic and https://github.com/IQSS/dataverse/blob/v4.15/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java#L774 which was added to fix https://github.com/IQSS/dataverse/issues/759

datasetContactEmail is in the database, of course, and you could write a SQL query for it. You might be able to draw inspiration from the "useful queries" doc linked from https://github.com/IQSS/dataverse/issues/4169

At a high level, a useful feature for Dataverse might be "Given an ORCID ID or other author identifier, show me a list of datasets."

Also, there has been some previous related discussion in the following places:


I hope this helps,

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CABFhGtkY4wr6xFxBtbv%2Bce0gKogpTw9mCRjrpKHgKiVDDxv2Nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Screen Shot 2019-06-25 at 11.41.57 AM.png

Péter Király

unread,
Jul 3, 2019, 6:22:37 AM7/3/19
to dataverse...@googlegroups.com
Hi Phill,

thanks for your answer and the pointers!

It turned out that only 20% of our potential users do have an ORCID,
so we will investigate further the custom SQL query based
possibilities.

Best,
Péter

Philip Durbin <philip...@harvard.edu> ezt írta (időpont: 2019.
jún. 25., K, 18:10):
> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CABbxx8HrZjoy-toEu2YDA0TsQFSwFYDPz91%3DbT5dMUokbtGkAQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.



--
Reply all
Reply to author
Forward
0 new messages