[Registry Use Case][Idea] Adding single remote datasets via new harvester type

46 views
Skip to first unread message

Oliver Bertuch

unread,
Jun 7, 2022, 9:12:34 AM6/7/22
to dataverse...@googlegroups.com

Hi folks,

as you might know, our installation Jülich DATA is used primarily as a registry and institutional repository. This means for data publications not happening in our repository but at more sophisticated domain repos / ..., we still require researchers to register the "remote" publication with us and enrich it with some institutional metadata (bibliometric data).

We experience its quite hard to add a single remote dataset, because currently adding a dataset will allocate a new DOI, which is highly discouraged. The harvesting possibilities on the other hand are limited to larger sets and available to superusers only.

Adding a non-integrated, separate service for this breaks the user experience, which made us thinking it might be worth the shot to reach out to the community: is there interest in adding a new harvester, picking up individual PIDs?

The concept (first iteration): create a Dataverse collection and add a list of DataCite DOIs. Let all of those be picked up by the harvester, receive metadata via the DataCite OAI-PMH interface, and create a dataset per DOI. A dataset template would provide any required institutional metadata, as harvested dataset cannot be edited and institutional metadata can be static per collection. (The UI could even allow different template per DOI.)

For better interoperability with other services, creating these remote dataset harvesting lists would be exposed via API, too. This has been a long standing feature request: https://github.com/IQSS/dataverse/issues/7330

Permission to create these lists could be tied to the permission of being able to create datasets within a collection, the "Edit" Menu entry would be a single item then.

Please let me know if this is sth. worth for upstreaming to the larger community or stick with our minimal fork. (This is not implemented yet, just an idea...)

Best
Oliver

P. S.: I know that https://github.com/IQSS/doi2pmh-server came to life for related, but not the same reasons. Still, creating a different service and sending users to it causes confusion and reluctance to stick with good practices.

-- 
-------------------------------------------------------------------------------------
Oliver Bertuch
Forschungszentrum Jülich GmbH
Zentralbibliothek / Central Library
Forschungsdatenmanagement / Research Data Management
Entwicklung von Forschungssoftware / Research Software Engineering

52425 Jülich
+49 2461 61-85370
https://www.fz-juelich.de/zb

Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
-------------------------------------------------------------------------------------
Reply all
Reply to author
Forward
0 new messages