Identifiers for private repository

49 views
Skip to first unread message

Víctor Mireles

unread,
Jun 7, 2020, 2:37:23 PM6/7/20
to Dataverse Users Community
Hi all,

I am new to the dataverse community, and have managed to setup an instance running on docker with many nice customizations. Thanks for the great work.

My use-case, however, is slightly different:  
1. Most of the datasets we will host are private, and will remain so for a long time. 
2. Also, we don't want anyone to find out that we are adding stuff to our instance.  
3. We would like for datasets to have a dereferenceable URI, complying with the above points (e.g. subject to access controls)

The access control of dataverse is pretty good (and integration with OAuth2 went more or less smoothly). 

The issue with the last, however, is that the only URIs that are automatically minted (DOI or Handles) have a certain prefix, which is in a domain outside our control (e.g. doi.org). Furthermore, the only way to link to a dataset seems to be using this URI in the URL as parameter (https://mydvinstance.somewhere/dataset.xhtml=presistentId=doi:foo/bar)

Is it possible to have DV automatically mint an identifier with a different http prefix.
If it would be possible to have the presistentID be   https://onedomain.somewhere/foo/bar    (where onedomain.somewhere is under our control) then we could set up a forwarding from  https://onedomain.somewhere/foo/bar  -- >  https://mydvinstance.somewhere/dataset.xhtml=onedomain:/foo/bar   (which is basically what the datacite resolver does). 

It would be awesome not having to fork DV for this.

Alternatively, are there examples of people configuring DV to use a third-party identifier forwarding service (e.g., add a third option apart from handles and doi to the :Protocol option) with an API defined by us? I mean, we can create an internal alternative to datacite, defining its own API, and then use the :Protocol, ":Authority etc settings to configure it.


Thanks in advance,
Víctor

Philip Durbin

unread,
Jun 8, 2020, 2:59:38 PM6/8/20
to dataverse...@googlegroups.com
Hi Victor,

First, thanks for the kind words and for kicking the tires on Dataverse!

Please don't fork if you can help it. :)

You aren't alone in wanting to run Dataverse "behind a firewall" or however you'd like to think about it. You can read through some related discussion in the following issues:

- Make dataset DOI publication optional: https://github.com/IQSS/dataverse/issues/3652
- Plugin mechanism to support further PID providers: https://github.com/IQSS/dataverse/issues/4106
- As a scientist/user, I want to mint a PID from different providers: https://github.com/IQSS/dataverse/issues/5082
- delete DOI link from the citation block: https://github.com/IQSS/dataverse/issues/6423

As you've observed, Dataverse is oriented toward publishing metadata as broadly as possible for search and discovery purposes. Send it to DataCite. Make it easy for Google Dataset Search to index it.

You have good suggestions, such as running your own DataCite service (might be challenging but I believe the code is open source). Handle is also something you run yourself as far as I understand.

I hope this helps. Perhaps other people who have similar use cases will reply here.

Thanks,

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/75d18969-90f6-4542-8a8a-1fd1b918f148o%40googlegroups.com.


--

Benjamin Peuch

unread,
Jun 11, 2020, 3:47:59 AM6/11/20
to Dataverse Users Community
Hello Victor,

The Swedish National Data Service (SND) have a subdataverse in Harvard's Dataverse with datasets without DOIs or Handles: https://dataverse.harvard.edu/dataverse/SND

I think they managed this by importing their data, so perhaps it is not exactly what you are looking to achieve, but it was possible to do it one way or another. There might be more to it.

Good luck,
Ben

Philip Durbin

unread,
Jun 11, 2020, 10:27:53 AM6/11/20
to dataverse...@googlegroups.com
Ah, those Swedish datasets are harvested via OAI-PMH. You can tell by the icon. I'll attach a screenshot. Harvesting is a way to get other PIDs into your installation.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
Screen Shot 2020-06-11 at 10.01.31 AM.png

Benjamin Peuch

unread,
Jun 11, 2020, 12:02:33 PM6/11/20
to Dataverse Users Community
Oh, that's a very interesting feature. We are just in the process of setting up our own OAI-PMH. Thanks for the tip, Philip!


On Thursday, June 11, 2020 at 4:27:53 PM UTC+2, Philip Durbin wrote:
Ah, those Swedish datasets are harvested via OAI-PMH. You can tell by the icon. I'll attach a screenshot. Harvesting is a way to get other PIDs into your installation.

On Thu, Jun 11, 2020 at 3:48 AM Benjamin Peuch <benjam...@gmail.com> wrote:
Hello Victor,

The Swedish National Data Service (SND) have a subdataverse in Harvard's Dataverse with datasets without DOIs or Handles: https://dataverse.harvard.edu/dataverse/SND

I think they managed this by importing their data, so perhaps it is not exactly what you are looking to achieve, but it was possible to do it one way or another. There might be more to it.

Good luck,
Ben

On Sunday, June 7, 2020 at 8:37:23 PM UTC+2, Víctor Mireles wrote:
Hi all,

I am new to the dataverse community, and have managed to setup an instance running on docker with many nice customizations. Thanks for the great work.

My use-case, however, is slightly different:  
1. Most of the datasets we will host are private, and will remain so for a long time. 
2. Also, we don't want anyone to find out that we are adding stuff to our instance.  
3. We would like for datasets to have a dereferenceable URI, complying with the above points (e.g. subject to access controls)

The access control of dataverse is pretty good (and integration with OAuth2 went more or less smoothly). 

The issue with the last, however, is that the only URIs that are automatically minted (DOI or Handles) have a certain prefix, which is in a domain outside our control (e.g. doi.org). Furthermore, the only way to link to a dataset seems to be using this URI in the URL as parameter (https://mydvinstance.somewhere/dataset.xhtml=presistentId=doi:foo/bar)

Is it possible to have DV automatically mint an identifier with a different http prefix.
If it would be possible to have the presistentID be   https://onedomain.somewhere/foo/bar    (where onedomain.somewhere is under our control) then we could set up a forwarding from  https://onedomain.somewhere/foo/bar  -- >  https://mydvinstance.somewhere/dataset.xhtml=onedomain:/foo/bar   (which is basically what the datacite resolver does). 

It would be awesome not having to fork DV for this.

Alternatively, are there examples of people configuring DV to use a third-party identifier forwarding service (e.g., add a third option apart from handles and doi to the :Protocol option) with an API defined by us? I mean, we can create an internal alternative to datacite, defining its own API, and then use the :Protocol, ":Authority etc settings to configure it.


Thanks in advance,
Víctor

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages