SSO (oIDC) and APIs and API Access

51 views
Skip to first unread message

Richard Dennis

unread,
Aug 16, 2021, 7:07:27 AM8/16/21
to Dataverse Users Community

Dear Philip and Dataverse Community,

Our current Test Dataverse Environment for Researchers Only

  • VMware Test Environment running Dataverse version 5.5 (Public port 443 only)
  • All other ports, i.e., 4848, is accessible only through the university VPN 
  • Standard out-of-the-box configuration and settings with Make Data Count and File Previewer
  • Local Dataverse Authentication (soon to test SSo (OIDC)

Questions relating to SSO: SAML version 2.0 / OIDC

Please note: We have determined that we will use OIDC for our SSO. We also anticipate our production environment to be very similar to our test environment.

Given the above details for our Dataverse Test Environment, we are trying to determine if our researchers will be able to utilize API's with the use of the API tokens  should we implement SSO (OIDC)

Thank you in advance,

Regards,


Richard Dennis

Special Advisor - Data Steward

Copenhagen University Library

danny...@g.harvard.edu

unread,
Aug 16, 2021, 11:23:27 AM8/16/21
to Dataverse Users Community
Hi Richard, I'll be interested to hear from others, but I can't think of any reasons why users would not be able to generate API tokens if they have one type of account vs. another. We don't use OIDC at the Harvard Dataverse Repository so I don't have any direct experience. 

Thanks,

Danny

Philip Durbin

unread,
Aug 16, 2021, 11:37:15 AM8/16/21
to dataverse...@googlegroups.com
Hi Richard,

I also replied to you in chat* but like Danny says, your OIDC users should have no problems using API tokens.

Thanks,

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/72a11575-9c0f-4414-af41-84676c966270n%40googlegroups.com.


--

Richard Dennis

unread,
Aug 17, 2021, 3:28:22 AM8/17/21
to dataverse...@googlegroups.com
Thank you for your responses, Danny and Philip. Still, I am also exploring if researchers can run the curl commands and continue to have the complete API functionality with OIDC.

The functionality of APIs is rather important to us because we need to determine whether or not we will move forward with implementing SSO.

My assumption is we will maintain the function of APIs. However, I can't be sure unless. 

Generating an API token is not an issue here; the continual functionality of the APIs through port 443 after authentication via OIDC is my main concern.

Thank you again for your support.

Mvh / Best regards,

Richard Dennis, MLS

Special Advisor - Data Steward
ORCHID: 0002-0002-4472-7194

Københavns Universitetsbibliotek | Copenhagen University Library

Afdelingen for Forskerservice | Department for Research Support

 +45 91324 822 I +45 2236 6855 (private)

ri...@kb.dk
ww.linkedin.com/in/pacian



From: dataverse...@googlegroups.com <dataverse...@googlegroups.com> on behalf of danny...@g.harvard.edu <danny...@g.harvard.edu>
Sent: Monday, August 16, 2021 5:23:27 PM
To: Dataverse Users Community <dataverse...@googlegroups.com>
Subject: [Dataverse-Users] Re: SSO (oIDC) and APIs and API Access
 
--

Philip Durbin

unread,
Aug 17, 2021, 11:30:52 AM8/17/21
to dataverse...@googlegroups.com
If anything, you might have the opposite problem... someone with an OIDC account who leaves your institution may still have full use of Dataverse APIs until their API token expires or until you explicitly deactivate the account: https://guides.dataverse.org/en/5.6/api/native-api.html#deactivate-a-user

I'd encourage you to try it and see and please let us know if anything's not working. If it helps, you can play around with https://demo.dataverse.org which has Shibboleth (not OIDC) configured. Assuming you're able to log in via Shib (it looks like University of Copenhagen is an option), your API token should work fine.

I hope this helps,

Phil

Richard Dennis

unread,
Aug 19, 2021, 8:55:22 AM8/19/21
to Dataverse Users Community
Hello Phil,

Thank you for your response.

I have just one more comment to make about SSO. 

With the use of SSO (Institutional Sign-On), when a researcher decides to leaves the University, the institution will discontinue their access to the repository because their sign-on is linked to the University AD, this precisely is the purpose for implementing an Institutional Sign-on to eliminate administrative overhead and have the University IT Department control the access.

What are your thoughts about the above?

Regards,

Richard

danny...@g.harvard.edu

unread,
Aug 19, 2021, 10:06:05 AM8/19/21
to Dataverse Users Community
Hi Richard,

It's certainly an advantage to automatically remove access in the way that you describe, and at Harvard we encourage users to sign in through their institutional account.

Though our situation is slightly different, in that we run a service where anyone can deposit and access, at the Harvard Dataverse Repository we also handle a lot of support requests related to people leaving institutions and moving to other institutions. When a user leaves Harvard and transfers to another institution, we ask them to create an account at the new institution and then we merge their information into the new account. We added an API for this: 

https://guides.dataverse.org/en/latest/api/native-api.html#merge-user-accounts

So, I think it's a good idea to use the institutional account, but note that there may be some additional support queries. 

- Danny

James Myers

unread,
Aug 19, 2021, 10:33:14 AM8/19/21
to dataverse...@googlegroups.com

I think an underlying point that is key in this discussion is that Dataverse’s API does not require/allow login via password and doesn’t support the notion of a session. Instead, users can generate a relatively long-lived API key that can be used to make API calls. That means that when an SSO account is deactivated and no action is taken in Dataverse to deactivate the user, their API key (if it was created at all) would still be valid until it times out.

 

In addition to allowing users to use the API directly, e.g. through pyDataverse/in notebooks, etc. where the user would have seen their API key, Dataverse also uses the API key to enable external tools (explore, config, and preview tools) interact with Dataverse to get files, metadata, etc. and, with config tools, to send updates back (e.g. to change the variable level metadata for a tabular file). This latter case probably motivated the API key route – the external tools do not get the users password and can automatically connect with Dataverse (and users/admins can change/deactivate the key to stop access by the tool without changing their password). It also means that, since users don’t deal directly with an API key to use external tools, it is really only those who directly use the API who are aware they have an API key.

 

For the external tools use case, there is work underway to replace direct use of the API key with shorter-lived signed URLs. Although external tools are ‘trusted’ in some sense, this both limits how long they could have access after being triggered (e.g. the last time a user logged in and clicked to view a preview/use an explore tool) and reduces the need to have an exposed API key. That could make it easier to think of ways to change how direct API use is handled.

 

-- Jim

Philip Durbin

unread,
Aug 19, 2021, 11:31:14 AM8/19/21
to dataverse...@googlegroups.com
I agree with Jim that probably the long term answer is to switch from long-lived API tokens (they last a year) to something short-lived (signed URLs, which don't exist yet) with limited access.

Like I said, for now you could deactivate users who have left. Or have them create a builtin/local account and merge the old account in, like Danny explained.

You are welcome, of course, to open an issue with any suggestions on how you'd like Dataverse to work: https://github.com/IQSS/dataverse/issues

Thanks,

Phil


Richard Dennis

unread,
Aug 24, 2021, 5:33:58 PM8/24/21
to Dataverse Users Community
Dear Philip, Jim, and Danny,
 
Thank you for providing me the information regarding the API Tokens and future API developments and enhancements projects.

If  I may, I would like to move the conversation away from API tokens and towards API commands and functionality and SSO (OIDC).

We are in the final phase of testing Dataverse in our test environment.

We need to determine if our researchers and dataverse administrators will be able to utilize most of the API's commands if we are to implement SSO (OIDC)
  • curl http://localhost:8080/api/admin/externalTools
  • curl -H X-Dataverse-key:$API_TOKEN -X POST $SERVER_URL/api/dataverses/$DATAVERSE_ID/datasets/:startmigration --upload-file dataset-migrate.jsonld
  • curl -L -O -J -H "X-Dataverse-key:$API_TOKEN" $SERVER_URL/api/access/dataset/:persistentId/?persistentId=$PERSISTENT_ID

these curl commands are just examples that we can expect dataverse administrators and researchers to run after they connect to Dataverse via OIDC.

I should preface my comments by saying my experience with Dataverse is limited to only a test environment, not a production environment. Also, my experience with Dataverse is less than two years.

thank you in advance.

Regards,

Richard Dennis

Philip Durbin

unread,
Aug 25, 2021, 9:24:44 AM8/25/21
to dataverse...@googlegroups.com
Yep, those curl commands should work fine with OIDC accounts.

Please keep in mind that "/api/admin" commands are typically executed on the server itself (localhost) for security reasons. For more on this, please see https://guides.dataverse.org/en/5.6/installation/config.html#blocking-api-endpoints

I hope this helps. Please keep the questions coming!

Phil

Reply all
Reply to author
Forward
0 new messages