Shibboleth integration + deposit workflows

Skip to first unread message

Amber Leahey

Aug 30, 2016, 11:31:43 AM8/30/16
to Dataverse Users Community
Hi Dataverse folks, 

Rather than report an issue I thought I'd start here to flesh out what we are thinking with moving ahead using Shibboleth. 

Firstly, we want to use Dataverse and Shibboleth and we are glad Dataverse is now a service provider. This will certainly make things easier for people to manage their accounts by having a single sign-on. 

We are currently in the process of upgrading our our DV 3.6 to version 4.5 and we aren't going to use Shibboleth just yet, but we hope to work with you to develop some potentially new features using Shibboleth.

We were thinking we might want to test Shibboleth using the new CAF "Research and Scholarship" entity profile later this fall: At the moment not all our schools are setup with Shibboleth, and instead some use EZProxy, else others aren't exposing all the necessary elements as an identity provider (e.g. not fully implemented yet). 

Aside from all this, we would like to see some additional utilizations of Shibboleth in the Dataverse system, beyond single sign-on:

-integration with the IP Groups setup / permissions for data access;
-integration with the user account information and "affiliation" field (e.g. automatically populate based on Shibb identity)
-user affiliation tied to specific Dataverse(s) for deposit (if desired) (e.g. sending people to institutional dataverse in hosted Dataverse network)

For now we are going to implement this workflow using a custom drop-down menu under the user account information (see attached screenshot).

Once affiliated, upon login, users are directed to the identified institutional dataverse, where they will see their own dataverse and datasets, and others from their institution. 

Not sure if others are interested in this or not, so I thought I'd put it out there and help to explain some of our experiences using Shibboleth as a service provider. 

Amber Leahey, Scholars Portal


Philip Durbin

Sep 2, 2016, 3:41:13 PM9/2/16
Hi Amber,

Thanks, this helps me understand how you'd like to see the Shibboleth feature evolve in the future. At Harvard, now that we have "Federated Login Mode" enabled at as of Dataverse 4.5[1], we've completed all the "phase 1" work that was spec'ed out originally in the "Remote Authentication" requirements doc[2]. That doc contains a "phase 2" section that we'd like to eventually turn into GitHub issues and I'd like to treat your (fantastic!) ideas the same way. That is to say, let's at least make sure there are GitHub issues tracking the various ideas and improvements the community would like to see around the Shibboleth feature and authentication in general. Then we'll just need to find time to work on them. :)

I'm getting the impression that Scholars Portal might be willing to help write some code to develop some of the feature you want, which is great! The "phase 1" work was good enough for Harvard for now, but maaaybe we'd like some of your suggested features as well. :)

It's a bummer that not all of your institutions support Shibboleth yet, but it sounds like they're working on it. Good, good.

Someday we'd like to make Dataverse more "forgiving" about not receiving names and email addresses from Identity Providers (no GitHub issue for this yet) but for now, you're on the right track with encouraging institutions to release the Research & Scholarship attribute bundle. More on this at

Can you describe more how you see IP Groups and Shibboleth integrated? We're actually working on a "groups within groups" issue at but I'm sitting here wondering myself if this would help or not. Shibboleth Groups and IP Groups are stored in different database tables but maybe we could get some sort of "groups within groups" think working, a Shib group and an IP Group within an explicit group or something.

We already populate "Affiliation" for users with the value (if any) that comes from the Identity Provider. The docs don't seem to explain how this works (whoops) but you can find some details in . It sounds like you want to go way beyond simply persisting the string "Harvard" or whatever for each user, which is all that happens now. Mostly we use this affiliation field to pre-populate the affiliation of an author when creating a dataset. Maybe it's used in other places I'm not aware of.

Your affiliation ideas actually remind me of which may become a pain point for if lots of the 200+ institutions that can log in to the Harvard Dataverse each want an institution-wide group, which currently must be created manually. Working on this issue would force us to create a new database table to record all these institutions (I think) which might help in general with your affiliation ideas. I guess there'd be an (optional?) relationship between each of the institutions and a dataverse... or maybe this would be stored at the user level and even non-Shibboleth users could specify which dataverse they want as their homepage when they log in? I dunno.

Anyway, thanks for kicking off this discussion. Interesting stuff! Let's decide what we want to build and when. :)




You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
To view this discussion on the web visit
For more options, visit

Reply all
Reply to author
0 new messages