Hi Amber,
Thanks, this helps me understand how you'd like to see the Shibboleth feature evolve in the future. At Harvard, now that we have "Federated Login Mode" enabled at
https://dataverse.harvard.edu as of Dataverse 4.5[1], we've completed all the "phase 1" work that was spec'ed out originally in the "Remote Authentication" requirements doc[2]. That doc contains a "phase 2" section that we'd like to eventually turn into GitHub issues and I'd like to treat your (fantastic!) ideas the same way. That is to say, let's at least make sure there are GitHub issues tracking the various ideas and improvements the community would like to see around the Shibboleth feature and authentication in general. Then we'll just need to find time to work on them. :)
I'm getting the impression that Scholars Portal might be willing to help write some code to develop some of the feature you want, which is great! The "phase 1" work was good enough for Harvard for now, but maaaybe we'd like some of your suggested features as well. :)
It's a bummer that not all of your institutions support Shibboleth yet, but it sounds like they're working on it. Good, good.
Someday we'd like to make Dataverse more "forgiving" about not receiving names and email addresses from Identity Providers (no GitHub issue for this yet) but for now, you're on the right track with encouraging institutions to release the Research & Scholarship attribute bundle. More on this at
http://guides.dataverse.org/en/4.5/installation/shibboleth.html#identity-federationCan you describe more how you see IP Groups and Shibboleth integrated? We're actually working on a "groups within groups" issue at
https://github.com/IQSS/dataverse/issues/3273 but I'm sitting here wondering myself if this would help or not. Shibboleth Groups and IP Groups are stored in different database tables but maybe we could get some sort of "groups within groups" think working, a Shib group and an IP Group within an explicit group or something.
We already populate "Affiliation" for users with the value (if any) that comes from the Identity Provider. The docs don't seem to explain how this works (whoops) but you can find some details in
https://github.com/IQSS/dataverse/issues/1497 . It sounds like you want to go way beyond simply persisting the string "Harvard" or whatever for each user, which is all that happens now. Mostly we use this affiliation field to pre-populate the affiliation of an author when creating a dataset. Maybe it's used in other places I'm not aware of.
Your affiliation ideas actually remind me of
https://github.com/IQSS/dataverse/issues/1403 which may become a pain point for if lots of the 200+ institutions that can log in to the Harvard Dataverse each want an institution-wide group, which currently must be created manually. Working on this issue would force us to create a new database table to record all these institutions (I think) which might help in general with your affiliation ideas. I guess there'd be an (optional?) relationship between each of the institutions and a dataverse... or maybe this would be stored at the user level and even non-Shibboleth users could specify which dataverse they want as their homepage when they log in? I dunno.
Anyway, thanks for kicking off this discussion. Interesting stuff! Let's decide what we want to build and when. :)
Thanks,
Phil
1.
https://groups.google.com/d/msg/dataverse-community/dfMrGAKLCHA/a9x-Ey3FBAAJ2.
https://docs.google.com/document/d/1vcAmo2nkFYavAr7OwwXzxM0IFQbkRZYZrrX43q-wqGE/edit?usp=sharing