pyDataverse and the 2019-05-07 Dataverse Community Call

41 views
Skip to first unread message

Philip Durbin

unread,
May 7, 2019, 10:48:54 AM5/7/19
to dataverse...@googlegroups.com
In a little over an hour, during the 2019-05-07 Dataverse Community Call, Stefan Kasberger from the Austrian Social Science Data Archive (AUSSDA) will be presenting a new Python client for Dataverse he has developed called pyDataverse.

Stefan is eager for feedback and would like to discuss how others can contribute.

To call in, please see "connection details" at https://dataverse.org/community-calls

Sorry for the short notice. Hope to see you there!

Thanks,

Phil

Philip Durbin

unread,
May 8, 2019, 10:01:01 AM5/8/19
to dataverse...@googlegroups.com
Great call and a special thanks to Stefan for a fabulous presentation on pyDataverse! Here are the notes from https://docs.google.com/document/d/1nxvurGyVzxYKna-h0Qzl7emk6RoHELFFlwdPw7MESnI/edit?usp=sharing

2019-05-07 Dataverse Community Call

Agenda

* #dataverse2019
* pyDataverse https://github.com/AUSSDA/pyDataverse
* Community Questions

Attendees

* Danny Brooke (IQSS)
* Tania Schlatter (IQSS)
* Julian Gautier (IQSS)
* Phil Durbin (IQSS)
* Jim Myers (QDR)
* Sherry Lake (UVA)
* Paul Boon (DANS)
* Slava (DANS)
* Stefan Kasberger (AUSSDA)
* Jon Crabtree (Odum)
* Jamie Jamison (UCLA)

Notes

* #dataverse2019
   * (Danny) Registration Open https://projects.iq.harvard.edu/dcm2019
   * (Danny) Proposal review and final schedule this week or early next
   * (Jon) You can take advantage of a discount if you join the GDCC: http://dataversecommunity.global
* pyDataverse ( https://github.com/AUSSDA/pyDataverse )
   * (Stefan) I'm a devop, pretty new with Dataverse, background in data science. The idea of pyDataverse started because I had two projects: 1) NESSTAR migration. 2) Duplicating data from from Dataverse installation to another via API. So I created a Python module. It's all open source, MIT licensed. I see two main use cases - data pipelines and microservices. The central functionality should work already. I'm writing tests, using Travis. I may be releasing as soon as tomorrow on PyPI. I'm using curl for uploading files but will work on moving this into Python. Plan: Migrating data coming from GESIS DSpace export, maybe importing DDI XML. A fixed activity is to develop a NESSTAR migration microservice as part of the Horizon 2020 project called DataverseSSHOC. For this, we want to set up a microservice with this Python module in the core to make migration from NESSTAR easy. A central functionality for this will be to create a data model for the data and metadata used by Dataverse which then can be used to import from and export into other sources and to directly upload into the Dataverse API. One output format is what I call a "DVTree" (Dataverse Tree), a predefined structure of directories and files with defined naming conventions, and content, so you can export and import from this local file hierarchy and directory structure into the python objects.
   * (Danny) You mentioned you wanted to discuss this on June 18th in Cambridge. Is that still the case?
   * (Stefan) Yes, I'll arrive on the 17th in the evening. I'd love to hack around a little bit.
      * (Phil) It's super exciting that we may someday have a Dataverse package on PyPI! With regard to hacking around on the days before the community meeting, everyone should add their ideas to the "#Dataverse2019 - June 17th and 18th" doc at https://docs.google.com/document/d/19y2H_3fvHmni56JDucOHIYtYFbZDuZ-b2p37XvKdvjg/edit?usp=sharing
   * (Slava) We're also interested in migrating datasets from NESSTAR to Dataverse. Can we add RESTful API support? I'm thinking about the extension of pyDataverse with functionality to add external controlled vocabularies. The pyDataverse module could be deployed on the Cloud as an application for clients to use. Swagger doc for this application would be helpful.
* Community Questions
   * (Jamie) How to proceed moving the Social Science Data Archive hosted at Harvard Dataverse to UCLA Dataverse.
      * (Danny) Will investigate, more info soon and to this group
   * (Jamie) Questions about configuring the login for people not affiliated with our institution (UCLA).
      * (Sherry) I think TDL has done this. https://dataverse.tdl.org - Here’s the login page for TDL: https://dataverse.tdl.org/loginpage.xhtml;jsessionid=1c6c03bd862c88deee9acfcf7057?redirectPage=%2Fdataverse.xhtml
      * (Danny) Will connect you with the group at NTU that (I believe) has new signups turned off https://researchdata.ntu.edu.sg/
   * (Jamie) Can we have people log in and land in a specific dataverse?
      * (Phil) This has definitely been asked for. Let me dig a bit. (Time passes.) Amber from Scholars portal asked for something similar in her "Shibboleth integration + deposit workflows" post at https://groups.google.com/d/msg/dataverse-community/HtbqpVIa-SU/xo3zIVSPAQAJ . Also Courtney from TDL aske about a similar feature at https://github.com/IQSS/dataverse/issues/3923 .
   * (Jamie) Any connects to ORCID apart from using it as a login?
      * (Danny) There have been discussions but nothing specific has been planned.
      * (Phil) See also ORCID issues https://github.com/IQSS/dataverse/issues/3490 and https://github.com/IQSS/dataverse/issues/4236 and https://github.com/IQSS/dataverse/issues/3622 and https://github.com/IQSS/dataverse/issues/3048
Reply all
Reply to author
Forward
0 new messages