Community Call on Tuesday 10/25!

68 views
Skip to first unread message

danny...@g.harvard.edu

unread,
Oct 21, 2016, 5:09:42 PM10/21/16
to Dataverse Users Community
Hi everyone - please join us for the community call on Tuesday! Call in information and the agenda can be found here:


If you have anything you'd like to get onto the agenda, please let me know, or just bring it up on Tuesday!

Philip Durbin

unread,
Oct 25, 2016, 12:46:29 PM10/25/16
to dataverse...@googlegroups.com
Great call today! I'm going to try something new. Since I'm taking notes anyway, I'll just paste them below with a link to the Google doc so people can reply here or leave comments on the doc itself. Any errors in the notes are my own and I'm happy to correct them!

https://docs.google.com/document/d/1_lPMh68BV6y_IbN0U1ZiCM3CSCNM33PQT3uoPEeLrAI/edit?usp=sharing

2016-10-25 Dataverse Community Call

Agenda

* Goals, Roadmap, and Releases
* Usability Testing Volunteers
* Community Questions

Attendees

* Danny Brooke (IQSS)
* Len (IQSS)
* Gustavo (IQSS)
* Phil (IQSS)
* Julian (IQSS)
* Derek (IQSS)
* Sherry Lake (UVA)
* Slava Tikhonov (DANS)
* Eugene Barksy (UBC)
* Ryan Steans (TDL)
* Calvin Winkowski (VT)

Notes

* New page: http://dataverse.org/goals-roadmap-and-releases
   * Linked from "Software" in the header.
   * Three levels
      * Strategic goals
      * Big problems we're trying to solve.
      * Details on a particular release (Waffle board).
* Usability testing
   * (Danny) UX process involves testing before and after development. Derek and Julian will be in touch to gather feedback for planned features.
   * (Sherry) Sherry oversaw usability testing of UVa’s installation and will share those notes.
* Three questions/comments from UBC.
   * What about Handle support? 30,000 data files.
      * Good news. Issue 2437 is in QA at https://waffle.io/IQSS/dataverse
      * "Adding Handle System support" is listed under 4.6 at http://dataverse.org/goals-roadmap-and-releases
   * What about OAI?
      * What we rolled out in 4.5 works but in 4.6 we are making sure our OAI-PHM implementation passes protocol tests. See issue 3307 at https://waffle.io/IQSS/dataverse
   * We'd like to integrate Open Science Framework (OSF) with Dataverse.
      * (Eugune) We know that OSF no longer works with DVN 3.x and that by upgrading to Dataverse 4, it will work.
      * (Sherry) If you put a dataset in the main (root) dataverse you can now see it but there's a performance problem when there are thousands and thousands of datasets in the root dataverse.
* Questions from Ryan Steans at TDL
   * At the community meeting, people were talking about their data being stored in data centers.
      * (Eugene) We have the opposite problem. Using Globus. We are taking files from our Dataverse and building a discovery platform to expose those datasets on the Globus publication (?). Using OAI-PHM feed to expose dataset from Dataverse to Globus.
      * (Ryan) 22 institutions. Not aware of what everyone is using. Some are using iRODs. Can we get the metadata into Harvard?
      * (Gustavo) We want to figure out how to support things like Starfish and other big data platforms where you could tell Dataverse where this "trusted" storage is.
      * (Phil) As part of a three year Helmsley grant, we are implementing rsync, having computation on files stored on a cluster.
         * Data Capture Module (rsync support): https://github.com/IQSS/dataverse/issues/3145
         * File Import Batch job in support of rsync (file system crawler): https://github.com/IQSS/dataverse/issues/3353
* Questions from Sherry at UVA
   * Can we ask Harvard to run a certain harvesting script?
      * (Danny) We should have the ability to set up a harvesting set.
* Calvin from VT on AWS
   * Automated deployment of Dataverse to AWS: https://github.com/IQSS/dataverse-aws
   * (Phil) I'd love to use this for automated testing: https://github.com/IQSS/dataverse/issues/2746
* Eugene question on 2017 Dataverse Community meeting
   * (Danny) June 14-16 2017 in Cambridge, MA. A page will be linked from dataverse.org
* Slava question on Widgets, FAIR principles
   * Any plans to develop widgets further?
   * (Danny) Nothing on the roadmap.
   * (Gustavo) Once we have a file landing page, we plan to someday have widgets at the file level.
   * (Slava) FINDABLE, ACCESSIBLE, INTEROPERABLE AND RE-USABLE DATA (FAIR) principles https://www.force11.org/fairprinciples
      * Could influence the design of widgets.


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/281f0bcf-3638-41b6-ab8a-447940986ddd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Mercè Crosas

unread,
Oct 25, 2016, 7:01:29 PM10/25/16
to dataverse...@googlegroups.com
Thanks, Phil, for the useful notes. Here are a few more references to projects of interest, relevant to the discussion on today's call:

Cloud Dataverse project with the Massachusetts Open Cloud (I'll share a few slides later this week): 

Structural Biology project, replicating data to remote sites to support local access to data and compute (last part of this talk):

FAIR principles paper published in Scientific Data:


Mercè Crosas, Ph.D.
Chief Data Science and Technology Officer, IQSS
Harvard University

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

To post to this group, send email to dataverse-community@googlegroups.com.

Philip Durbin

unread,
Oct 26, 2016, 9:32:35 AM10/26/16
to dataverse...@googlegroups.com
Thanks, Merce! The slides on the SBGrid collaboration are great and makes me think that we should update the "rsync" feature at http://dataverse.org/goals-roadmap-and-releases to include much more information about what the feature is about:

In my "tradeoffs" slide mentioned at https://github.com/IQSS/dataverse/issues/3351 I talk about exciting new features

- Directory structure preserved
- Preserved filenames on disk allows for computation
- Pause and resume uploads
- Round-trip checksum validation

I also mention some (current) incompatible with existing features but I'd rather not dwell on that at the moment. We hope to address as many of these as possible before we ship that "rsync" release. :)

The other thing I'd want to put on the roadmap is an invitation to anyone who is interested in features of this nature to try out the code we're working on. The big issue that's in development is https://github.com/IQSS/dataverse/issues/3353 which is the "crawl over the filesystem and insert rows into the Dataverse database (postgres)" feature I was talking about during the community call. Once that issue becomes a pull request, I'd love to invite people to try out a pre-release war file so we can get early feedback. If you're interested in this sort of testing, please comment on that issue or otherwise get in touch!

Finally, I didn't even realize that Anuj has been blogging away about the Mass Open Cloud (MOC) and Swift stuff! I just read his posts and left some handy links as a comment on this issue that seems most relevant: https://github.com/IQSS/dataverse/issues/2909#issuecomment-256345413

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages