2020-01-31 Notes from State Archives of Belgium Dataverse Meetup

34 views
Skip to first unread message

Philip Durbin

unread,
Feb 4, 2020, 4:54:34 PM2/4/20
to dataverse...@googlegroups.com
On Friday there was a fantastic meetup of six installations of Dataverse at the State Archives of Belgium and we look some notes (and a pic) which are below and linked from https://twitter.com/philipdurbin/status/1223334005006794752

Attendees:

* Phil Durbin (IQSS) @philipdurbin, philip...@harvard.edu
* Oliver Bertuch (FZJ) Twitter: @poi_ki_lo_therm, Mail: o.be...@fz-juelich.de
* Dorothea Iglezakis (Uni Stuttgart) Mail: dorothea....@ub.uni-stuttgart.de
* Patrick Vranckx (UCLouvain) (patrick...@uclouvain.be)
* Baptiste Rouxel (baptist...@sciencespo.fr), Sciences Po, CDSP
* Benjamin Peuch (State Archives of Belgium), Benjami...@arch.be
* Youssef Ouahalou (State Archives of Belgium), Youssef....@arch.be
* Freya De Schamphelaere (State Archives of Belgium), Freya.DeSc...@arch.be

Notes:

* EAD Encoded Archival Description (XML format) custom metadata block
* Re-evaluate metadata blocks, metadata display, help, and metadata sources: https://github.com/IQSS/dataverse/issues/6030
   * Related to https://github.com/IQSS/dataverse/issues/3404
* Validately usability study:
   * Dataset and File Redesign Testing ( https://validately.com/unmoderated/6716a601-2d91-11ea-a2f1-42010af00531 )
      * Jump into the prototype AJPS dataset page: https://sketch.cloud/s/VYGYr/a/ZonZ7Z/play
      * Jump into the prototype SBGrid dataset page: https://sketch.cloud/s/VYGYr/a/AMROE5/play
      * All pages:  https://sketch.cloud/s/VYGYr
   * Copied from Community Call 2020-01-14 https://docs.google.com/document/d/1rpLVU_booGE1gnrlFeli-iu-hyFrt08Hb9PKASs7X2k
* Title of dataset vs title of study (actually it’s just saying “Title”)
* Different variations in names
* Duplicate author to contact
* Date of description
* Subject and Language have a different UI picker in 4.19
* Terms of Use applicable. Concern that researchers will use Terms of Access too often.
* AUSSDA, Terms of Use: https://data.aussda.at/dataset.xhtml?persistentId=doi:10.11587/8VAV6W
* Why is CC0 the default?
   * (Phil) Here's a design doc about CC0 I mentioned: https://docs.google.com/document/d/10lQeIbfqgFd8JOBXYznxv2AMHFXxhSlYxyEaOoe00G0/edit?usp=sharing
      * In addition, here's what's documented in the guides: http://guides.dataverse.org/en/4.19/user/dataset-management.html#cc0-public-domain-dedication
   * All people in the room (5 installations in Belgium, France and Germany) push for CC-BY as a default. DaRUS also hacked the properties to replace CC-0 with CC-BY.
   * (Phil) Recently during sprint planning I suggested that we estimate https://github.com/IQSS/dataverse/issues/1753 about CC-BY, etc because the issue had 3 rows in my "aggregation" of installation boards at https://docs.google.com/spreadsheets/d/1akFp4NEJ_SgQJ9aaHAH020G0lYpM2GShs3vEnqEYy-U/edit?usp=sharing and there seemed to be some interest in delivering CC-BY and other CC license options as part of a "small chunk" before the entire dataset page redesign happens.
   * Maybe this can be a community contribution. Phil, should we give you a goodie bag to take home for Tania and Danny? (Kinda you mentoring the community)
* self curation model vs curated
   * Dataverse installation personas: https://github.com/IQSS/dataverse-installations/issues/16
   * new curation services offered by Harvard Dataverse: https://support.dataverse.harvard.edu/curation-services
* CoreTrustSeal
   * Do you bring up terms of non-compliance?
   * You can find a CoreTrustSeal badge on the new map: https://dataverse.org/installations
* SSI funding needs to benefit the UK
* DaRUS at Stuttgart (https://docs.google.com/presentation/d/1Dh7ONBYHYZHIJP8X13ij4HNDN-FI4quNijstCh7B02k/edit?usp=sharing)
   * (Phil) 4.17? Let's add to metrics.
   * command line tool to release datasets
   * 100s of datasets in a week (aeronautic simulations), need automation of metadata capture
   * 100 GB via API
   * Data is too big to move
   * ReplayClient on GitHub
   * ExtractIng (extracting metadata from files in engineering discipline) coming soon
   * Metadata Mapper on private GitHub
   * Engineering metadata
   * Process metadata
      * Methods, software, and tools
      * (Phil) Renku seems related for capturing what software is used, etc: https://github.com/IQSS/dataverse/issues/6592
   * Software metadata (Codemeta)
   * Hierarchical Metadata Schemes - metadata4lng
      * (Oliver) Jim mentioned a project where metadata files go into MongoDB, Clowder: https://github.com/IQSS/dataverse/issues/6497#issuecomment-572741689
   * Controlled vocabularies
      * https://github.com/IQSS/dataverse/issues/6154
   * Custom metadata blocks should have their fields mapped to export formats
   * Lists of fields might be nice. See also https://github.com/IQSS/dataverse/issues/6589#issuecomment-580230932
   * range queries for numerical data
      * (Phil) If you index the field as number, a Solr range query should work (this example is for dates): http://guides.dataverse.org/en/4.19/api/search.html#date-range-search-example
   * (Oliver) external tool for poor man's samples database
   * license for software (get a list from https://tldrlegal.com ?)
* https://dataverse.uclouvain.be
   * new installation
* (Patrick) NextCloud
* (Oliver) b2drop
* Baptiste
   * Import NESSTAR
      * (Oliver) You should talk to Slava about this. (Phil agrees.)
   * Using CESSDA controlled vocabularies
* https://chat.dataverse.org
* Phil's talking points: http://blog.greptilian.com/2020/01/31/dataverse-at-state-archives-of-belgium/
   * News from Dataverse:
      * DataverseTV (https://dataverse.org/dataversetv)
      * Dataverse Meetups: https://dataverse.org/events
   * Input wanted for https://dataverse.org/software-features (see the spreadsheet at the bottom)
* Oliver on Dataverse European Workshop 2020
   * 75 people
   * next one in Portugal probably
   * use cases were presented
   * Merce on Harvard Data Commons, FAIR
      * (Benjamin) There's a powerpoint from DANS (Elly Dijk & Peter Doorn): https://slideshare.net/OSFair/monitoring-the-fairness-of-data-sets-introducing-the-dans-approach-to-fair-metrics
   * how to do sustainable open source projects
   * proceedings are coming
   * technical sessions hosted by Slava
   * Slava on a Docker approach
   * Oliver on considering kubernetes
   * Stefan on pyDataverse, includes a Jupyter Notebook: https://github.com/AUSSDA/pyDataverse_demo_tromso/blob/master/pydataverse.ipynb
   * Jim on GDCC, S3 uploader, file previewers, BagIT support
* List of Dataverse forks (see “fork URL” column): https://docs.google.com/spreadsheets/d/1l2R9D1FQy88qVzg2bI6L1LgplmM2l7pnMI80jdiz4fk/edit?usp=sharing

--
brussels-meetup.jpg
Reply all
Reply to author
Forward
0 new messages