Dataverse 5.0 Available

67 views
Skip to first unread message

danny...@g.harvard.edu

unread,
Aug 20, 2020, 2:22:43 PM8/20/20
to Dataverse Users Community
Hi everyone,

We're happy to announce the arrival of Dataverse 5! This was a true community effort, with many of you contributing code, bug reports, and other help as we put together this major release. We discussed a lot of the Dataverse 5 features at the Plenary Session of the Dataverse Community Meeting, but I'll post some quick highlights below. Remember you can also check out the detailed release notes for information about what's included in this release. 

Thank you again for all of your help with this release! On to 5.1! :)

------

Continued Dataset and File Redesign: Dataset and File Button Redesign, Responsive Layout

The buttons available on the Dataset and File pages have been redesigned. This change is to provide more scalability for future expanded options for data access and exploration, and to provide a consistent experience between the two pages. The dataset and file pages have also been redesigned to be more responsive and function better across multiple devices.

This is an important step in the incremental process of the Dataset and File Redesign project, following the release of on-page previews, filtering and sorting options, tree view, and other enhancements. Additional features in support of these redesign efforts will follow in later 5.x releases.

Payara 5

A major upgrade of the application server provides security updates, access to new features like MicroProfile Config API, and will enable upgrades to other core technologies.

Note that moving from Glassfish to Payara will be required as part of the move to Dataverse 5.

Download Dataset

Users can now more easily download all files in Dataset through both the UI and API. If this causes server instability, it's suggested that Dataverse Installation Administrators take advantage of the new Standalone Zipper Service described below.

Download All Option on the Dataset Page

In previous versions of Dataverse, downloading all files from a dataset meant several clicks to select files and initiate the download. The Dataset Page now includes a Download All option for both the original and archival formats of the files in a dataset under the "Access Dataset" button.

Download All Files in a Dataset by API

In previous versions of Dataverse, downloading all files from a dataset via API was a two step process:

  • Find all the database ids of the files.
  • Download all the files, using those ids (comma-separated).

Now you can download all files from a dataset (assuming you have access to them) via API by passing the dataset persistent ID (PID such as DOI or Handle) or the dataset's database id. Versions are also supported, and you can pass :draft, :latest, :latest-published, or numbers (1.1, 2.0) similar to the "download metadata" API.

A Multi-File, Zipped Download Optimization

In this release we are offering an experimental optimization for the multi-file, download-as-zip functionality. If this option is enabled, instead of enforcing size limits, we attempt to serve all the files that the user requested (that they are authorized to download), but the request is redirected to a standalone zipper service running as a cgi executable. Thus moving these potentially long-running jobs completely outside the Application Server (Payara); and preventing service threads from becoming locked serving them. Since zipping is also a CPU-intensive task, it is possible to have this service running on a different host system, thus freeing the cycles on the main Application Server. The system running the service needs to have access to the database as well as to the storage filesystem, and/or S3 bucket.

Please consult the scripts/zipdownload/README.md in the Dataverse 5 source tree.

The components of the standalone "zipper tool" can also be downloaded
here:

https://github.com/IQSS/dataverse/releases/download/v5.0/zipper.zip

Updated File Handling

Files without extensions can now be uploaded through the UI. This release also changes the way Dataverse handles duplicate (filename or checksum) files in a dataset. Specifically:

  • Files with the same checksum can be included in a dataset, even if the files are in the same directory.
  • Files with the same filename can be included in a dataset as long as the files are in different directories.
  • If a user uploads a file to a directory where a file already exists with that directory/filename combination, Dataverse will adjust the file path and names by adding "-1" or "-2" as applicable. This change will be visible in the list of files being uploaded.
  • If the directory or name of an existing or newly uploaded file is edited in such a way that would create a directory/filename combination that already exists, Dataverse will display an error.
  • If a user attempts to replace a file with another file that has the same checksum, an error message will be displayed and the file will not be able to be replaced.
  • If a user attempts to replace a file with a file that has the same checksum as a different file in the dataset, a warning will be displayed.
  • Files without extensions can now be uploaded through the UI.
Pre-Publish DOI Reservation with DataCite

Dataverse installations using DataCite will be able to reserve the persistent identifiers for datasets with DataCite ahead of publishing time. This allows the DOI to be reserved earlier in the data sharing process and makes the step of publishing datasets simpler and less error-prone.

Primefaces 8

Primefaces, the open source UI framework upon which the Dataverse front end is built, has been updated to the most recent version. This provides security updates and bug fixes and will also allow Dataverse developers to take advantage of new features and enhancements.



Thomas Jouneau

unread,
Aug 24, 2020, 8:02:22 AM8/24/20
to dataverse...@googlegroups.com, danny...@g.harvard.edu

Dear Danny, all,

Thanks for these great news. Do you know if and when a Docker version will be available?

All the best,

Thomas

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/93d15deb-f787-4a95-afce-3c4cc8f054fen%40googlegroups.com.

Philip Durbin

unread,
Aug 25, 2020, 12:09:51 PM8/25/20
to dataverse...@googlegroups.com
Hi Thomas,

Docker versions of Dataverse are a community-lead effort. The two primary repos to watch are:


You can read a little more about Docker from the Dataverse perspective at http://guides.dataverse.org/en/5.0/developers/containers.html

I hope this helps,

Phil



--

Vyacheslav Tikhonov

unread,
Sep 2, 2020, 3:52:09 PM9/2/20
to Dataverse Users Community
Hi Thomas,

Docker version was already upgraded to version 5.0, you can try it. 

Best,
Slava
Reply all
Reply to author
Forward
0 new messages