Control over dataset file hierarchy + directory structure (new feature in Dataverse 4.12)

185 views
Skip to first unread message

Philip Durbin

unread,
Apr 5, 2019, 7:55:15 AM4/5/19
to dataverse...@googlegroups.com
As was previously announced[1], Dataverse 4.12 has a new feature called "File Path" that allows you to organize your files into folders.

https://demo.dataverse.org was upgraded to Dataverse 4.12 yesterday and I'd like to invite you to try it out and provide feedback either here or in https://github.com/IQSS/dataverse/issues/2249 or on Twitter or wherever. :)

I'm going to attach some screenshots to help get you started. The feature is documented here: http://guides.dataverse.org/en/4.12/user/dataset-management.html#file-path

I'm sure we'll be updating our comparative review of data repositories[2] to indicate that Harvard Dataverse (upgraded to 4.12 yesterday as well) now supports "Users are able to control dataset file hierarchy + directory structure".

To be clear, there are two ways to put your files into folders:

- For existing files, edit "File Path".
- For new files, upload a zip with files already in folders, written to "File Path" since 4.6[3], used in zip download since 4.11[4].

I hope you're excited about this as I am. :)

Thanks in advance for any feedback!

Phil
step1.png
step2.png
docs.png

Julian Gautier

unread,
Apr 5, 2019, 11:41:16 AM4/5/19
to Dataverse Users Community
I'm sure we'll be updating our comparative review of data repositories[2] to indicate that Harvard Dataverse (upgraded to 4.12 yesterday as well) now supports "Users are able to control dataset file hierarchy + directory structure".

Philipp at UiT

unread,
Aug 21, 2019, 8:04:55 AM8/21/19
to Dataverse Users Community
After our Dataverse installation has been updated from 4.11 (or so) to 4.15.1 I have finally been able to test the File Path feature. Being able to control dataset file hierarchy + directory structure is really great! Thanks for implementing this! With this new feature in place, we can, e.g., be more strict on our recommendation on not using zip files. So far, we have accepted twice-packed zip files in cases where the file hierarchy was in important. With the new feature in place, zips can be replaced by File Path.

However, I wonder if it possible to easily select one or multiple folders without selecting all files contained in a dataset. In the tree view there is no way to select folders/files as far as I can see.

Best, Philipp

Philip Durbin

unread,
Aug 21, 2019, 9:21:51 AM8/21/19
to dataverse...@googlegroups.com
I like the idea of being able to select which folders to download and you should feel free to open an issue about this.

I'm curious if you (or others) have any ideas about how it would look. Attached is a mockup I put together quick (using Chrome dev tools) where I'm playing around with adding checkboxes and a "download" button to the tree view.

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/0ef21dcd-d670-4891-827b-7a6b6231a0a1%40googlegroups.com.
Screen Shot 2019-08-21 at 9.17.35 AM.png

Philipp at UiT

unread,
Aug 22, 2019, 1:00:44 AM8/22/19
to Dataverse Users Community
The mockup looks nice. I was further thinking whether we could merge the table and the tree view. If all download options -- i.e. on file, folder and dataset level, as well as combinations of them -- are available in one single view, I think this would make it easier for users.

Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Sherry Lake

unread,
Aug 22, 2019, 12:19:54 PM8/22/19
to Dataverse Users Community
I agree with Philipp, there needs to be download buttons or check boxes to select files to download from the "Tree view". Phil, your mockup just has check boxes for directories (which is also useful), but was thinking about checks all-the-way-down.

Danny Brooke

unread,
Aug 22, 2019, 12:37:16 PM8/22/19
to Dataverse Users Community
Sherry, Philipp, thanks for the feedback. 

We discussed providing a single, unified view instead of the table/tree toggle but it would be a bit challenging to provide a good experience and we had concerns about scalability. This isn't as big of an issue with a download use case, but as we add additional external tools and functionality at the file level, and support different ways of accessing files, it would get crowded very quickly in a tree view. The table view provides the flexibility to provide more of these file-level options. 

Philipp at UiT

unread,
Aug 24, 2019, 4:23:51 AM8/24/19
to Dataverse Users Community
Thanks for clarifying this, Danny.

I just realized that it has been some time ago I last tried to down load all files in a dataset with more than the number of files that are displayed in one table view - i.e. more than maximally 50 files. When I tried this now, the first thing I did was wondering "where is the download all button". I then decided to select the box to the left of 1 of X of Y files. A new pop-up then appeared with the text "Select all Z files in this dataset". But I wonder wouldn't it be more convenient to have a download all button to start with? I guess this has been discussed before, e.g. on GitHub (see #4051), but I think other users also might look for a download all or select all button in datasets with a lot of files.

Philipp at UiT

unread,
Aug 24, 2019, 4:37:44 AM8/24/19
to Dataverse Users Community
Another issue I came across is the sorting of the files. I'm currently curating a dataset with more than 100 files. The researcher wanted to use double-zipped files, but we suggested to zip a file/folder structure and upload a simple zip. However, in the table view the folders and files are sorted so that files contained in folders are listed first. So all the files in data folders are listed first. But we wanted the README file - which was not contained in any folder - to be listed on the top and not in position 113, because we do not want users to have to scroll to the very end of the file list in order to download only the README file. We named the README file "_README.txt", but due to the sorting behavior this doesn't help. I guess configuring the default sorting to list files before folders would help. But do we want this default sorting? Is there another way to make README file appear on top of all files? I could of course create two main folders, "_DOCUMENTATION" and "DATA", but I still would have to add a _ in the documentation folder name in order for the README file to appear on top.

Philip Durbin

unread,
Aug 26, 2019, 7:16:20 AM8/26/19
to dataverse...@googlegroups.com
It looks ugly but if you rename your README.txt file to 0README.txt file it will be displayed above a folder called "data" or whatever. There's some related discussion in these issues:

- Allow installations to determine order of files on Dataset page: https://github.com/IQSS/dataverse/issues/4959
- Add ability to reorder dataset files as draggable elements: https://github.com/IQSS/dataverse/issues/5280

Please feel free to comment on these issues or open a new year.

You are also welcome to open an issue about your idea for a "Download All" button. I'm not sure if you mean if it would sit right next to the "Downlaod" button or if it would replace it somehow. You could propose something in an issue. :)

I hope this helps,

Phil

Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/23499e87-bf0c-4422-bd9d-a1e288f8a51c%40googlegroups.com.

Philipp at UiT

unread,
Aug 26, 2019, 12:19:56 PM8/26/19
to Dataverse Users Community
Thanks, Phil. Actually, I tried to get the README file on top of the list by renaming it, but I thought an initial "_" would do the trick, but it didn't. Now I know that "0" works fine :)

I just submitted an issue about the Download all files feature in GitHub; cf. #6118.

Best, Philipp
Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Philipp at UiT

unread,
Aug 27, 2019, 8:34:25 AM8/27/19
to Dataverse Users Community
I just discovered that in the tree view the first folder is always opened by the default. I think it would help users to get an overview if all folders were closed by default.

Philip Durbin

unread,
Aug 27, 2019, 9:10:30 AM8/27/19
to dataverse...@googlegroups.com
Here's where this is controlled:

// all the folders, except for the top-level root node
// are collapsed by default:
currentNode.setExpanded(expandFolders);



You are, of course, welcome to open an issue for this decision to be revisited. Perhaps it could be a configuration option.

Thanks for the feedback,

Phil

p.s. Thanks for creating the issue about "download all".


Best, Philipp
Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/17319819-f20b-46c4-baca-dc4beaafd452%40googlegroups.com.

Philipp at UiT

unread,
Aug 28, 2019, 6:31:10 AM8/28/19
to Dataverse Users Community
Thanks, Phil! I have just created an issue (#6122) about this.

Best, Philipp
Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages