Downloading metadata

70 views
Skip to first unread message

Jonathan Bohan

unread,
Feb 7, 2017, 9:27:45 AM2/7/17
to Dataverse Users Community
Hello,

I'm fairly new to Dataverse, and I was wondering if there was a method for downloading all the metadata entered into a dataverse, published or unpublished in order to check for consistency among the data entered. Having difficulty finding information on this in the user guide, I'm not sure if I'm just using the wrong terminology.

Thanks in advance for your replies.

- Jonathan Bohan

Philip Durbin

unread,
Feb 7, 2017, 10:00:31 AM2/7/17
to dataverse...@googlegroups.com
Hi Jonathan,

Welcome to the Dataverse community! Here are some pointers to get you started:

- "Note that once a dataset has been published its metadata may be exported. A button on the dataset page’s metadata tab will allow a user to export the metadata of the most recently published version of the dataset. Currently supported export formats are DDI, Dublin Core and JSON." http://guides.dataverse.org/en/4.6/user/dataset-management.html#supported-metadata
- http://guides.dataverse.org/en/4.6/admin/metadataexport.html

Please let us know if you have any questions!

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/255809a1-bd5e-4f2a-a351-0356f5d09cbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Jonathan Bohan

unread,
Feb 7, 2017, 10:08:26 AM2/7/17
to Dataverse Users Community, philip...@harvard.edu
Phillip,

Thank you for your thoughtful reply! Yes, I see that I can download published dataverse metadata one study at a time - what I'm actually looking for is a way to download ALL the metadata - published and unpublished for all datasets that's I've entered. An excel file or csv file would do nicely. I'd like to be able to ensure that I'm entering metadata consistently across all datasets and that's not easy to do looking at individual datasets one at a time. 

Best regards,

Jonathan

Philip Durbin

unread,
Feb 7, 2017, 11:31:35 AM2/7/17
to dataverse...@googlegroups.com
I see, well this API call will write the metadata to the filesystem, at least, but I think it's only for published datasets: http://guides.dataverse.org/en/4.6/admin/metadataexport.html#batch-exports-through-the-api

I didn't work on this code myself so I'm not sure. The export button in the GUI grabs the exported file from the filesystem. You could script this, since it's an API call. Here's an example of the three formats:

- https://dataverse.harvard.edu/api/datasets/export?exporter=dcterms&persistentId=doi:10.7910/DVN/WEQKOZ
- https://dataverse.harvard.edu/api/datasets/export?exporter=ddi&persistentId=doi:10.7910/DVN/WEQKOZ
- https://dataverse.harvard.edu/api/datasets/export?exporter=dataverse_json&persistentId=doi:10.7910/DVN/WEQKOZ

The metadata is a tree so it's not easily represented in an Excel or CSV file. That said, I'm seeing "as_csv" and "as_excel" here, for example: https://services.dataverse.harvard.edu/static/swagger-ui/index.html?url=/miniverse/metrics/v1/swagger.yaml#!/metrics_-_datasets/get_datasets_count_by_subject . I don't know much about the miniverse code either, but I think the "as_csv" method is here: https://github.com/IQSS/miniverse/blob/0.3/dv_apps/metrics/stats_view_base.py#L198 . That "miniverse" app is designed to sit on top of a Dataverse 4 database to provide read-only metrics.

I hope this helps!

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

James Turitto

unread,
Sep 28, 2017, 12:31:34 PM9/28/17
to Dataverse Users Community
Jonathan,

Did this end up working for you? I am trying to compare the download rates of our published datasets in our dataverse, and it doesn't seem there is an easy way besides looking at all the individual datasets. An easy export of the metadata, particularly the download counts, would be great.

Thanks,
James

Philip Durbin

unread,
Sep 29, 2017, 1:06:00 AM9/29/17
to dataverse...@googlegroups.com
Hi James,

For download counts, you could try navigating to the dataverse the holds your datasets and then clicking "Edit" then "Dataset Guestbooks"* and finally "Download All Responses" as shown in the attached screenshot.

I just tried this at https://dataverse.harvard.edu/dataverse/open-source-at-harvard and got a CSV with a row per time a file was downloaded. It looks like this:

Guestbook, Dataset, Date, Type, File Name,  File id, User Name, Email, Institution, Position, Custom Questions
Default,Open Source at Harvard,07/7/2017,Download,2017-06-30.tab,3035125,Guest,,,
Default,Open Source at Harvard,07/6/2017,Download,2017-06-30.tab,3035125,Guest,,,
Default,Open Source at Harvard,07/7/2017,Download,2017-06-30.tab,3035125,Guest,,,
Default,Open Source at Harvard,07/7/2017,Download,2017-06-30.tab,3035125,Philip Durbin,philip...@harvard.edu,Harvard University,
Default,Open Source at Harvard,07/31/2017,Download,2017-07-31.tab,3040230,Guest,,,
Default,Open Source at Harvard,08/1/2017,Download,berkmancenter-amber_nginx.json,3040301,Guest,,,
Default,Open Source at Harvard,08/1/2017,Download,2017-07-31.tab,3040230,Guest,,,
Default,Open Source at Harvard,08/1/2017,Download,berkmancenter-brkmn.json,3040297,Guest,,,
Default,Open Source at Harvard,09/24/2017,Download,IQSS-TwoRavens.json,3040238,Guest,,,

Does this help? If not, can you please open an issue at https://github.com/IQSS/dataverse/issues ? I feel like other people have asked for something similar.

Thanks!

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Screen Shot 2017-09-29 at 12.53.59 AM.png

Jonathan Bohan

unread,
Oct 5, 2017, 9:41:10 AM10/5/17
to Dataverse Users Community
Hi James, 

Sorry to just respond to this now, was out sick for a bit - I have not been able to try this yet, we had Dataverse on the back burner for a bit while we tried to work out a different issue. Getting back to it now, will need to get out IT team to work on it. Sorry I don't have a better response for you.

Best regards, 

Jonathan Bohan
Reply all
Reply to author
Forward
0 new messages