How is provenance feature useful, and can I use one provenance file for multiple data filesin the same dataset?

69 views
Skip to first unread message

Eunice Soh

unread,
Mar 14, 2022, 5:02:04 AM3/14/22
to Dataverse Users Community
Hi,

Understand the provenance files are supposed to track how data "flows" and am testing the provenance feature in Dataverse.


I've generated a provenance JSON file using an R package called rdtLite and uploaded for a data file via Dataverse. In the R script whose provenance is captured by rdtLite:
1) a txt file is read, 
2) subset/filtered and 
3) the filtered dataset output as a csv file.


After which, in provenance pop up, I select the entity that belongs to the data file.



I'm wondering now:

1. Should I upload the output csv file as well in the same dataset?

2. If so, do I upload the provenance JSON file again? There appears no option to reuse the first provenance file in the same dataset.

3. How is the provenance file going to be useful, especially for those downloading the dataset? I downloaded the dataset with the provenance json added, but the dataverse_file.zip does not contain any provenance info.


Appreciate the advice for any of questions!


Kind regards,
Eunice

Eunice Soh

unread,
Mar 14, 2022, 5:19:10 AM3/14/22
to dataverse...@googlegroups.com
Tl;dr
How is provenance feature useful?
How is provenance feature used?


--
You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/q-0HlB-C0pk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/629aa142-d679-4ba3-9cc6-5620e1bcb9d1n%40googlegroups.com.

Philip Durbin

unread,
Mar 14, 2022, 10:53:19 AM3/14/22
to dataverse...@googlegroups.com
Hi Eunice,

Provenance is a feature that was only partially implemented compared to all the hopes and dreams we had initially. The feature educates users about the term data provenance, mentions tools and file formats in this space, and allows them to upload provenance files. It looks like the prov files can be downloaded via API: https://guides.dataverse.org/en/5.9/api/native-api.html?highlight=provenance#get-provenance-json-for-an-uploaded-file

I hope this helps,

Phil

You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CAMWPvdMojoddpD9a0ioPnXW6uPf1gA5rUdX%3DSuwgBhXVfx1PBg%40mail.gmail.com.


--

Eunice Soh

unread,
Mar 15, 2022, 9:08:48 PM3/15/22
to Dataverse Users Community
Thanks Phil! It's quite an interesting feature. 
Multiple data files should be able to use one provenance file, I think. That would make more sense.

Kind regards,
Eunice

Reply all
Reply to author
Forward
0 new messages