Making processed subset of data in Dataverse available inside an R package

60 views
Skip to first unread message

Giuseppe Arena

unread,
Oct 27, 2020, 5:53:09 AM10/27/20
to Dataverse Users Community
Good day everybody,

I worked on a (preprocessed) subset of a dataset in Dataverse and I would like to make it available inside an R package which I've been coding for my project. 
I wonder whether apart from the Data citation that I am going to write among the references in my paper how I should behave when it comes to the data made available inside the package.

Will the writing of the Data citation in the vignette/help function be enough? 
The data had to be preprocessed, as in removing records and some re-labelling where needed. Therefore, I think I also have to make this clear when citing Dataverse as the data I  provide inside the package are not the source data but a byproduct of them.
Does anybody know any best practice guide for citing/referencing when it comes to this specific case?

Thank you in advance for paying attention to my issue.

Best wishes,

Giuseppe Arena

Philip Durbin

unread,
Oct 29, 2020, 11:17:02 AM10/29/20
to dataverse...@googlegroups.com
Hi Giuseppe,

As you say, citing the original data seems like a good place to start. Since you are preprocessing the data, perhaps you can include the script you used to transform it. There are tools that help capture the provenance of the data, including any transformations that have been made. Renku comes to mind.

I hope this helps. Again, I think the main thing is to at least cite the original data. Others on this list probably have opinions too. :)

Thanks,

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/5fbe2bda-0ad5-40da-8a35-5eb6cada1eebn%40googlegroups.com.


--

Giuseppe Arena

unread,
Nov 4, 2020, 11:59:25 AM11/4/20
to Dataverse Users Community
Hi Phil,

Thanks for your answer!

I am going to give a look at Renku which seems to suit my case and to be a valid solution. :D


Best wishes,
Giuseppe

Rok Roskar

unread,
Nov 5, 2020, 10:45:28 AM11/5/20
to Dataverse Users Community
Hi Giuseppe,

Rok here from the Renku project :) Thanks Phil for pointing this thread out to me, the use-case sounds very much aligned with what Renku is trying to provide. You can import Dataverse datasets into a Renku project and capture the provenance of your pre-processing using the renku CLI. We've been thinking about some closer interoperability between Renku and Dataverse, as well as archiving projects for the purposes of publication and citation so this comes at a good time.

If we can assist with getting your project going on Renku, please feel free to sign up at https://renkulab.io and reach out to us on renku.discourse.group.

Best,

Rok
Reply all
Reply to author
Forward
0 new messages