Large-scale transfer/migration of both data and metadata between Dataverses

92 views
Skip to first unread message

Benjamin Peuch

unread,
Mar 6, 2020, 5:10:05 AM3/6/20
to Dataverse Users Community
Hello everybody,

We are currently finishing setting up a Dataverse here at the State Archives of Belgium for an upcoming data archive for social sciences.

I'm reaching out to the community today because I'm having a hard finding a solution to a certain use case. I identified the following ones, the most complex of them being the last:

1. Someone wants to deposit a small amount of datasets (let's say 1 or a couple more) into our archive. To do that, they need only use the GUI, click on Add Data, fill in the metadata fields and add the files.

2. Someone wants a small amount of metadata records copied into our Dataverse for increased visibility. To this end, the Import a dataset native API feature comes in handy.

3. Same thing but on a much larger scale, let's say 10s, 100s or even 1,000s of metadata records. In that case, OAI-PMH is the way to go.

4. But what if we want to migrate a large quantity of both metadata and data from one Dataverse to another? Are there documented ways to do this as automatically, and with as little case-by-case human intervention, as possible? Perhaps by the powers combined of OAI-PMH and another protocol?

Vyacheslav Tikhonov

unread,
Mar 6, 2020, 5:33:01 AM3/6/20
to Dataverse Users Community
Hi Benjamin,

If community has an interest in this topic I can write a blog post and share our experience about the experimental migration we did at DANS to move almost 50k datasets from trusted digital repository to Dataverse.
We reused our DDI converter tool with XSLT mappings (http://github.com/IQSS/dataverse-ddi-converter-tool) to convert metadata to the proper format, pyDataverse to migrate datasets and created custom python scripts to copy files from storage and link to the records with file metadata in the Dataverse database.

Best,
Slava 
Senior Data Scientist
(DANS-KNAW)

Danny Brooke

unread,
Mar 6, 2020, 12:19:21 PM3/6/20
to Dataverse Users Community
Hey Slava, we're very interested in a writeup. We'll be happy to share on the dataverse.org blog and other places where it would be appropriate. This is a great need in the community and it would be good to learn from your experience. Thank you!

Durand, Gustavo

unread,
Mar 6, 2020, 1:55:14 PM3/6/20
to dataverse...@googlegroups.com
Slava,

Rather than a blog post (or in addition to), what do you think of writing this up as a "Migration Guide" that we could add to our docs (and therefore also continue to update as we improve migration related features / ability to migrate from more sources)?

Gustavo

On Fri, Mar 6, 2020 at 12:19 PM Danny Brooke <danny...@g.harvard.edu> wrote:
Hey Slava, we're very interested in a writeup. We'll be happy to share on the dataverse.org blog and other places where it would be appropriate. This is a great need in the community and it would be good to learn from your experience. Thank you!

On Friday, March 6, 2020 at 5:33:01 AM UTC-5, Vyacheslav Tikhonov wrote:

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2e0706ec-d4eb-43c0-988b-c876cd8498d5%40googlegroups.com.

Stefan Kasberger

unread,
Mar 9, 2020, 10:04:39 AM3/9/20
to Dataverse Users Community
Hi,

I am planning to write up my experiences in the last year around our data migrations and release some new pyDataverse functionality afterwards this. So if you want, we could do something together (write blog posts, or add a migration section in the docs, or whatever).

Regards, Stefan

Danny Brooke

unread,
Mar 9, 2020, 11:12:06 AM3/9/20
to Dataverse Users Community
Thanks Stefan, it will be great to get this info out to the community. And +1 to Gustavo's comment about setting this up as Guides content as well as a blog post.

Vyacheslav Tikhonov

unread,
Mar 10, 2020, 2:00:15 PM3/10/20
to Dataverse Users Community
Hi Stefan,

I think it's a great idea to combine efforts in the writing of data migration manual. We reused your pyDataverse to run EASY on Dataverse, anyway. ;)

Cheers,
Slava

Kaitlin Newson

unread,
Mar 18, 2021, 4:51:01 PM3/18/21
to Dataverse Users Community
Hi all, I realize this is a somewhat older thread, but I was wondering if anyone has started writing a data migration manual. I'm in the process of migrating some datasets from one installation of dataverse to another and would love to hear how others have handled these kinds of migrations!

Stefan Kasberger

unread,
Mar 19, 2021, 8:20:35 AM3/19/21
to Dataverse Users Community
Most of my data migration workflow is part of my last pyDataverse release (0.3.0) in the User Guide sections.

But I will also add some example Python scripts in the next releases.
Reply all
Reply to author
Forward
0 new messages