> email to dataverse-community+unsub...@googlegroups.com.
>> > email to dataverse-community+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
FWIW: The OAI_ORE metadata export and/or the BagIT archiving capabilities might play in this as well. The OAI_ORE file is ‘complete’ as far as I know with one exception. ‘Complete’ means it has all of the metadata entered via the GUI or API (but not metadata generated during ingest which, at least prior to the Curation tool from Scholar’s Portal, would be recreated during a round trip upload into another Dataverse). The one exception is provenance – I’ve recently added the free-text provenance to the OAI_ORE (PR tbd) but getting access to any auxiliary provenance file was hard enough given the current design that I skipped it for v1. The BagIT zip contains all files and, after the recent DV updates to show the directory hierarchy, I’ve made an update to make the /data directory in the Bag use the directory hierarchy (another PR tbd).
W.r.t. the round-trip part - the code in the DVUploader already uploads a directory tree of files, i.e. could re-upload the data files from an unzipped bag. The original code from SEAD for that also read an OAI-ORE map to also upload metadata – I haven’t updated that code to upload metadata to Dataverse, or to read directly from a zipped Bag, but that could be added (I just haven’t had time/$ so far…).
I think the biggest difference with the DV tree concept I see is that the Bags are per dataset and don’t cover the Dataverse hierarchy or metadata. Combining the two, it might be possible to either use the OAI_ORE metadata file (available via the export api) as the metadata file at the dataset level, or just drop Bags in the tree at the dataset level. Other than avoiding format proliferation, I think the only advantage of the OAI_ORE approach is that it is already json-ld, mapping internal DV metadata to external vocabularies – part of why the RDA Research Data Repository Interoperability WG picked it and BagIt as a way to get closer to round-tripping between repositories. Conversely, I don’t think that the OAI_ORE file is any harder to parse, e.g. for implementing a metadata upload in python, since it is ‘just json’.
Other thoughts on round-trip:
Should it be latest version only or all versions?
How should differences in installed metadata blocks be handled (incoming data with metadata that isn’t represented in the new Dataverse, a new Dataverse with required fields that are not in the uploaded datasets)?
-- Jim
To post to this group, send email to
dataverse...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/dataverse-community/CABbxx8HZ-gVHsw2d0BO4zuwVXtdwmL19YwFTau1cKiNRt%3DT%3DTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.