Sherry Lake | Scholarly Repository Librarian | University of Virginia Library | shL...@virginia.edu | 434.924.6730 | @shLakeUVA | Alderman Library, 160 N. McCormick Road, Charlottesville, VA 22903 | Alderman 563 | LinkedIn Profile | orcid.org/0000-0002-5660-2970 | “Keeper of the Dataverse"
Hi everyone,
I've been looking at what depositors put in the date metadata fields in Harvard Dataverse and a few other Dataverse installations, and wanted to share what I've learned so far and some thoughts:
Deposit date:
Distribution date:
Publication date:
Production date:
Here's my running list of descriptions of dates I’ve seen used so far, trying to be as distinct and jargon-free as possible:
I’m testing an online survey and hope to use it soon so we can learn more from curators, and maybe as a way to test changes that we hope will clarify the fields. Looking forward to hearing everyone’s thoughts.
Julian
Hi everyone,
I've been looking at what depositors put in the date metadata fields in Harvard Dataverse and a few other Dataverse installations, and wanted to share what I've learned so far and some thoughts:
Deposit date:
- This field is pre-populated with the date that the dataset was first created in the Dataverse repository (when someone clicks "New dataset"). If someone creates a dataset and doesn't change the pre-populated date, then visits the repository a week later to publish the dataset, the deposit date and publication date are a week apart.
- I’m not sure if we or DDI intended the field to be equivalent to the date that the first draft was saved, or the date that the dataset is publicly available. In Harvard Dataverse, most depositors publish datasets where the deposit date and the publication date contain different dates.
- The DDI definition of deposit date is the date of deposit into the original repository, so if Dataverse isn’t the original repository, depositors are able to change the date. A lot of datasets in ADA Dataverse use the deposit date this way, as well as some dataverses in Harvard Dataverse (e.g. Murray, UCLA Social Science Archive).
- Do depositors who use the deposit date field this way mean the date when the data was first added/stored someplace or the date that it was available for others to access? Is this a distinction they're thinking about or care about?
- Some dataverses in Harvard Dataverse change this date to the latest published version (e.g. Antislavery Petition dataverse).
Distribution date:
- Some depositors are using this field for the date that embargoed data is released (when all or some files will be unrestricted and available to everyone). I think this includes DataverseNO and at least one dataverse in Harvard Dataverse. (Nice coincidence that embargo is on the roadmap and being designed. https://github.com/IQSS/dataverse/issues/4052)
- I think distribution date should not be used for an embargo release date, unless it's used only that way.
- Regardless, does the "date that the work was made available for distribution/presentation" mean:
- the date when the data was first distributed anywhere? (which I think could be the same as the deposit date)
- or the date when the data was first distributed in the current (Dataverse) repository? (which I think would be the same as the publication date)
Publication date:
- DataCite’s description of the property "publicationyear" includes: “If an embargo period has been in effect, use the date when the embargo period ends.” This sounds like once Dataverse has an embargo feature, the publicationyear that Dataverse sends to DataCite should be the year of the embargo release date (and not the year of the publication date that Dataverse sends to DataCite now, which is the date that the dataset's first version was released).
- But if the embargo release date is the day when the files become unrestricted, then why doesn’t Dataverse do this now? That is, why doesn’t Dataverse use the year in which the files become unrestricted as the publicationyear? It’s because when depositors hit publish, Dataverse has to send DataCite a publicationyear, and depositors have no way to indicate when the files will become unrestricted (until there’s an embargo feature).
- For datasets where an embargo is set, if Dataverse sends the embargo release date to DataCite as the publicationyear, then in some cases the publicationyear that DataCite has will be different than the publication year in Dataverse’s dataset citation… unless the publication date in the citation changes to the embargo release date.
- Trying to find what others have written about what “publish” really means. Could also reach out to DataCite’s metadata group.
Production date:
- I'd interpret this as the date when the data was "finalized" and ready to be analyzed or distributed, as the DVNWG wrote. I don't see any problem with research archives interpreting this as "the date that the data was given to the archive because that’s the closest approximation we can make if no other input on this timestamp was offered to us."
- So I'd assume that the production date should always come BEFORE the distribution date, but there are hundreds of datasets in Harvard Dataverse whose production dates come after. Trying to find out why.
Here's my running list of descriptions of dates I’ve seen used so far, trying to be as distinct and jargon-free as possible:
- Dates when data was collected (could be a single date if the date was collected in one day; the DDI element collDate has attributes for single dates, date ranges, and “cycles”, not sure how “cycles” would work)
- Date when data is "finalized" and ready for distribution (could be the same as “date of collection” in cases such as Pete Meyer’s)
- Date when data was first deposited anywhere, but not available for distribution or presentation
- Date when dataset was first published/distributed anywhere.
- Date when data was first deposited in current repository (Dataverse), but not available for distribution or presentation
- Date when dataset was first published/distributed in current repository (Dataverse)
- Dates when different versions have been published (this is system generated of course)
- Date when files are no longer restricted (embargo release date)
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/e5603fb7-8b71-4258-877b-0364cf4edbe4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.