Production Date: Date range?

106 views
Skip to first unread message

Leonhard Maylein

unread,
Dec 5, 2018, 9:44:14 AM12/5/18
to Dataverse Users Community

Is it possible to allow date ranges for the production date in future releases?

Or is there better way to record that the production of the data has lasted several months/years ... ?


Julian Gautier

unread,
Dec 5, 2018, 10:36:03 AM12/5/18
to Dataverse Users Community
Hi Leonhard,

Could you share some more about what kind of data it is, how it's being collected and what factors go into deciding when the data is deposited into a Dataverse repository and published? After the data is created/collected and published, will more data in the study be collected and published?

In August, a working group representing several instances of Dataverse repositories across Canada started a discussion in another Google Groups thread about clarifying the date metadata,  https://groups.google.com/forum/#!topic/dataverse-community/n4I-bn1ukyQ, and they called out production date as one of the confusing fields. So I'm hoping we can learn about your use case and what would make sense.

Thanks!
Julian

Leonhard Maylein

unread,
Dec 6, 2018, 4:21:26 AM12/6/18
to Dataverse Users Community
Hi Julian,

we provide datasets from a lot of disciplines. I think in the humanities it is not unusual that data is collected over several years.

Here you will find an example (unfortunately in German): https://doi.org/10.11588/data/H2ILIH

This is another example:
The correct production date: 2012-2018

Leonhard

Sherry Lake

unread,
Dec 6, 2018, 9:35:12 AM12/6/18
to Dataverse Users Community
Hi Leonhard,

If you read the link that Julian sent, you will see there are many interpretations of the many date metadata fields in Dataverse.

The "Production Date" (according to Dataverse) is when all of the data, documentation, was put together (packaged). It is not a required field, so you don't have to use it.

In your most recent email (below), you say "data is collected over several years". Date of Collection (which can be a range) is what you are talking about, not Production Date. Dataverse has fields for "Date of Collection". It is not part of the initial set of fields when you are creating a dataset, but is part of the additional fields you can add once your dataset is created. (Edit metadata once the dataset has been saved).

Hope this helps -
Sherry Lake

Crosas, Mercè

unread,
Dec 6, 2018, 9:39:10 AM12/6/18
to dataverse...@googlegroups.com
I agree with, Sherry. I was going to make this similar comment - you should use Date of Collection for you are truing to describe.

Best,
Merce

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/f045feea-037d-4ed1-86b5-b6c3763fa5fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Mercè Crosas, Ph.D., Chief Data Science and Technology Officer, IQSS, Harvard University

Leonhard Maylein

unread,
Dec 6, 2018, 9:51:48 AM12/6/18
to Dataverse Users Community
Okay, thanks for the clarification.
"Production Date" only refers to the publicated dataset not to the data themselves.

Maybe the descriptions in the linked document don't fit such things like "transcriptions" (https://doi.org/10.11588/data/PEFKJM) because they are not "collected" but also "produced".

It seems to me that "Dates of collection", "Publication Date", "Date of deposit" and "Date of distribution" are much more important than the "Production Date".
In case of text publications (articles etc.) the "production date" does not matter. Or am I wrong?

Sherry Lake

unread,
Dec 6, 2018, 11:27:40 AM12/6/18
to dataverse...@googlegroups.com
If you are going to use "Date of Collection" and want to see it on the Summary Metadata fields section - where you have your current "Note" (as I see in this example: https://doi.org/10.11588/data/H2ILIH)

By default, running this command replaces the current default summary, so if you want to add the field dateOfCollection (and keep the others), do it this way:

curl http://localhost:8080/api/admin/settings/:CustomDatasetSummaryFields -X PUT -d 'dsDescription,subject,keyword,publication,notesText,dateOfCollection'




--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Leonhard Maylein

unread,
Dec 7, 2018, 2:44:49 AM12/7/18
to Dataverse Users Community
Thanks, this information is valuable.

Julian Gautier

unread,
Dec 7, 2018, 1:34:27 PM12/7/18
to Dataverse Users Community
I agree that the descriptions of these date fields, and maybe even the field names, can be improved. I see that as one outcome of the discussion in the other Google Group thread. And the thinking you've shared so far about the purpose of the "Production Date" field versus "Date of Collection" is really helpful. I agree when you write that "Production Date" only refers "to the publicated dataset not to the data themselves". So the goal I think is the describe that metadata field in way that makes this more clear.

I'm hoping we can continue clarifying these two fields with examples. Without examples the discussion can get frustratingly semantic (at least for me, such as "what does it mean to collect versus produce").  

You wrote that

Maybe the descriptions in the linked document don't fit such things like "transcriptions" (https://doi.org/10.11588/data/PEFKJM) because they are not "collected" but also "produced".

For that dataset, my thinking is that we could use "Date of Collection" to record:
  • the time duration during which the manuscript was being transcribed
  • or the time duration during which the text was being digitized
I think it depends on the purpose of the dataset and what you think is more relevant to record. Do you agree?

Then, as Sherry pointed out, the "Production Date" would be when all of the data, documentation, was put together (packaged).

I also wanted to ask about your comment:
In case of text publications (articles etc.) the "production date" does not matter. Or am I wrong?

By "articles," do you mean published journal articles? If so, I agree, I think it would be very uncommon to record a "production date" in the way I think Dataverse defines it. Of course, Dataverse isn't designed to describe journal articles, and you've referred to transcriptions, which I would also call a text publication. So I just wanted to ask what you meant by articles and text publications.

Philipp at UiT

unread,
Dec 9, 2018, 12:17:50 AM12/9/18
to Dataverse Users Community
We usually encourage researchers to fill in the field "Collection Date". I have interpreted this field to cover the periode when the data were collected for the project which the deposited dataset is about. In some cases, these data may be a subset of a larger collection of data which had been collected in an earlier period. Let me illustrate this with a dataset that I have published myself: The dataset consists of annotated sentences of written Norwegian. I collected these sentences from a corpus of written Norwegian in 2016, so I specified the field "Date of Collection" as start: 2016-07-07; End: 2016-10-23. However, the data my collected data is a subset of, i.e. the data in the corpus of written Norwegian, were collected at some time before 2011. After having read the Google Groups threads on the different date fields in Dataverse, i'm not quite sure anymore whether this interpretation is correct.

Best,
Philipp

Julian Gautier

unread,
Dec 10, 2018, 12:24:57 PM12/10/18
to Dataverse Users Community
Thanks Philipp. My opinion is that your first interpretation is best: the date of collection for the dataset of written Norwegian would be in 2016, when this subset derived from the corpus was collected, and not the dates when the corpus was collected. The tooltip text for "Date of Collection" right now is "Contains the date(s) when the data were collected." Would it be helpful if the description was "Contains the date(s) when this dataset was collected."? Do you think that would clarify which collection activity dates should be used for "Date of Collection"?

Philipp at UiT

unread,
Dec 10, 2018, 10:41:53 PM12/10/18
to Dataverse Users Community
Thanks Julian. Yes, maybe this dataset in bold would help.

Youn Noh

unread,
Feb 24, 2025, 11:24:25 AMFeb 24
to Dataverse Users Community
Why is date of collection defined (as distinct from production date) in Citation Metadata but not place of collection? Thanks. Youn

Youn Noh

unread,
Feb 24, 2025, 11:39:39 AMFeb 24
to Dataverse Users Community
Is the assumption that the geospatial block could be used for place of collection if needed? Thanks. Youn

Julian Gautier

unread,
Feb 26, 2025, 11:21:13 AMFeb 26
to Dataverse Users Community
Hi Youn Noh. I'm not sure that any of the metadata blocks that ship with Dataverse include a field or fields specifically for place of collection, or where the data was collected.

In the geospatial block you mentioned, the tooltip for the Geographic Coverage field reads "Information on the geographic coverage of the data. Includes the total geographic scope of the data." I think of "geographic coverage" and "geographic scope" as meaning where the data was collected or where the data is about or both. And for some type of data, those two things can be the same, right?

The description of DDI Codebook's "Geographic Coverage" element also starts with "Information on the geographic coverage of the data. Includes the total geographic scope of the data.".

But looking at the child fields of the Geographic Coverage, three of those fields' tooltips include "that the Dataset is about". And I think of that being more specifically what the data is about and not necessarily where the data was collected.

I wonder if it would be helpful if we could include datasets that highlight the importance of this distinction between where the data was collected or where the data. Do you have any of these cases in mind?

Julian Gautier

unread,
Feb 26, 2025, 11:24:02 AMFeb 26
to Dataverse Users Community
Whoops, that least sentence should read:

I wonder if it would be helpful if we could include datasets that highlight the importance of this distinction between where the data was collected or where the data is about. Do you have any of these cases in mind?

Julian Gautier

unread,
Feb 26, 2025, 11:42:02 AMFeb 26
to Dataverse Users Community
Oh, I just remembered the Production Location field in the Citation metadata block. Its tooltip reads "The location where the data and any related materials were produced or collected".
Reply all
Reply to author
Forward
0 new messages