meta studies (and future dv features)

August Muench

unread,

Feb 6, 2012, 9:50:11 AM2/6/12

to dataverse...@googlegroups.com

hi,

we have been discussing two examples where 'meta' studies for dataverses in theastrodata.org would be useful.

The first example came about when we started using the collection feature. We found that what we really wanted was to be able to cite collections of studies. This first experience mainly had to do with a DV where there were dozens of studies and we wanted to be able to create a single citation that encompassed a specific subset of them. Currently the collections are mainly an internal organizational feature though we noticed that there are collection IDs associated with them.

The second example came about when we wanted to aggregate studies of data from different DVs. These meta-studies are useful we think because we want to create a "place" or a "page" where a user can find an aggregation of potentially useful datasets that we have "created" for them. I suppose that these should also be citable, which is why this kind of meta-study might be appropriate from within the dataverse structure instead of built from pieces over an API. (of course the internal APIs in dataverse are probably the limiting factor in either of these requests, no?)

A third example comes to my mind -- namely, the aggregation of individual products across studies. But we have not talked about this very much as a group.

a better question might be, is there a wiki of planned DV dev?

- Gus

Gustavo Durand

unread,

Feb 6, 2012, 12:02:48 PM2/6/12

to dataverse...@googlegroups.com

On Feb 6, 2012, at 09:50 , August Muench wrote:

hi,

we have been discussing two examples where 'meta' studies for dataverses in theastrodata.org would be useful.

The first example came about when we started using the collection feature. We found that what we really wanted was to be able to cite collections of studies. This first experience mainly had to do with a DV where there were dozens of studies and we wanted to be able to create a single citation that encompassed a specific subset of them. Currently the collections are mainly an internal organizational feature though we noticed that there are collection IDs associated with them.

I think the idea of having a citation for a group of studies sounds very interesting. I'd be curious to hear what the rest of the community thinks as well.

As you pointed out, we do have collections, but no citations for them. The challenge with added citations here is that they can change and we don't currently have any mechanism for tracking these changes (while a study can also change, a citation can point to a specific version of the study)

A study has the ability to include references to other studies via the "Related studies" metadata field. This field supports HTML, so you could provide a link to the handle of each of the studies you want. And then this would have a citation. Would this be suitable?

The second example came about when we wanted to aggregate studies of data from different DVs. These meta-studies are useful we think because we want to create a "place" or a "page" where a user can find an aggregation of potentially useful datasets that we have "created" for them. I suppose that these should also be citable, which is why this kind of meta-study might be appropriate from within the dataverse structure instead of built from pieces over an API. (of course the internal APIs in dataverse are probably the limiting factor in either of these requests, no?)

For the 2nd example, wouldn't a collection work for this? You can create a collection from any studies across a DVN, not just from your DV. (the DV's have to be released, however, to add those studies). Again there wouldn't be a citation here, but I wasn't sure if in this example that was a critical desire or just a nice to have.

A third example comes to my mind -- namely, the aggregation of individual products across studies. But we have not talked about this very much as a group.

a better question might be, is there a wiki of planned DV dev?

We don't have a wiki, but you can track the issues we are working on for the current release and beyond through Redmine:

https://redmine.hmdc.harvard.edu/projects/dvn/roadmap

- Gus

Dr. Micah Altman

unread,

Feb 6, 2012, 1:41:56 PM2/6/12

to dataverse...@googlegroups.com

> There may be some models to build on or adapt.
>
> Libraries have had the concept of "record series" w/corresponding
> metadata. There is no bright line between a series and collection
> definitionally, but series tended to be more coherent, and
> explicitly/formally cataloged. Whereas collections were broader and
> did not have formal catalog records.
>
> Both ICPSR and NARA use formal series-level records to describe
> "closely related" studies that are designed (ex-post/ex-ante) to be
> comparable, most commonly linked by time (time-series). E.g:
>
> http://www.icpsr.umich.edu/icpsrweb/ICPSR/series/28
> http://www.icpsr.umich.edu/icpsrweb/ICPSR/series/159
> http://www.archives.gov/research/arc/education/#diagram
>
> Gus's #3 example seems possibly more related to "aggregated" objects.
> Less work in this area, but notably OAI-ORE , MPEG21-DDL, METS, and
> DDI v. 3.x all have the ability to describe an object that is an
> aggregate of parts of other existing objects.
>
> best,
>
> Micah

August Muench

unread,

Feb 6, 2012, 9:08:20 PM2/6/12

to dataverse...@googlegroups.com

On Monday, February 6, 2012 12:02:48 PM UTC-5, Gustavo Durand wrote:

On Feb 6, 2012, at 09:50 , August Muench wrote:

hi,

we have been discussing two examples where 'meta' studies for dataverses in theastrodata.org would be useful.

The first example came about when we started using the collection feature. We found that what we really wanted was to be able to cite collections of studies. This first experience mainly had to do with a DV where there were dozens of studies and we wanted to be able to create a single citation that encompassed a specific subset of them. Currently the collections are mainly an internal organizational feature though we noticed that there are collection IDs associated with them.

I think the idea of having a citation for a group of studies sounds very interesting. I'd be curious to hear what the rest of the community thinks as well.

Yes, I am also interested in knowing what others think about citable aggregations of studies.

As you pointed out, we do have collections, but no citations for them. The challenge with added citations here is that they can change and we don't currently have any mechanism for tracking these changes (while a study can also change, a citation can point to a specific version of the study)

A study has the ability to include references to other studies via the "Related studies" metadata field. This field supports HTML, so you could provide a link to the handle of each of the studies you want. And then this would have a citation. Would this be suitable?

Thanks, I have not convinced myself that I understand the definition of 'related' in this regard. If it were backed up with an ontological relationship I might be happier. ;-)

The second example came about when we wanted to aggregate studies of data from different DVs. These meta-studies are useful we think because we want to create a "place" or a "page" where a user can find an aggregation of potentially useful datasets that we have "created" for them. I suppose that these should also be citable, which is why this kind of meta-study might be appropriate from within the dataverse structure instead of built from pieces over an API. (of course the internal APIs in dataverse are probably the limiting factor in either of these requests, no?)

For the 2nd example, wouldn't a collection work for this? You can create a collection from any studies across a DVN, not just from your DV. (the DV's have to be released, however, to add those studies). Again there wouldn't be a citation here, but I wasn't sure if in this example that was a critical desire or just a nice to have.

Yes, Merce pointed out to me today how collections can aggregate studies across DV and how linked collections can aggregate collections across DV. Manipulations (or the ability to create such manipulations) of the datasets is a critical feature but is also I think something that can be enabled by the APIs.

A third example comes to my mind -- namely, the aggregation of individual products across studies. But we have not talked about this very much as a group.

a better question might be, is there a wiki of planned DV dev?

We don't have a wiki, but you can track the issues we are working on for the current release and beyond through Redmine:
https://redmine.hmdc.harvard.edu/projects/dvn/roadmap

I've had enough Trac/Jira experience that this is perfectly useful to understand the issues and dev plans. Thanks for sharing it.

- Gus

August Muench

unread,

Feb 6, 2012, 9:12:57 PM2/6/12

to dataverse...@googlegroups.com

On Monday, February 6, 2012 1:41:56 PM UTC-5, Dr. Micah Altman wrote:

> There may be some models to build on or adapt.

> <snip>

> Gus's #3 example seems possibly more related to "aggregated" objects.
> Less work in this area, but notably OAI-ORE , MPEG21-DDL, METS, and
> DDI v. 3.x all have the ability to describe an object that is an
> aggregate of parts of other existing objects.

Thank you, Micah,

our theastrodata team (specifically myself and the Wolbach Library folks -- Erdmann et al) are beginning (now, today, this week) to dig with focus into the metadata models and the domain specific mappings we will want to make to the extensible data collection/method section in DVN 3. I'll make sure to also dig into the aggregated objects sections of DDI v3.x as well.