best practices for dataset description

2 views
Skip to first unread message

Tim Clark

unread,
Jan 8, 2015, 1:10:17 PM1/8/15
to Merce Crosas, <idmeta@force11.org>
Hi Merce

You added this text to the best practices for dataset description, but I find it confusing because it describes representations of data itself, not dataset descriptions. 

"Data in the described datasets might also be described using various formats depending on the application area, in which case, content negotiation could be desirable for the dataset URI(s) as well as the landing page URI. For example, tabular data can be described using the Data Documentation Initiative schema (DDI), which includes metadata elements to fully describe each column (or variable) in the table. In particular, DDI is widely used by repositories that support social science data, such as Dataverse. When other formats (such as DDI) are being used, these formats should support the minimal set of recommended metadata elements above, in addition to other elements, and be easily ported to DCAT through a cross-walk."

I wonder if we could remove all but the first sentence, would that work for you?

Tim

---------------------------------------------
Tim Clark, Ph.D.
Assistant Professor of Neurology, Harvard Medical School
Director, Biomedical Informatics Core, Massachusetts General Hospital
co-Director, Data and Statistics Core, Massachusetts Alzheimer Disease Research Center
website: http://mindinformatics.org
mobile: +1 617-947-7098 fax: +1 617-213-5418

Mercè Crosas

unread,
Jan 8, 2015, 1:59:06 PM1/8/15
to Tim Clark, <idmeta@force11.org>
Tim,

I think it would be useful to mention other types of metadata that are commonly used in repositories to describe datasets: For example, dublin core, datacite, and DDI. DDI in addition supports description about each column of a tabular dataset, but this is also dataset metadata which is useful (or even necessary) to understand the dataset.

So, would it be ok to at least list a few examples of other descriptive metadata standards used in community repositories?

Merce


Joan Starr

unread,
Jan 8, 2015, 2:03:59 PM1/8/15
to Mercè Crosas, Tim Clark, <idmeta@force11.org>

I like this idea—it opens the doc to a broader audience.

--Joan

To unsubscribe from this group and stop receiving emails from it, send an email to idmeta+un...@force11.org.

Ruth Ellen Duerr

unread,
Jan 8, 2015, 7:11:33 PM1/8/15
to Joan Starr, Mercè Crosas, Tim Clark, <idmeta@force11.org>
Well in that case may I ask you to add ISO 19115 which is the standard in use for Petabytes of Earth science data.

Ruth

Sent from my iPad

Tim Clark

unread,
Jan 8, 2015, 8:23:42 PM1/8/15
to Ruth Ellen Duerr, Merce Crosas, Joan Starr, <idmeta@force11.org>
HI - 

Here is the new text (in raw latex) let me know if this is okay.  Particularly - does anyone object to the suggestion to crosswalk to DCAT?  Or is that overreach?

"Data in the described datasets might also be described using  other formats depending on the application area. Other possible approaches for dataset description include DataCite metadata (\cite{DataCite}), Dublin Core (\cite{DCMI}), the Data Documentation Initiative (\cite{DDI}) for social sciences, or ISO19115 (\cite{ISO19115}) for Geographic information. Where these formats are used they should support the minimal set of recommended metadata elements above, in addition to other elements.  We also suggest they be ported to DCAT if possible for commonality."

Tim

Joan Starr

unread,
Jan 8, 2015, 8:56:48 PM1/8/15
to Tim Clark, joan....@ucop.edu, Merce Crosas, Ruth Ellen Duerr, <idmeta@force11.org>

My concern about suggesting DCAT is that it is designed for data catalogs, so perhaps "if possible and where appropriate"?

Tim Clark

unread,
Jan 8, 2015, 9:00:04 PM1/8/15
to joanb...@gmail.com, joan....@ucop.edu, Merce Crosas, Ruth Ellen Duerr, <idmeta@force11.org>
Or we can just drop the last sentence, saying essentially, choose your local format but make sure to have the above data elements. I prefer that... not intrusive. 

Joan Starr

unread,
Jan 8, 2015, 9:11:12 PM1/8/15
to Tim Clark, Merce Crosas, Ruth Ellen Duerr, <idmeta@force11.org>, joan....@ucop.edu

OK sure.
--Joan

Reply all
Reply to author
Forward
0 new messages