Custom metadata best practise

84 views
Skip to first unread message

Heinz Werner Kramski-Grote

unread,
Apr 29, 2021, 9:49:33 AM4/29/21
to Dataverse Users Community

Hi, new Dataverse (5.3) user here.

In an academic project, we are trying to add several home-grown custom metadata blocks to DV.

Currently we have one for PREMIS Relationship Sub Types and 9 for each entity of the VGMS (Video Game Metadata Schema). (More should follow for MODS and PREMIS in general.)

While working on our .tsv files, some questions arose which seem not to be covered by the documentation:
  1. Is there a comment directive or a comment character to include remarks inside the .tsv files or comment out lines?
  2. What is the best way to deal with iterations of the custom formats? Is there a script or something else to clear the internal data structure from obsolete datasetFields from older versions of your .tsv files? (I'm on a vagrant demo VM right now, but on a serious instance this question will get, well, more serious.)
  3. How do you specify the sequence of your custom metadata blocks as presented to the user (both for display and for data entry)? At first it looked like a chronological order (which would not be enough either), but it got random on further development.
  4. Is it possible to mark an entry in a ControlledVocabulary as the default one, like the "selected" attribute in an HTML option list does?
  5. How do you deal with semantically overlapping data between your custom metadata blocks and the default Citation format? E.g. MODS has an agent, PREMIS has one, VGMS has an agent and all are similar and could (partly) as well be expressed as authors or contributors in the Citation section. Is there some kind of re-use of existing default datasetFields, so they will show up twice on data entry (one occurence r/o of course)? Or do we have an onchange-handler which can populate data from one datasetField to another once a document is saved? Or should we simply avoid such redundancy and only add custom fields where there really is no other way to (mis-)use existing ones?
Any help is very welcome, including references to existing PREMIS or MODS implementations.

(We will be happy to share our .tsv files as soon as we make substantial progress.)

TIA
   Heinz

Philipp at UiT

unread,
Apr 30, 2021, 1:27:46 AM4/30/21
to Dataverse Users Community

Hi Heinz,

I'm currently working on how to implement a domain-specific metadata schema for linguistic data in Dataverse, and have had similar questions as the one you are raising. Hopefully, someone from the community can answer your specific questions.

You probably know that there is a Dataverse Metadata Interest Group (IG) and several related Working Groups (WG), among them one dealing with Controlled Vocabularies. Maybe some of your issues could be discussed there? I'm planning to suggest a WG to work on best practice for implementing metadata schemas in Dataverse. Maybe Jim could schedule a Metadata IG meeting on this topic? I think I could prepare some input by May 20.

Best, Philipp

Volodymyr Kushnarenko

unread,
Jun 1, 2021, 5:42:57 AM6/1/21
to Dataverse Users Community
Dear all,
many thanks Philipp for reply. I am working together with Heinz. 5 questions above are quite specific, that's why simple search unfortunately does not help us, experience of the community needed... If somebody has a solution or some ideas, please reply just shortly.
Many thanks in advance, Vladimir

Philip Durbin

unread,
Jun 2, 2021, 4:37:29 PM6/2/21
to dataverse...@googlegroups.com
Hi Heinz and Volodymyr,

Thanks for the reminder to circle back to your questions!

No, you can't put comments in the TSV. This is a definite limitation and other formats such as JSON (which doesn't allow comments either, come to think of it), YAML, or XML have been suggested: https://github.com/IQSS/dataverse/issues/4451

For now, the best way to iterate on your TSV is to use a test environment where you can easily drop the database. Please see "For either task, you should have a Dataverse Software development environment set up for testing where you can drop the database frequently while you make edits to TSV files." https://guides.dataverse.org/en/5.5/admin/metadatacustomization.html#metadata-block-setup

I took a quick look at the code and database schema and I don't see an obvious way to order metadata blocks. Please feel free to create an issue about this at https://github.com/IQSS/dataverse/issues

Yes, you can have one (or more) of the values in a controlled vocabulary be the default through a dataset template. For example, you could have a dataverse collection called "Chemistry Department" and for their template you could have "Chemistry" pre-selected for the subject. For more on templates, please see https://guides.dataverse.org/en/5.5/user/dataverse-management.html#dataset-templates

I don't think we have a good solution for fields that overlap across blocks (or other parts of the system). For example, it was noticed that Darwin Core has fields for "License" and "Rights Holder" ( https://github.com/IQSS/dataverse/issues/6243#issuecomment-576682233 ) but this is handled under "Terms" in Dataverse. We're definitely open to suggestions.

I hope this helps. Please keep the questions coming!

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2b752aed-59b9-4042-aa7c-c7c4583dc66fn%40googlegroups.com.


--
Reply all
Reply to author
Forward
0 new messages