Setting criteria for marking metadata as RECOMMENDED

Chris Markiewicz

unread,

Feb 7, 2025, 9:03:42 AMFeb 7

to bids-di...@googlegroups.com

Hi all,

I'm emailing the list because an issue came up on the GitHub tracker that might fly under the radar of those interested, and I think there is a larger meta-spec conversation here.

For context, BIDS has a lot of metadata (JSON sidecar) fields that are marked as RECOMMENDED in the specification. Some of these have raised validator warnings on absence from very early on, but many others did not. The schema-based validator took the position that, if RECOMMENDED is to mean anything, then absence should trigger a warning. The alternative was a large set of custom checks in the schema that replicated the pick-and-choose selections of the original validator.

An issue was recently opened by a researcher who converted their dataset with HeuDiConv/dcm2niix, which populated as much as reasonably could be populated automatically, leading to warnings for 28 different fields on 10k+ files: https://github.com/bids-standard/bids-specification/issues/2040. I looked through all of the fields on that issue, and I believe that at least 23 of the fields should be demoted to OPTIONAL, though some can be made RECOMMENDED given certain conditions. For example, "PartialFourierDirection" only makes sense if "PartialFourier" is defined.

I would particularly appreciate a review of my response to that issue from qMRI and MRS researchers, as some of these fields may determine analytical choices, where for many of us they are basically just provenance.

It feels like a good opportunity to think about what justifies a field as being RECOMMENDED, so that validation warnings correspond to problems that reasonably can be fixed and don't drown the user with nice-to-haves. Just to kick off the discussion, I have a few suggestions:

1) Fields should be required/recommended if they require human intervention to populate and significantly contribute to the understandability of the dataset. For example, describing tasks either with plain language or references to protocols.

2) Fields can be made conditionally required/recommended if a narrowly-targeted condition can be established from the file name or the value of other, related metadata. For example, requiring EchoTime for files with `echo-` in the file name.

3) Fields should be required/optional and not recommended if they can be populated by a conversion tool. If the absence is not severe enough for an error, we presume an inability to detect the metadata field.

4) Fields that exist to record deviations from typical usage must be optional, with an explicit default value specified in the term description. For example SliceEncodingDirection is generally assumed to be "k".

5) Fields that are commonly scrubbed for anonymization purposes must be optional.

Another possibility would be to move beyond the OPTIONAL/RECOMMENDED/REQUIRED terminology and think about different kinds of recommendation. I don't have a specific proposal here, but I would imagine that we would have OPTIONAL/REQUIRED and then categories such as "provenance" or "analysis" which distinguish between metadata that will make your dataset more understandable to someone attempting to replicate your experiment or someone attempting to analyze it. A validator could then sort out different kinds of warnings.

This is all coming from an MRI perspective where DICOM inputs are the norm. Input from ephys/PET/NIRS researchers would be appreciated as well.

Best,

Chris

yarikoptic

unread,

Feb 7, 2025, 12:06:48 PMFeb 7

to bids-discussion

Terminology: Looking at used by us [RFC2119](https://www.ietf.org/rfc/rfc2119.txt) we have no clear way to introduce another level of some kind. And your suggestions make great sense. I do vote

- YES for 1 and 2;

- 3 is "murky" until there is an easy way to identify that conversion was done with a tool (BEP028 PROV) and even then source data format manufacturer specifics could preclude satisfying the REQUIRED;

- 4 is also NO from me since it implies some default which might not correspond to factual data and there is no way to verify;

- 5 makes sense, besides the times/dates for which we have an alternative procedure (offset) which preserves temporal relationships etc, so I would keep them RECOMMENDED where already present.

Dr Cyril, Pernet

unread,

Feb 10, 2025, 9:41:42 AMFeb 10

to bids-di...@googlegroups.com, Chris Markiewicz

Hi Chris

for PET, I went through the DICOM spec field by field with Martin when we were preparing PET2BIDS and BIDS does cover all that is usually comes out (or combinations of some).

for MEEG, typically nothing comes out with the data and metadata are almost fully documented by people - the choice of recommended or not was driven by what we thought is useful for analysis.

Cyril

--
We are all colleagues working together to shape brain imaging for tomorrow, please be respectful, gracious, and patient with your fellow group members.
---
You received this message because you are subscribed to the Google Groups "bids-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bids-discussi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bids-discussion/CAJqDEcHxAVZ0u2W14_Gjje%2But3TScRaavPLCOGaZo9dMx%3Dugxg%40mail.gmail.com.

-- 
Dr Cyril Pernet, PhD, OHBM fellow, SSI fellow
Neurobiology Research Unit, 
Building 8057, Blegdamsvej 9
Copenhagen University Hospital, Rigshospitalet
DK-2100 Copenhagen, Denmark

wamc...@gmail.com
https://zcal.co/cpernet
https://cpernet.github.io/
https://orcid.org/0000-0003-4010-4632

Chris Markiewicz

unread,

Jun 3, 2025, 9:41:36 AMJun 3

to bids-discussion

Hi all,

Finally following up on this. I don't think we agreed on very clear rules for when something should be a recommendation versus optional, but the number of warnings is currently overwhelming, and I think practicality demands that we make some judgment calls, even if we can't define a set of guidelines.

I had a go at downgrading fields (at least conditionally) in https://github.com/bids-standard/bids-specification/pull/2116, and a review from someone less in the weeds would be appreciated.