Hi BIDS-ers,
I have
previously modified the text in the specification explaining how the Inheritance Principle works, as it was somewhat loosely defined and not particularly systematized. Those changes were in part motivated by my intent to now change the rules of the principle
itself: the current operation needed to be clearer before a change thereof could be expressed. My proposed change here facilitates the storage of more complex data representations, which become
relevant in the context of BIDS Derivatives. But it comes at the cost of increased complexity. So myself and the maintainers would like to know what others' opinions are on how to proceed.
I've written a lot on the topic in relevant links---likely too much for most---so I'm going to try to explain the issue here as succinctly as I can.
The What
Here are two example BIDS datasets, pulled
directly from the current specification.
Example 1:
└─ sub-01/
└─ ses-test/
├─ anat/
│ └─ sub-01_ses-test_T1w.nii.gz
└─ func/
├─ sub-01_ses-test_task-overtverbgeneration_run-1_bold.nii.gz
├─ sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz
├─ sub-01_ses-test_task-overtverbgeneration_bold.json
└─ sub-01_ses-test_task-overtverbgeneration_run-2_bold.json
Example 2:
└─ sub-01/
└─ ses-test/
├─ sub-01_ses-test_task-overtverbgeneration_bold.json
├─ anat/
│ └─ sub-01_ses-test_T1w.nii.gz
└─ func/
├─ sub-01_ses-test_task-overtverbgeneration_run-1_bold.nii.gz
├─ sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz
└─ sub-01_ses-test_task-overtverbgeneration_run-2_bold.json
Note move of file "sub-01_ses-test_task-overtverbgeneration_bold.json" from "sub-01/ses-test/func/" to "sub-01/ses-test/".
In the current specification:
- Example 1 is illegal: NIfTI image for run 2 has multiple applicable JSON files in one directory, and that's not allowed.
- Example 2 is one possible workaround to make the dataset legal.
- Removes what I think is an ill-directed, ultimately unnecessary restriction on dataset construction.
- Facilitates construction of more complex data, particularly when it comes to derivatives.
I have myself run into this limitation in the
context of diffusion MRI voxel-level models, but I foresee that the issue is far more general, so I'll try to present it here absent context-specifics.
Imagine we fit a voxel-wise model to the data. We'd like to store metadata relating to the nature of the model, how it was fitted, and any input parameters that may have been set; it makes sense to store that information only once. The output of that model
fit includes multiple output parameters, and each of those parameters have different units to one another; so we put those in separate images and encode their units in file-specific
metadata.
One (simplified) way we could propose that such data be stored is: -
└─
sub-01/
└─
dwi/
├─ sub-01_param-X_model.nii.gz
├─ sub-01_param-X_model.json
├─ sub-01_param-Y_model.nii.gz
├─ sub-01_param-Y_model.json
└─ sub-01_model.json
We put any metadata that applies to the whole model fit in "sub-01_model.json", and the units of parameters X and Y are specified in files "sub-01_param-X_model.json" and "sub-01_param-Y_model.json" respectively.
Problem: the current specification renders this structure illegal.
While there are alternative proposal structures for complex derivatives (e.g. see ramblings
here) that might bypass this specific issue, I wouldn't want the inheritance principle limitation to be a primary motivating factor for resorting to one of those alternatives, especially if said limitation can be resolved.
- All existing datasets that satisfy the current principle will satisfy the revised principle (ie. it's backwards-compatible).
The why not
- Increased complexity, and therefore potentially reduced accessibility.
(though I'd argue that users can simply neglect to make use of this advanced feature; most exploitations of this capability will come from BIDS Apps outputs)
- BIDS APIs across multiple software languages would need to be altered to satisfy the revised principle.
- The validator would need to be altered to satisfy the revised principle.
- While it would not introduce backward-incompatibility, it would introduce forward-incompatibility: datasets that exploit the new version of the specification would be illegal under a prior version of the specification.
-----------
I hope that's enough content to understand the situation and formulate an opinion.
There's a lot of possible tangent discussions here, some of which I may have riffed on in the various GitHub links. Happy to either provide additional links or feed forward what thought I've put into related issues or alternatives thus far.
Cheers
Rob
This communication is intended only for the named recipient and may contain information that is confidential, legally privileged or subject to copyright; the Florey Institute of Neuroscience & Mental Health does not waive any rights if you have received this
communication in error. The views expressed in this communication are those of the sender and do not necessarily reflect the views of the Florey Institute of Neuroscience & Mental Health.