Variability of image quality metrics

148 views

Skip to first unread message

oest...@stanford.edu

unread,

Nov 9, 2017, 11:51:23 AM11/9/17

to mriqc-users

One of the hardest problems we've faced during our MRIQC experience is the reproducibility of measures.

We have basically identified two sources of variability:

- The obvious one: logical changes in the workflow, along with changes in the definition of metrics. The latest changes on this front were done before releasing 0.9.6. Therefore, if not affected by the second item of this list, you should not notice changes between 0.9.6-10.

- The tricky one: one-to-one run reproducibility is not ensure when running C++ code compiled with generally used flags. That is the case of ANTs within MRIQC. So, if you want to get the exact same values between two MRIQC runs of the exact same version, then make sure you use the flag --ants-nthreads 1 to avoid parallelism. You might expect though a much longer runtime (https://github.com/poldracklab/mriqc/pull/596)

The extent to which these numeric variabilities affect the results in our paper is largely unknown, and any efforts investigating this will be greatly welcome.

I'll leave this thread open for anyone to comment, knowing the importance of the topic.

Romain Valabregue

unread,

Jul 11, 2018, 3:19:05 AM7/11/18

to mriqc-users

Hi
this is indeed an important topic

the first one is as you said obvious, and one just need to stay with a given version.
the second one is a little bit scary (but not that surprising, I remember report about variability of freesurfer metric when running on different computer !) . Do you have any idea of how much is this variation ? does it impact all metrics similarly or is there more stable metrics ?

A third sources of variability is when you change the minimal preprocessing (realignment segmentation and the brain mask). So without changing the metric definition, changing the "volumes" where they are computed from is very critical
This is a point I would like to further investigate. For instance the brain mask may not be optimal with afni framework and changing the way you compute, will change the qc metrics. How then to validate which one will be the best is an other difficult question ...

Cheers
Romain

Reply all

Reply to author

Forward

0 new messages