Datasets with artifacts/acquisition errors

Skip to first unread message

Mark Mikkelsen

Sep 16, 2022, 7:25:30 PM9/16/22
to bids-discussion
Hi all,

How should one handle raw data that has obvious artifacts/is unusable or was acquired with acquisition errors (due to human error) in BIDS? Out of the interest of open science, one should note the excluded datasets in scientific reports.

My feeling is that these datasets should still be included amongst the raw data of a BIDSified project. But how would one make sure third-party users are aware of them? Should there be a new folder in the root project directory (or rawdata/), such as my_project/excludeddata/sub-12/...?


Remi Gau

Sep 17, 2022, 3:15:05 AM9/17/22
to, Mark Mikkelsen

From the top of my head, I don't think there is a "formal" way to specify it a in BIDS dataset.

But there are still ways to mention it.

If you do not put the excluded data in the dataset, the most free form way to mention it is in the README of the dataset (see template here).

If you include the data in the dataset and want to stay BIDS compliant, you could just have those files in your dataset but flag them as unusable in the scans.tsv file maybe something like the status or status_description column used for channels.tsv in EEG, iEEG, MEG (see here). You could also "flag" those files by using an "acq-ERR" in the filename (though this may be seen as abusing the real meaning of the acq entity).

If you prefer to have this excluded data more cleanly separated from the rest of your data, then your suggestion seems good but make sure to include "excludeddata" in a .bidsignore file.

Curious to hear other suggestions

We are all colleagues working together to shape brain imaging for tomorrow, please be respectful, gracious, and patient with your fellow group members.
You received this message because you are subscribed to the Google Groups "bids-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Rémi Gau

Marcel Zwiers

Sep 17, 2022, 4:59:08 AM9/17/22
to bids-discussion
I believe there is no clear distinction between data with and without artifacts -- they all have artifacts, just the degree differs. And what may be an artifact for one analysis, may not be a problem for another. For this, I think all data should be included and artifacts in them should be quantified. These QC measures then fit perfectly well in the derivatives folder

Op zaterdag 17 september 2022 om 09:15:05 UTC+2 schreef

Oostenveld, R. (Robert)

Sep 17, 2022, 6:48:00 AM9/17/22
Dear Mark

I would not split it off in a separate directory but add a “quality” column to the participants.tsv or to the scans.tsv (and document that in the participants.json or scans.json) and use that toflag the good/bad scans or subjects. In the README you can then document that the quality varies and point to that column.

best regards

PS and thanks for also sharing bad data, that is important for developing and testing ways to detect and deal with noise.


Open Minds Lab

Sep 19, 2022, 12:20:31 PM9/19/22
to bids-discussion
Hi Everyone, talking about Acquisition Errors, I just wanted to mention a tool we built to detect them by checking for "protocol compliance" across a given dataset, and would love to hear your comments or feedback:

Please bear with somewhat rough docs as we're still refining them.


Mark Mikkelsen

Sep 27, 2022, 2:50:47 PM9/27/22
to bids-discussion
Thanks for the great suggestions, everyone! I will go with including all the data and adding a QA column to scans.json.
Reply all
Reply to author
0 new messages