Hi everyone,
I am performing a classification task on a multi-class dataset. Each observation in the dataset can have different labels depending on the categorization task (e.g., visual dimensions, semantic dimensions, etc.). The problem is that the chance level varies across tasks since the number of unique labels differs (e.g., 2 labels in the visual task, 3 labels in the semantic task). This makes comparing decoding accuracy across tasks problematic, as the baseline accuracy is not the same.
To address this, I was thinking of performing several binary classification tasks for each pair of labels and then averaging the classifier accuracy across all pairwise comparisons. This would yield a single, averaged classification accuracy with a consistent chance level (50%) across all tasks.
Thanks in advance for any help you can provide!
Best,
Andrea