Feature selection gives suspiciously high accuracy

Andrea Ivan Costantino

unread,

Mar 29, 2023, 5:59:40 AM3/29/23

to CoSMoMVPA

Hi all,

I am running a classification analysis on some brain data I collected, using a LDA classifier and then calculating the cross-validated distance along the linear discriminant. The task is a visual task, with some higher cognitive components.

The results for one subject are shown below. On the x axis there are several ROIs, stacked so that the blue bars is the average distance for the classification in the original mask (no feature selection) and the orange bars represent the distance for the masks with feature selection.

The good news is that the feature selection improves the classifier performance all over the board. The bad news is that we really would not expect any classification accuracy in control regions (i.e., non-visual regions) such as the auditory or the motor cortex.

Anyone has an idea of what is going on? Any help would be very much appreciated.

Best regards,

Andrea

Nick Oosterhof

unread,

Mar 29, 2023, 2:05:42 PM3/29/23

to Andrea Ivan Costantino, CoSMoMVPA

Did you use the cosmo_meta_feature_selection_classifier?

I hope you did not do any ‘double dipping’ in your analysis. For details, see:

- https://www.cosmomvpa.org/ex_classify_double_dipping.html
- https://files.eric.ed.gov/fulltext/ED505657.pdf
- https://www.mrc-cbu.cam.ac.uk/personal/nikolaus.kriegeskorte/Kriegeskorte_Simmons_Bellgowan_Baker_Circular_analysis_in_systems_neuroscience_incl_supplement_author_version.pdf

Andrea Ivan Costantino

unread,

Oct 30, 2023, 7:46:19 AM10/30/23

to Nick Oosterhof, CoSMoMVPA

Thanks. I am aware of the problem of double dipping, and it seems indeed that this is what is happening here.

However, it's not clear to me how the partitioning in test/train data would be implemented when we want to do features selection. I can implement it manually for each fold, but I feel there must be an easier way using cosmo native functions. This is how I am running the MVPA:

% Define labels for the data samples and other arguments needed for classification

ds.sa.targets = results.targets_table.CheckmateTarget; % Assign the variable "checkmateTargets" as the target labels

comsoArgs = struct(); % Initialize an empty structure to hold classification arguments

% Define the classifier function to be used (Linear Discriminant Analysis in this case)

comsoArgs.classifier = @cosmo_classify_lda;

% Define how to partition the data for cross-validation

comsoArgs.partitions = cosmo_nfold_partitioner(ds);

% Specify the type of output to be produced by the classifier ('fold_accuracy' means accuracy will be calculated for each fold)

comsoArgs.output = 'fold_accuracy';

% Set the maximum number of features to be considered in the classification

comsoArgs.max_feature_count = 10000;

% Run the MVPA classification

checkRes = cosmo_crossvalidation_measure(ds, comsoArgs);

How would I use here the cosmo_meta_feature_selection_classifier function or, more generally, how would I do features selection in this lda analysis?

Andrea

On 29 Mar 2023, at 8:05 pm, Nick Oosterhof <n.n.oo...@googlemail.com> wrote:

Nick Oosterhof

unread,

Oct 30, 2023, 7:58:29 AM10/30/23

to Andrea Ivan Costantino, CoSMoMVPA

Greetings,

> On Oct 30, 2023, at 12:46, Andrea Ivan Costantino <andreaivan...@gmail.com> wrote:
>
> Thanks. I am aware of the problem of double dipping, and it seems indeed that this is what is happening here.
>
> However, it's not clear to me how the partitioning in test/train data would be implemented when we want to do features selection. I can implement it manually for each fold, but I feel there must be an easier way using cosmo native functions. This is how I am running the MVPA:
>
>
> % Define labels for the data samples and other arguments needed for classification
> ds.sa.targets = results.targets_table.CheckmateTarget; % Assign the variable "checkmateTargets" as the target labels
> comsoArgs = struct(); % Initialize an empty structure to hold classification arguments % Define the classifier function to be used (Linear Discriminant Analysis in this case)
> comsoArgs.classifier = @cosmo_classify_lda; % Define how to partition the data for cross-validation
> comsoArgs.partitions = cosmo_nfold_partitioner(ds); % Specify the type of output to be produced by the classifier ('fold_accuracy' means accuracy will be calculated for each fold)
> comsoArgs.output = 'fold_accuracy'; % Set the maximum number of features to be considered in the classification
> comsoArgs.max_feature_count = 10000; % Run the MVPA classification
> checkRes = cosmo_crossvalidation_measure(ds, comsoArgs);
>
> How would I use here the cosmo_meta_feature_selection_classifier function or, more generally, how would I do features selection in this lda analysis?

Actually cosmo_meta_feature_selection_classifier is deprecated, the more updated function is now called cosmo_classify_meta_feature_selection. Its documentation contains an example using a searchlight.

There the final line of code is:

res=cosmo_searchlight(ds_tl,nbrhood,measure,measure_args,...
'progress',false);

which, if you want to run the analysis only once on the entire dataset in ds_tl (without searchlight), can be changed into:

res=measure(ds_tl, measure_args)

The idea is that for the measure arguments used in cosmo_classify_meta_feature_selection:
- child_classifier is used as classifier, you would use @cosmo_classify_lda there
- feature_selector and feature_selection_ratio_to_keep define how to select the ‘best' features
- other arguments, such as partitions, are passed onto the child_classifier.

Does that help?

Reply all

Reply to author

Forward