cosmo_montecarlo_cluster_stat and Stelzer et al. (2013) correction for multiple comparisons

Dillon Plunkett

unread,

Jun 14, 2019, 3:31:01 PM6/14/19

to CoSMoMVPA

Hi,

The FAQ notes that cosmo_montecarlo_cluster_stat supports the method described by Stelzer et al. (2013) for correcting for multiple comparisons. As I understand it, the Stelzer et al. proposal is:

Make 100 chance accuracy maps per participant.
Make a chance group accuracy maps by randomly selecting 1 chance accuracy map per participant and averaging those together. Repeat this procedure 10,000 times.
Using those group maps, determine what accuracy at each voxel corresponds to p < .001 (using the distribution specifically at that voxel in the chance group maps, not the same threshold for every voxel).
Use those values to threshold each of the null images and the real map.
Assign a p-value to each cluster by comparing to how often clusters of that size appear in the chance group maps.
Correct for the multiple comparisons involved in Step 5 by using step-down FDR on the cluster p-values.

I'm confused, however, about how to actually achieve this with cosmo_montecarlo_cluster_stat.

I perform Step 1 separately, but don't understand how to then proceed with cosmo_montecarlo_cluster_stat. In particular, none of the possible combinations of permitted options described for either `cluster_stat` or `feature_stat` sound like they would perform the Stelzer et al. approach.

What's the correct use of cosmo_montecarlo_cluster_stat if the goal is just to implement the Stelzer et al. approach (after having generated the 100 null datasets per participant)?

Thanks very much!

Dillon Plunkett

Nick Oosterhof

unread,

Jun 15, 2019, 6:25:36 AM6/15/19

to Dillon Plunkett, CoSMoMVPA

Greetings,

On Fri, 14 Jun 2019 at 21:31, Dillon Plunkett <plun...@g.harvard.edu> wrote:

The FAQ notes that cosmo_montecarlo_cluster_stat supports the method described by Stelzer et al. (2013) for correcting for multiple comparisons. As I understand it, the Stelzer et al. proposal is:
Make 100 chance accuracy maps per participant.

Make a chance group accuracy maps by randomly selecting 1 chance accuracy map per participant and averaging those together. Repeat this procedure 10,000 times.

Using those group maps, determine what accuracy at each voxel corresponds to p < .001 (using the distribution specifically at that voxel in the chance group maps, not the same threshold for every voxel).

Use those values to threshold each of the null images and the real map.
Assign a p-value to each cluster by comparing to how often clusters of that size appear in the chance group maps.

Correct for the multiple comparisons involved in Step 5 by using step-down FDR on the cluster p-values.
I'm confused, however, about how to actually achieve this with cosmo_montecarlo_cluster_stat.

I perform Step 1 separately, but don't understand how to then proceed with cosmo_montecarlo_cluster_stat. In particular, none of the possible combinations of permitted options described for either `cluster_stat` or `feature_stat` sound like they would perform the Stelzer et al. approach.

cosmo_montecarlo_cluster_stat works not exactly as the steps above. In CoSMoMVPA:

step 1-cosmo: make (for example) 100 null datasets for each individual participant. This must be done outside cosmo_montecarlo_cluster_stat, for example using cosmo_randomize_targets.

step 2a-cosmo: compute the group statistic map using the original data (not null data). For this first a feature-wise statistical test is performed (usually t-test). Then the group statistic is based on this, it being either a cluster size (for features that survive a fixed uncorrected threshold) or a a TFCE score (which does not require specifying an uncorrected threshold).

step 2b-cosmo: do the same as step 2a-cosmo, but now using the null maps. Make (for example) 10,000 null maps using this way, sampling from the maps in step 1-cosmo.

step 3-cosmo: compute the distribution of the maximum statistic over the null maps from step 2b-cosmo.

step 4-cosmo: compute the p-value of the group map from step2a-cosmo by computing, for each feature, where it lies in the distribution obtained in step 3-cosmo.

step 5-cosmo: convert the p-values from step 4-cosmo to a z-score, for easier visualisation.

FDR is not used.

What's the correct use of cosmo_montecarlo_cluster_stat if the goal is just to implement the Stelzer et al. approach (after having generated the 100 null datasets per participant)?

Quite similar to its normal use (as in the example in the documentation), except that you need to provide data for the 'null' argument. This is used in step 2b-cosmo above. Please let me know if you need more information.

best,

Nick

Adi Yaniv

unread,

Jun 17, 2019, 7:42:34 AM6/17/19

to CoSMoMVPA

Hi,

I'm posting on this thread as my question is also related to monte carlo cluster size stat -

i preformed a searchlight using cosmo_dissimilarity_matrix_measure as a measure [i.e - searchlight is simply SL=cosmo_searchlight(ds,nbrhood,measure)

from it i get targets1 and targets2. following stacking all subjects, I extract only the lines of correlations that interest me (for instance: dissimilarity between target 1 and 2 only for each subject) and mean it into a new ds - "ds_12" (cosmo disp below).

now i am trying to calculate monte carlo cluster size on ds_12. I tried several options (after reading the doc) - like

>> opt=struct();

>> opt.cluster_stat='tfce'

>> opt.niter=150;

>> opt.h0_mean=0;

>> z_ds=cosmo_montecarlo_cluster_stat(ds_12,nbrhoodMC,opt);

I get error -

Reference to non-existent field 'targets'.

when i manually enter "targets=1" field, and run the same line, i get "no chunks" error.

What would be the right method to implement montecarlo for my analysis? all i need is cluster size output from it to threshold the searchlight results maps.

thanks a lot!

-Adi

if needed -

cosmo_disp(ds_12)

.samples

[ 0.529 0.504 0.515 ... 0.49 0.488 0.492 ]@1x50552

.sa

.targets1

[ 1 ]

.targets2

[ 2 ]

.fa

.center_ids

[ 1 2 3 ... 5.06e+04 5.06e+04 5.06e+04 ]@1x50552

.i

[ 33 32 33 ... 32 29 30 ]@1x50552

.j

[ 25 26 26 ... 31 32 33 ]@1x50552

.k

[ 1 1 1 ... 46 46 46 ]@1x50552

.nvoxels

[ 89 109 89 ... 102 101 100 ]@1x50552

.radius

[ 3.74 3.74 3.74 ... 3.74 4.12 4.47 ]@1x50552

.a

.fdim

.labels

{ 'i'

'j'

'k' }

.values

{ [ 1 2 3 ... 56 57 58 ]@1x58

[ 1 2 3 ... 38 39 40 ]@1x40

[ 1 2 3 ... 44 45 46 ]@1x46 }

.sdim

.labels

{ 'targets1' 'targets2' }

.values

{ [ 1 [ 1

2 2

3 ] 3 ] }

.vol

.mat

[ 0 0 -3 72

-3 0 0 74

0 -3 0 79

0 0 0 1 ]

.dim

[ 58 40 46 ]

.xform

'talairach'

Nick Oosterhof

unread,

Jun 17, 2019, 3:23:13 PM6/17/19

to Adi Yaniv, CoSMoMVPA

Greetings,

On Mon, 17 Jun 2019 at 13:42, Adi Yaniv <adi....@gmail.com> wrote:

I'm posting on this thread as my question is also related to monte carlo cluster size stat -
i preformed a searchlight using cosmo_dissimilarity_matrix_measure as a measure [i.e - searchlight is simply SL=cosmo_searchlight(ds,nbrhood,measure)
from it i get targets1 and targets2. following stacking all subjects, I extract only the lines of correlations that interest me (for instance: dissimilarity between target 1 and 2 only for each subject) and mean it into a new ds - "ds_12" (cosmo disp below).
now i am trying to calculate monte carlo cluster size on ds_12. I tried several options (after reading the doc) - like
>> opt=struct();
>> opt.cluster_stat='tfce'
>> opt.niter=150;
>> opt.h0_mean=0;
>> z_ds=cosmo_montecarlo_cluster_stat(ds_12,nbrhoodMC,opt);
I get error -
Reference to non-existent field 'targets'.
when i manually enter "targets=1" field, and run the same line, i get "no chunks" error.

What would be the right method to implement montecarlo for my analysis?

You should have a dataset with maps from each participant, so if you have 12 participants, the .samples field should be 12xN (where N is the number of voxels). Set .sa.targets all to the same value (e.g. 1), and .sa.chunks to 1:12. This indicates that you want to do a one sample t-test. See the documentation of cosmo_montecarlo_cluster_stat on how to set sa.targets and .sa.chunks for a one and two-sample t-test, and one way and repeated measure ANOVA.

all i need is cluster size output from it to threshold the searchlight results maps.

cosmo_montecarlo_cluster_stat will compute a map corrected for multiple comparisons. If you want to use a fixed uncorrected threshold (not TFCE), then set the p_uncorrected to the desired threshold. Currently there is no functionality to directly output a critical cluster size, although such sizes are computed internally.

Adi Yaniv

unread,

Jun 20, 2019, 6:07:42 AM6/20/19

to CoSMoMVPA

Hi Nick,

thanks for your reply...i tried it, but i still get an error. here's what i have tried:

- calculate searchlight for each subject with cosmo_dissimilarity_matrix_measure

- stack all subjects' searchlight results using cosmo_stack

- use cosmo_slice to get unique combination of targets 1 and targets 2, for instance:

ds_H=cosmo_slice(ds_all, ds_all.sa.targets2==1);

ds_HS=cosmo_slice(ds_H,ds_H.sa.targets1==2);

[H & S index my conditions, or target names, hence i am interested in the dissimilarity between condition 1 and condition 2 across the whole brain]

now i would like to run monte carlo on ds_HS which contains:

samples (n subjects*n voxels)

sa.targets1, sa.targets2, no chunks.

as you suggested, i manually add chunks as 1:number of subjects & targets set to 1:

ds_HS.sa.targets=ones(1,n_subjects)';

ds_HS.sa.chunks=(1:n_subjects)';

then i calculate the neighborhood:

nbrhood=cosmo_spherical_neighborhood(ds_HS,'count',n_voxels);

and when i try to use cosmo_montecarlo_cluster_stat (i want tfce as an output)-

ds_MC=cosmo_montecarlo_cluster_stat(ds_HS,nbrhood,'niter',10000,'h0_mean',0,...

'feature_stat','auto','cluster_stat','tfce','dh',0.1);

I get -

>>error using cosmo_montecarlo_cluster_stat>get_clusterizer_func (line 790)

option 'feature_sizes' is not provided, and neighborhood struct does not contain a field .fa.sizes. This

probably means that the neighborhood was not created using cosmo_cluster_neighborhood (which sets this

feature attribute appropriately)

Background: cluster measure usually requires the relative size of each feature. For example:

- in a volumetric fMRI dataset it can be a row vector of ones, because all voxels have the same size

- in a surface-based dataset it should be the area of each node (such as computed by surfing_surfacearea in

the surfing toolbox).

Your options are:

- generate the neighborhood using cosmo_cluster_neighborhood. This is the easiest option and should be used

unless you are using a custom cluster neighborhood.

- set the .fa.sizes manually.

Error in cosmo_montecarlo_cluster_stat (line 295)

cluster_func=get_clusterizer_func(nbrhood,opt);

then I tried -

ds_HS.fa.sizes=ones(1,n_voxels);

but I still get this error.

also tired without h0, then i get another error stating that ho is needed.

i guess something with the neighborhood is not right. maybe i should compute it in another way or at a different stage at my code? or maybe my implementation of cosmo_montecarlo_cluster_stat is wrong?

thanks a lot,

Adi

Nick Oosterhof

unread,

Jun 20, 2019, 6:38:27 AM6/20/19

to Adi Yaniv, CoSMoMVPA

The error message says "This probably means that the neighborhood was not created using cosmo_cluster_neighborhood ", and that seems indeed the case. Can you try using cosmo_cluster_neighborhood?

Adi Yaniv

unread,

Jun 20, 2019, 7:26:33 AM6/20/19

to CoSMoMVPA

The error message says "This probably means that the neighborhood was not created using cosmo_cluster_neighborhood ", and that seems indeed the case. Can you try using cosmo_cluster_neighborhood?

thanks, I tried:

cosmo_cluster_neighborhood(ds_HS)

got -

Error using cosmo_check_dataset (line 138)

: Missing field .sa

Error in cosmo_check_neighborhood (line 90)

is_ok=cosmo_check_dataset(nbrhood,raise);

Error in cosmo_cluster_neighborhood (line 190)

cosmo_check_neighborhood(nbrhood,ds);

although ds_HS contains .sa....

K>> cosmo_disp(ds_HS)

.samples

[ 0.339 0.275 0.378 ... 0.603 0.548 0.546

1.02 1.15 1.13 ... 0.519 0.486 0.459

0.328 0.431 0.374 ... 0.407 0.236 0.295

: : : : : :

0.806 0.802 0.921 ... 0.687 0.691 0.952

0.25 0.25 0.185 ... 0.17 0.284 0.25

0.469 0.44 0.453 ... 0.508 0.351 0.34 ]@79x50552

.sa

.targets1

[ 2

2

:

2

2 ]@79x1

.targets2

[ 1

1

:

1

1 ]@79x1

.targets

[ 1

1

:

1

1 ]@79x1

.chunks

[ 1

2

3

:

77

78

79 ]@79x1

Nick Oosterhof

unread,

Jun 20, 2019, 7:42:00 AM6/20/19

to Adi Yaniv, CoSMoMVPA

Hmm, that's weird, I did not expect that.

Can you maybe send your dataset (ds_HS) as a .mat file to me by email?

I can then try to understand why you get this error.

(In case you have a super exciting brain map results that you're not ready to share yet, you could replace the content of .samples with random data).

Nick Oosterhof

unread,

Jun 20, 2019, 9:02:54 AM6/20/19

to Adi Yaniv, CoSMoMVPA

On Thu, 20 Jun 2019 at 13:41, Nick Oosterhof <n.n.oo...@googlemail.com> wrote:

On Thu, 20 Jun 2019 at 13:26, Adi Yaniv <adi....@gmail.com> wrote:

The error message says "This probably means that the neighborhood was not created using cosmo_cluster_neighborhood ", and that seems indeed the case. Can you try using cosmo_cluster_neighborhood?
thanks, I tried:
cosmo_cluster_neighborhood(ds_HS)

got -
Error using cosmo_check_dataset (line 138)
: Missing field .sa

Error in cosmo_check_neighborhood (line 90)
is_ok=cosmo_check_dataset(nbrhood,raise);

Error in cosmo_cluster_neighborhood (line 190)
cosmo_check_neighborhood(nbrhood,ds);

although ds_HS contains .sa....

[...]

Hmm, that's weird, I did not expect that.
Can you maybe send your dataset (ds_HS) as a .mat file to me by email?
I can then try to understand why you get this error.

(In case you have a super exciting brain map results that you're not ready to share yet, you could replace the content of .samples with random data).

This seems to be a bug caused by a combination of making a cluster neighbourhood with a dataset that has sample dimensions, in your case {'targets1', 'targets2'}. I will work on a fix, but in the meantime could you try the following workaround?

% possible workaround
sdim_labels = ds_HS.a.sdim.labels;
ds_HS=cosmo_dim_remove(ds_HS,sdim_labels);

% now this should work
nh=cosmo_cluster_neighborhood(ds_HS);

Nick Oosterhof

unread,

Jun 21, 2019, 7:33:52 AM6/21/19

to CoSMoMVPA

---------- Forwarded message ---------
From: Nick Oosterhof <n.n.oo...@googlemail.com>
Date: Fri, 21 Jun 2019 at 13:33
Subject: Re: cosmo_montecarlo_cluster_stat and Stelzer et al. (2013) correction for multiple comparisons
To: Adi Yaniv <adi....@gmail.com>

On Thu, 20 Jun 2019 at 15:57, Adi Yaniv <adi....@gmail.com> wrote:

thanks so much! it works!
I just wanna make sure i'm correct with the implementation of monte carlo stat . i'm doing this -

ds_z=cosmo_montecarlo_cluster_stat(ds_HS,nh,'niter',10000,'h0_mean',0,...

'feature_stat','auto','cluster_stat','tfce','dh',0.1);

the output i'd like to get is the map of average ds_HS, corrected for multiple comparisons.
what i'm getting is a a row vector (ds_z.samples), with values ranging 3.4316-3.7190 - which are the thresholds for the map (i'm returning to .vmp file)

I'm not sure what you mean with "values ranging 3.4316-3.7190 - which are the thresholds for the map". The values are z-scores corresponding to p values corrected for multiple comparisons. For example, with alpha=0.05, z-scores less than -1.96 or greater than 1.96 would be significant (below or above h0_mean, respectively) after correction for multiple comparisons.

Of course this assumes that h0_mean=0. For classification accuracies it must be set to h0_mean=1/n_classes, for example 0.25 if you have 4 classes (assuming a balanced design, see cosmo_balance_partitions).

Reply all

Reply to author

Forward