How to shuffle condition labels to conduct permutation test

366 views
Skip to first unread message

徐梦思

unread,
Feb 1, 2018, 9:03:51 AM2/1/18
to AnalyzingNeuralTimeSeriesData

Hi, Mike.

Recently I am analyzing data from an experiment with within-subject design (paired-samples). And I have a question as to how to do permutation for paired-sample data.

If I have two conditions (A and B), and all subject received treatments from both conditions. I think there are two kinds of permutation to test difference between conditions, both plausibly applicable.

1). shuffle the condition labels, just like the method you described in your book when comparing two independent-sample. 

2). shuffle the subject mappings between condition A and B, like the way you described for correlation coefficients. Because the samples are paired, I suspect maybe this method is also OK.


What is your opinion? 

Many thanks in advance for your kindness ^_^.



This is a question you have ever answerd. But I still cant finish it.
Indeed, I can make the permutation test for between-subjects design (independent-samples). But I really don't know how to shuffle condition labels for  within-subject design to conduct permutation test.
Thus I hope you could post me some codes if convinent.

Thanks very much!

Mengsi

Mike X Cohen

unread,
Feb 1, 2018, 1:03:26 PM2/1/18
to analyzingneura...@googlegroups.com
Hi Mengsi. If all subjects receive treatment A and treatment B, then the null hypothesis is that A=B. You could rewrite that to be A-B=0. And if you consider C=A-B, then you are testing the null hypothesis that C=0, which is the typical case of a one-sample t-test against 0. If A and B really are equal, then A-B=B-A=0. But if A and B are different, then A-B≠B-A.

I hope that part makes sense. Now consider that A-B = -(B-A). So you could say that the null hypothesis is that (A-B)=-(B-A). This is the key insight to the permutation testing here: Compute C=A-B for each subject, and then at each iteration, you multiply C by -1 for some random subset of subjects. After 1000 iterations, you'll have a distribution of the condition difference under the null hypothesis (C=-C=0), and you can compare that against your observed effect.

I hope that reasoning makes sense. Basically, you want to swap the condition labels for a random subset of subjects during permutation testing.

Mike



--
You received this message because you are subscribed to the Google Groups "AnalyzingNeuralTimeSeriesData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to analyzingneuraltimeseriesdata+unsub...@googlegroups.com.
Visit this group at https://groups.google.com/group/analyzingneuraltimeseriesdata.
For more options, visit https://groups.google.com/d/optout.



--
Mike X Cohen, PhD
New online courses: mikexcohen.com

徐梦思

unread,
Feb 1, 2018, 10:43:56 PM2/1/18
to AnalyzingNeuralTimeSeriesData
Thanks Mike,
Please check whether this code is right at one  iteration. 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % generate pixel-specific null hypothesis parameter distributions
    sn = randi (size(diff,1),1);   % diff = A -B, size(diff,1) = 23 subjects, use randi to generate random number from 1 to 23
    idx = (1:sn);
    diff1 = diff (idx,:,:)*(-1); % at eact iterarion, multiply diff by -1 for some random subset of subjects
    diff2 = diff;
    diff2 (idx,:,:)= []; % rest subjects
    fakediff = cat (1,diff1, diff2 ); % sub*fre*time
    
    % compute t-map of null hypothesis
    tnum   = squeeze(mean(fakediff,1));
    tdenom = sqrt( squeeze((std(fakediff,0,1).^2)./(size(fakediff,1)-1)));
    tmap   = tnum./tdenom;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


在 2018年2月2日星期五 UTC+8上午2:03:26,Mike X Cohen写道:

Mike X Cohen

unread,
Feb 2, 2018, 12:41:55 AM2/2/18
to analyzingneura...@googlegroups.com
I think you could skip several lines in the middle: 

    idx = (1:sn);
    fakediff = diff (idx,:,:)*(-1);
    
    % compute t-map of null hypothesis
    tnum   = squeeze(mean(fakediff,1));

徐梦思

unread,
Feb 2, 2018, 9:15:43 AM2/2/18
to AnalyzingNeuralTimeSeriesData
Thanks again.
I get your methods. But I have another problem, when the random number of subjects is 1, we could not calculate the standard deviation. Then the method might have problems to compute t-map of null hypothesis (I do get some "NaH" results).
Indeed, I also read "Nonparametric statistical testing of EEG- and MEG-data". The authors say that "we should randomly permute the subject-specific averages (condition1, condition2) within every subject. Moreover, this random permutation is performed independently
for every subject". Therefore, I wonder whether this method is Ok.
I have also used the method proposed in this paper, and get some results. However, I hope you could help me check whether this code is right at one iteration..

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

for i = 1:23 % 23 subjects
         
    perm_subj = randperm (size(aver_eegpower1,2));  %size(aver_eegpower1,2)=2, 2 conditions, aver_eegpower1, sub *con*freq*time
    fake_condition1 = perm_subj(1);
    fake_condition2 = perm_subj(1+1:end);
    
    aver_eegpower2(i, :,:,:) = aver_eegpower1(i,[fake_condition1 fake_condition2],:,:);
end
        
    % compute t-map of null hypothesis
    tnum   = squeeze(mean(aver_eegpower2(:,2,:,:),1)-mean(aver_eegpower2(:,1,:,:),1));
    tdenom = sqrt( squeeze((std(( aver_eegpower2(:,2,:,:)-aver_eegpower2(:,1,:,:)),0,1).^2)./size(aver_eegpower2,1)));
    tmap   = tnum./tdenom;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Thanks.

在 2018年2月2日星期五 UTC+8下午1:41:55,Mike X Cohen写道:

Mike X Cohen

unread,
Feb 4, 2018, 2:36:02 AM2/4/18
to analyzingneura...@googlegroups.com
Indeed, that discussion is about permutation testing for group-level statistics, i.e., over many subjects. If you have one subject, then you would want to save the data from all trials and then shuffle the condition labels of each trial. 




徐梦思

unread,
Feb 4, 2018, 8:50:29 AM2/4/18
to AnalyzingNeuralTimeSeriesData
Thanks. I agree with you as my analyses mainly focus group-level. Thus I prefer to use their methods.
However, when I try to use subject-level methods as you suggest, it gives rise to a new problem. That is, the numbers of trials under different conditions are different (data-preprocessing would lead to this). Thus, it is difficult to set a matrix (frequency * timepoint * trial) for even one subject. So how should I resolve this problem.

Thanks again for your time spending on my 'stupid' questions. ^^

在 2018年2月4日星期日 UTC+8下午3:36:02,Mike X Cohen写道:

Mike X Cohen

unread,
Feb 5, 2018, 11:44:24 AM2/5/18
to analyzingneura...@googlegroups.com
First of all, that's not a stupid question! Permutation-based statistics over multidimensional matrices is no simple business.

Anyway, the answer is that it doesn't matter if the trial count differs. Keep all of your trials in the matrix. Then you'll want a separate vector with all trial labels encoded as numbers:
condvector = [1 2 2 1 1 1 1 2 1 1 2 2 ... ];

Then the real condition difference is:
conddiff = squeeze( mean(data(:,:,condvector==1),3) - mean(data(:,:,condvector==2),3) );

To generate the permuted values, you repeat the line above except with a shuffled trial vector:
fakecondvector = condvector(randperm(length(condvector)));

You can see that the mapping of trial identity to condition is swapped, but the number of trials per condition remains.

Mike


徐梦思

unread,
Feb 6, 2018, 1:04:50 AM2/6/18
to AnalyzingNeuralTimeSeriesData
Thanks Mike ! I learned much from your book. I will try your methods.


在 2018年2月6日星期二 UTC+8上午12:44:24,Mike X Cohen写道:
Reply all
Reply to author
Forward
0 new messages