permutation tests

Alexandre Gramfort

unread,

Feb 12, 2011, 5:18:11 PM2/12/11

to pystatsmodels

Hi,

I need python code to do permutation tests (paired ttest, pseudo-
ttest, wilcoxon, signtest and more)
The application is brain imaging but such tools should exist in a
stats package.

Do you know where I can find this?

thanks
Alex

josef...@gmail.com

unread,

Feb 12, 2011, 6:14:42 PM2/12/11

to pystat...@googlegroups.com

I *should* exists but it is not yet in statsmodels. Until now we
relied mostly on asymptotic distributions of tests and estimators.

I thought nipy or pymvpa might have something already, but I don't
remember anything directly anywhere in python.
Sturla Molden has written permutation pvalues as part of a function
(kendalltau, wilcoxon ? I don't find the link), but in my Monte Carlos
it didn't look any better than the asymptotic distribution.

I'm not really familiar with the details of permutation tests, and I'm
only partway through reading up on them. I saw this last week which I
thought might be a good way to get (me) started:

http://www.mathworks.com/matlabcentral/fileexchange/29782-one-samplepaired-samples-permutation-t-test-with-correction-for-multiple-comparisons

I guess it's easy to implement with an iterator for all permutations
for the exact distribution in small samples, and for random
permutations in large samples. (Although Sturla had a warning in his
code that numpy.random.shuffle needs a warmup and he dropped the first
observations.)

I ran most of the tests in scipy.stats through MonteCarlos to see how
good they are, but I never used a permutation as resampler. I was
planning to extend the MonteCarlo class to use bootstrap and
permutation as alternative resamplers.

(one thing where I'm not sure yet is when do we permute and resample
from the joint sample, like in t-test, and when we do we permute only
one series, I have just read about the latter in tests for
independence.)

With the right reference it should be fast to implement since we have
all the infrastructure for it and it sounds relatively easy, but I
haven't looked for or found the right reference yet.
(In contrast to BCa bootstrap intervals where I spent hours trying to
understand the algorithm without sufficient success, and the only
implementation in python is in dipy and didn't work last time I
looked. But since it's the most popular confidence interval in Stata
we need to get it).

Josef
(sorry about being longwinded, as usual)

>
> thanks
> Alex

josef...@gmail.com

unread,

Feb 13, 2011, 7:58:48 AM2/13/11

to pystat...@googlegroups.com

On Sat, Feb 12, 2011 at 6:14 PM, <josef...@gmail.com> wrote:
> On Sat, Feb 12, 2011 at 5:18 PM, Alexandre Gramfort
> <alexandre...@gmail.com> wrote:
>> Hi,
>>
>> I need python code to do permutation tests (paired ttest, pseudo-
>> ttest, wilcoxon, signtest and more)
>> The application is brain imaging but such tools should exist in a
>> stats package.
>>
>> Do you know where I can find this?
>
> I *should* exists but it is not yet in statsmodels. Until now we
> relied mostly on asymptotic distributions of tests and estimators.
>
> I thought nipy or pymvpa might have something already, but I don't
> remember anything directly anywhere in python.

In case anyone else is interested, the replies pointing to
nipy.neurospin and pymvpa are on the nipy mailing list

http://mail.scipy.org/pipermail/nipy-devel/2011-February/005541.html

Josef

Alexandre Gramfort

unread,

Feb 13, 2011, 3:30:15 PM2/13/11

to pystat...@googlegroups.com

hi,

I've came up with a quite efficient implementation for the t-test :

https://github.com/mne-tools/mne-python/commit/54f9e47be76ccc2cbbf34f9ba9c319802c44291d

using pure numpy and a few variance splitting tricks.

If anyone here wants to take a look.

Feed back welcome

Alex

PS : i've put this code in the project I'm working on now but it can be
copied and put anywhere you want. Just please let me know if you do so.

On Sat, Feb 12, 2011 at 6:14 PM, <josef...@gmail.com> wrote:

josef...@gmail.com

unread,

Feb 13, 2011, 5:45:26 PM2/13/11

to pystat...@googlegroups.com

On Sun, Feb 13, 2011 at 3:30 PM, Alexandre Gramfort
<alexandre...@gmail.com> wrote:
> hi,
>
> I've came up with a quite efficient implementation for the t-test :
>
> https://github.com/mne-tools/mne-python/commit/54f9e47be76ccc2cbbf34f9ba9c319802c44291d
>
> using pure numpy and a few variance splitting tricks.
>
> If anyone here wants to take a look.
>
> Feed back welcome

looks like a nice trick using the partition instead of shuffle. From
reading the code I'm not quite sure how this works. In stds you are
using X2 from the original sample not from the permutation, is this
ok?

a few comments:

np.sqrt(X2 - mu0**2) has usually lower precision for variance than
two pass, I don't know if it matters in this case

random.rand should be faster than random.randn (I assume)

As a testcase, with (n_samples, n_tests) = (1000,2) you should be able
to get p-values close to the t-distribution, and you could check
against scipy.stats.ttests

I was playing with this for a while now, and still haven't figured out
what is really going on. comparing with other ttests I wasn't getting
as close as I thought I should.

I worried that the dot product with a matrix of shape n_permutations,
n_samples might create problems with larger samples, but so far it's
very fast, and I suppose that large n-samples is not the main usecase
for this.

Josef

Alexandre Gramfort

unread,

Feb 13, 2011, 6:24:54 PM2/13/11

to pystat...@googlegroups.com

> looks like a nice trick using the partition instead of shuffle. From
> reading the code I'm not quite sure how this works. In stds you are
> using X2 from the original sample not from the permutation, is this
> ok?

yes since I just flip the signs randomly which does not affect the second
order moments.

> a few comments:
>
> np.sqrt(X2 - mu0**2) has usually lower precision for variance than
> two pass, I don't know if it matters in this case

good to know.

> random.rand should be faster than random.randn (I assume)

good catch even it is quite negligible:

[stats]177>%timeit np.sign(np.random.randn(1e7))
1 loops, best of 3: 595 ms per loop

[stats]178>%timeit np.sign(0.5 - np.random.rand(1e7))
1 loops, best of 3: 323 ms per loop

> As a testcase, with (n_samples, n_tests) = (1000,2) you should be able
> to get p-values close to the t-distribution, and you could check
> against scipy.stats.ttests

I'll give it a try

> I was playing with this for a while now, and still haven't figured out
> what is really going on. comparing with other ttests I wasn't getting
> as close as I thought I should.

hum... to be investigated.

> I worried that the dot product with a matrix of shape n_permutations,
> n_samples might create problems with larger samples, but so far it's
> very fast, and I suppose that large n-samples is not the main usecase
> for this.

correct. The way to do it is to build H0 iteratively using mini batch of
let's say 1000 permutations.

thanks for the feed back

Alex

josef...@gmail.com

unread,

Feb 13, 2011, 9:45:12 PM2/13/11

to pystat...@googlegroups.com

My "impression" reading the program is that it tests whether the
deviations from a common mean are zero, but then X here should be
replaced by X-X.mean() (?)

mus = np.dot(perms, X) / float(n_samples)

or alternatively that we have subtracted the hypothesized mean already from X.
Also as in your second example it looks like it can directly test
several groups against a control. Or maybe X should have all pairwise
comparisons (?) see last example below

but I haven't figured out what the result is supposed to be with two variables

>>> Xa = np.random.randn(1000, 2) + np.array([0.0,0.05])
>>> p_values, T0, H0, oth = permutation_t_test(Xa, n_permutations=999)
(I added oth to get at some intermediate results)

>>> p_values, T0
(array([ 0.901, 0.19 ]), array([ 0.43786396, 1.66748845]))

the p-values are close to the distribution of the max of two
independent t-distributed variables

>>> from scikits.statsmodels.sandbox.distributions.multivariate import mvstdtprob
>>> [1-mvstdtprob(-t*np.ones(2), t*np.ones(2), np.eye(2), 100) for t in T0]
[0.88554628153883863, 0.18682485602085153]

>>> p_values, T0, H0, oth = permutation_t_test(Xa-Xa.mean(), n_permutations=999)
>>> p_values, T0
(array([ 0.824, 0.802]), array([-0.56507154, 0.60092222]))

(independence is not the correct assumption in this case but the
values are close)
>>> [1-mvstdtprob(-t*np.ones(2), t*np.ones(2), np.eye(2), 100) for t in T0]
[0.81718696901478416, 0.79603174615802974]

The ttest give pretty different answers, most likely because they are
based on a different null hypothesis

>>> stats.ttest_1samp(Xa[:,1], Xa.mean())
(0.60092221598629558, 0.5480281085546379)
>>> stats.ttest_1samp(Xa[:,0], Xa.mean())
(-0.56507153734634741, 0.57215197617815516)

>>> stats.ttest_ind(Xa[:,0], Xa[:,1])
(-0.82331357178668108, 0.41042795454761671)

I'm still puzzled,

but this looks good:

>>> p_values, T0, H0, oth = permutation_t_test(Xa[:,1:]-Xa[:,:1], n_permutations=999)
>>> p_values, T0
(array([ 0.412]), array([ 0.82192971]))

>>> stats.ttest_1samp(Xa[:,1]-Xa[:,0], 0)
(0.82192971090505584, 0.41131302140178205)
>>> stats.ttest_ind(Xa[:,0], Xa[:,1])
(-0.82331357178668108, 0.41042795454761671)

Maybe it's more a question how to use the test then what the
implementation means

I will definitely borrow the code for statsmodels

Josef

Alexandre Gramfort

unread,

Feb 13, 2011, 10:27:15 PM2/13/11

to pystat...@googlegroups.com

Hi,

> My "impression" reading the program is that it tests whether the
> deviations from a common mean are zero,

It tests if the sample mean deviates from 0.

> but then X here should be
> replaced by X-X.mean() (?)

I don't think so.

Regarding the 2-sample test, I still do not know how to handle efficiently
groups of different sizes.

feel free to take this code for statsmodel.

still to do is the Wilcoxon signed rank test.

Alex

josef...@gmail.com

unread,

Feb 13, 2011, 10:29:48 PM2/13/11

to pystat...@googlegroups.com

I'm not puzzled anymore, I just saw the update on the nipy mailing list,

using variables to explain the columns was misleading for me. Each
column represent a hypothesis that the expected value of a column is
zero,

for all pairwise comparisons each column represents a pair, as the
sign is flipped the observations change sides in the pairwise
comparison

Nice, and thanks,

Josef

josef...@gmail.com

unread,

Feb 13, 2011, 10:38:38 PM2/13/11

to pystat...@googlegroups.com

Just to correct myself again, (and so I find it later again)
If a variable shows up in more than one column, then permutations
across columns should not be independent, I think. So this might only
be approximate, I'm not sure it fully captures the correlation across
different t-tests.

Josef

josef...@gmail.com

unread,

Feb 17, 2011, 7:10:42 AM2/17/11

to pystat...@googlegroups.com

On Sun, Feb 13, 2011 at 10:27 PM, Alexandre Gramfort
<alexandre...@gmail.com> wrote:
> Hi,
>

>> My "impression" reading the program is that it tests whether the
>> deviations from a common mean are zero,
>
> It tests if the sample mean deviates from 0.
>
>> but then X here should be
>> replaced by X-X.mean() (?)
>
> I don't think so.

cleared up

>
> Regarding the 2-sample test, I still do not know how to handle efficiently
> groups of different sizes.

currently you are not imposing that the sample sizes are the same,
i.e. equal number of +1 and -1:
- in small samples, what happens if all a permutation come up with the
same sign for each observation (mean is still zero in expectation)

stack both datasets with a second array as group indicator array in
{-1,1} permute group indicator and take average or sum, with nobs
correction.

this also preserves nobs_i during the permutation.

(some quick notes while running out the door)

Josef

Alexandre Gramfort

unread,

Feb 17, 2011, 11:19:58 AM2/17/11

to pystat...@googlegroups.com, josef...@gmail.com

hi,

let me clarify. When I have 2 samples X1 and X2
I run the paired t-test by running the one sample permutation test to X1 - X2.
If dimensions do not match (different samples sizes) my implementation
breaks.

Maybe you can come up with a fast way to handle this.

Alex

josef...@gmail.com

unread,

Feb 17, 2011, 2:45:39 PM2/17/11

to pystat...@googlegroups.com, alexandre...@m4x.org

On Thu, Feb 17, 2011 at 11:19 AM, Alexandre Gramfort
<alexandre...@gmail.com> wrote:
> hi,
>
> let me clarify. When I have 2 samples X1 and X2
> I run the paired t-test by running the one sample permutation test to X1 - X2.
> If dimensions do not match (different samples sizes) my implementation
> breaks.

I forgot about the paired part.
I think paired ttest in scipy.stats requires equal sample sizes,
otherwise there are no pairs.

So my idea (while in the shower), was something like the following and
works only for non-paired tests, where unbalanced cases would have to
be filled up with zeros or nans to work in a vectorized way.
this also keeps the number of +1 and -1 constant.

>>> x = np.array([0,0,0.1,0.2]) + np.random.randn(20,4)
>>> pdiff = np.empty((40,3))
>>> pdiff[:20, :] = x[:,1:]
>>> pdiff[20:, :] = x[:,:1]
>>> groupind = np.ones(40.)/20
>>> groupind[20:] *= -1
>>> x.mean(0) - x[:,0].mean(0)
array([ 0. , 0.03925505, 0.58201175, 0.48228527])
>>> np.dot(pdiff.T, groupind)
array([ 0.03925505, 0.58201175, 0.48228527])
>>> for _ in range(3):
gi = np.random.permutation(groupind)
print np.dot(pdiff.T, gi)

[ 0.39654089 0.05458038 0.01902859]
[-0.09091045 -0.30249603 -0.4403841 ]
[ 0.15725859 0.23483027 -0.12234697]
>>>

permutation on diff, not on t-statistic, and no tmax for multiple
tests - looks ok

>>> res = []
>>> for _ in range(1000):
gi = np.random.permutation(groupind)
res.append(np.dot(pdiff.T, gi))

>>> resarr = np.array(res)
>>> resarrs = np.sort(resarr, axis=0)
>>> resarrs[np.floor(1000*0.95)]
array([ 0.49338791, 0.46704862, 0.46482952])
>>> from scipy import stats
>>> stats.t.sf(0.05, 18)
0.48033652247237346
>>> stats.norm.sf(0.05)
0.48006119416162751
>>> resarrs[np.floor(1000*0.95)] < x[:,1:].mean(0) - x[:,0].mean(0)
array([False, True, True], dtype=bool)

I still haven't thought too much about it, just checked whether my
idea could work, following a similar pattern as your current code.

Josef

gcalmettes

unread,

Apr 13, 2012, 3:57:34 AM4/13/12

to pystat...@googlegroups.com

Hi,

In case you didn't know about it, I just found http://sjeng.org/ftp/bootstrap.py

Apparently, all the tests you are looking for have been written using the permutation method.