Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Two-sample chi-square test

602 views
Skip to first unread message

Arthur Zheng

unread,
Sep 1, 2010, 2:58:04 PM9/1/10
to
Background information:
I have two samples, X1 and X2. X1 and X2 are categorical (2 or 3 cases). I want to use chi-square test to test whether X1 and X2 are drawn from the same underlying distribution.

Question:
Does matlab have a function for chi-square test for two samples, like kstest2?
There is a function called "CHI2GOF", but I haven't figured out how to use it to the 2 sample case.

Arthur Zheng

unread,
Sep 1, 2010, 9:19:07 PM9/1/10
to
Any suggestions?

"Arthur Zheng" <hzh...@gatech.edu> wrote in message <i5m7ns$cuo$1...@fred.mathworks.com>...

Joel

unread,
Sep 1, 2010, 10:29:22 PM9/1/10
to
I am not an expert but what about these?
nsaribradley Ansari-Bradley test
barttest Bartlett's test
canoncorr Canonical correlation
chi2gof Chi-square goodness-of-fit test
dwtest Durbin-Watson test
friedman Friedman's test
jbtest Jarque-Bera test
kruskalwallis Kruskal-Wallis test
kstest One-sample Kolmogorov-Smirnov test
kstest2 Two-sample Kolmogorov-Smirnov test
lillietest Lilliefors test
linhyptest Linear hypothesis test
ranksum Wilcoxon rank sum test
runstest Runs test for randomness
sampsizepwr Sample size and power of test
signrank Wilcoxon signed rank test
signtest Sign test
ttest One-sample t-test
ttest2 Two-sample t-test
vartest Chi-square variance test
vartest2 Two-sample F-test for equal variances
vartestn Bartlett multiple-sample test for equal variances
zscore Standardized z-scores
ztest z-test

Arthur Zheng

unread,
Sep 1, 2010, 10:52:04 PM9/1/10
to
Most of them are for continuous distributions. For 2 samples from discrete distributions, chi-square should be the most popular.
That's why I'm wondering whether matlab has such a 2 sample chi square function.


"Joel" <espo...@usna.edu> wrote in message <i5n262$70u$1...@fred.mathworks.com>...

Peter Perkins

unread,
Sep 3, 2010, 5:53:21 PM9/3/10
to

You can use CHI2GOF, but it's really intended more for testing goodness
of fit of a single sample against a distribution family. You're doing
something more like contingency table analysis, which is what CROSSTAB
is for. Not sure what form your data are in, but any of these should work:

% cook up some sample data
k = 5;
p = rand(1,k); p = p./sum(p);
M = 200; N = 250;
x = randsample(1:k,M,true,p); m = histc(x,1:k);
y = randsample(1:k,N,true,p); n = histc(y,1:k);

% Do the test by hand
phat = (m+n) ./ (M+N);
em = phat*M; en = phat*N;
chi2 = sum(([m n] - [em en]).^2 ./ [em en]);
df = k-1;
pval = 1 - chi2cdf(chi2,df);

% Trick CHI2GOF into doing a two sample test. Note the
% nparams value must be such that 2*k - nparams - 1 = k-1
[~,pval,stats] = chi2gof(1:10,'ctrs',1:10,'freq',[m n], ...
'expected',[em en],'nparams',k, 'emin',0)

% Use CROSSTAB
[tbl,chi2,pval] = crosstab([x y],[ones(size(x)) 2*ones(size(y))])

Hope this helps.

0 new messages