I understand spss does this automatically, but how does it do this? I'm
curious.
thanks,
james
This is a basic "statistics" question -- you should read some
basic statistics to get an orientation. But, in brief....
The t-test on a difference is computed as the Difference
divided by the standard error of the difference.
Or, generally speaking,
t_df = (A-B)/ (sqrt( var(A-B) )
where the variance of A-B is equal to the sum of the
two independent variances, Var(A) and Var(B).
If "B" is the mean of a group with a small N, then
the variance of B (a group mean) is larger than the
variance of A (a mean with a much larger N and using
the same pooled estimate of variance).
Hope this helps.
--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html
so are you saying if the two sample sizes are very different, then an
indepedent t- automatically takes this into account? Or must I
'correct' for sample size differences? I've done loads of t-tests
before and assumed sample size differences (between the two groups)
were irrelevant...but I did wonder whether this is indeed right.
thanks.
Please, as I suggested before, read up on the basics. You can't
get a deep understanding from a short answer here. Perhaps -
study the equation for the test, in a textbook that explains
what the terms stand for.
Sample sizes, or differences in them, are surely "relevant."
But there are different ways of being relevant.
If you double the d.f.'s while keeping means and s.d.'s the
same, the t value is multiplied by the square root of 2.
N1=18 and N2=2 gives a different t-test from N1=N2=10,
for the same means and variance.
The relevance of sample size differences is that if the sample sizes
are equal then the t-test is insensitive to heteroscedasticity, but
the more unequal the sample sizes are (i.e., the more the ratio of the
larger n over the smaller n departs from 1) the more sensitive the
test is to heteroscedasticity. In short, if your n's are equal then
you don't worry about heteroscedasticity; the more unequal they are,
the more you worry about heteroscedasticity.
Ray, when you say "don't worry about it" for equal n's, I assume you
mean provided the ratio of the larger to the smaller variance is not too
large. E.g., Dave Howell suggests (in his textbook) that as long as the
ratio is 4 or less, the t-test is robust to heterogeneity of variance
when the sample sizes are equal.
--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
Right. Fven when the n's are equal, there is a point at which we start
wondering about heteroscedasticity, and another, more extreme, point
at which we start worrying. Just where those points might be is some-
what uncertain. I think Howell may be a touch conservative, but not
enough to get into an argument about. And wherever those points may
be, they come sooner as the n's become more unequal.