Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

test for difference in center of mass

22 views
Skip to first unread message

zim...@hotmail.com

unread,
May 4, 2013, 5:17:56 PM5/4/13
to
Hi,
I'm looking to test for a difference in center of mass between two groups of points. Each point has an x,y,z coordinate and I need to know if the means of the two clouds are significantly different, under the assumption that they are approximately Normally distributed. Can anyone point me to a test I can use?
thanks!

~z

Rich Ulrich

unread,
May 4, 2013, 7:12:10 PM5/4/13
to
That seems to be precisely the test that 2-group
discriminant function provides.

Here is an instance for 2-group discrimination where
logistic regression, with its different penalty function,
does not fit the hypothesis as well as DF does, because
LR explicitly does not test "the center of mass."

The assumption needed for decent tests is not so much
"normallity" as it is "similarly distributed, without big
outliers." - "Similar" would include having var-covar matrices
that are similar.

If variances differ by too much, the best test might use
some derived difference(s) and some test with the Welch-
Satterthwaite correction. I don't recall seeing this discussed.


--
Rich Ulrich

Ray Koopman

unread,
May 5, 2013, 1:48:22 AM5/5/13
to
When there are only two groups, there are three algebraically
equivalent ways to test the hypothesis that the group centroids are
the same. The easiest is probably multiple linear regression, with
group as the dependent variable and the coordinate values as
predictors. The F-test of the multiple R^2 is exactly equal to the F
that you would get from a discriminant analysis or a one-way manova.

Ray Koopman

unread,
May 6, 2013, 5:26:53 AM5/6/13
to
On May 4, 4:12 pm, Rich Ulrich <rich.ulr...@comcast.net> wrote:
I don't recall seeing it discussed, either, but here's how it might
be done. (This is for any number of dimensions, not just three).

Let n1 & n2 be the sample sizes,
let m1 & m2 be the sample mean vectors, and
let S1 & S2 be the sample covariance matrices.

Let m = m1 - m2, let S = S1/n1 + S2/n2,
and let w be a vector of weights on the coordinates.

Then t = w'.m/sqrt[w'.S.w],
where ' denotes vector transposition (column -> row),
and . denotes matrix multiplication.

To maximize |t|, take w = (S^-1).m; the Satterthwaite df =
(w'.S.w)^2 /( (w'.S1.w/n1)^2 / (n1-1) + (w'.S2.w/n2)^2 / (n2-1) )

(Note that ordinary discriminant analysis, which assumes the true
covariance matrices are equal, uses exactly the same approach but
with S = ((n1-1)S1 + (n2-1)S2)/(n1-1 + n2-1) and df = n1+n2-2.)

Ray Koopman

unread,
May 7, 2013, 12:50:11 AM5/7/13
to
On May 6, 2:26 am, Ray Koopman <koop...@sfu.ca> wrote:
> ... ordinary discriminant analysis, which assumes the true
> covariance matrices are equal, uses exactly the same approach
> but with S = ((n1-1)S1 + (n2-1)S2)/(n1-1 + n2-1) ...

Wrong. That's an estimate of the covariance matrix of the raw data.
It's supposed to be an estimate of the covariance matrix of m,
so the above expression should be multiplied by (1/n1 + 1/n2).
0 new messages