Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Matched Case Control with SPSS

5,777 views
Skip to first unread message

fiumare

unread,
Sep 3, 2007, 6:56:56 AM9/3/07
to
Hi,

I have a database with 5000 subjects.

200 of those have disease x (the cases). I want to create a matched
(age & gender) control group without the disease. But how? Is this
possible with SPSS?

Thanks,
Marc

Bruce Weaver

unread,
Sep 3, 2007, 7:32:20 AM9/3/07
to

Richard Ulrich

unread,
Sep 4, 2007, 4:05:43 PM9/4/07
to

I hope that Bruce's reference gave the poster what he needed.

I hope, also, that the Poster is intending to find at least 4 or 5
matches to each control, and to analyze with age and gender
as covariates.


A LITTLE LECTURE.

I feel like an extra comment is needed, because I have
seen as many requests for matching that were wrong-headed,
as otherwise.

This question is one that contains a good assumption
about prospective studies that is often generalized
in an inappropriate way to observational data -- namely,
that "matching" will give a more powerful analysis, one
that makes best use of the given N.

Prospectively -
1) Having equal Ns gives the best power for hypotheses
with continuous outcomes (assuming equal variances).

2) Doing some variety of paired analysis has best power
when the dependency (correlation) is high, such as --
paired body parts; pre-post for individuals; sibs (often).

3) Unless the correlation is high, the best *analysis* of the
paired data, in several respects, is going to be an analysis
the controls for the matching variables (age and gender
in this case) statistically. The analysis with statistical
control will potentially do a *better* job of controlling
for the covariates, at the cost of far fewer degrees of
freedom, while incidentally providing direct estimates
of the influence of the covariates.


When you already have 5000 controls on hand, to match
to 200, the best use of them does not throw away 4800.
(I hope that the OP forgives me for using his post to make
this point; the post never stated that matching was 1-to-1
or otherwise.)

Still, one main reason for matching is to eliminate the
statistical "confounding" of the matched variables, and that
is a legitimate concern. With 5000 controls available,
there is an easy chance that there will be some noticeable
difference in the age-sex distribution for *this* one disease,
and matching is a pretty sure way to remove the confounding.

In addition, as a corollary to comment (1), there is a gain, but
it is a decreasingly-small gain in adding more and more
cases to a control group, once it is several times as large
as the other, so, in this case, "5000" is far more than what
would be adequate.

Finally, I would add this: When controls are easily available
that can come from several different 'populations', it can be
desirable to define more than one 'Control-group' based
on different characteristics. For instance, a study of one
chronic disease might use both a non-disease control
and a different-chronic-disease control.


Hope this is useful to someone....

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

0 new messages