Missing values and logistic regression

Freepete

unread,

Oct 21, 1999, 3:00:00 AM10/21/99

to

What can I do using logistic regression when I have multiple predictor
variables (e.g 10 predictor variables, a mix of categorical and continuous)
each with numerous missing values? There is no pairwise deletion in SPSS v.9
for logistic regression as there is with linear regression. So I cannot run the
analyses because the listwise deletion removes virtually all cases. Any
suggestions? Thanks!

Peter Wingate

John Hendrickx

unread,

Oct 21, 1999, 3:00:00 AM10/21/99

to

In article <19991021020035...@ng-fx1.aol.com>,
free...@aol.com says...

Pairwise deletion only works in linear regression. One option is to
recode missing values to the mean value and to include a dummy for
missing values on each variable. If the dummy isn't significant, that
means that missings are random. If it is significant, well, at least
you've taken the non-randomness of the missing cases into account. For
categorical variables, recode the missings to a separate category and use
the deviation contrast.

Hope this helps,
John Hendrickx

Rich Ulrich

unread,

Oct 21, 1999, 3:00:00 AM10/21/99

to

- Good answer, John.

If the missing is missing in sets, then they are not "missing at
random". The other usual approach is to drop the variables that have
a lot of missing values -- If you don't have the information, it can't
help you predict. So, can you do prediction by different sets?

It puzzles me, how it arises that you should have a lot of missing for
all variables. If there is not any structure to it, that certainly
ought to make your prospect pretty shaky for doing any useful
prediction.

(If you were concerned with testing, more than predicting, that raises
concerns of statistical power, in addition to the other concerns.)

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Freepete

unread,

Oct 21, 1999, 3:00:00 AM10/21/99

to

I am using logistic regression to analyze variables that may influence judicial
decision making in employment law cases (e.g. policy capturing). The missing
values are due to information that is not written or mentioned in the case
opinion; the reasons for omitting the desired information in these opinions
stem from many factors. The power issue is certainly a concern, yet I do expect
to gain useful information on various hypothesized relationships for the
variables with sufficient sample sizes. Thank you for your replies,
Pete

Rich Ulrich

unread,

Oct 22, 1999, 3:00:00 AM10/22/99

to

On symptom checklists, it is possible to drop the "Missing" category
by describing the eventual total as the "number endorsed". If there
were patients whose illness explains why they did not fill out the
checklist, it is important to remove their scores from the data-set,
so that we look at a set of 'putatively reliable' scores -- if we
want to know how effectively we can predict something from putatively
reliable scores.

I suggest that you probably want to sort out your case histories. How
do you feel about treating "Missing" as merely "not considered to be
worth mentioning" if you only consider the histories that you would
deem to be "adequate"?

Simo V. Virtanen

unread,

Oct 22, 1999, 3:00:00 AM10/22/99

to

Peter,

Sorry if I'm taking this thread in a wrong direction but it sounds like
the priority should be figuring out why listwise deletion wipes out
almost all of your data. Are cases missing by design or is the data
set just very small? In any case, it sounds like more than the average
missing data problem.

Simo Virtanen

*********************************************************
Simo V. Virtanen Tel: +358-9-4747 429
Finnish Institute of
Occupational Health
E-mail: Simo.V...@occuphealth.fi
Home page: http://www.occuphealth.fi/users/simo.virtanen/
*********************************************************

Freepete wrote:

> What can I do using logistic regression when I have multiple predictor
> variables (e.g 10 predictor variables, a mix of categorical and continuous)
> each with numerous missing values? There is no pairwise deletion in SPSS v.9
> for logistic regression as there is with linear regression. So I cannot run the
> analyses because the listwise deletion removes virtually all cases. Any
> suggestions? Thanks!
>

> Peter Wingate