The MVA module has mean substitution. But before using that method, you
might want to take a look at these references posted by Frank Harrell
earlier this year in another group.
@Article{don06rev,
author = {Donders, A. Rogier T. and {van der Heijden},
Geert
J. M. G. and Stijnen, Theo and Moons, Karel G. M.},
title = {Review: {A} gentle introduction to imputation of
missing values},
journal = J Clin Epi,
year = 2006,
volume = 59,
pages = {1087-1091},
annote = {missing data;imputation;simple demonstration of
failure of indicator (new category) method}
}
@Article{hei06imp,
author = {{van der Heijden}, Geert J. M. G. and Donders,
A. Rogier T. and Stijnen, Theo and Moons, Karel G. M.},
title = {Imputation of missing values is superior to
complete
case analysis and the missing-indicator method in
multivariable diagnostic research: {A} clinical
example},
journal = J Clin Epi,
year = 2006,
volume = 59,
pages = {1102-1109},
annote = {missing data;imputation;invalidity of adding
extra
categories or missing value indicators;bias;precision;complete case
analysis;single imputation}
}
--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
2) compute v_mean = mean(first_variable_name to last_variable_name).
do repeat x = first_variable_name to last_variable_name.
if mis(x) x = v_mean.
end repeat.
del var v_mean x.
exe.
this is syntax code you can call from Sax Basic. Having a data set
open, you can "take" the names of first and last variable needed here.
If they are named in an intuitive way (ex. var1, var2...), you can
just put var1 to var100.
1) Try to click this out for 1 variable from the menu, paste the
syntax, use "do repeat" as above to loop it over variables and you
have the syntax you can call from your Sax Basic code.
Hope that helps and don't hesitate to ask if anything is unclear
Monika
On Oct 8, 2:36 pm, Bruce Weaver <bwea...@lakeheadu.ca> wrote:
> bwea...@lakeheadu.cawww.angelfire.com/wv/bwhomedir
> "When all else fails, RTFM."- Hide quoted text -
>
> - Show quoted text -
Felix is right, mean imputation is in many cases not really good
solution.
However, as I assume, you have a great lot data to process and can't
think out on every single variable.
An improvement of this solution, which I've seen in some articles is
to eliminate (delate or filer) variables and cases that have a
relatively big (>10, >20, >30%?)percentage of missings. If you are
going to do some analysis on your data, I would recommend doing so.
Regards
Monika
> > Frank- Hide quoted text -
Thank you very much for your help! I do appreciate it greatly.
But how to delete some variables with more than 30% missing.
If just one or two variables, it is easy to check. Suppose we have
hundreds of variables.
Frank
By the way, "del var v_mean x." should appear after exe., and with
another exe. after it.
Frank