Institute for Social Research, University of Michigan performs:

1. Single or multiple imputations of missing values using the Sequential Regression Imputation Method described

in the article "A multivariate technique for multiply imputing missing values using a sequence of regression models"

by Raghunathan, Lepkowski, Van Hoewyk and Solenberger.

2. A variety of descriptive and model based analyzes accounting for complex design features such as clustering,

stratification and weighting.

3. Multiple imputation analyzes for both descriptive and model-based survey statistics.

4. Create partial or full synthetic data sets using the sequential regression approach to protect confidentiality and limit

statistical disclosure.

5. Combine information from multiple sources by vertically concatenating data sets and multiply imputing the missing

portions to create a larger rectangular data set.

IVEware includes six modules: IMPUTE, DESCRIBE, REGRESS, SASMOD, SYNTHESIZE and COMBINE.

IMPUTE uses a multivariate sequential regression approach for multiply imputing item missing values in a data set.

DESCRIBE estimates the population means, proportions, subgroup differences, contrasts and linear combinations of

means and proportions. For complex surveys, the Taylor Series approach is used to obtain variance estimates.

The item missing values can be multiply imputed for the variables while perfoming the analysis.

REGRESS fits linear, logistic, polytomous, Poisson, Tobit and proportional hazard regression models.

The Jackknife Repeated Replication (JRR) approach is used to estimate the sampling variances for complex

survey data. The item missing values may be multiply imputed while performing the regression analysis.

SASMOD allows users to analyze data with several SAS procedures. Currently the following SAS PROCS

can be called: CALIS, CATMOD, GENMOD, LIFEREG, MIXED, NLIN, PHREG, and PROBIT. The JRR approach

is used for complex survey data and the missing values can be multiply imputed while performing these analyses.

SYNTHESIZE uses multivariate sequential regression approach to create full or partial synthetic data sets to

limit statistical disclosure (See Raghunathan, Reiter and Rubin (2003) , Reiter (2002) and Little,Liu and Raghunathan

(2004) for more details.) All item missing values will also be imputed when creating synthetic data sets.

However, DESCRIBE, REGRESS and SASMOD modules cannot be used to analyze synthetic data sets as they

DO NOT implement the appropriate combining rules.

COMBINE is useful for combining information from multiple sources through multiple imputation. Suppose that

Data 1 provides variables X and Y, Data 2 provides variables X and Z and Data 3 provides variables Y and Z.

COMBINE can be used to concatenate the three data sets and multiply impute the missing values of X, Y and Z

to create large data sets with complete data on all three variables. All item missing values in the individual data

sets will also be imputed. The multiply imputed combined data sets can be analyzed using DESCRIBE, REGRESS

and SASMOD modules.

View topics in all categories