Dear Dr. Harrell,
but anova(fit.mult.impute(...)) does report sum of square - see made up example below.....
maybe I am making a terrible coding error.
> set.seed(2)
> x1<-runif(20)*3
> x2<-runif(20)*5
> y<-x1^2+2*x2+rnorm(20)
> x1m[runif(20)<.2]<-NA
Error in x1m[runif(20) < 0.2] <- NA : object 'x1m' not found
> x1m<-x1
> x1m[runif(20)<.2]<-NA
> x2m<-x2
> x2m[runif(20)<.2]<-NA
> DATA<-data.frame(y,x1m,x2m)
> DATA
y x1m x2m
1 9.017440 0.5546468 3.3094938
2 7.115533 2.1071221 1.9377477
3 12.916858 NA 4.1844459
4 3.713839 0.5041558 0.7525072
5 11.495155 NA 1.7363612
6 10.447331 2.8304249 2.4438662
7 2.119844 0.3874769 0.7462343
8 9.225800 2.5003464 1.7853130
9 12.390016 1.4040555 NA
10 4.335696 1.6499512 NA
11 3.592121 1.6580222 NA
12 2.479019 0.7166843 0.8232112
13 14.383510 2.2815399 4.0509607
14 8.698716 0.5424603 4.3443052
15 5.844425 1.2158465 2.5714088
16 12.233207 NA 3.1359814
17 15.298496 2.9291955 4.2221450
18 2.405096 NA NA
19 7.893892 NA 3.3361282
20 1.308782 0.2249383 0.7523488
> DATA.mi<-aregImpute(~y+x1m+x2m,DATA, n.impute=5,nk=0)
Iteration 8
> DATA.mi
Multiple Imputation using Bootstrap and PMM
aregImpute(formula = ~y + x1m + x2m, data = DATA, n.impute = 5,
nk = 0)
n: 20 p: 3 Imputations: 5 nk: 0
Number of NAs:
y x1m x2m
0 5 4
type d.f.
y l 1
x1m l 1
x2m l 1
Transformation of Target Variables Forced to be Linear
R-squares for Predicting Non-Missing Values for Each Variable
Using Last Imputations of Predictors
x1m x2m
0.876 0.482
> DATA.mi$imputed
$y
NULL
$x1m
[,1] [,2] [,3] [,4] [,5]
3 0.5546468 0.5546468 0.5546468 0.5546468 2.8304249
5 2.9291955 0.5546468 0.5546468 0.5546468 0.5546468
16 2.8304249 2.2815399 2.2815399 2.9291955 2.9291955
18 0.7166843 0.3874769 0.2249383 0.7166843 0.3874769
19 0.7166843 0.7166843 0.2249383 0.7166843 0.7166843
$x2m
[,1] [,2] [,3] [,4] [,5]
9 4.2221450 3.3094938 4.1844459 4.1844459 1.7363612
10 0.7523488 0.7523488 0.8232112 0.8232112 0.7525072
11 0.7523488 3.3094938 0.7523488 0.7523488 0.7462343
18 0.7462343 0.7462343 0.7525072 0.7523488 0.8232112
> f<-fit.mult.impute(y~rcs(x1,3)+rcs(x2,3),ols, DATA.mi,data=DATA)
Variance Inflation Factors Due to Imputation:
Intercept x1 x1' x2 x2'
1 1 1 1 1
Rate of Missing Information:
Intercept x1 x1' x2 x2'
0 0 0 0 0
d.f. for t-distribution for Tests of Single Coefficients:
Intercept x1 x1' x2 x2'
Inf Inf Inf Inf Inf
The following fit components were averaged over the 5 model fits:
fitted.values stats linear.predictors
> f
Linear Regression Model
fit.mult.impute(formula = y ~ rcs(x1, 3) + rcs(x2, 3), fitter = ols,
xtrans = DATA.mi, data = DATA)
Model Likelihood Discrimination
Ratio Test Indexes
Obs 20 LR chi2 63.21 R2 0.958
sigma 1.0282 d.f. 4 R2 adj 0.946
d.f. 15 Pr(> chi2) 0.0000 g 5.085
Residuals
Min 1Q Median 3Q Max
-1.1229 -0.6170 -0.1657 0.4354 1.9382
Coef S.E. t Pr(>|t|)
Intercept 0.5935 0.8554 0.69 0.4984
x1 1.4888 0.7384 2.02 0.0620
x1' 1.5526 0.8784 1.77 0.0975
x2 1.2824 0.5104 2.51 0.0239
x2' 1.0822 0.6734 1.61 0.1289
> anova(f)
Analysis of Variance Response: y
Factor d.f. Partial SS MS F P
x1 2 96.789733 48.394867 45.78 <.0001
Nonlinear 1 3.302695 3.302695 3.12 0.0975
x2 2 165.089955 82.544978 78.08 <.0001
Nonlinear 1 2.729978 2.729978 2.58 0.1289
TOTAL NONLINEAR 2 4.992265 2.496133 2.36 0.1284
REGRESSION 4 358.084106 89.521026 84.68 <.0001
ERROR 15 15.857043 1.057136