Skip to first unread message

Feb 20, 2019, 2:50:54 PM2/20/19

to lavaan

Dear lavaan group,

My question is about the semTools calculation of pooled standard errors when multiple imputation has been used. Unless I'm misunderstanding something, the calculation is not consistent with Rubin's rule.

The following example is based on the example in the documentation for runMI. I'm using R 3.5.2, lavaan 0.6-3, semTools 0.5-1.911.

library(semTools)

HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),

"ageyr","agemo","school")]

set.seed(12345)

HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)

age <- HSMiss$ageyr + HSMiss$agemo/12

HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)

HS.model <- '

visual =~ x1 + x2 + x3

textual =~ x4 + x5 + x6

speed =~ x7 + x8 + x9

'

n.imp <- 5 # number of imputed data sets

library(Amelia)

set.seed(12345)

HS.amelia <- amelia(HSMiss, m = n.imp, noms = "school", p2s = FALSE)

imps <- HS.amelia$imputations

out2 <- cfa.mi(HS.model, data = imps)

summary(out2)

# Fit the model to each imputed data set manually

cfa.imp <- vector("list", n.imp)

for (i in 1:n.imp) {

cfa.imp[[i]] <- cfa(HS.model, data = imps[[i]])

}

# Standard errors according to Rubin's rule

sqrt(diag(Reduce("+", lapply(cfa.imp, vcov))/n.imp +

(1 + 1/n.imp)*cov(t(sapply(cfa.imp, coef)))))

summary(out2)$se # similar but not the same (ignore the zeros)

The code for runMI-methods.R contains the following lines:

W <- rowMeans(sapply(object@ParTableList[useImps], "[[", i = "se")^2)

B <- apply(sapply(object@ParTableList[useImps], "[[", i = "est"), 1, var)

Bm <- B + B/m

Tot <- W + Bm

I'm wondering why the standard errors aren't calculated as sqrt(Tot).

Thanks,

Mark

Feb 21, 2019, 3:51:35 PM2/21/19

to lavaan

I'm wondering why the standard errors aren't calculated as sqrt(Tot).

They are, in the code immediately above that where the standard errors are assigned using the `lav_model_vcov_se()`

function, providing `VCOV`

from the `vcov()`

output (which is identical to `Tot`

as calculated there when `scale.W=FALSE`

). Not sure why I kept `Tot`

in there, it’s probably just a legacy from before I wrote the `vcov()`

method. But `W`

and `Bm`

are there for calculating the df. The reason for the differences is that texts often repeat only the univariate version of Rubin’s rules, rather than the multivariate extension that pools the entire sampling covariance matrix.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Feb 21, 2019, 5:06:55 PM2/21/19

to lavaan

On Friday, February 22, 2019 at 7:51:35 AM UTC+11, Terrence Jorgensen wrote:

The reason for the differences is that texts often repeat only the univariate version of Rubin’s rules, rather than the multivariate extension that pools the entire sampling covariance matrix.

Thanks for your reply Terrence. But I don't understand what you mean by univariate vs multivariate versions of Rubin's rules.

In the example I posted, isn't

sqrt(diag(Reduce("+", lapply(cfa.imp, vcov))/n.imp +

(1 + 1/n.imp)*cov(t(sapply(cfa.imp, coef)))))

multivariate because it involves the covariance matrix of all the coefficients?

Feb 28, 2019, 10:00:03 PM2/28/19

to lavaan

Thanks for your reply Terrence. But I don't understand what you mean by univariate vs multivariate versions of Rubin's rules.

It should really be referred to as uni- vs. multiparameter tests. I think it is self-explanatory, but read Enders (2010) book for a more thorough description and example applications.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu