Hello,
I am new to R and am having trouble using multiply imputed data with runMI. My data is currently in long format and has been imputed 10 times. In order to have the program actually run the syntax within a reasonable time frame to check my work, I made another dataset with 2 imputations just so I could figure out what I’m doing.
It seems as though the first 3 sections of the syntax below (from the semTools documentation) involve creating a simulated dataset. I thought perhaps that was just to create data to use in the example. However, since I’m being told my data has no missing values (which it shouldn't), do I need to be simulating a model like mine as well? I did try this to see what would happen and the program ran for a very, very long time. I’m not sure how exactly I’m supposed to use multiply imputed data in my analysis, or whether my data is in the wrong format. I’ve included the syntax from the semTools documentation and under that, my model and syntax.
My last question is, how does one request pooled parameters from the multiply imputed datasets? When I do get a line to run (although it hasn't been exactly what I want yet), the parameters are provided for each dataset, not pooled.
I’d appreciate any advice. Thank you very much in advance,
Diana
modsim <- '
f1 =~ 0.7*y1+0.7*y2+0.7*y3
f2 =~ 0.7*y4+0.7*y5+0.7*y6
f3 =~ 0.7*y7+0.7*y8+0.7*y9'
mod <- '
f1 =~ y1+y2+y3
f2 =~ y4+y5+y6
f3 =~ y7+y8+y9'
datsim <- simulateData(modsim,model.type="cfa", meanstructure=TRUE,
std.lv=TRUE, sample.nobs=c(200,200))
randomMiss2 <- rbinom(prod(dim(datsim)), 1, 0.1)
randomMiss2 <- matrix(as.logical(randomMiss2), nrow=nrow(datsim))
datsim[randomMiss2] <- NA
datsimMI <- amelia(datsim,m=3, noms="group")
out3 <- runMI(mod, data=datsimMI$imputations, chi="LMRR", group="group", fun="cfa")
summary(out3)
inspect(out3, "fit")
inspect(out3, "impute")
mod2 <- '
grade =~ NA*grade1
gender=~ NA*gender1
cvict =~ cvict1 + cvict2 + cvict3
cbul =~ v2*cbul1 + v2*cbul2
clim =~ clim1 + clim2 + clim3 + clim4
teachdo=~NA*teachdo1
peerdo=~NA*peerdo1
agg =~ v3*agg1 + v3*agg2
assert =~ v6*assert1 + v6*assert2
talk=~NA*talk1
telladult =~ v4*telladult1 + v4*telladult2
donothing =~ v5*donothing1 + v5*donothing2
grade~~1*grade
gender~~1*gender
cvict~~1*cvict
clim~~1*clim
teachdo~~1*teachdo
peerdo~~1*peerdo
agg~~1*agg
assert~~1*assert
talk~~1*talk
telladult~~1*telladult
cbul~~1*cbul
donothing~~1*donothing
agg ~ grade + gender + cvict + cbul + clim + teachdo + peerdo
assert~ grade + gender + cvict + cbul + clim + teachdo + peerdo
talk ~ grade + gender + cvict + cbul + clim + teachdo + peerdo
telladult ~ grade + gender + cvict + cbul + clim + teachdo + peerdo
donothing ~ grade + gender + cvict + cbul + clim + teachdo + peerdo
'
imputed2<-read.csv("imputed2.csv",header=FALSE)
names(imputed2) <-c("imp", "grade1","gender1","cvict1","cvict2","cvict3","cbul1","cbul2","clim1","clim2","clim3","clim4","teachdo1","peerdo1","agg1", "agg2", "assert1", "assert2","talk1","telladult1", "telladult2","donothing1", "donothing2")
out3 <- runMI(mod2, data=imputed2, chi="all", group="imp", fun="sem", m = 2, std.lv=TRUE)
summary(out3)I get this error:
Amelia Error Code: 39 Your data has no missing values. Make sure the code for missing data is set to the code for R, which is NA. Error in apply(seAll, 2, function(x) all(x >= 0)) : dim(X) must have a positive length
mice3 is a dataset I set up like I believe it would be had I imputed using mice. I tried this too. I had my unimputed dataset, and then imputation 1 and imputation 2 below in long format.
mice3<-read.csv("mice3.csv")
test <- as.mids(mice3, .id = NULL)
is.mids(test)
test.dat <- complete(test, action = "long", include = TRUE)
names(mice3) <-c(".imp", "grade1","gender1","cvict1","cvict2","cvict3","cbul1","cbul2","clim1","clim2","clim3","clim4","teachdo1","peerdo1","agg1", "agg2", "assert1", "assert2","talk1","telladult1", "telladult2","donothing1", "donothing2")
out3 <- runMI(mod2, data=test.dat, chi="all", miPackage="mice",group=".imp", fun="sem", m = 2)
This gives me output for 3 datasets, including the one I have coded as 0 because it is unimputed, when I only want the program to pay attention to the already imputed datasets.
imputed2<-read.csv("imputed2.csv",header=FALSE)
names(imputed2) <-c("imp", "grade1","gender1","cvict1","cvict2","cvict3","cbul1","cbul2","clim1","clim2","clim3","clim4","teachdo1","peerdo1","agg1", "agg2", "assert1", "assert2","talk1","telladult1", "telladult2","donothing1", "donothing2")
out3 <- runMI(mod2, data=imputed2, chi="all", group="imp", fun="sem", m = 2, std.lv=TRUE)
t3 <- runMI(mod2, data=test.dat, chi="all", miPackage="mice",group=".imp", fun="sem", m = 2)This gives me output for 3 datasets, including the one I have coded as 0 because it is unimputed, when I only want the program to pay attention to the already imputed datasets.
I also have been able to put the data into a single dataframe.imputed10 <- complete(imputed, action = "long")
nImputations <- 10
impList <- list()
for (i in 1:nImputations) {
impList[[i]] <- complete(imputed, action = i)
}
out <- runMI(outcomes, data = impList, ...)
Will this syntax also work to get lavaan to recognize 100 imputed datasets such that it will not read 1000 girls, 3500 boys, etc in my multi group analyses?