error covariance of auxiliary variables

244 views
Skip to first unread message

Patrick Paschke

unread,
Mar 27, 2019, 7:14:52 AM3/27/19
to lavaan
Hi,

I'm trying to run a strucural equation model using FIML with auxiliary variables for missing data estimation. This my analysis:

aux.vars <- c('ZMLAE', 'ZALAE')
model <- 'ZMA_t2 ~ 1 + c1*MFSK1 + c2*MK1 + c3*ZMA_t1 + c4*Zage + c5*Zsex
MFSK1 =~ kom_m1 + kom_m2 + kom_m3 + kom_m4
MK1 =~ alg_ges + dars_ges + zahl_ges + geo_ges + mess_ges + prop_ges'
regression.aux=sem.auxiliary(model, missing = "FIML", estimator = "MLR", data=df, fixed.x=FALSE, aux=aux.vars)

Now I get the following warning message:

In lav_object_post_check(object) :
  lavaan WARNING: the covariance matrix of the residuals of the observed
                variables (theta) is not positive definite;
                use lavInspect(fit, "theta") to investigate.

I checked the error covariances and found that all covariances substantially deviating from 0 are those between one of the two auxiliary variables and one model variable. My question is, whether these covariances are actually problematic or not, because the auxiliary variables are not part of my model but just used for missing data estimation. And if they are problematic, I would be much obliged if you could explain why this is the case and how I could best approach that problem.

Thanks in advance! Below I will report the error covariances.
Patrick

         kom_m1 kom_m2 kom_m3 kom_m4 alg_gs drs_gs zhl_gs geo_gs mss_gs prp_gs ZMA_t2 ZMA_t1 Zalter Zsex   ZMLAE  ZALAE
kom_m1    0.189                                                                                                        
kom_m2    0.000  0.338                                                                                                 
kom_m3    0.000  0.000  0.173                                                                                          
kom_m4    0.000  0.000  0.000  0.272                                                                                   
alg_ges   0.000  0.000  0.000  0.000  1.318                                                                            
dars_ges  0.000  0.000  0.000  0.000  0.000  0.799                                                                     
zahl_ges  0.000  0.000  0.000  0.000  0.000  0.000  1.260                                                              
geo_ges   0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.974                                                       
mess_ges  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.347                                                
prop_ges  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.571                                         
ZMA_t2    0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.499                                  
ZMA_t1    0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.997                           
Zalter    0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.002  0.997                    
Zsex      0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000 -0.004  0.138  0.996             
ZMLAE    -0.571 -0.569 -0.560 -0.523 -0.219 -0.165 -0.282 -0.285 -0.307 -0.379 -0.279  0.063  0.020 -0.036  1.027      
ZALAE    -0.187 -0.182 -0.185 -0.184 -0.123 -0.103 -0.181 -0.137 -0.131 -0.233 -0.109  0.061  0.024 -0.036  0.795  1.006

Terrence Jorgensen

unread,
Mar 29, 2019, 6:45:59 AM3/29/19
to lavaan
I checked the error covariances and found that all covariances substantially deviating from 0 are those between one of the two auxiliary variables and one model variable.

"not positive definite" does not mean "different from zero".  It could mean a Heywood case (negative variance, or residual correlation > 1 in absolute value), or it could more generally mean there is linear dependency, which is harder to find.

My question is, whether these covariances are actually problematic or not, because the auxiliary variables are not part of my model but just used for missing data estimation

They are part of your model, just not your hypothesized model.  That is why they need to be there.  That is the "Saturated correlates" approach described in the Enders article you can find under References on the ?auxiliary help page.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Reply all
Reply to author
Forward
0 new messages