Covariance between exogenous binary variables in SEM

358 views
Skip to first unread message

Dai Duong

unread,
Sep 26, 2018, 1:21:04 AM9/26/18
to lavaan
Hello experts and fans of lavaan,
I want to include co-variances between exogenous binary variables into my SEM model. I have two questions:
1. In lavaan syntax, to allow such co-variances, I use two options: "fixed.x = FALSE,conditional.x = FALSE" 
so the command will be: sem(fit, data=data,std.lv = TRUE,fixed.x = FALSE,conditional.x = FALSE)
Is that correct?

2. The above command only work with exogenous variables that have at least three values. When I add an exogenous binary variable, errors show up 
"Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE) : 
  infinite or missing values in 'x'
In addition: Warning message:
In sqrt(A1[[g]]) : NaNs produced"
I don't understand why this happens and how to overcome it?

Here is my model
ctrl.l <- "
ctrl_l =~ income+wh+skill+mwh21+njob #in which njob and skill are categorical indicators.

ctrl_l~tedu+m1ac2+ahtedu+hleadera+ titw+urbanrate+exp #all exogenous variables are continuous or ordered
wh ~~  mwh21
mwh21 ~~   njob
income ~~  skill
income ~~ wh
income ~~ mwh21
wh ~~  njob
"
fit.l <- sem(ctrl.l, data=data,std.lv = TRUE,fixed.x = FALSE,conditional.x = FALSE)

==> Regress successfully!

But if I add binary variables as gender, urban/rural, or other dummy variables, errors occur:
  
  ctrl.l <- "
ctrl_l =~ income+wh+skill+mwh21+njob    # njob and skill are categorical indicators.

ctrl_l~tedu+m1ac2+ahtedu+hleadera+ titw+urbanrate+exp+gender+urban    #all exogenous variables are continuous or ordered while gender and urban are binary
wh ~~  mwh21
mwh21 ~~   njob
income ~~  skill
income ~~ wh
income ~~ mwh21
wh ~~  njob
"
fit.l <- sem(ctrl.l, data=data,std.lv = TRUE,fixed.x = FALSE,conditional.x = FALSE)
Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE) : 
  infinite or missing values in 'x'
In addition: Warning message:
  In sqrt(A1[[g]]) : NaNs produced

Please help me, 
Thank you very much,

Dai Duong

Dai Duong

unread,
Sep 27, 2018, 11:20:38 AM9/27/18
to lavaan
Hi all,
I have found a way to make lavaan running with co-variances of exogenous binary variables: set binaries as ordered variables.
However, another issue occurs:

when I add an exogenous binary variable, residual variances of two indicators (njob, skill) become negative

$cov
          inc001 wh0001 skill  mwh21  njob   ahtedu hleadr migrtd
income001  0.000                                                 
wh0001    -0.015  0.000                                          
skill      0.059 -0.133 -0.016                                   
mwh21      0.001 -0.016  0.012  0.000                            
njob       0.031  0.164  0.000  0.149 -0.021                     
ahtedu     0.022 -0.025  0.097  0.005  0.016  0.000              
hleadera   0.002 -0.006  0.003  0.003  0.008  0.000  0.000       
migrated   0.022 -0.033 -0.025  0.008 -0.003  0.000  0.002  0.000

In the above residual covariance matrix, residual variances of skill and njob become negative when migrated (binary) is added.

 

If I add non-binary variable, for example tedu01 (schooling years) while keeping njob and skill in model then residual variances of skill and njob are 0

$cov
          inc001 wh0001 skill  mwh21  njob   ahtedu hleadr tedu01
income001  0.000                                                 
wh0001    -0.019  0.000                                          
skill      0.036 -0.143  0.000                                   
mwh21      0.001 -0.021  0.012  0.000                            
njob       0.029  0.148  0.000  0.170  0.000                     
ahtedu     0.018 -0.027  0.087  0.005  0.014  0.000              
hleadera   0.001 -0.006  0.003  0.003  0.007  0.000  0.000       
tedu01     0.015 -0.053  0.115  0.005  0.044  0.109  0.008  0.000

 

I try to remove both njob and skill from measurement part, then no issue occurs even when a binary variable is added

$cov
          inc001 wh0001 mwh21  ahtedu hleadr prisct
income001  0.000                                   
wh0001    -0.009  0.000                            
mwh21      0.000 -0.004  0.000                     
ahtedu     0.032 -0.020  0.003  0.000              
hleadera   0.002 -0.007  0.004  0.000  0.000       
prisect   -0.015  0.157 -0.011  0.000 -0.016  0.000

 

It seems that, the inclusion of exogenous binary variables makes the model invalid if covariances between exogenous variables are allowed.

 

How should I solve this issue?

Thank you very much,
Dai Duong

Terrence Jorgensen

unread,
Sep 28, 2018, 6:06:56 AM9/28/18
to lavaan
1. In lavaan syntax, to allow such co-variances, I use two options: "fixed.x = FALSE,conditional.x = FALSE" 
so the command will be: sem(fit, data=data,std.lv = TRUE,fixed.x = FALSE,conditional.x = FALSE)
Is that correct?

Yes.

2. The above command only work with exogenous variables that have at least three values. When I add an exogenous binary variable, errors show up 
"Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE) : 
  infinite or missing values in 'x'
In addition: Warning message:
In sqrt(A1[[g]]) : NaNs produced"
I don't understand why this happens and how to overcome it?

Can you post enough data to reproduce this error?

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Dai Duong

unread,
Sep 28, 2018, 12:06:00 PM9/28/18
to lavaan
Dear Dr.Terrence
For the problem:
2. The above command only work with exogenous variables that have at least three values. When I add an exogenous binary variable, errors show up 
"Error in eigen(VarCov, symmetric = TRUE, only.values = TRUE) : 
  infinite or missing values in 'x'
In addition: Warning message:
In sqrt(A1[[g]]) : NaNs produced"
Can you post enough data to reproduce this error?

--> My sample contains 5856 obs.

I fixed the issue by setting exogenous binary variables as ordered variables. Then fitting runs successfully. It is ok, isn't it? 
If I don't specify as ordered ones, there are other errors like this:
> ctrl.labor <- "
+ control_labor =~ income001+wh0001+skill+mwh21+njob
+ control_labor~ahtedu+hleadera+tedu01+exp01+ttnt
+ wh0001 ~~  mwh21
+ mwh21 ~~   njob
+ income001 ~~  skill
+ income001 ~~ wh0001
+ income001 ~~ mwh21
+ wh0001 ~~  njob
+ "
> fit.l <- sem(ctrl.labor, data=x180108_lkh,std.lv = TRUE,fixed.x = FALSE,conditional.x = FALSE,ordered=c("njob","skill")) # only two indicators is set "ordered", none of predictors is set "ordered".
Warning messages:
1: In muthen1984(Data = X[[g]], ov.names = ov.names[[g]], ov.types = ov.types,  :
  lavaan WARNING: trouble constructing W matrix; used generalized inverse for A11 submatrix
2: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING:
    The variance-covariance matrix of the estimated parameters (vcov)
    does not appear to be positive definite! The smallest eigenvalue
    (= -4.591183e-20) is smaller than zero. This may be a symptom that
    the model is not identified.

Thank you very much
Dai


On Wednesday, 26 September 2018 00:21:04 UTC-5, Dai Duong wrote:
Reply all
Reply to author
Forward
0 new messages