246 views

Skip to first unread message

Apr 6, 2020, 2:58:28 PM4/6/20

to lavaan

Hello,

I'm trying to run a model (a replication of a previous paper) in lavaan using the 'sem()' function. I have four endogenous variables (three are continuous and one is a binary variable), five exogenous variables, and I have one covariate. My sample size is N=1172. I believe my issue is to do with specifying the binary variable in my model - when I run the model without the binary variable (with 3 endogenous variables), it works just fine. However, when I do include the binary variable (0/1 coded responses), I get this error:

lavaan WARNING:

Could not compute standard errors! The information matrix could

not be inverted. This may be a symptom that the model is not

identified.lavaan WARNING: could not invert information matrix needed for robust test statistic

My model is as follows:

___________________________________________________________________________________________________

mymodel <- '

#endogenous variables

endo1 =~ sds_q1 + sds_q4 + sds_q6 + sds_q8

endo2 =~ sds_q2 + sds_q3 + sds_q5 + sde_q7

endo3 =~ dis_q1 + dis_q2 + dis_q3 + dis_q4

endo4 =~ sh_q2 #categorical variable (0/1)

#exogenous variables

exo1 =~ sresil_q1 + sresil_q2 + sresil_q5 + sresil_q9 + sresil_q11 + sresil_q13 + sresil_q19 + sresil_q20 + sresil_q25

exo2 =~ sresil_q4 +sresil_q8 + sresil_q10 + sresil_q12 + sresil_q14 + sresil_q18 + sresil_q23 + sresil_q24

exo3 =~ sresil_q3 + sresil_q6 + sresil_q7 + sresil_q11 + sresil_q13 + sresil_q15 + sresil_q16 + sresil_q17 + sresil_q21 + sresil_q22

exo4 =~ in_q1 + in_q2 + in_q3 + in_q4 + in_q5

exo5 =~ inq_6 + in_q7 + in_q8 + in_q9 + in_q10

#covariate

covariate =~ csds_q1 + csds_q2 + csds_q3 + csds_q4 + csds_q5 + csds_q6 + csds_q7 + csds_q8 + csds_q9 + csds_q10

#regression

endo1 ~ covariate

endo2 ~ endo1 + covariate

endo3~ endo2 + exo1 + exo2 + exo3 + exo4 + exo5 + covariate

endo4 ~ endo3 + covariate'

# model identification and estimation

modelfit <- sem(mymodel, data = data, ordered = "sh_q2") #to specify categorical variable

#print results

summary(modelfit, fit.measures = T, standardized = T)

_________________________________________________________________________________________________

I have two questions;

1. In the output for the model above, I get the error but I also get the output. All standard errors are NA, and lavaan doesn't compute SEM fit indices. I'd really appreciate some guidance as to where I might be going wrong with this.

2. When I run the model without the binary variable (exactly the same as above but without "endo4"), and I print rsquare values, only two values show up - the other is na. I get no errors, however, I'm curious as to why this might be the case, and how I might be able to get all three rsquare values as in the study I am replicating. It looks like this:

endo1 NA

endo2 0.848

endo3 0.695

Overall, could the issue be that I am just specifying too many endogenous variables for lavaan? Or that I'm specifying it incorrectly?

Thank you for any help!

Apr 9, 2020, 7:30:50 PM4/9/20

to lavaan

Have you tried just regressing the binary variable on its predictors, instead of needlessly defining a single-indicator factor? With a single binary indicator, unexpected identification issues might be at play.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

May 25, 2020, 3:02:30 AM5/25/20

to lavaan

Hi Terrence,

Thank you very much for your response, and apologies for my delayed reply. As this is a replication study, ideally we would still use SEM, so I've run the model removing the categorical variable as an indicator in the structural model, and it seems to have worked just fine.

I did have another query if possible: my categorical endogenous variable is binary (either 0, 1 -- 'no', 'yes'), which Rosseel (2020) suggests lavaan can deal with. However, this variable is hugely zero-inflated (i.e. there are 1172 total cases, and 852 of them are '0' values). Do you have any thoughts on how this could be an issue in an SEM, or know of any papers that can help? There are papers on this for logistic regression generally, but less so on this applied in a SEM.

Thanks again,

Rosie

May 26, 2020, 12:26:59 PM5/26/20

to lav...@googlegroups.com

On 5/25/20 9:02 AM, Rosie Pendrous wrote:

> I did have another query if possible: my categorical endogenous variable

> is binary (either 0, 1 -- 'no', 'yes'), which Rosseel (2020) suggests

> lavaan can deal with. However, this variable is hugely zero-inflated

> (i.e. there are 1172 total cases, and 852 of them are '0' values).

That is not a problem. The term 'zero-inflated' is not used for binary
> I did have another query if possible: my categorical endogenous variable

> is binary (either 0, 1 -- 'no', 'yes'), which Rosseel (2020) suggests

> lavaan can deal with. However, this variable is hugely zero-inflated

> (i.e. there are 1172 total cases, and 852 of them are '0' values).

variables. It is (usally) used for count variables.

Yves.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu