binary data lavaan

Skip to first unread message

Maria Altendorf

Apr 29, 2019, 5:38:27 PM4/29/19
to lavaan
Dear all,

I have a model with 2 exogenous (binary:0,1) variables, whihc are my conditions, some latent / continuous variables, which are the moderators and a binary outcome variable (smoking behaviour: 0, 1). 

Can I use lavaan to run a CFA and SEM with it? 
In case that it's possible: how do I specify my cfa-model? Thus, tell lavaan/R that the observed variables are manifest and not latent? 
Also, lavaan needs to know that it should run logistic regression, which is usually done with the "ordered" command for categorical endogenous variables. However, my variables are binary and not categorical. I Could not find a solution in the www yet.

I appreciate your help a lot!!! In case you want to see my syntax I am happy to provide it.
Thanks in advance. 
Cheers, Maria

PhD Candidate
University of Amsterdam

Leslie Zhen

May 1, 2019, 10:59:59 PM5/1/19
Hi Maria,

I am by no means experienced with lavaan, so I will try to address what I know.  The link below has been helpful for how to declare exogenous categorical variables.  Whether a variable is treated as latent or observed depends on how you write your model.  Latent variables are measured by (=~) other variables.  I do not follow your differentiation of binary vs. categorical.  Binary is a type of categorical.

You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
Visit this group at
For more options, visit

Leslie Zhen, PhD Student
Department of Communication Sciences and Disorders
University of Pittsburgh

Christopher David Desjardins

May 2, 2019, 10:08:54 AM5/2/19
to lavaan

Hi Maria,

It might be helpful to share your script to see what you’re actually trying to do, but what you’ve described is certainly possible in lavaan. The link that Leslie provided for you is a good start. Currently lavaan will do probit regression not logistic regression for categorical endogenous variables (by default). But probit regression is fine to use and easy (maybe easier?) to interpret.

I’m not sure the extent that logistic regression has been implemented in lavaan, maybe there’s something more current than this!topic/lavaan/zDcHbOnxdRg. From what I can see it’s still not implemented:

mydata <- read.csv("")
mod <- "
admit ~ gre
fit <- sem(mod, data = mydata, link = "logit", ordered = "admit", estimator = "MML")

results in:

<0 x 0 matrix>
Error in lav_model_gradient_mml(lavmodel = lavmodel, GLIST = GLIST, THETA = THETA[[g]],  : 
  logit link not implemented yet; use probit
In addition: Warning messages:
1: In lav_options_set(opt) :
  lavaan WARNING: link will be set to “probit” for estimator = “MML”
2: In lav_model_lik_mml(lavmodel = lavmodel, THETA = THETA, TH = TH,  :
  lavaan WARNING: --- VETAx not positive definite

Maria Altendorf

May 6, 2019, 4:56:50 AM5/6/19
to lavaan
Thank you Leslie for your time and consideration.

I have read that manual already, but unfortunately it did not help me in specifying my CFA model with a binary outcome. But I have now conducted the CFA without my (binary) outcome, just to test the latent constructs, however this is not really the "fine" way to do it, I guess.

Maria Altendorf

May 6, 2019, 4:10:50 PM5/6/19
to lavaan
Dear Christopher and others,

Thanks a lot for your response, too.

I also guess it is not possible to conduct a CFA with latent continuous varibales and also a dichotomous variable (the outcome). :( The manual and all examples also in groups lead me come to this conclusion.

However, I thought (as my model is bades on a storng theory) I could just do a SEM model and add apriori expected cov between variables that I expect to correlate.
Then new issues come up...
First things first: Below is a concept of my model:


#Reached acceptable fit with latent variables, so I can use them in SEM


structuralmodel2 <-   'Relevance =~ RELEVA1 + RELEVA2 + RELEVA3

                      Motivation =~ TSRQ1_1 + TSRQ3_1 + TSRQ6_1 + TSRQ8_1 + TSRQ12_1 + TSRQ14_1

                      Attitude_pro =~ Ap_CON_1 + Ap_UIT_1 + Ap_GEZ_1 + Ap_VB_1 + Ap_TEV_1 + Ap_SCH_1 

                      Attitude_dis =~ Ac_ONT_1 + Ac_GEW_1 + Ac_VER_1 + Ac_SOM_1 + Ac_EEN_1 + Ac_ONZ_1

                      Social_norm =~ SNPART_1 + SNKIND_1 + SNVRIE_1

                      Social_support =~ SSPART_1 + SSKIND_1 + SSVRIE_1

                      Selfefficacy =~ SEKWAA_1 + SESOM_1 + SESTRE_1 + SEETEN_1 + SEKOFF_1 + SEPAUZ_1 + SEFEES_1 + SEAANB_1 + SEGEN_1 

                      Intention =~ INTEN_2 + INTEN_1 + INTEN_3

                      Ap_CON_1 ~~ Ap_GEZ_1

                      SEAANB_1 ~~  SEGEN_1

                      SEETEN_1 ~~ SEKOFF_1

                      SEKWAA_1 ~~  SESOM_1

                      SEKWAA_1 ~~ SESTRE_1

                      TSRQ1_1 ~~  TSRQ3_1 

                      SESOM_1 ~~ SESTRE_1 

                      SEETEN_1 ~~ SEPAUZ_1

                      Relevance ~ Content + Frame + CTFT

                      Motivation ~ Relevance + Content + Frame + CTFT

                      Attitude_pro ~ Motivation

                      Attitude_dis ~ Motivation

                      Social_norm ~ Motivation

                      Social_support ~ Motivation

                      Selfefficacy ~ Motivation

                      Intention ~ Attitude_pro + Attitude_dis + Social_norm + Social_support 

                                + Selfefficacy + Motivation

                      PP_FU1 ~ Intention + Attitude_pro + Attitude_dis + Social_norm + Social_support 

                              + Selfefficacy + Motivation'

fit3 <- sem(structuralmodel2, data=data, estimator = "WLS", ordered = "PP_FU1")

summary (fit2, fit.measures = TRUE, standardized = TRUE)

Here I get an error message now: "Error in chol.default(S) : 

  the leading minor of order 265 is not positive definite"



I appreciate any help possible! 


Thank you in advance,



Reply all
Reply to author
0 new messages