How to obtain Covariance Matrix for categorical data for SEM/CFA

417 views
Skip to first unread message

Priya Chetri

unread,
Oct 6, 2020, 2:05:02 PM10/6/20
to lavaan
Dear All

I have a dataset in which all the relevant variables for my model (SEM) are categorical (all observed variables for measurement model are binary yes/no 1/0 format; one main dependent variable for regression model is ordinal). I have defined the class of these variables as factor.
I am trying to fit the SEM using the covariance matrix. I am able to get the polychoric/tetrachoric correlation (using the polycor package in lavaan) but because the variables are categorical in nature, I am not able to figure out how can I extract covariance from the polychoric correlations obtained one-by-one for two categorical variables that makes it possible to generate one half triangular correlation matrix.

Any hint/advice on this is much appreciated.
Looking forward to hear from someone.

Kind regards 
Priya

Terrence Jorgensen

unread,
Oct 8, 2020, 8:03:24 AM10/8/20
to lavaan
I have defined the class of these variables as factor.

Their class should be "ordered", which inherits from factor but respects order.  Use the ordered() function instead of factor()
 
I am trying to fit the SEM using the covariance matrix.

It sounds like you have the raw data.  Why do you want a matrix of summary statistics?  If it is to report them, you can request the polychoric correlation matrix after fitting your hypothesized model:

lavInspect(fit, "sampstat")

 
I am able to get the polychoric/tetrachoric correlation (using the polycor package in lavaan)

That is different package than lavaan.  The lavaan package has its own procedure.  See the ?lavCor help page.

 
because the variables are categorical in nature, I am not able to figure out how can I extract covariance from the polychoric correlations obtained one-by-one for two categorical variables that makes it possible to generate one half triangular correlation matrix.

The syntax above already returns a correlation matrix.  You cannot scale it to a covariance matrix because categorical data have no inherent scale.  The polychoric correlations are estimated on the assumption that each observed variable has a corresponding latent item response that is normally distributed.  Because latent variables have no inherent scale, they are identified by fixing the SD == 1.


Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Reply all
Reply to author
Forward
0 new messages