Model issues

38 views
Skip to first unread message

Dae Meow

unread,
Dec 14, 2018, 8:02:05 AM12/14/18
to lavaan
Hello all,

I'm sorry if this issue has been raised before and I am 'late to the party' as such, but I am new to sem and rather uncertain as how to proceed - I would really appreciate some advice!
I am making a model for my master thesis, this is the code:
modely<- ' y ~ x1
y ~ x2
y ~ x3
y ~ x4
y ~  x5
y ~ x6
y ~~ y
x1~~x1
x1~~x2
x1~~x3
x1~~x4
x1~~ x5
x1~~ x6
x2~~x2
x2~~x3
x2~~x4
x2~~ x5
x2~~ x6
x3~~x4
x3~~ x5
x3~~ x6
x3~~x3
x4~~ x5
x4~~ x6
x4~~x4
x5~~ x6
x5~~ x5 
x6~~ x6   '
modelyfit<- sem(modely, data = modeldata)

So basically we have 6 different parameters and one 'outcome' and are trying to explore which parameters have the biggest effect on the 'outcome'. We have 179 samples.
The model seems to work fairly well in that it gives outputs which are logical based on our understanding of the system - but there are consistently 0 df. We think that this is down to including all the covariances of the different parameters - but if we remove these, then aren't we invalidating the model by ignoring things which matter? Or is there a test or general rule of thumb which could indicate which covariances may be unnecessary to include? Or perhaps another way entirely of increasing the df without resorting to this?
Thank you to anyone whom has taken the time to read this, and extra thanks to anyone whom may be able to help :)

Dan Laxman

unread,
Dec 14, 2018, 10:53:53 AM12/14/18
to lav...@googlegroups.com

Dear Dae Meow,

As I understand, allowing for an error covariance between your indicator variables (e.g., x1~~x2) is discouraged unless there is a reason that you expect a pair of indicators to be related (i.e., they share a common cause or common source of error) beyond what is already accounted for in the latent variable. For example, if x1 and x2 measure a common aspect of the latent construct that is distinct from the other items (e.g., the items measure a specific aspect of depression), then you might consider allowing the error covariance between x1 and x2 to be freely estimated rather than constrained to 0. Likewise, if they share a common stem or use wording that is not found in the other items.

Your model is currently specified such that every indicator has some common cause or source of error with every other indicator beyond the common cause that is your latent variable. This is not a realistic model and you've used up all your degrees of freedom to estimate every error covariance possible. I suspect several of them are not significant.

In terms of what error covariances to include, theory should be your primary guide as well as past empirical findings and common sense. Examining your modification indices and model residuals will help identify potential error covariances that you missed. In general, you want to keep your model as simple as possible (i.e., include as few error covariances as possible).

A good introductory text for SEM is Principles and Practice of Structural Equation Modeling, Fourth Edition.

https://www.guilford.com/books/Principles-and-Practice-of-Structural-Equation-Modeling/Rex-Kline/9781462523344

Best,

Dan

Daniel J. Laxman, PhD
Postdoctoral Fellow
Department of Human Development and Family Studies
Utah State University

Preferred email address:
Dan.J....@gmail.com

Office:
FCHD West (FCHDW) 001

Mailing address:
2705 Old Main Hill
Logan, UT 84322-2705
--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Terrence Jorgensen

unread,
Dec 15, 2018, 6:06:40 AM12/15/18
to lavaan
there are consistently 0 df. We think that this is down to including all the covariances of the different parameters - but if we remove these, then aren't we invalidating the model by ignoring things which matter? Or is there a test or general rule of thumb which could indicate which covariances may be unnecessary to include?

This is just a multiple regression model, with x2--x6 predicting y.  In covariance-structure analysis, this model is saturated, and I don't see a reason why it shouldn't be.  In CSA, the number of observations is the number of unique elements in the covariance matrix your model tries to reproduce.  In OLS regression, this model would have df because the number of observations is the number of rows of data, and all the predictors are allowed to covary (just as in the path model you are estimating).  

The general rule of thumb for exogenous predictors in an SEM is that they should be allowed to freely covary, unless you know they are zero by design (e.g., orthogonal contrast codes to represent grouping variables are uncorrelated in balanced designs, so those covariances can be fixed to zero because your design ensures it).

Or perhaps another way entirely of increasing the df without resorting to this?

In OLS regression, the df represent ways that your model can fail to reproduce the observed values of the outcome variable y.  But in CSA, the df represent the number of ways that your model can fail to reproduce the observed covariance matrix.  They are restrictions on your model, so they should only be applied and tested if you have a theoretical reason to do so or if you have null hypotheses you want to test.  For example, if you are interested in whether a set of predictors have no effects on y, you can keep those variables in the model (still allowing all Xs to covary with each other) but fix their slopes on y to zero, then compare that to the saturated model using lavTestLRT().  If you only constrain one slope to zero, that test is equivalent to its Wald z tests in the summary() output.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Dan Laxman

unread,
Dec 15, 2018, 9:25:44 AM12/15/18
to lav...@googlegroups.com

Thank you, Dr. Jorgensen, for replying to this. Dae Meow, please ignore my response. I misread your code to indicate x1 - x6 were indicator variables for latent variable y rather than predictors of an observed y (=~ vs. ~). My apologies.

Best,

Dan

Daniel J. Laxman, PhD
Postdoctoral Fellow
Department of Human Development and Family Studies
Utah State University

Preferred email address:
Dan.J....@gmail.com

Office:
FCHD West (FCHDW) 001

Mailing address:
2705 Old Main Hill
Logan, UT 84322-2705
Reply all
Reply to author
Forward
0 new messages