fixed.x option and missing values in exogeneous variables

448 views
Skip to first unread message

CSM

unread,
Feb 8, 2019, 12:12:43 PM2/8/19
to lavaan

Dear lavaan users,

I have a small question regarding the use of ‘fixed.x’. It is the following: I want to specify a path diagram where one of the (observed) variables, X, has about 30% of missing values, because in the beginning of my study I was not collecting information about this variable. However, I really would like to include this variable in the model. Due to the high number of missing values in this variable and some deviations from normality in some other variables, I think it is recommended to use the following options:

 

mod<-sem(form, missing="fiml", estimator="MLR", data=my.data)

 

The problem is that I would like to include the variable X as exogenous and then, by default, fixed.x=TRUE is assumed by lavaan. But then, about 30% of the data is "deleted" from the analysis. So, I have to options: force "fixed.x=FALSE" or change the variable X to endogenous. What is the best way to follow? Why is it problematic to assume missing values in exogenous variables?

 

Best,

csm

Edward Rigdon

unread,
Feb 8, 2019, 2:02:16 PM2/8/19
to lav...@googlegroups.com
"Fixed.x" here implies set and not observed or sampled. That is why a fixed.x variable has no sampling variance. So if it was set by the researcher, how is it the researcher does not know its value? If it was only observed or sampled, then fixed.x is incorrect.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

CSM

unread,
Feb 8, 2019, 7:20:39 PM2/8/19
to lavaan
Thank you very much, Prof. Edward Rigdon, for the quick answer. The variable was observed, measured, but only for about 70% of the whole sample due to some technical problems at the beginning of the study. In this case, you write that 'fixed.x is incorrect'. Do you mean 'fixed.x=FALSE' is incorrect? That is, including that variable as exogenous, the model should be restricted to that 70%?

Benedikt Heuckmann

unread,
Feb 11, 2019, 5:20:37 AM2/11/19
to lavaan

Hi CMS, 
I posted a similar issue some time ago (https://groups.google.com/forum/#!topic/lavaan/XpwzUh4h3gs). We had 33% missing data due to a 3-form-planned missingness design for the exogenous variables. What helped us to perform the analysis was to impute data first (to account for the missing data in the exogenous variables ) and then run a SEM analysis for the multiply imputed data sets. We used the 'mice' package for data imputation first, and then passed the imputed data set to a runMI() syntax from the semTools package. 

HTH, Benedikt

CSM

unread,
Feb 11, 2019, 7:52:40 AM2/11/19
to lavaan
Thank you, Benedikt!
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages