challenges running SEM with missing data

Skip to first unread message

Jun 13, 2018, 3:18:22 AM6/13/18
to lavaan
I am quite new with R and lavaan package. I am trying to run a SEM model below but am keep on getting errors on missing data. How can I improve the argument to make sure lavaan runs it using methods of treating missing values like listwise.

> model.sem <- '
+ ###Measuremement models
+ Malnutrition =~ WAZ + HAZ +WAH
+ Immediate_causes =~ morb + month_bf + num_semifood
+ Underlying_causes =~ bmi + birth_weight + bord
+ Basic_causes =~ HH_members + w_index +m_educa
+ ### Regression
+ Malnutrition ~ Immediate_causes + Underlying_causes + Basic_causes
+ Immediate_causes ~ Underlying_causes + Basic_causes
+ Underlying_causes ~ Basic_causes
+ ###Residual correlation
+ WAZ ~~ HAZ + WAH
+ morb ~~ month_bf + num_semifood
+ bmi ~~ birth_weight + bord
+ HH_members ~~ w_index + m_educa
+ '
> fitsem <- sem(model.sem, data=data4R, missing = "listwise")
Error in lav_data_full(data = data, group = group, cluster = cluster,  : 
  lavaan ERROR: missing observed variables in dataset: num_semifood

Edward Rigdon

Jun 13, 2018, 7:37:10 AM6/13/18
This is not a missing data problem. The error message says that the variable num_semifood is not present in the data frame data4R. Go back and check the data frame. Remember that R is case-sensitive AlwAys.
You are going to have more problems as well, but solve this problem first.
Ed Rigdon

You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
Visit this group at
For more options, visit

Jun 16, 2018, 3:03:33 AM6/16/18
to lavaan
Thanks, I later figured out that I misnamed and now I can run it but the model is unidentified. What can I do to make it identified.

 fitsem <- sem (model.sem, data=data4R)
Warning messages:
1: In lav_data_full(data = data, group = group, cluster = cluster,  :
  lavaan WARNING: some observed variances are (at least) a factor 1000 times larger than others; use varTable(fit) to investigate
2: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING: could not compute standard errors!
  lavaan NOTE: this may be a symptom that the model is not identified.

3: In lav_object_post_check(object) :
  lavaan WARNING: some estimated ov variances are negative

Edward Rigdon

Jun 16, 2018, 5:31:02 AM6/16/18
You cannot free all residual covariances for all of the indicators of each factor. The common factor is supposed to account for covariances among indicators.  If you comment out that section of your code, I think this will run.

Jun 16, 2018, 6:27:00 AM6/16/18
to lavaan
Thanks for the guidance. it run perfectly.


Jun 16, 2018, 6:46:05 AM6/16/18
to lavaan
Not that I am an expert, but you should have theory backing up your model. Right now it seems like you're just fitting a model that you hope will 'run' or converge. To my knowledge, that makes no sense. First you have to think why for example measurement errors of indicators should or shouldn't be correlated, and based on that you should run the model with or without those correlations. If your model fits badly and you simply modify untill it seems to give a good fit, without understanding why certain relations should exist, is meaningless

Jun 18, 2018, 7:36:53 AM6/18/18
to lavaan
Yes I tried to use different combinations of the residual covariances to improve my model fit..Thanks it improved. my model


Jun 18, 2018, 1:49:12 PM6/18/18
to lavaan
You missed my point. I meant that you shouldn't do that: you shouldn't try different combinations of the residual covariances to improve model fit unless you have theoretical justification to do so. 

The idea of CFA is that you test a theory that you think makes sense. 
Reply all
Reply to author
0 new messages