"lavaan WARNING: some models are based on a different set of observed variables"

2,157 views
Skip to first unread message

Kara Weisman

unread,
Jan 16, 2016, 10:42:46 PM1/16/16
to lavaan
Hi all,

I'm doing a confirmatory factor analysis on a dataset with 405 observations of 40 variables.

I'd like to be able to compare models that make use of more or fewer of these variables, e.g.:

model1 <- 'F1 =~ A + B + C + D + E + F + G + H + I + J + K'
model2 <- 'F1 =~ A + B + C + D + E + F
F2 =~ G + H + I + J + K'
model3 <- 'F1 =~ A + B + C + D
F2 =~ E + F + G
F3 =~ H + I + J + K'
model4 <- 'F1 =~ A + B + D
F2 =~ E + G
F3 =~ I + J + K'

In reality, model1 has one factor that uses all 40 items; model2 has 2 factors that use all 40 items between them; model3 has 3 factors that use all 40 items between them; and model4 has 3 factors that use only 16 items between them.

When I compare only model1model2, and model3 (using anova(model1, model2, model3, model4)), I find that model1 fits significantly worse than model3, which is not significantly worse than model2.  So, so far, looks like model3 is the way to go.

When I add model4, it looks like the best-fitting previous model is significantly worse-fitting than model4.  This would make sense to me, but I'm worried about an error that this command throws: 

> anova(d1_all_fit1, d1_all_fit2, d1_all_fit3, d1_all_fit4)
Chi Square Difference Test

             Df   AIC   BIC   Chisq Chisq diff Df diff Pr(>Chisq)    
d1_all_fit4 101 23230 23370  619.03                                  
d1_all_fit2 734 56251 56595 5966.77     5347.7     633     <2e-16 ***
d1_all_fit3 737 55341 55673 5062.83     -903.9       3          1    
d1_all_fit1 740 56681 57002 6409.26     1346.4       3     <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In lavTestLRT(object = <S4 object of class "lavaan">, SB.classic = TRUE,  :
  lavaan WARNING: some models are based on a different set of observed variables

I assume that this warning comes from the fact that model4 makes use of only 16 of the 40 observed variables, when the other models make use of all 40.  

How big of a problem is this?  Are these models impossible to compare?  E.g., is there something about a simpler model (using fewer observed variables) that will mathematically make it a better fit?  (My intuition is that the reverse would be true, if anything!)

Thanks so much for your help!  Happy to post more of my actual input/output if that would be helpful.

Best wishes,
Kara

Edward Rigdon

unread,
Jan 17, 2016, 4:20:16 PM1/17/16
to lav...@googlegroups.com
Kara--
     There multiple problems within these results.  First, yes, using different observed variables in different models makes the chi-square difference test results uninterpretable.  This test requires that the models be nested, one a special case of the next.  Models with different observed variables are not nested.  Even in terms of information criterion indices like AIC, it could be hard to derive value from a comparison of one model with 40 observed variables vs another with 16 observed variables.  The power to reject the 40 variable model will generally be much,much higher.
    There is another problem.  Your anova output shows that your Model2 with 2 factors has fewer DF than your Model3 with 3 factors.  Model2 should be more constrained and have higher DF.  I would very much expect DF(Model1) > DF(Model2) > DF(Model3).  If there are other special features of Model2 not included with Model3, then those two models also are not nested.  Notice that the Model3 chi-square difference is *negative*  This can happen with chi-square differences for nested models from more exotic estimators, but not with ML chi-squares.  I think you may have a syntax error with Model2 and / or Model3.
     There is a bit of literature which argues that models with different numbers of factors may not be suited to chi-square difference tests,because they tend to involve parameter estimates constrained at their boundary.  Discontinuities make the usual chi-square difference p-value untrustworthy.
--Ed Rigdon

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Kara Weisman

unread,
Jan 17, 2016, 4:29:05 PM1/17/16
to lavaan
Ed, thanks very much.  I don't know what I was thinking - I should have known the models need to be nested!  I think this actually solves most of my problems, including the unexpected differences in the DFs in Model2 vs Model3.  (Thanks for noticing that!)  These are just 4 totally different models with no real relations to each other, and I now see that comparing chi-squares doesn't work for that situation.

I really appreciate it.  Will go try to build some nested models worth testing...

- Kara

Edward Rigdon

unread,
Jan 17, 2016, 4:41:12 PM1/17/16
to lav...@googlegroups.com
For model comparisons of nonnested models, consider the emcompassing principle.  In this approach to comparing Model1 and Model2, you create a supermodel within which both models are nested , and then proceed by comparing the component models to the supermodel.  See
--Ed Rigdon


Kara Weisman

unread,
Jan 17, 2016, 9:24:46 PM1/17/16
to lavaan
Thanks again!  Looking forward to playing around with this.

Yves Rosseel

unread,
Jan 18, 2016, 3:15:44 AM1/18/16
to lav...@googlegroups.com
On 01/17/2016 10:41 PM, Edward Rigdon wrote:
> For model comparisons of nonnested models, consider the emcompassing
> principle. In this approach to comparing Model1 and Model2, you create
> a supermodel within which both models are nested , and then proceed by
> comparing the component models to the supermodel. See
> http://www.tandfonline.com/doi/abs/10.1080/00273170701329112#.VpwKR_krLrc
> --Ed Rigdon

Another approach is to use the vuong test. See:

http://quantpsy.org/pubs/merkle_you_preacher_%28in.press%29.pdf

and the corresponding R package 'nonnest2' (which also supports lavaan):

https://cran.r-project.org/web/packages/nonnest2/

Yves.

Kara Weisman

unread,
Jan 18, 2016, 11:54:40 AM1/18/16
to lav...@googlegroups.com
Oh, very interesting, and very easy to use.  I'll have to read more about the supermodel option vs. the vuong test.  I really appreciate the suggestions from both of you.

And thanks so much, Yves, for lavaan!  



Yves.

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/Ya3Ak4AiTkQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages