Use lavTables(fit) output to check model fit

147 views
Skip to first unread message

Spring

unread,
Sep 5, 2018, 12:16:24 AM9/5/18
to lavaan
Hi there, I'm fitting a CFA model with categorical data. Apart from the overall fit indices, I was told to look at the residuals, and then I found the following advice "if you have categorical indicator variables, you’ll want to look at expected vs. observed counts in each level instead of residual correlations. You can get that with lavTables(fit)".

So, I looked at the output of lavTables(fit), which gives a two-way table showing the frequencies. Yet, how do I suppose to use this to evaluate the model fit?? How large should the stats (like X2 and G2) be? What rule should I use? 

Also, when I include p.values in the function, like  lavTables(fit, 2L, stat="G2", p.value=TRUE), it doesn't provide a p-value. Any idea why?

Many thanks!!

Yves Rosseel

unread,
Sep 16, 2018, 2:22:34 PM9/16/18
to lav...@googlegroups.com
On 9/5/18 6:16 AM, Spring wrote:
> Hi there, I'm fitting a CFA model with categorical data. Apart from the
> overall fit indices, I was told to look at the residuals, and then I
> found the following advice "if you have categorical indicator variables,
> you’ll want to look at expected vs. observed counts in each level
> instead of residual correlations. You can get that with |lavTables(fit)|".
>
> So, I looked at the output of lavTables(fit), which gives a two-way
> table showing the frequencies. Yet, how do I suppose to use this to
> evaluate the model fit?? How large should the stats (like X2 and G2) be?
> What rule should I use?

Strictly speaking, X2 or G2 values larger than 3.0 are already 'large'.
But I would first look at the 'largest' values.

A discussion of this can be found in eg

Joreskog, K.G. & Moustaki, I. (2001). Factor analysis of ordinal
variables: A comparison of three approaches. Multivariate
Behavioral Research, 36, 347-387.


> Also, when I include p.values in the function, like lavTables(fit, 2L,
> stat="G2", p.value=TRUE), it doesn't provide a p-value. Any idea why?

Yes: this is a bug (well, more an oversight). Will fix this soon.

Yves.

Yves Rosseel

unread,
Sep 18, 2018, 12:18:14 PM9/18/18
to lav...@googlegroups.com
> Also, when I include p.values in the function, like lavTables(fit, 2L,
> stat="G2", p.value=TRUE), it doesn't provide a p-value. Any idea why?

Looking at this again: there should be no p.value here, as we are
looking at a single cell (each row is a cell).

lavTables(fit, 2L, stat="G2", type = "table", p.value=TRUE)

gives the G2 statistic for each table, including a p-value.

Yves.

--
Yves Rosseel -- http://www.da.ugent.be
Department of Data Analysis, Ghent University
http://lavaan.org

Spring

unread,
Sep 24, 2018, 5:54:37 AM9/24/18
to lavaan
Thanks Yves! Your reply and the reference are very helpful. I got the result of the LR stats and LR-fit, but have a few more questions now.
I hope you can provide some guidance. Thank you in advance!

(1) To use LR stats and LR-fit to judge the fit, does it have requirements about the estimator? 
Currently I use DWLS, and it gives a warning message "estimator DWLS is not using full information while est.prop is using full information" when I request LavTables(fit, type="pattern").

(2) If the chi-square tests are significant for both the LR and GF stats, should the model be rejected because it is mis-specified? Can we use the sparseness to justify accepting the model?
For my model, I have 16 dichotomous items in a 5-factor model, and the sample size is 1250. So, there are a large number of empty cells, leading to strong distorting effects of sparseness.  
I understand from your paper that we can calculate the sum of contributions over response patterns whose expected frequency exceeds a certain value v. Yet, the test remains significant, when I increase the value v to 1, 3, 5 (where the df becomes negative).

(3) For LR fit, do we need to report the full matrix of the bivariate fits, or is it sufficient to report the overall LR-fit (the average over all cells)? 
For my model, all LR fits are well below the threshold of 4 (most below 2). So I wonder if I can keep the model even though the LR stats fails the test (See Question 2 as above). 
Reply all
Reply to author
Forward
0 new messages