Residual covariance matrix

rlan...@u.uchile.cl

unread,

Mar 13, 2019, 10:44:07 AM3/13/19

to lavaan

Hi

Im doing SEM using R. When I do the CFA with categorical indicator variables (ordinal specifically), I have 8 indicators for one factor so I have to use lavtables. For the 28 results with lavtables, 27 of them have a g2.pvalue < .05 (so its means that is a bad bad fit), but My CFI and TLI > .95, RMSEA <.08, SRMR<.08 are good (Chi square and Chi square/df = 4.98 are not good, but it can be for the sample size).

If these G2 values can be thought of components that (together) give the total model chi-square test statistic, so that can explain the bad results for my G2 and the cause of that can be my sample size (530), so how can I interpret that? the residual covariance matrix is not interpretable for my sample size or there is another option?

Regards!

Terrence Jorgensen

unread,

Mar 21, 2019, 6:35:13 AM3/21/19

to lavaan

so I have to use lavtables

To do what?

For the 28 results with lavtables, 27 of them have a g2.pvalue < .05 (so its means that is a bad bad fit), but My CFI and TLI > .95, RMSEA <.08, SRMR<.08 are good (Chi square and Chi square/df = 4.98 are not good, but it can be for the sample size).

The fit statistics in summary() output refer to how well your model reproduces the estimated polychoric correlation matrix (by comparing it to the model-implied polychoric correlation matrix); they say nothing about reproducing the actual observed data.

When you pass your model to lavTables(), the stats will test how closely your hypothesized model can reproduce the observed proportions in 1-way or 2-way tables, by comparing observed proportions to model-implied proportions. It is common that models can reproduce polychorics quite closely, yet the same model implies tables that differ from the observed data.

If these G2 values can be thought of components that (together) give the total model chi-square test statistic

They are not cumulative, (at least, not in that way).

so that can explain the bad results for my G2 and the cause of that can be my sample size (530), so how can I interpret that?

The same as in any other scenario. Large sample size makes it easy to detect differences from H0 even when the effect size is small. You can accompany the significance tests with measures of effect size appropriate for contingency-table analysis (see Agresti's book on categorical data analysis for ideas, or Google around, ask CrossValidated). It should not be difficult to find something you can express as a function of the obs.prop and est.prop provided by lavTables(), or of the statistic itself (e.g., Phi or Cramer's V).

the residual covariance matrix is not interpretable for my sample size or there is another option?

Do you mean resid(fit, type = "cor")? That has nothing to do with N, but keep in mind it tells you how your model fails to reproduce the estimated polychorics, not how it fails to reproduce the observed data.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Rodrigo Landabur

unread,

Mar 23, 2019, 5:01:44 PM3/23/19

to lav...@googlegroups.com

Thanks for the response!!!

On Thu, 21 Mar 2019 at 07:35, Terrence Jorgensen <tjorge...@gmail.com> wrote:

so I have to use lavtables

To do what?

To see if there is a difference between the observed and implied data (I would have continuous data I would use the function Resid and I would see any value greater than .1), because it is give me an idea about the fix of my data.

For the 28 results with lavtables, 27 of them have a g2.pvalue < .05 (so its means that is a bad bad fit), but My CFI and TLI > .95, RMSEA <.08, SRMR<.08 are good (Chi square and Chi square/df = 4.98 are not good, but it can be for the sample size).

The fit statistics in summary() output refer to how well your model reproduces the estimated polychoric correlation matrix (by comparing it to the model-implied polychoric correlation matrix); they say nothing about reproducing the actual observed data.

Ok, but the index SRMR is a measurement of residual covariance, so If my SRMR is good, the residual covariances have to be good too.... am I right?

Please, but I do not understand what is the difference between estimated and model-implied polychoric correlation matrix. Can you help me?

When you pass your model to lavTables(), the stats will test how closely your hypothesized model can reproduce the observed proportions in 1-way or 2-way tables, by comparing observed proportions to model-implied proportions. It is common that models can reproduce polychorics quite closely, yet the same model implies tables that differ from the observed data.

Yeah, but my results of lavtables are bad but my SRMR is good, I think that is a contradiction..

If these G2 values can be thought of components that (together) give the total model chi-square test statistic

They are not cumulative, (at least, not in that way).

Ok!

so that can explain the bad results for my G2 and the cause of that can be my sample size (530), so how can I interpret that?

The same as in any other scenario. Large sample size makes it easy to detect differences from H0 even when the effect size is small. You can accompany the significance tests with measures of effect size appropriate for contingency-table analysis (see Agresti's book on categorical data analysis for ideas, or Google around, ask CrossValidated). It should not be difficult to find something you can express as a function of the obs.prop and est.prop provided by lavTables(), or of the statistic itself (e.g., Phi or Cramer's V).

Thanks. I am going to see it.

the residual covariance matrix is not interpretable for my sample size or there is another option?

Do you mean resid(fit, type = "cor")? That has nothing to do with N, but keep in mind it tells you how your model fails to reproduce the estimated polychorics, not how it fails to reproduce the observed data.

Thanks!!!

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Terrence Jorgensen

unread,

Mar 25, 2019, 10:57:24 AM3/25/19

to lavaan

Ok, but the index SRMR is a measurement of residual covariance, so If my SRMR is good, the residual covariances have to be good too.... am I right?

No, SRMR is an average, so by definition it can easily mask important problems that individual residuals would make you aware of. Better to look at individual standardized/correlation residuals from resid(fit, type = "cor")

Please, but I do not understand what is the difference between estimated and model-implied polychoric correlation matrix. Can you help me?

https://www.tandfonline.com/doi/abs/10.1207/S15327906347-387

https://www.statmodel.com/download/webnotes/CatMGLong.pdf

https://www.tandfonline.com/doi/abs/10.1080/10705510701758406

Yeah, but my results of lavtables are bad but my SRMR is good, I think that is a contradiction.

No, these criteria refer to different things. As I said before, it is often easier for fitted models to reproduce the estimated polychorics than to reproduce the observed proportions. The other way around would be contradictory (if your model could not adequately reproduce polychorics, yet still adequately predict the observed proportions).

Reply all

Reply to author

Forward