GenomicSEM: model fit and Residual Covariance Matrix

58 views
Skip to first unread message

B

unread,
Jun 30, 2025, 5:25:40 AMJun 30
to Genomic SEM Users
Dear all,

I wanted to ask a question regarding confirmatory factor analysis step in genomicSEM: 
On genomicSEM github wiki I found mentioning concerning running EFA step: "Below we provide an example of a workflow in which we do not have an a priori model, and therefore first conduct an exploratory factor analysis before specifying and fitting a usermodel. " I understood that running EFA is then only recommended if the researcher is not able to determine a factor model based on theory or previous work, did I understood this correctly? As I am working with internalizing and externalizing behaviour, we can have a priori two factor model with externalizing factor and internalizing factor.
When I run CFA on this two factor model and look at output of CFA I observe the following model fit:
Externalizing_factor=~ NA *trait1 + trait2 +trait3 + trait4  + trait5 + trait6 + trait7 + trait8
Internalizing_factor=~ NA * trait8 +  trait9 + trait10 +trait11 + trait12
F1~~F2
F1~~1*F1
F2~~1*F2"

$modelfit
 chisq df p_chisq       AIC            CFI             SRMR
df 2967.64 52       0 3019.64 0.7839851 0.1661678

Based on previous work and posts in GenomicSEM users group, I understood that this is a poor model fit and therefore should not continue with this model (as CFI is below 0.9 and SRMR above 0.1). However, the residual covariance matrix shows good fit as the highest value I observe is  1.578590e-02 (based on $resid_cov$`Residual Covariance Matrix: Calculated as Observed Cov - Model Implied Cov`output)

In the following paper I read that there could be cases where residuals and model fit are not compatible.  However, I am not certain that a model with not good/acceptable fit indices in genomicSEM is reliable and good model to run multivariate GWAS, even when the residuals are minimal.

If the model fit is not optimal but residuals do show good fit, should the choice of model for multivariate GWAS be based on the residual covariance matrix, on biological interest/research question in including certain traits in the model (despite having poor model fit) or only on the model fit?

Thank you very much in advance for your time and advice!

Elliot Tucker-Drob

unread,
Jun 30, 2025, 10:33:02 AMJun 30
to B, Genomic SEM Users
The unstandardized residual matrix can be misleading with respect to model fit for phenotypes with low h2. You can compute the model residuals for the standardized S matrix to put the residuals on a more interpretable scale that is more comparable to the scale of SRMR.
Your model has a lot of traits and imposes simple structure; I suspect that cross-loadings may be useful to improve fit.
You can see how we did something similar (start with a theory-derived model imposing simple structure, then using model esiduals to identify cross-loadings to free) in the ReGPC supplement (see "Genomic SEM Analyses Stratified by Measurement Instrument")https://www.biorxiv.org/content/10.1101/2025.05.16.648988v1
Yavor Dragostinov conducted those analyses, so after reading through that section, you might reach out to him if you have questions.



--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/56989cb0-e980-4e23-abb4-d11fef89fa7fn%40googlegroups.com.
Message has been deleted

B

unread,
Jul 11, 2025, 6:20:43 PMJul 11
to Genomic SEM Users
Dear Elliot Tucker-Drob

Thank you very much for your suggestion and input. I will examine the standardized model residuals in my current model. 

Kind regards,
Barbara
Op maandag 30 juni 2025 om 16:33:02 UTC+2 schreef tucke...@gmail.com:
Reply all
Reply to author
Forward
0 new messages