Multiple models with good fit

200 views
Skip to first unread message

Rashmi Sukumaran

unread,
Oct 11, 2023, 3:43:04 AM10/11/23
to genomic-...@googlegroups.com
Dear all,

I am new to SEM and learning a lot. Thanks to this wonderful package, I am able to actually test my ideas of genetic correlation.

I have a set of 8 traits. The EFA shows feasibility for up to 4 factors. However, all the models show good fit indices, so how do I choose the suitable model for my data?
Also could anyone please verify if my EFA to CFA conversions seem reasonable (I have specified the models using prior field knowledge as well).

The ldsc covariance matrix (ldsc$S_Stand) is as follows
image.png

EFA loadings for 1 factor
       Factor1
T1  0.899
T2     0.676
T3    -0.139
T4    0.222
T5     0.482
T6     0.241
T7     0.129
T8     0.392
               Factor1
SS loadings      1.795
Proportion Var   0.224

EFA Loadings for 2 factors:
       Factor1 Factor2
T1     0.792   0.145
T2     0.713        
T3            -0.215
T4    0.163   0.127
T5     0.103   0.963
T6     0.387  -0.276
T7     0.199        
T8     0.395       
               Factor1 Factor2
SS loadings       1.52   1.100
Proportion Var    0.19   0.138
Cu
mulative Var    0.19   0.327

EFA Loadings for 3 factors
       Factor1 Factor2 Factor3
T1     0.399   0.133   0.451
T2     0.932                
T3     0.258  -0.103  -0.402
T4                     0.268
T5             -0.159  0.779
T6             1.029  -0.122
T7     0.239                
T8     0.193   0.148   0.224
               Factor1 Factor2 Factor3
SS loadings      1.195   1.143   1.119
Proportion Var   0.149   0.143   0.140
Cumulative Var   0.149   0.292   0.432
EFA Loadings for 4 factors
       Factor1 Factor2 Factor3 Factor4
T1     0.389   0.151   0.373        
T2     1.077                        
T3     0.210          -0.352        
T4   -0.136                    0.634
T5            -0.143   0.835        
T6             1.042  -0.127        
T7     0.160          -0.194   0.212
T8     0.131   0.148   0.108   0.214

               Factor1 Factor2 Factor3 Factor4
SS loadings      1.420   1.176   1.026   0.507
Proportion Var   0.177   0.147   0.128   0.063
Cumulative Var   0.177   0.324   0.453   0.516

one_factor_m <- 'F1 =~ NA*T1 + T2 + T3 + T4 + T5 + T6 + T7 + T8
               F1 ~~ 1*F1'
one_factor$modelfit
      chisq df   p_chisq      AIC CFI       SRMR
df 16.30894 20 0.6972824 48.30894   1 0.06869384

two_factor_m <- 'F1 =~ NA*T1 + T2 + T4 + T6 + T7+ T8
              F2 =~ NA*T3+T5
               F1 ~~ 1*F1
               F2 ~~1*F2'
two_factor$modelfit
      chisq df   p_chisq      AIC CFI     SRMR
df 15.25199 19 0.7064523 49.25199   1 0.061889
three_factor_m <- 'F1 =~ NA*T1 + T2 + T7 + T8 + T3
              F2 =~ NA*T3+ T4 + T5
              F3 =~ NA*T6 + T5
               F1 ~~ 1*F1
               F2 ~~ 1*F2
               F3 ~~ 1*F3'
three_factor$modelfit
      chisq df   p_chisq      AIC CFI       SRMR
df 13.30398 15 0.5788315 55.30398   1 0.04284452
four_f_model <-'F1 =~ NA*T1 + T2 + T8
F2 =~ NA*T4
F3 =~ NA*T5 + T3
F4 =~ NA*T6 + T7
F1~~F2
F1~~F3
F2~~F3
F1~~F4
F2~~F4
F3~~F4
F4 ~~ 1*F4
F1 ~~ 1*F1
F2 ~~ 1*F2
F3 ~~ 1*F3'
four_factor$modelfit
      chisq df p_chisq      AIC CFI       SRMR
df 12.01082 15 0.67821 54.01082   1 0.04455509

Thank you!
Regards,
Rashmi

Jeff Kim

unread,
Oct 17, 2023, 2:44:37 PM10/17/23
to Genomic SEM Users
Hi Rashmi,

Cumulative variance explained seems to be going up consistently as you add the number of factors which is good to see, but it seems that 2, 3, and 4 factor models have factors with less than ideal number of strong loadings (>= 0.3). Ideally you would want at least 2 strong loadings per factor. Factor2 in particular is consistently dominated by T6 with 3 or 4 weak loading traits. Covariance matrix seems to also show that T6 has low correlation with the other traits. How does the EFA look when you remove T6?

Also is there a particular reason why the factors are correlated with each other at exactly 1? I'm assuming this is from prior field knowledge?

Regards,
Jeff

Rashmi Sukumaran

unread,
Oct 20, 2023, 6:09:06 AM10/20/23
to Genomic SEM Users
Hi Jeff,

Thank you for taking the time to look into my data and reply.
Yes the fact that cumulative variance goes up, but the factors seem weak with not enough loadings had me confused. I am new to SEM.

1. Just to clarify, it is not recommended to specify just one loading on a factor right? Like F1 =~ T1? But when I look at fit indices, some of the CFA models with single loading factors give the best values. Is it not necessary to go with the model with the best fit?

2. So basically, this data is a set of trait (T1) with its comorbid factors (T2-T8). So I was expecting one factor with T1 and a few other traits, and a second factor with a couple of traits which influence the first factor as well. However, such models seem to not converge. I am not sure what I am doing wrong.

I tried a model with just one factor, F1 =~ T1 + T2 + T4 + T8. It has great fit indices. 
                chisq df p_chisq      AIC CFI       SRMR
df   0.4745738   2    0.7887649    16.47457   1   0.02985077

However, after estimating for the SNP effects, I do not get any significant SNPS. Why does that happen?
Does it mean the model is not a good fit for our data?

3. In the four factor model, factors correlated with each other at 1 was a rookie mistake. 
The fit indices without the correlations are as follows
      chisq df p_chisq      AIC CFI       SRMR
df 14.07736 15 0.51967 56.07736   1 0.04517484

4. What is the recommended cutoff for covariances? EFA with T6 removed is as follows:
EFA Loadings for 1 factor
Uniquenesses:
   T1      T2     T3      T4       T5      T7     T8
0.325 0.471 0.986 0.942 0.798 0.981 0.825

Loadings:
     Factor1
T1    0.821
T2   0.727
T3  -0.118
T4  0.241
T5   0.449
T7   0.138
T8   0.418
               Factor1
SS loadings      1.671
Proportion Var   0.239

EFA Loadings for 2 factors
Uniquenesses:
   T1      T2     T3      T4       T5      T7     T8
0.465 0.113 0.846 0.907 0.621 0.958 0.828

Loadings:
     Factor1 Factor2
T1    0.413   0.445
T2   0.899        
T3   0.252  -0.435
T4               0.316
T5                0.623
T7   0.228        
T8   0.188   0.295
               Factor1 Factor2
SS loadings      1.131   0.976
Proportion Var   0.162   0.139
Cumulative Var   0.162   0.301

Factor Correlations:
        Factor1 Factor2
Factor1   1.000  -0.452
Factor2  -0.452   1.000

EFA Loadings for 3 factors
Uniquenesses:
   T1      T2     T3      T4       T5      T7     T8
0.496 0.005 0.853 0.815 0.589 0.899 0.792

Loadings:
     Factor1 Factor2 Factor3
T1    0.404   0.302   0.168
T2   1.014                
T3   0.175   -0.403        
T4 -0.190                  0.484
T5                0.574        
T7               -0.242   0.299
T8                0.101   0.348

               Factor1 Factor2 Factor3
SS loadings      1.273   0.657   0.482
Proportion Var   0.182   0.094   0.069
Cumulative Var   0.182   0.276   0.345

Factor Correlations:
        Factor1 Factor2 Factor3
Factor1   1.000  -0.362   0.619
Factor2  -0.362   1.000  -0.488
Factor3   0.619  -0.488   1.000

Thanks again!
Regards,
Rashmi

Michel Nivard

unread,
Oct 20, 2023, 6:55:47 AM10/20/23
to Rashmi Sukumaran, Genomic SEM Users
Hi Rashmi,

So looking at the fit of these models they're really all fairly close (though the 4 factor model is somewhat better, but the factor loading on the one trait isnt really a factor right? more a trait that then doesnt really fit any factor). 

Models can have close fit for 2 general reasons, they are very similar or you lack the statistical power to distinguish between them. I'd like to ask how big your GWASes are and for the SNP h2 and its standard error.

Then with respect to model choice, models are supposed to serve some further descriptive, or scientific purpose. So in the absence of clear empirical evidence for one or the other, you can let theory guide your choices. Obviously while acknowledging to the reader you didn't have the power to make the voice based on data.

Overall it seems clear from the correlation matrix these 8 traits don't really fit a one factor model, though that could just be big uncertainty in the correlations, have a look at the snp h2 and its significance I'D (and share those if you can).

Best,
Michel

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/genomic-sem-users/16870c6e-2cd0-4233-826a-f0d5d6298d06n%40googlegroups.com.

Rashmi Sukumaran

unread,
Oct 21, 2023, 4:15:50 AM10/21/23
to Michel Nivard, Genomic SEM Users
Hi Michel,

Thank you for clarifying my doubts! This helps me a lot!
The GWAS effective sample sizes and the heritability estimates are as follows
Mean chi2 are all above 1.02 and intercepts are close to 1 (as per ldsc wiki)


Trait Neff Mean chi2 Lambda GC Intercept Ratio Total obv scale h2 h2 Z Total liability scale h2
T1 195961.54 1.2282 1.1893 1.0697 (0.0074) 0.3057 (0.0323) 0.0415 (0.0032) 13 0.0214 (0.0017)
T2 56866.32 1.1328 1.0967 1.014 (0.0075) 0.1056 (0.0568) 0.1071 (0.012) 8.91 0.0745 (0.0084)
T3 269545 1.5174 1.2391 1.0633 (0.0361) 0.1224 (0.0697) 0.0811 (0.009) 9.01 0.0811 (0.009)
T4 8142.6 1.0647 1.066 1.0407 (0.0064) 0.6283 (0.0989) 0.1066 (0.0381) 2.8 0.1451 (0.0519)
T5 120416.84 1.0351 1.0475 1.0188 (0.007) 0.5364 (0.1979) 0.0064 (0.0037) 1.7 0.0064 (0.0037)
T6 117262.3 1.0624 1.0475 1.0175 (0.0089) 0.2807 (0.1423) 0.0177 (0.0058) 3.06 0.0169 (0.0055)
T7 46453.16 1.3321 1.1999 1.0682 (0.0121) 0.2055 (0.0365) 0.2858 (0.0257) 11.1 0.1369 (0.0123)
T8 21924 1.0965 1.0792 1.0065 (0.0066) 0.0673 (0.0687) 0.2096 (0.0242) 8.67 0.1778 (0.0205)


Thanks & Regards,
Rashmi Sukumaran

Rashmi Sukumaran

unread,
Oct 26, 2023, 6:34:07 AM10/26/23
to Genomic SEM Users
Hi Michel,

Is there a cutoff for h2 that is required for genomic sem?

Also, after munge, one of my gwas do not have any significant snps (p<5e-08) left. Should I eliminate that sumstats?

Thank you!
Regards,
Rashmi
Reply all
Reply to author
Forward
0 new messages