Hi,
I’m encountering a substantial discrepancy in estimated genetic correlations (rg) when comparing HDL and LDSC, despite using the same GWAS summary statistics and downstream GenomicSEM model. I would appreciate guidance on whether this behavior is expected or indicates a setup issue.
Using three binary, non-overlapping disease traits (European ancestry GWAS), I observe markedly different rg estimates depending on whether the genetic covariance matrix is estimated via HDL or LDSC, even though:
• The same summary statistics are used
• Effective sample size conventions are applied when munging the summary statistics
• sample.prev = 0.5 is specified for all traits
• Liability-scale conversion is performed using plausible population prevalences
• The same saturated GenomicSEM correlation model is applied
The SEM layer appears to behave as expected (i.e., it reproduces cov2cor(S)), so the discrepancy seems to originate from differences in the estimated genetic covariance matrix S produced by HDL vs LDSC.
model <- ' lT1 =~ NA*T1 lT2 =~ NA*T2 lT3 =~ NA*T3 T1 ~~ 0*T1 T2 ~~ 0*T2 T3 ~~ 0*T3 T1 ~~ 0*T2 T1 ~~ 0*T3 T2 ~~ 0*T3 lT1 ~~ 1*lT1 lT2 ~~ 1*lT2 lT3 ~~ 1*lT3 lT1 ~~ lT2 lT1 ~~ lT3 lT2 ~~ lT3 '
This model is used only to extract rg (latent–latent correlations), not to test factor structure.
What I observed was:
Using the LDSC-based covariance structure:
• rg estimates are moderate and internally consistent across trait pairs
Using the HDL-based covariance structure:
• rg estimates are substantially larger for two of the three trait pairs
• standard errors are also noticeably larger for those pairs
This pattern is reproducible and persists after confirming that the SEM output matches cov2cor(S) for each method.
The questions I have in mind are:
Thanks very much!