Unreliable genetic correlations (rg > 1) when running LDSC on GWAS from the same cohort with correlated binary phenotypes

8 views
Skip to first unread message

Kelvin Supriami

unread,
May 8, 2026, 6:04:20 PM (4 days ago) May 8
to Genomic SEM Users

Hi GenomicSEM community,

I am running GenomicSEM to estimate genetic correlations between four related binary phenotypes derived from the same cohort (N~7,500). GWAS were run separately per phenotype using SAIGE, followed by meta-analysis across two genotyping platforms using METAL. Summary statistics were preprocessed and munged following the standard GenomicSEM workflow.

Problem:

The LDSC output produces genetic correlations that appear unreliable, with several estimates exceeding ±1 and some showing biologically implausible negative correlations between phenotypes that should be positively correlated:

rg(trait1, trait2) = 1.957 rg(trait3, trait4) = 2.884 rg(trait1, external_trait) = -1.008

The cross-trait LDSC intercepts between phenotypes are very high (ranging 0.22-0.46), which I believe reflects complete sample overlap since the same individuals contributed to all GWAS. SNP heritability is also low for some phenotypes (h2 Z < 1).

Questions:

  1. Is GenomicSEM/LDSC appropriate for estimating genetic correlations between phenotypes from the same cohort with complete sample overlap? How should the elevated cross-trait intercepts be interpreted in this context?
  2. Is there a recommended way to constrain the cross-trait intercept matrix (I) to expected values given known sample overlap?
  3. Is there a minimum h2 Z-score below which LDSC-based genetic correlations are unreliable?
  4. Would an alternative approach be more appropriate for same-cohort correlated phenotypes?

Thank you!

Elliot Tucker-Drob

unread,
May 11, 2026, 12:01:15 PM (20 hours ago) May 11
to Kelvin Supriami, Genomic SEM Users
It's difficult to interpret rG estimates when h2 is unstable. As you mention, h2 Z <1 means that the h2 estimates have very large SEs. This may be due to power. A GWAS of 7.5K for LDSC is on the low side, although it depends on the true h2 and polygenicity of the traits.

rG estimates>1 can usually be interpreted as being very high, but I'd suggest you check the SEs and compute CIs.

There isn't an issue with using phenotypes from the same cohort. The CTIs can simply be interpreted as the rPheno x % sample overlap, which you should be able to verify in the data itself. I would not suggest constraining the intercepts unless you are willing to make strong assumptions about stratification.





--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/79c46fcd-ff3c-4a41-9ea5-0f5980081ed7n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages