estimating SNP effects on residual

Holly Poore

unread,

Feb 20, 2026, 9:40:49 AMFeb 20

to Genomic SEM Users

Hello,

I would like to estimate SNP effects on the residual of mdd from the model below after accounting for its loading on the INT factor. I ran the below model, munged the output, and then used LDSC to estimate the genetic correlations between INT and residual mdd. I thought the residual mdd should be basically uncorrelated with the factor, but the correlation is .67. Does this sytax look correct? Thank you!

model <- "INT =~ 1*mdd + anx + neuro + ptsd
INT ~~ NA*INT

INT ~ SNP
mdd ~ SNP"

results <- userGWAS(LDSCoutput,
sumstats,
estimation = "DWLS",
model = model,
printwarn = TRUE,
sub=c("mdd~SNP"),
cores=1,
toler = 1e-50)

Michel Nivard

unread,

Feb 20, 2026, 12:31:51 PMFeb 20

to Holly Poore, Genomic SEM Users

Hi Holly,

Its grat that you checked and did not simply assume the rg was 0. The generic correlation between the factor and MDD is proportional to the standardized loading. The residuals are uncorrelated to the factor by definition. If you could (you can't it's not identified) estimate the SNP effects on all the residual/indicators that might have gotten some kind of orthogonal effects. Currently the average SNP effect on ANX NEURO, PTSD are captured by INT ~ SNP, with specific deviations from those captured by mdd ~ SNP (such that the two always combien to perfectly recapitulate MDD ~ SNP). It is absolutely possible for mdd ~ SNP to be higher when INT ~ SNP is high and low when INT ~ SNP is low.

This is a long winded way of me saying that I don't know for sure that it should be 0 in.a model with some, but not all, SNP effects allowed. That being said, you chose to fix the loading form INT to MDD to 1, which basically means the SNP effects on int are scaled using MDD, this might in this case be working to your disadvantage in some way, as your looking for the residual for MDD? Not sure but I'd certainly try:

model <- "INT =~ mdd + anx + neuro + ptsd

INT ~~ 1*INT

INT ~ SNP
mdd ~ SNP"

results <- userGWAS(LDSCoutput,
sumstats,
estimation = "DWLS",
model = model,
printwarn = TRUE,
sub=c("mdd~SNP"),
cores=1,
toler = 1e-50)

Not sure that'll matter but it might? I assume your eventual goal is to establish some rg's with other traits? or do some other downstream analysis? you could see which of those you could run in GenomicSEM, because the act of doing a GWAS in GenomicSEM could recapitulate error/bias that does not exist in the SEM model itself. YOu can intuitively compare this to doing PRS analysis, the PRS analysis usually does not recapitulate the whole of h2_SNP because while the totl effects of all SNPs equals h2_SNP, the sum of their individual effects encodes errors, and biases. We shared his worry in the GWAS by subtraction paper and we ran all the rg's outside the sem model with LDSC and inside the sem model, and compared. I think we also tried to see if the rg with a holdout Cog GWAS was low/absent I believe. Should be somewhere in the supplements.

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/6da70e12-bd26-4344-9f8f-b2e1822f647dn%40googlegroups.com.

Holly Poore

unread,

Feb 20, 2026, 3:26:38 PMFeb 20

to Genomic SEM Users

Thank you so much for your helpful response! I will try rerunning the model freely estimating the mdd loading and report back.

I would like to do some PGS analyses with these results, so I do need to go outside the model. Do you think there is a better way to model these residuals (like a Cholesky or something like that?) or is the same problem likely to persist in other models?

Thanks again,

Holly

Michel Nivard

unread,

Feb 20, 2026, 3:30:25 PMFeb 20

to Holly Poore, Genomic SEM Users

So we also did PRS in the GWAS by subtraction paper. But the first PRS analysis we ran was on the traits in the model, cognitive scores, and EA. We confirmed the PRs for Non-cog explained way less variance (not zero) in cognitive tests then the cog or EA scores, but did still predict EA. I imagine you’ll have to do something similar, find positive controls (things it should still predict) and negative controls (things it should predict less (at least, ideally near 0)).

Op vr 20 feb 2026 om 21:26 schreef Holly Poore <hollyb...@gmail.com>

To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/eb1e4a9c-39cc-4018-aeaf-7b530f6f19c0n%40googlegroups.com.

Paris Huynh

unread,

Jun 15, 2026, 10:32:33 AMJun 15

to Genomic SEM Users

Hello,

My aim is to estimate SNP effect across 5 psychiatric disorders loading to a common factor. I want to estimate the SNP effect size on the common factor and the disorder-specific SNP effect size to calculate the PRS in my sample.

I would like to ask whether I should esimtate it directly from the gSEM model (simultanously) or use the two-step GWAS-by-subtraction method? If the model is the same, is there any difference or advantage of using one method over the other? Thank you very much for your clarification.

Elliot Tucker-Drob

unread,

Jun 15, 2026, 10:52:56 AMJun 15

to Paris Huynh, Genomic SEM Users

Typically, you can't estimate a model with paths from an external predictor to a factor and all of its indicators, as it is underientified. For k indicators of a factor you have k observed SNP-indicator associations, but you would be trying to estimate k+1 associations in the model you describe. With the measurement model fixed to the estimates from the unconditional model, as is typical, you can identify the model, but the interpretation of the direct effects is somewhat nuanced. For an explication of this idea with an external GWAS phenotype, rather than SNP, see:

de la Fuente, J., Londoño-Correa, D., & Tucker-Drob, E. M. (2025). Distinguishing specific from broad genetic associations between external correlates and common factors. Bioinformatics, 41, btaf568. Link

For an example of how we've taken such an approach with SNPs (e.g. those in the APOE region for cognition), see the supplement to:

de la Fuente, J., Davies, J., Grotzinger, A. D., Tucker-Drob, E. M., & Deary. (2020). A general dimension of genetic sharing across diverse cognitive traits inferred from molecular data. Nature Human Behaviour. [de la Fuente & Davies contributed equally to this work; Tucker-Drob & Deary jointly directed this work] Link

We have not attempted to use these SNP-indicator residuals to estimate polygenic indices, and I'd be cautious with resepct to both interpretation and power of doing so.

To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/5093bcba-8d42-4716-87ab-7b21a7518c45n%40googlegroups.com.

Paris Huynh

unread,

Jun 15, 2026, 10:44:59 PMJun 15

to Genomic SEM Users

Hi Tucker, thank you very much for your response! If I understand correctly, once I have identified the loading from the unconditional model and fixed the values, I can then estimate the SNP effects on the common latent factor and residual SNP effect for each disorder ("simultaneous method"). Would the SNP effects estimated from this simultaneous method the same as that estimated from the GWAS-by-subtraction method? Is there an advantage of using one method over the other?

Regarding building polygenic indices from SNP-indicator residuals: I was thinking of doing something similar to the PRS for Non-cog in the GWAS by subtraction paper, but for psychiatric disorders. I would greatly appreciate to better understand what are interpretation and power issues which I should be cautious about. One thing that comes to my mind (please correct me if I'm wrong) is potentially the majority of SNP effects are attributable to the latent factor, so disorder-specific polygenic indices built from SNP-indicator residuals don't explain much.

Thank you again for your help.

Reply all

Reply to author

Forward