Discrepancy between multivariate GWAS rg and SEM rg estimate

Constantino Rafael de la Vega

unread,

Mar 4, 2026, 11:44:22 AMMar 4

to Genomic SEM Users

Hello,

tl;dr: LDSC rg estimate is 0.14 between 2 factors (using multivariate GWAS output) but STD_all between them is 0.74 in the GenomicSEM model. Wondering how to interpret this or if I did something wrong.

I have created a 3 factor SEM:

"

# Cortical factor
Cortical =~ FI + IC + SA

# Skeletal - limbs + height
Skeletal_LimbHeight =~ Average_Femur + Average_Forearm + Average_Humerus + Average_Tibia + height

# Skeletal - trunk/widths
Skeletal_TrunkWidths =~ Hip_width + Shoulder_width + Torso_length + height

# Allow factors to correlate
Cortical ~~ Skeletal_LimbHeight
Cortical ~~ Skeletal_TrunkWidths
Skeletal_LimbHeight ~~ Skeletal_TrunkWidths

"

The usermodel() function specified was the following:

usermodel(covstruc = LDSCoutput, model = model, estimation = "DWLS", std.lv = T)

where LDSCoutput contains only even chromosomes. The usermodel-estimated Skeletal_LimbHeight ~~ Skeletal_TrunkWidths STD_all value is 0.74. I proceeded to run multivariate GWAS on these factors, where I used the same model but adding 3 extra lines:

"

Cortical ~ SNP

Skeletal_LimbHeight ~ SNP

Skeletal_TrunkWidths ~ SNP

"

I ran the multivariate GWAS with the following function:

userGWAS(covstruc = LDSCoutput,

SNPs = it_SNPs,

estimation = "DWLS",

model = model,

std.lv = T,

printwarn = TRUE,

sub=c("Cortical~SNP", "Skeletal_LimbHeight~SNP", "Skeletal_TrunkWidths~SNP"),

toler = FALSE,

SNPSE = FALSE,

parallel = F,

GC="standard",

MPI=FALSE,

smooth_check=TRUE,

fix_measurement=TRUE,

Q_SNP=TRUE)

In this function, LDSCoutput contains every chromosome, and it_SNPs is the output from the sumstats() function using the recommended SNP panel in the GenomicSEM GitHub Wiki.

I ran LDSC using the multivariate GWAS output and the rg estimate between Skeletal_LimbHeight and Skeletal_TrunkWidths is now 0.14. Should it not be 0.74, as estimated by the usermodel() function? What can I do?

Kind regards

Tino

Michel Nivard

unread,

Mar 5, 2026, 7:52:41 AMMar 5

to Constantino Rafael de la Vega, Genomic SEM Users

Hi,

IN the genomicSEM model the latent rg models the cross loadings on height, but when you do a GWAS height will dominate the signal for both factors, and a GWAS meta-analysis of any kind is a way of making a weighted aggregate of the effect sizes, accounting for the precision (se). This means that the individual SNP effects on both GWASes will be heavily influenced by height (which presumably has a huge N).

We explored an alternate ML estimator in the OG GenomicSEM paper which actually is a balanced average of effects, which places less emphasis on power, so it would lose you power but also be less weighted to a specific well powered trait. I am unsure how well supported the ML estimator stil is, as it's not something we maintain as well as DWLS, simply because in GWAS ppl focus on power, and you can get any and all rg's inside the GenomicSEM model as well.

These are some of the considerations that can potentially cause differences inside the model, compared to "outside the model". We had the same worries in the GWAS by subtraction paper which is why we did all kinds of validation of the rg's with external traits (and there they were much closer, see the supplements).

one solution you have is to do any rg;s you want inside the model, just bring extra traits into the model, should be scripts for that in the GWAS by subtraction github repo.

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/24dd3e3a-585d-4b1b-aabd-42511ec23b5bn%40googlegroups.com.

Elliot Tucker-Drob

unread,

Mar 5, 2026, 8:52:33 AMMar 5

to Michel Nivard, Constantino Rafael de la Vega, Genomic SEM Users

What Michel is describing would be most likely to occur when the fit of your no-SNP model is poor or the overpowered GWAS (height) has a relatively low loading on the factor. However, if the loadings of height is high and the model fits well, then the correlation between the two factors and the correlation between height and the factor on which it doesn't load should be very similar. You can check this by checking your loading estimates, checking the model fit, and comparing the relevant rG parameters from the S_Stand portion of the LDSC object to the interfactor correlation from your CFA. There could be other issues, such as model-mispecification, inappropriate interpretation of the output, instability of a two indicator solution. Without going through the whole process directly, it would be very hard for me to trouble shoot this for you. Generally, when you have a well-specificed stable model, the rGs involving the factors from the no-SNP model vs. from conducting a userGWAS and inputting the sumstats into ldsc again will be more similar than you are describing. They are nevertheless not expected to be equivalent because of several factors, such as differential precision of the GWAS estiamtes due to differences in N and h2 and misfit at the level of the SNP associations (Qsnp) that the two stage approach carries forward.

I would additionally recommend estimating the rG more direclty in the model by using unit variance identification and looking at the unstandardized output. Standardized output can have strange behavior at times , such as when model constraints are used (a constraint that is sensible for the unstandardized model may not be sensible for the standardized model).

To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/CAD6HHS1c9Y93KzSM9QRCC7NZuEuBNB%3D3ePt37qXf22NX55NWgQ%40mail.gmail.com.

Reply all

Reply to author

Forward