meta-analysis using genomicSEM

375 views
Skip to first unread message

leah

unread,
Jul 11, 2023, 12:42:28 PM7/11/23
to Genomic SEM Users
Hi
I’m trying meta-analyze two anxiety cohorts (Otowa (2016) and Purves (2019)), replicating Grotzinger paper ( https://www.nature.com/articles/s41588-022-01057-4 )

I’m using the effective sample sizes reported in the same paper; however, I’m getting different heritability estimates than those reported in the paper.

And here are the code we used in genomicSEM:
############################################################

##======MUNGE THE FILES======##
#create vector of the summary statistics files
files<-c("otowa_withNeff.txt", "Purves_anxiety_withN.txt")
hm3<-"eur_w_ld_chr/w_hm3.snplist"

#name the traits
trait.names<-c("Anxiety_Otowa", "Anxiety_Purves")

#list the sample sizes.
N=c(8242,22054)

#definte the imputation quality filter
info.filter=0.9

#define the MAF filter
maf.filter=0.01

#run munge
munge(files=files,hm3=hm3,trait.names=trait.names,N=N,info.filter=info.filter,maf.filter=maf.filter)

####run LDSC

#vector of munged summary statistics
traits<-c("Anxiety_Otowa.sumstats.gz","Anxiety_Purves.sumstats.gz")

#enter sample prevalence of .5 to reflect that all traits were munged using the sum of effective sample size
sample.prev<-c(.5,0.5)

#vector of population prevalences
population.prev<-c(0.311,0.311)

#the folder of LD scores
ld<-"eur_w_ld_chr"

#the folder of LD weights [typically the same as folder of LD scores]
wld<-"eur_w_ld_chr"

#name the traits
trait.names<-c("Anxiety_Otowa", "Anxiety_Purves")

#run LDSC
LDSCoutput<-ldsc(traits=traits,sample.prev=sample.prev,population.prev=population.prev,ld=ld,wld=wld,trait.names=trait. Names)
#######################################################
And this is the output I got: 

Liability scale results for: Anxiety_Otowa Total Liability Scale h2: 0.237 (0.0888) Total Liability Scale Genetic Covariance between Anxiety_Otowa and Anxiety_Purves: 0.2765 (0.04) Liability scale results for: Anxiety_Purves Total Liability Scale h2: 0.5846 (0.0383) Genetic Correlation Results Genetic Correlation between Anxiety_Otowa and Anxiety_Purves: 0.7429 (0.1074)

Thank you for your attention! 

Best, 
Leah

agro...@gmail.com

unread,
Jul 11, 2023, 9:30:45 PM7/11/23
to Genomic SEM Users
Hi Leah, 

Thanks for bringing this up! If you are following the most recent GenomicSEM Github wiki recommendations and using sum of effective sample size to for the liability scale conversion (which I'm guessing you are since you put in the sample prevalence as 0.5) then you are going to get higher heritability estimates than those reported in our 2021 paper. The reason being that when the 2021 paper was published we had yet to discover the downward bias in heritability estimates that occurs due to the standard liability scale conversion failing to take into account varying levels of participant sample ascertainment across cohorts contributing to the GWAS meta-analysis. We described this bias in a more recent 2023 paper: 
Grotzinger, A. D., de la Fuente, J., Privé, F., Nivard, M. G., & Tucker-Drob, E. M. (2023). Pervasive downward bias in estimates of liability-scale heritability in genome-wide association study meta-analysis: a simple solution. Biological psychiatry93(1), 29-36.

To summarize, my hunch is that you are doing things correctly and this difference is to be expected based on our updated recommendations for best practices when analyzing binary traits using the liability scale conversion. 

Best, 
  Andrew

leah

unread,
Jul 12, 2023, 11:25:03 AM7/12/23
to Genomic SEM Users
Hi Andrew, Thank you for your response. I’m trying to follow what’s recommended in genomic SEM github; however, I am not quite sure which formula to use. 

I run a meta-analysis in genomic sem for Purves et al (2019) and Otowa (2016) to try to replicate your findings. For sample size for Purves, I used the total sample size and sample prevalence (rather than effective sample size and 0.5 for sample prevalence); and for Otowa et al (2016), we calculated the Neff using the code recommended on the GitHub Page. 

otowa <- fread('raw_data/ANX/anxiety.meta.full.cc.tbl.gz', data.table = F)

#convert allele frequency column to minor allele frequency for effective sample size calculation below
otowa$MAF<-ifelse(otowa$Freq1 > .5, 1-otowa$Freq1, otowa$Freq1)

#calculate SNP-specific sum of effective sample size
otowa$Neff<-4/((2*otowa$MAF*(1-otowa$MAF))*otowa$StdErr^2)

#calculate total effective N to cap backed out Neff
Ncases<-5539
Ncontrols<- 11771
v<-Ncases/(Ncases+Ncontrols)
TotalNeff<-4*v*(1-v)*(Ncases+Ncontrols)

#cap at 1.1 of total effective N
otowa$Neff<-ifelse(otowa$Neff > 1.1*TotalNeff, 1.1*TotalNeff, otowa$Neff)

#lower limit of 0.5 of total effective N
otowa$Neff<-ifelse(otowa$Neff < 0.5*TotalNeff, 0.5*TotalNeff, otowa$Neff)

#remove Freq column now that we have created MAF column
otowa$Freq1 <- NULL
otowa$TotalN <- NULL

#output the updated anxiety file
write.table(otowa, file = "meta_analysis/genomicSEM_anxiety/otowa_withNeff.txt", sep = "\t", quote=FALSE,row.names=FALSE,col.names=TRUE)


However, for the meta analyzed anxiety trait, I am getting a much lower heritability estimate (18%) compared to what you reported in your paper (29%). 

Thank you so much for your help with this! I really appreciated it!

Best Wishes, 
Leah

agro...@gmail.com

unread,
Jul 13, 2023, 12:52:31 PM7/13/23
to Genomic SEM Users
Hi Leah, 

I would expect you to get a different heritability estimate for meta-analyzed anxiety than what is reported paper because we didn't know to use sum of effective sample size for the liability scale correction at the time we were performing those analyses. With that said, I'm a bit surprised you are getting a lower heritability estimate for the meta-analyzed trait so I'm going to throw out a couple of suggestions and things to think about: 

1. Maybe this is the most important, but when we started these analyses only the Purves UKB summary statistics were available, but they have since made available the meta-analyzed UKB + iPSYCH + ANGST (where ANGST reflects the Otowa summary stats) which I would personally use in this case since it also includes iPSYCH and uses GWAS data that is already publicly available: 

2. How are you getting the heritability estimate? I could imagine estimating it within a model where you specify the anxiety factor defined by the two ANX GWAS you are meta-analyzing or alternatively running the GWAS meta-analysis in Genomic SEM and then putting that meta-analyzed trait through LDSC. For our paper we ran the GWAS meta-analysis in Genomic SEM and and then input the meta-analyzed summary statistics through LDSC to get the heritability estimate. 

3. One tricky part when using GWAS summary statistics from Genomic SEM is what sample size to use. We typically back out the expected sample size using the formula described at the bottom of this wiki page and then treat the trait as continuous when putting it through LDSC since it reflects the GWAS summary statistics of a latent factor. The sample size you put in can easily shift heritability estimates quite a bit so I think this is one of the most likely explanations for why you specifically have a deflated estimate relative to what we report in the paper. 

4. How are you specifying the meta-analysis? This is what we write in the supplement of the paper: "Models were specified to be equivalent to a fixed-effects meta-analysis, with both variables loading on the latent variable with an unstandardized loading fixed to 1.0, and both residual variances fixed to 0." The model I ran using the userGWAS function then looked like this: 
"F1=~1*ANX1+1*ANX2
ANX1~~0*ANX1
ANX2~~0*ANX2
F1~SNP"

Hope this helps!

-Andrew
Reply all
Reply to author
Forward
0 new messages