FUMA code for obtaining LeadSNPs

530 views
Skip to first unread message

Carolina Makowski

unread,
May 6, 2020, 12:20:44 PM5/6/20
to FUMA GWAS users
Hi Kyoko,
Firstly, thank you for creating and maintaining such an amazing web-based tool! 

I had a question about the code for obtaining LeadSNPs in FUMA.
I have several replication analyses I would like to do on many phenotypes - For these replication analyses, I only need the number of LeadSNPs for each of these phenotypes, without the in-depth functional follow-up that FUMA provides for my main analyses.

I have been using plink to try to replicate the number of LeadSNPs that FUMA provides but am finding that my plink results are always giving me a slightly large number of hits compared to what FUMA gives me.
Specifically, I have been using the default FUMA parameters of defining independent SNPs at r2=0.6, and lead SNPs at r2=0.1 with 250kb LD blocks.
To replicate this in plink, I tried to do a two-stage process of initial round of clumping with r2=0.6, then a second round with r2=0.1 on the SNPs obtained from stage 1, and as mentioned this is giving me a different number of hits compared to FUMA.
I suspect this may be because of the difference in the reference genome but would welcome your thoughts on this as well.

Is there a way I can obtain the code for FUMA's process of obtaining LeadSNPs, based on the parameters I used for my FUMA run? 

Thanks in advance for your help.

Carolina Makowski

Kyoko Watanabe

unread,
May 10, 2020, 2:52:05 PM5/10/20
to FUMA GWAS users
Hi Carolina,

I pre-computed pair-wise LD for 1KG Phase3 to speed up FUMA process.
I used PLINK with the following command
plink --bfile $pop/$pop.chr$i --maf 1e-4 \
	--r2 --ld-window-r2 0.05 \
	--ld-window 100000 \
	--out $pop/$pop.chr$

So the main difference is, --r2 flag by default computes squared of raw inter-variant allele count correlations.
While when you use --clump, it uses --r2 dprime which is based on maximum likelihood haplotype frequency estimates.
I never really compared how different they are but I recently discovered that it does make a difference when you clump with plink compared to FUMA.

One thing I noticed is that 250kb window is not to define LD block but it's just a threshold to merge LD blocks, in both steps of clumping in FUMA I use the same precomputed LD, so as shown above, I did not limit by the distance but the count of SNPs maximum 100000.

The full script is available at git hub
The script doing clumping is storage/scripts/getLD.py
(I'm sorry but this script is bit messy and might be hard to read...)


Let me know if you have any further question.

Best,
Kyoko

Carolina Makowski

unread,
May 18, 2020, 6:35:52 PM5/18/20
to FUMA GWAS users
Hi Kyoko,
Thanks for your response. That's very helpful.

I do have a question though because these pre-computed LD scores are for the 1k Phase 3 data only - I used the UKB Phase 2b reference panel in my FUMA analysis. Are the pre-computed LD scores for UKB available?

Thanks a lot 
Carolina

Kyoko Watanabe

unread,
May 30, 2020, 5:13:53 PM5/30/20
to FUMA GWAS users
Hi Carolina,

I am sorry for the slow response.
I cannot share the UKB reference panel at the moment since the access to the UKB is restricted by each application and under our application, I cannot share the reference panel (I can share GWAS sumstats but not the actual genotype data).
I am sorry about that and hope you understand.

Best,
Kyoko

Andreas Schmidt

unread,
May 18, 2021, 7:16:07 PM5/18/21
to FUMA GWAS users
Hello,
I found this old post, because I am about to clump some SNPs myself (as they do not reach the significance threshold needed for fuma clumping).
If I understand it correct, PLINK and FUMA clump in a different way (FUMA by r-squared and PLINK by D-prime)?
If this is true, I would get lead SNPs by plink with one method and indipendent significant SNPs in FUMA with another method...This does not seem to be a good strategy...Do you have any suggestions, how to do it?

Best,
Andreas

Tanmay Ahmed

unread,
Jun 13, 2025, 2:26:38 PMJun 13
to FUMA GWAS users
Hi Kyoko,
The link for downloading  pre-computed LD files  are not working. Could you please share the updated link?
Best,
Saiful

Reply all
Reply to author
Forward
0 new messages