Hi Pronoy,
Why isn't the most significant SNP within an LD block considered as the lead SNP?
This is because: the variants with gray colors in the annotPlot are the ones with insignificant p values and/OR they are not in the reference panels even if the p values are significant.
I don't have access to your input summary statistics, so I can't check for certain, but from some intermediate files that I have access to on the FUMA server, I see that for chr12 (which is the plot you shared), these are the top 5 SNPS:
1: 12 79762579 1.110223e-15
2: 12 8548485 1.332268e-14
3: 12 8548486 2.597922e-14
4: 12 11166650 5.107026e-14
5: 12 50745894 5.773160e-14
You selected 1KG/Phase3 ALL population. You can confirm that these variants do not exist in the FUMA database by checking the file
1000 genomes ALL variants from
https://fuma.ctglab.nl/downloadPage.
If possible, can you also let me know the factors that are taken into account while choosing the lead SNP within each LD block?
Some additional information:
snps.txt: this is a list of variants that encompass: variants that are independent and significant, variants that are in LD with the variants that are independent and significant (including both the ones that are in the input file or not)
IndSigSNPs.txt: this is essentially a summary of the file snps.txt: for each variant that is independent and significant, it tabulates how many variants that are in LD with this variant (nSNPs) and of those nSNPs, how many are also found in the input file (nGWASSNPs)
leadSNPs: for each variant that is independent and significant (from the file IndSigSNPs.txt), check if the other variants in this file (IndSigSNPs.txt) are in moderate LD with it (default r2 threshold is 0.1). If there are, then the variants in the file IndSigSNPs.txt that are in moderate LD with it are not considered leadSNPs.
Best,
Tanya