Which SNPs are mapped to genes - candidate SNPs or independent significant SNPs?

21 views
Skip to first unread message

Line Hellwig

unread,
Jul 29, 2025, 4:34:08 AMJul 29
to FUMA GWAS users

Dear FUMA team,

thank you for providing the tool and the resources. which is very helpful!

While I was analyzing a GWAS using the SNP2GENE tool I realized that not all genes listed as “NearestGene” in the candidate SNPs list (snps.txt) are listed in the “Mapped Genes” list (genes.txt) even if the distance is less than 10kb (used as threshold for the positional mapping, no additional filters for the gene mapping).

On the tutorial page it says that the mapped genes are the “genes which are mapped by SNPs in the SNPs table”, while the SNPs table includes “All candidate SNPs (SNPs which are in LD of any independent lead SNPs) with annotations”. However at least for my analysis it seems like only the independent significant SNPs, rather than all candidate SNPs are mapped to genes.

Maybe I misunderstood something, but it is not quite clear to me which set of SNPs is actually mapped to genes in the tool and I would really appreciate some help with this question.

Thank you in advance for your help!

Best, 
Line

Tanya Phung

unread,
Jul 31, 2025, 1:21:13 AMJul 31
to FUMA GWAS users
Hi Line, 

Can you provide a specific example with specific jobID? 

From the code, the gene mapping for positional mapping is done for all of the snps from the file snps.txt. This file includes a list of variants that encompass: variants that are independent and significant, variants that are in LD with the variants that are independent and significant (including both the ones that are in the input file or not). 

Using the example input file, you can check this. If you run FUMA SNP2GENE with the example input file, one of the files that is included in the downloaded file is the file called snps.txt. This file has the information for each snp and the gene it is mapped to: 

uniqID  rsID    chr     pos     non_effect_allele       effect_allele   MAF     gwasP   r2      IndSigSNP       GenomicLocus    nearestGene     dist    func    CADD    RDB     minChrState     commonChrState  posMapFilt      eqtlMapFilt     ciMapFilt
1:7822723:C:G   rs41402249      1       7822723 G       C       0.1789  NA      0.794451        rs2797685       1       CAMTA1  0       intronic        0.342   7       5       15      1       0       0
1:7824840:C:T   rs141653094     1       7824840 C       T       0.1769  NA      0.805893        rs2797685       1       CAMTA1  0       intronic        0.592   6       5       15      1       0       0
1:7827963:T:TA  rs557622671     1       7827963 T       TA      0.1998  NA      0.853438        rs2797685       1       CAMTA1  0       UTR3    17.35   NA      5       15      1       0       0
1:7834026:A:G   rs2071987       1       7834026 A       G       0.1978  2.8e-10 0.9214  rs2797685       1       VAMP3   0       intronic        0.373   5       4       4       1       0       0
1:7837676:C:G   rs2301489       1       7837676 G       C       0.2197  NA      0.788086        rs2797685       1       VAMP3   0       intronic        0.618   7       4       4       1       0       0

cat snps.txt | wc -l
8718

Best,
Tanya


Line Hellwig

unread,
Jul 31, 2025, 4:56:30 AMJul 31
to FUMA GWAS users

Hi Tanya,

thank you for the quick response!

One jobID for which I noticed the issue is 640696, for this GWAS I found two genomic loci and for each locus in the snsps.txt file I have one “NearestGene” with zero distance, however in the mapped genes only one of them is listed.
And for another jobID 640699, I also checked if all “NearestGenes” in the snsp.txt file with distance < 10kb are in the genes.txt file which is not the case.

I now also run SNP2GENE with the example input file as you proposed and downloaded the results and not all “NearestGenes” can be found in the mapped genes.

As I said, maybe I’m missing something here, so it would be great if you could take a look at this. Thank you!

Best,
Line

Tanya Phung

unread,
Aug 1, 2025, 10:32:59 AMAug 1
to FUMA GWAS users
Hi Line, 

In job 640696, I saw that there are 3 unique genes in the nearestGene column: 

awk '{print$12}' snps.txt | sort | uniq
AC018359.1
FECHP1
TENM2
nearestGene

The gene AC018359.1 is lincRNA gene. In your submission you specified protein coding gene under genetype. Therefore, this gene is not considered. 

For gene FECHP1, it's not in the mapped genes because the distance is >10kb which was specified during the submission step. 
grep FECHP1 snps.txt
3:34779669:A:C  rs144251471     3       34779669        A       C       0.05865 2.79591027007e-05       0.982208        rs76918245      1       FECHP1  134463  intergenic      2.655   6       9       15      0       0       0
3:34786139:C:T  rs79016602      3       34786139        C       T       0.05765 9.50796834837e-05       1       rs76918245      1       FECHP1  127993  intergenic      0.383   7       9       15      0       0       0
3:34794500:G:T  rs76918245      3       34794500        T       G       0.05765 9.98317847845e-06       1       rs76918245      1       FECHP1  119632  intergenic      0.065   7       9       15      0       0       0

Best,
Tanya

Line Hellwig

unread,
Aug 1, 2025, 10:40:17 AMAug 1
to FUMA GWAS users

Hi Tanya,

thank you for looking into this. 
I wasn’t aware that I filtered to protein coding genes as I assumed everything related to Gene Mapping is specified under point 3, I should have looked into this more carefully.

Thank you for the clarification! 

Best,
Line

Reply all
Reply to author
Forward
0 new messages