Failed to successfully convert the symbol name

9 views
Skip to first unread message

Jie Sun

unread,
Oct 2, 2025, 10:59:27 AMOct 2
to FUMA GWAS users
Dear FUMA Technical Team,

I'm encountering a significant data retention issue when processing my GWAS results through FUMA (Job ID: 664200). My input file (TXT format) (https://drive.google.com/file/d/11ZATrFMBLf1bq91rlD-Il9PIjUBs_QpK/view?usp=drive_link) contains standard GWAS columns including SNP rsIDs, chromosome, position, alleles, p-values, effect sizes, and standard errors.

However, I've observed that only a small fraction of the SNPs from my original input file actually appear in the rsID column of the output snp.txt file. The majority of my input SNPs seem to be filtered out or lost during processing.

I've already configured the analysis with the most permissive settings to maximize SNP retention. Could you please help me understand:

1. What processing steps might be causing such extensive SNP exclusion?
2. Are there specific quality control filters that could be removing these SNPs?
3. Could this be related to reference genome compatibility (my data is based on GRCh37/hg19)?
4. Are there common formatting issues that might cause SNPs to be dropped silently?

My goal is to obtain gene symbol annotations for all SNPs in my original dataset. I would greatly appreciate any insights into why my input SNPs are not persisting through to the final snp.txt output and how I might address this issue to recover gene symbol mappings for my complete SNP set.

Thank you for your support.

Tanya Phung

unread,
Oct 3, 2025, 4:03:40 AMOct 3
to FUMA GWAS users
Hi, 

I do not have permission to access the file that you shared. Please send the input to our email address. 


Best,
Tanya

Tanya Phung

unread,
Oct 6, 2025, 3:37:45 AM (13 days ago) Oct 6
to FUMA GWAS users
Hi, 

There are 162 variants in your input file Final_MTAG_CPASSOC_leadSNP_Merged.txt. 
17 of these variants are in the file snps.txt. 

The rest are not present in the file snps.txt because they are in the MHC region. 

Some R code to confirm: 

original = fread("Final_MTAG_CPASSOC_leadSNP_Merged.txt")
original_non_mhc = original %>% filter(BP<29614758 | BP>33170276)
nrow(original_non_mhc)
[1] 17

Indeed there are 17 variants outside of the MHC region. 

I have already answered a previous question related to this topic here, so you can check it out: https://groups.google.com/g/fuma-gwas-users/c/n8fUJcbOSUk/m/v2rjuQqnAAAJ

Best,
Tanya
Reply all
Reply to author
Forward
0 new messages