Conditional filtering in gene mapping

61 views
Skip to first unread message

Khoa Ho

unread,
Jun 24, 2025, 12:30:43 PMJun 24
to FUMA GWAS users
Hi FUMA team,

I'm currently running gene mapping of my GWAS sumstat on FUMA and notice that for eQTL, whenever I apply both CADD filter and RDB filter, it will map using SNPs satisfy either condition (CADD or RDB). I wonder if this is also true for chromatin interaction and positional mapping?

Thank you!!

Best,
Khoa

Tanya Phung

unread,
Jun 25, 2025, 6:04:54 AMJun 25
to FUMA GWAS users
Hi Khoa, 

Could you please give some examples to show: "it will map using SNPs satisfy either condition (CADD or RDB)"? 

If you are curious, you can check the codes itself: 
For positional mapping filtering: https://github.com/vufuma/FUMA-webapp/blob/master/scripts/genemap/geneMap.R#L118-L123

Best,
Tanya

Khoa Ho

unread,
Jul 12, 2025, 11:48:31 AMJul 12
to FUMA GWAS users
Hi tanya,

is it possible for a SNP in snps.txt file to still get eqtlMapFilt = 1 even though it did not pass additional filtering?
I set RDB threshold to be 3b and I saw that SNPs did not pass the threshold still get eqtlMapFilt = 1. Is there any reason to this? Because the code snippet in github should work without a problem.

Best,
Khoa

Tanya Phung

unread,
Jul 16, 2025, 8:57:18 AMJul 16
to FUMA GWAS users
Hi Khoa, 

Could you please provide a specific job ID with specific information on the SNP that you are seeing the problem? 

Best,
Tanya

Khoa Ho

unread,
Jul 16, 2025, 12:46:17 PMJul 16
to FUMA GWAS users
Hi Tanya,
It's nice to hear from you. You could take these two job ID:
640191: with filtering in eQTL
640216: no filtering in eQTL

There are two problems: 
  • First eQTL mapping using SNPs not passing threshold.
    • The thing is when I apply filtering in eQTL, there are few genes mapped solely by eQTL but all the SNPs it uses for mapping didn't pass the threshold, say RDB score.
  • Second, you would want to take a look at rs147643737 on chromosome 1 for a case where a SNP satistifes both RDB threshold (3b) and eQTL FDR but still ended up filtered out.
I checked the code you sent and it seems to me that this should work normally as you only use SNPs passing specified threshold for assigning eqtlMapFilt = 1. I can't wrap my head around this.
What do you think Tanya? Let me know!


Best,
Khoa
Kho

Khoa Ho

unread,
Jul 16, 2025, 12:49:56 PMJul 16
to FUMA GWAS users
For the first case you might want to see rs375256 on chromsome 6, where the gene LEMD2 was mapped only due to this SNP. But the SNP didn't pass eQTL filtering at all.

Best,
Khoa

Tanya Phung

unread,
Jul 17, 2025, 6:58:50 AMJul 17
to FUMA GWAS users
Hi Khoa, 

Thank you for the information. 

It turns out that this is a bug in the code base. 

I can confirm that there is an eQTL mapping to this gene
Screenshot 2025-07-17 125248.png
that SHOULD not be because its RDB score is 5 which is greater than your specified threshold of 3a. 

where eqtl$uniqID[annot$RDB<=eqtlMapRDBth] works properly ONLY when the number of rows are the same between the 2 dataframes. When they are not the same, they result in erroneous results.

As a test, I reproduced this: 
eqtl <- eqtl[eqtl$uniqID %in% eqtl$uniqID[annot$RDB<=eqtlMapRDBth],] write.table(eqtl, "original_code.txt", quote=FALSE, row.names=FALSE, sep = "\t")

grep 32975869 original_code.txt | grep LEMD2 6:32975869:A:G eQTLGen eQTLGen_cis_eQTLs ENSG00000161904 A 1.3509e-07 -5.2718 0.00045107129432402 G + 6 32975869 LEMD2 1 

Fixed code: 

merged_df <- merge(eqtl, annot, by = "uniqID")
filtered_ids <- merged_df$uniqID[merged_df$RDB <= eqtlMapRDBth]
eqtl <- eqtl[eqtl$uniqID %in% filtered_ids, ]
write.table(eqtl, "corrected_code.txt", quote=FALSE, row.names=FALSE, sep = "\t")

grep 32975869 corrected_code.txt | grep LEMD2 #return nothing 

So to answer your question: 
the issue you observed is due to bugs in the code. 
To properly fix this, I will need to identify all the places where this is coded incorrectly and fix them, and push them to production. Be aware that it is the summer holiday at the moment so there is going to be a delay. I will let you know once it is fixed. 

Thank you for reporting the issue.

Best,
Tanya

Khoa Ho

unread,
Aug 3, 2025, 3:16:16 PMAug 3
to FUMA GWAS users
Hi Tanya,

I recently openned a pull request to fix the problem mentioned here. I hope that you could take a look at it. In summary, it fixes the RDB filtering problem in eQTL. No other places in the geneMap.R file has the same problem. While I notice an inconsistency between ciMapFilt in ci.txt and mapped genes in genes.txt, the final gene list in genes.txt is still valid.

Furthermore, I used code chunks provided in allSNPs and gwas_file dir to generate all.txt.gz for redrawing circos plots for the new list of eQTL. And I saw that the inconsistent gene (HLA-DPB1) was not included. In chromatin interaction. So I think the inconsitency is not really a bug I'd say?

Best,
Khoa

Khoa Ho

unread,
Aug 3, 2025, 3:20:02 PMAug 3
to FUMA GWAS users
I want to provide plots just in case you want to take a look at it.

Before:
circos_chr6.before.png

After fixing:
circos_chr6.png
Best,
Khoa

Tanya Phung

unread,
Sep 16, 2025, 3:35:04 AM (12 days ago) Sep 16
to FUMA GWAS users
Hi Khoa, 

First, thank you for your patience. 

Second, thank you for providing the fix. I approved it and merged it to production. Can you test it and let me know if the fix is applied properly? 

As for your other reported issue with the ci gene list, I will need to create a different ticket to work on it at a different time, but feel free to fix it if you have time. 

Best,
Tanya

Reply all
Reply to author
Forward
0 new messages