Multiple hypothesis correction for branch-site test for genome-wide screens

407 views

Skip to first unread message

Urjaswita Yadav

unread,

Apr 11, 2018, 3:22:41 PM4/11/18

to PAML discussion group

Hi PAML users,

I am struggling with the issue of multiple hypothesis correction in my analysis. I have used branch site test to identify genes under positive selection in my lineage of interest genome wide (~3000 genes). I correct for multiple comparisons using LRT p-value using Benjamini-Hochberg method, and all the genes have a very high FRD (FDR > 0.2). Which means that none of the gene is under positive selection if I correct for multiple comparison!!

I think this is due to the fact that my p-value histogram is highly skewed towards 1 with most of the genes having a LRT p-value close to 1. Please see below my p-value histogram:

Now, my question is:

1. How to deal with FDR correction if I run branch-site test for thousands of genes?

2. What is the best method to do it (Benjamini-Hochberg or something else)?

3. The p-value histogram is very atypical for multiple hypothesis correction because there is no peak at low p-values. What does it mean? It does not make sense that there are no genes under positive selection in my genome.

Any advice or insight is highly appreciated. Thank you for your help!

- Urja

Ziheng

unread,

Jul 31, 2018, 1:31:16 PM7/31/18

to PAML discussion group

i also think that " It does not make sense that there are no genes under positive selection in my genome", so i think that means there is no need to do the multiple testing correction. the correction is for testing the null hypothesis that no gene is under positive selection.

instead you can rank your genes and look at what the genes with the strongest evidence for positive selection do.