Chose the threshold of significace

89 views
Skip to first unread message

Dongmeng Wang

unread,
Sep 13, 2023, 6:42:00 PM9/13/23
to MEME Suite Q&A
Hello,

I'm trying to identify the motifs enriched in the set of primary sequences compared to the shuffled sequences. I'm quite new in this area. Is it possible for you to help with the following questions?

1. I don't think these sequences should be ranked in my analysis. Should I choose SEA rather than AME?

2. The output reports p-value, E-value and q-value. From my understanding, E-value looks like the Bonferroni correction, which is the p-value mulplied by the number of tested motifs. If that's correct, then the threshold of E-value should normaly be <0.05, why the default threshold is E-value<10? I didn't get the meaning of setting it as 10. 

3. As for the q-value, is it directly generated from the p-value of each motif. Say I have 84 motifs to test, then the q-value will be calculated based on these 84 p-values? If so, when the number of motifs is not big enough, is it suitable to use q-value<0.05 as the threshold?

Thank you so much,
All the best,
Meng

cegrant

unread,
Sep 13, 2023, 7:34:22 PM9/13/23
to MEME Suite Q&A
Hi Meng,

1. Yes, SEA would be the more appropriate tool.

2. E-value, q-value, and Bonferroni correction are all forms of multiple testing correction. In this case the E-value is the Bonferroni correction for the number of motifs tested (other applications may use slightly different definitions for the 'E-value'). An E-value/q-value of 0.05 is the conventional threshold for statistical significance, but that is only a convention. We chose a higher default threshold because the tools in the MEME Suite are often used in an exploratory fashion, so it's often helpful to include results that are only weakly significant. The user can then make adjustments to the data provided and parameters like the background model to see if more significant results can be obtained. If you find the inclusion of weakly significant results distracting you can set the E-value threshold to use in the "Advanced Options" section.

3. Yes, the q-values are estimated from the distribution of observed p-values. In particular the fraction of p-value corresponding to a true null result is estimated (pi_0). If there aren't enough p-values available to make that estimate, it is assumed to be 1.0. You can continue to use q-value < 0.05 as a threshold. 

Dongmeng Wang

unread,
Sep 14, 2023, 7:37:56 AM9/14/23
to MEME Suite Q&A
Hi,

Thank you so much for your quick response. It's really clear now. 

All the best,
Meng

Reply all
Reply to author
Forward
0 new messages