Question on FDR testing with logFC threshold

53 views

Skip to first unread message

BigOmics Analytics Team

unread,

Mar 21, 2023, 9:22:33 AM3/21/23

to Omics Playground

[Morgane M, 21.3.2023]

Hi,

I have a question regarding the calculation of FDR for gene's differential expression. I aim to have a calculation of significance of differential expression following the hypothesis of logFC >= |0.5| and not the hypothesis of logFC > 0. This is according to this methodology

"For well-powered experiments, however, a statistical test against the conventional null hypothesis of zero LFC may report genes with statistically significant changes that are so weak in effect strength that they could be considered irrelevant or distracting. A common procedure is to disregard genes whose estimated LFC β ir is below some threshold, |β ir |≤θ. However, this approach loses the benefit of an easily interpretable FDR, as the reported P value and adjusted P value still correspond to the test of zero LFC. It is therefore desirable to include the threshold in the statistical testing procedure directly, i.e., not to filter post hoc on a reported fold-change estimate, but rather to evaluate statistically directly whether there is sufficient evidence that the LFC is above the chosen threshold."
From Love et al., Genome Biology (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

I clearly see on the platform that the change of logFC threshold does not lead to a recalculation of the FDR. Could you provide a way to generate this recalculation?

While thanking you in advance.

Best regards,

Morgane

Ivo Kwee

unread,

May 22, 2023, 7:49:53 AM5/22/23

to Omics Playground

Hi Morgane,

I think you are referring to the paper "Testing significance relative to a fold-change threshold is a TREAT " (McCarthy, Smyth, 2009). While perhaps the methodology might be theoretically correct, in practice I have never seen people using this kind of statistics. I think because the method is kind of non-practical partially because the choice of threshold is kind of "arbitrary" and also because for each threshold (as you said) the p-values have to be recomputed. The conventional zero LFC is also easier to compute using permutation and while the t-test of limma may have closed formula to derive the p-values, this may not be trivial for other statistical methods. So it wouldn't be easy to compare between methods with different null hypothesis. I don't expect we will include this test method in our platform, until I am more convinced. We tend to stick with the most well accepted methods.. If you want to try, the TREAT method is in the LIMMA R package.

Best

Ivo Kwee

Reply all

Reply to author

Forward

0 new messages