FDR=0.1 is significant in DESeq2?

2,803 views
Skip to first unread message

Farbod Emami

unread,
Jan 24, 2016, 4:27:28 AM1/24/16
to trinityrnaseq-users
Dear Mark, Hi.
Is it true that the FDR=0.1 (instead of 0.01 or 0.05 or 0.001) shows the significant up-regulated transcripts is the new DESeq2 package ?
Thanks 

Mark Chapman

unread,
Jan 25, 2016, 2:27:04 AM1/25/16
to Farbod Emami, trinityrnaseq-users
Hi Farbod,
P values and associated FDR-controlled P values are ways of saying how certain you are that a result is real. Hence at FDR of 0.1 you're 90% sure its 'real', given the multiple testing associated with something like RNAseq. So an FDR of 0.1 in DEseq2 is the same as 0.1 any other statistical test.
Best, Mark

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
Dr. Mark A. Chapman
+44 (0)2380 594396
------------------------------------
Centre for Biological Sciences
University of Southampton
Life Sciences Building 85
Highfield Campus
Southampton
SO17 1BJ

biocoding

unread,
Jan 25, 2016, 9:09:21 AM1/25/16
to trinityrnaseq-users, farbo...@gmail.com
The question is probably about what cutoff DESeq2 uses as default for FDR. From the manual (p. 10), this is indeed 0.1.

This can be set by the user to another cutoff (p. 10) and there is always a discussion to be had about what is reasonable values for FDR. To some degree, it depends on what you will do with the results. If it is just an exploratory analysis that will be dissected with more filters later on for .e g. pathways or PCR corroboration, it is possible to be more lenient in this step. If those are missing and you are just trawling data for anything at all that appears to stand out, there is often a need to be apply great intellectual restraint.

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf

There is also a ton of misconceptions about p values, and so we should try to avoid these kinds of simplifications. A p value is the probability of observing at least as extreme results, given that the null hypothesis is true. It is a conditional probability, it isn't the probability of the null hypothesis and 1 - p certainly isn't the probability that the results are real or even "real" in an informal sense as an study, no matter how good p values you have, can be affected by biases, errors, excess data analysis flexibility by the user etc.

/Biocoding


On Monday, January 25, 2016 at 8:27:04 AM UTC+1, Mark Chapman wrote:
Hi Farbod,
P values and associated FDR-controlled P values are ways of saying how certain you are that a result is real. Hence at FDR of 0.1 you're 90% sure its 'real', given the multiple testing associated with something like RNAseq. So an FDR of 0.1 in DEseq2 is the same as 0.1 any other statistical test.
Best, Mark

On 24 January 2016 at 09:27, Farbod Emami wrote:
Dear Mark, Hi.
Is it true that the FDR=0.1 (instead of 0.01 or 0.05 or 0.001) shows the significant up-regulated transcripts is the new DESeq2 package ?
Thanks 

--




Farbod Emami

unread,
Jan 25, 2016, 10:10:13 AM1/25/16
to trinityrnaseq-users, farbo...@gmail.com
Dear Biocoding, Hi

Finally it must be some sort of all-will-accept statistic for showing biological significant up-regulation in the concept of FDR and DESeq2 algorithm !

for example the pvalue = 0.05 have been accepted as "significant" in normal t-test comparisons, 

My question is this that : I have read about FDR = 0.001 or 0.01 or 0.05 , as significant up-regulation in the process of comparison the expression of a transcript or gene between case and control. What about FDR = 0.1?

You have asked about the purpose of using FDR and the goal of research? imagine that you want to check if the expression of the "mustache" gene is significantly up-regulated in male human in compare to 

females. which FDR and foldchange will you use ?


Thanks

biocoding

unread,
Jan 27, 2016, 8:19:31 AM1/27/16
to trinityrnaseq-users, farbo...@gmail.com
Unfortunately, there is no single answer that will be the best for all possible analyses.

Typically, people chose FDR values between 0.0001 and 0.1 and fold change values between 150%-400% (i.e. log2 FC between 0.58... and 2), but you can probably find some that have picked outside this range.

For instance, it is popular to require FDR < 0.05 and (FC > 1 or FC < -1) i.e. a doubling or halving of gene expression.

As always, it is easy to put too much focus on e. g. p values, but it is also important to focus on the size of the difference, how precise you have estimated the expression levels and what it all means in the scientific context.

An RNA-seq differential expression analysis can be thought of as a "screen" over all genes. It will pick out things were the difference is large enough and the spread between replicates is small enough for them to pass cutoff criteria. Some of the most interesting findings or genes were it is not clear can be independently corroborated by PCR.

Whatever you do, do not base your FDR and FC cutoffs based on wanting particular genes to pass this filter; that would be circular reasoning and invalid.

Brian Haas

unread,
Jan 27, 2016, 8:54:05 AM1/27/16
to biocoding, trinityrnaseq-users, Farbod Emami
really great points here!!



--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Tiago Hori

unread,
Jan 27, 2016, 9:26:29 AM1/27/16
to Brian Haas, biocoding, trinityrnaseq-users, Farbod Emami

Just to further the discussion, please be aware that FDR and p-values are often completely different beasts. While some people used corrected p-values, which are still p-values, FDRs are an estimation of false discovery. In crude words, a 1% FDR says that there is a chance that about 1% of the genes in your list are false. In a list of 10 genes that’s nothing, in list of 1000, could be more significant. Just remember that p-values do not measure the chance of a hypothesis being wrong, but rather the chance of a results as extreme or more extreme being  observed. Then the difference between a p-value and FDR become clear.

 

Also, always keep in mind what analysis will follow up. If you are picking genes for QPCR, you could be more lax, because you will be doing confirmatory work. However, I would not report data a FDR of 10% for example, unless I made it super clear that it was the case and had a good reasoning. However, if you are doing pathway analysis, you have to be stringent, cause you don’t want overly inflated gene lists to confound your results.

 

As we stated many time here, there is no one answer or magic number. As a scientist we have to understand the underlying principles of these methods and make informed choices.

 

T.

Tiago Hori

unread,
Jan 27, 2016, 9:31:20 AM1/27/16
to Brian Haas, biocoding, trinityrnaseq-users, Farbod Emami
Reply all
Reply to author
Forward
0 new messages