FDR threshold for preranked data

147 views
Skip to first unread message

Kira Cozzolino

unread,
Sep 1, 2023, 12:22:27 PM9/1/23
to gsea-help
Hello, I'm in the process of using GSEA to analyze pairwise comparisons of RNAseq expression data, and had a question about best practices for setting an FDR cutoff. I've used DESeq2 to analyze relative gene expression and generate a ranked list of genes by fold-change and p-value, which I then fed into GSEA. I read on the GSEA user guide page that an FDR of 0.25 is generally appropriate, unless "you have a small number of samples and use gene_set permutation (rather than phenotype permutation) for your analysis." I wasn't sure whether this applied to preranked gene lists - would you advise an FDR cutoff of 0.05 for preranked, pairwise analysis? 
Thanks!

Castanza, Anthony

unread,
Sep 1, 2023, 12:52:17 PM9/1/23
to gsea...@googlegroups.com

Hi Kira,

GSEA Preranked always uses the gene set permutation method for generation of the null distribution, thus the recommendation for more stringent FDR thresholds applies.
That said, when you say you generated a “ranked list of genes by fold-change and p-value”, what exactly do you mean here? My concern is that since GSEA Preranked only supports ranking by a single metric, you might’ve applied cutoffs to the data that you shouldn’t have. Or do you mean you used a method for ranking that computes both those values into some combined statistic?

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/a1ad1996-21cb-4d7d-8d79-fa4dcf2c10a9n%40googlegroups.com.

 

Kira Cozzolino

unread,
Sep 1, 2023, 1:26:54 PM9/1/23
to gsea-help
Hi Anthony, sorry for not being clear - the ranked list is generated in R by multiplying the -log10(adjusted p-value) by the sign of the log2(foldchange).
Reply all
Reply to author
Forward
0 new messages