Running FGSEA or Preranked GSEA on RStudio vs Desktop

158 views
Skip to first unread message

Patrick Tamukong

unread,
Sep 19, 2021, 1:57:20 AM9/19/21
to gsea-help
Hi,

I have a preranked gene list (with log2FC as ranking metric) from a differential gene expression analysis of microarray data. I have found Hallmark pathways using fgsea on the RStudio platform [using the command: gsea_res <- fgsea(hallmark, ranks, minSize=15, maxSize = 500)]. I then submitted the same preranked gene list to GSEA 4.1.0 (installed on my laptop) and ran GSEAPreranked with 1000 permutations. I found the same pathways but with different NES (likely due to different numbers of permutations). My questions:
1) What is the default number of permutations in fgsea?
2) Is fgsea able to compute FDR q-value as does GSEAPreranked? (I thought these two are the same but their outputs don't show same statistics)
3) Is fgsea able to show leading edge percentages (e.g., tags=52%, list=8%, signal=56%) as does GSEAPreranked? 
4) Can GSEAPreranked compute Padj values as does FGSEA?
5) I understand that enriched pathways are ordered by NES. Which metric should be used as the cutoff for significance? Is it the nominal P-value (in the case of GSEAPreranked) or P-value or Padj value (in the case of FGSEA)? 

Thanks on your help. 

Patrick

Anthony Castanza

unread,
Sep 20, 2021, 1:11:00 PM9/20/21
to gsea...@googlegroups.com

Hi Patrick,

 

fgsea is not our software and we don't provide support for it, my understanding is that they take several algorithmic shortcuts that result in them being able to produce significance statistics without actually generating the full null distribution as we do in the full original GSEA implementation.

 

The FDR produced by GSEA is a global false discovery statistic using the information from the distributions of all analyzed gene sets to produce an estimate of false discovery rate given the input collection of gene sets. I don’t believe fgsea produces an equivalent statistic, it's padj is based on mathematical correction of the pval not an analysis of the true global distributions. The NOM p-value, in contrast represents the significance statistic of a gene set just compared to its own null distribution, since fgsea isn't generating all the null distributions as we do, I can't speak to the accuracy of their statistics.

 

As for significance cutoffs, since GSEA Preranked runs in gene set permutation mode and not phenotype permutation mode, the standard statistical cutoffs of p<0.05, FDR<0.05 are generally reasonable.

 

Sorry I couldn't be of more help here, if you want to dig into the mathematical assumtions of fgsea, the method is available as a preprint here: https://www.biorxiv.org/content/10.1101/060012v3.full

And we rehost the publication for the original GSEA method here: https://www.gsea-msigdb.org/gsea/doc/subramanian_tamayo_gsea_pnas.pdf (along with the GSEA User guide: https://software.broadinstitute.org/gsea/doc/GSEAUserGuideFrame.html)

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

http://gsea-msigdb.org/

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/d164c402-bedf-4c90-8859-ffc055481fa3n%40googlegroups.com.

Patrick Tamukong

unread,
Sep 21, 2021, 9:03:11 AM9/21/21
to gsea...@googlegroups.com
Thanks very much Anthony. Very helpful information. 

Patrick Tamukong


You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/dLpEwpVOQWk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/SJ0PR05MB7609DAF171400B7F3C0775B8F7A09%40SJ0PR05MB7609.namprd05.prod.outlook.com.
Reply all
Reply to author
Forward
0 new messages