prerank comparison to gene/protein set if prerank is lopsided?

74 views
Skip to first unread message

matt stratton

unread,
Jan 6, 2022, 4:17:22 PM1/6/22
to gsea-help
I'm running GSEA to compare a pre-ranked RNA seq dataset to protein lists for upregulated vs downregulated proteins from the same model. The RNA seq data set is lopsided (more inhibited genes than activated genes by a lot). Can I trust the p value and enrichment scores from this analysis? Eye test makes me wonder about false enrichment for activated genes but missed enrichment for inhibited genes based on the lopsided dataset. First image is for more abundant proteins/genes. Second image for less abundant proteins/genes.  Appreciate any help/advice.


PG rna to protein.pngpg rna to protein down.png

Anthony Castanza

unread,
Jan 6, 2022, 5:03:15 PM1/6/22
to gsea...@googlegroups.com

Hi Matt,

 

Highly skewed distributions can definitely cause some issues with calculating significance statistics for GSEA, and I would exercise caution when interpreting these results.

 

At the very bottom of the enrichment report there is a plot that shows the "Random ES Distribution" (i.e., the enrichment scores generated from the permutation tests to derive the "null distribution") it could be worthwhile looking at this plot, particularly for the sets that recived positive enrichment scores to ensure that a reasonable fraction of the permutations contributed to the positive null used for calculating the statistics for those sets. If not, then your data may benefit from increasing the permutation number to get a better estimation of the positive null. However as long as GSEA was able to generate a reasonably robust null distribution for the set it should be ok.

 

Generally, I would expect that GSEA would be more prone to calling negative enrichments in this case since there is simply a higher probability of any given gene occurring in the negative side of the distribution, but that is just the nature of this particular dataset, you would expect to find many more negative enrichments because that is the effect of whatever treatment was used.

 

So, in summary, as long as the null distributions look reasonable for the "significant" positive hits, I would – cautiously – consider this to be ok.

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/0b49fee9-1928-4ba3-96e3-125286eddab8n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages