I'm reading the GSEA paper and I have a question about it. Because GSEA increases the running sum statistic the most when genes appear close to each other (and hence yields a large ES score), it seems to me that it assumes that pathways (as defined by prior gene lists) involve genes that are positively correlated with each other. For example, if we had a moderate number of genes at the top and bottom of the list, then we would NOT detect a pathway that involves genes that are both at the top and bottom (i.e genes that are negatively correlated with each other). Am I understanding this correctly?
Hello,
As you suspect, in general, the best performing gene sets are ones where it is expected that the members of the set would be coordinately regulated. “Pathway” sets can indeed be suboptimal for precisely the reason you describe, a pathway that has members that are both upregulated and downregulated roughly equally would likely not be significantly enriched, at least with default GSEA settings. GSEA does offer an “absolute value” mode that can be used to help address this kind of scenario but it isn’t commonly used.
-Anthony
Anthony S. Castanza, PhD
Department of Medicine
University of California, San Diego
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/6508b6cf-5ad9-4716-8c08-53b8423f445cn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/1297a11d-c88d-4ee8-b081-35b6f14ebdeen%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZxoRNFL%2BP9APP0iu2LDCMytQDebXigd1sFXR2B9hGfTtw%40mail.gmail.com.