Hi Adel,
The gene sets themselves are unordered, when they are scored using your data the rankings for each gene are derived from your dataset. Typically this is a metric of the gene’s differential expression in your dataset, either one that you’ve provided in a Preranked list for GSEA Preranked or through GSEA’s calculation of differential expression (i.e. the signal to noise ratio that is the default metric for ranking genes).
Once GSEA produces this ranking, each gene set is scored by walking down the ranked list and adding to the running sum a value derived from the gene’s ranking value if the gene is in the set, or subtracting from the running sum if the gene is not in the set. The point at which the maximum deviation from zero is obtained in the running sum is defined as the Enrichment score. Genes that are found in the ranked list before the maximum deviation from zero is obtained are the “core enrichment” genes, also called the leading edge, these are the genes that are most strongly driving the enrichment score (i.e. the contribution of these gene ‘hits’ is outweighing the gene ‘misses’ up until this point in the list).
Hope this helps! Let me know if you have any additional questions!
-Anthony
Anthony S. Castanza, PhD
Department of Medicine
University of California, San Diego
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/35d08de4-142d-4f9f-85bd-315332651157n%40googlegroups.com.
Hi Adel,
The core enrichment genes are exactly what are used for features like GSEA’s “Leading Edge Analysis” so there is definitely some additional value to be extracted there. That said, we would probably not recommend using just those genes in lieu of the results from a standard differential expression analysis since the GSEA leading edge genes will only contain the genes that are both highly ranked, and were found in a gene set.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/00de1288-3632-47aa-af13-dfa9c548e6b1n%40googlegroups.com.