Hi Manisha,
If you have a ranked list of features you should definitely be able to use it with GSEA Preranked provided it's formatted into a .rnk file (https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#RNK:_Ranked_list_file_format_.28.2A.rnk.29), generally we recommend that these RNK files use some metric like Log2FC for all expressed genes (for RNA-seq data) but the data you've described should work okay.
Assuming this isn't a differential proteomics study where you have both positive and negative scores (i.e. a Log2FC) you might also try Z-score transfromed data, that might give GSEA a better shot at calculating enrichment at both the top and bottom of the ranked list (i.e. above and below the mean expression).
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/3e16cf86-de11-49b7-b16d-e649412b5583n%40googlegroups.com.
Log2FC is generally the standard metric, however some users have had good results with the significance weighting offered by the combined metric. We haven't collected detailed comparative performance metrics though so can't offer an official advisement there sorry.
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/f039d556-e84f-4daf-b8ed-56e24d3ba737n%40googlegroups.com.
Hi,
I'm not entirely sure I understand the procedure you're describing here.
With GSEA you have a ranked list of genes - such as the log2 fold change calculated from differential expression of samples of phenotype A vs Phenotype B.
You use this ranked list to calculate enrichment scores for sets of genes that represent biological functions/pathways.
Those scores are reported as either positive (functions enriched in phenotype A) or negative (functions enriched in phenotype B).
It sounds like you're describing the following:
you have two independent groups (such as two groups of technical replicates) each containing paired sets of "phenotype a" and "phenotype b" samples. You've ranked genes for A vs B for each of the technical replicates resulting in two different ranked lists.
Then for each ranked list (one for each paired set of technical replicates) you've performed preranked GSEA.
Now, you're seeing pathways which are scored as enriched in one phenotype in one set of technical replicates, but enriched in the opposite phenotype in the other set of technical replicates.
Is that correct?
This kind of switching would be quite unusual. If that is the case, I might suggest trying a hierarchical clustering method on a pooled dataset to assess the fidelity of your technical replicates (i.e. are the expected replicates all clustering together). There might be some kind of batch effect that hasn't been accounted for. Also, have you looked at the enrichment plot? The typical expected shape is described in the GSEA user guide (a smooth-ish mountain shape skewed strongly to one side of the distribution). An atypical shape in one of the comparisons might be informative. Also, how does the leading edge gene membership of the set change between the replicates? This might give you an indication of what genes are driving this discrepancy.
It's difficult to speculate on answers to these kind of questions without more information about the experiment and what the results are actually looking like. If you're able to get into more details about the data that you wouldn't want to share on an open forum, you can reach out to us privately at gsea...@broadinstitute.org
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/b99d8eeb-701f-4296-b9c0-2787d4fcd617n%40googlegroups.com.