ssGSEA input data

1,043 views
Skip to first unread message

Katia S

unread,
Nov 4, 2021, 10:51:28 AM11/4/21
to gsea-help
I am trying to run a ssGSEA analysis but I am a bit confused about the input. Can I input normalised counts as in the GSEA?
Also is it allowed to have zero values?

Many thanks,
Katia

Anthony Castanza

unread,
Nov 4, 2021, 11:51:16 AM11/4/21
to gsea-help
Hi Katia,

For ssGSEA we'd recommend a metric that has a gene length normalization built in, such as gene level TPM, or FPKM, since ssGSEA compares within a sample for relative enrichment. Normalize counts is more appropriate for standard GSEA which ranks per-gene differentially between samples.

Zero values are allowed but as they are considered a "tie" in the ranking, when the gene list is sorted, the order ends up arbitrary. This can have a variable effect on the score depending on how extensive the zeros are. The issue here is largest with single-cell data where we don't necessarily recommend ssGSEA for individual cell-level quantitation, but binning to cluster level seems to work reasonably well, as does bulk sequencing data.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/8a7a3b20-f419-4532-aafa-303439377781n%40googlegroups.com.

Katia S

unread,
Nov 4, 2021, 5:31:22 PM11/4/21
to gsea-help
Hi Anthony,

Many thanks for your reply. That was very helpful!

Best,
Katia

Katia S

unread,
Nov 13, 2021, 5:15:29 AM11/13/21
to gsea-help
Hi Anthony,

Can you please let me know based on what do I choose the sample normalization method during the ssGSEA. 
Just to confirm I didn't pre-rank my genes, I just input normalised counts, is that correct?

Many thanks,
Katia

Anthony Castanza

unread,
Nov 13, 2021, 2:30:46 PM11/13/21
to gsea-help
Hi Katia,

Like I said before TPM or FPKM are more appropriate for ssGSEA than "normalized counts", at least, the way we normally understand normalized counts is non-length corrected counts normalized by something like quantile normalization, or DESeq2's median of ratios. For ssGSEA it is strongly recommended to use an expression statistic that is length normalized.
The "
sample normalization method" parameter should generally be left as "none" it isn't applicable to most data sets and the options will not give a desired behavior anyway if the input data type is not appropriate for the method.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
Reply all
Reply to author
Forward
0 new messages