multiple entrez gene id for protein in GSEA

9 views
Skip to first unread message

Liu Zhe

unread,
Jun 21, 2024, 4:08:31 AMJun 21
to gsea-help
Hi,
I am going to perform GSEA on the proteome data.
Some molecular of proteome have mutiple entrez ids and Gene symbols.

My question is, how did GSEA process this kind of data with mutiple entrez ids or gene symbols?
will it make the data with mutiple id have a higher rank than other data which only has one entrez id? 

Anthony Castanza

unread,
Jun 21, 2024, 10:12:37 AMJun 21
to gsea-help
Hello,

GSEA generally expects 1:1, or with collapse, many:1 mappings. We also generally recommend performing the analysis in the space of (collapsed to) gene symbols as the problem is lessened in that space.

The short answer is that we don't, and can't really, resolve these kinds of issues on our side.

When building our chip files we've endeavored to resolve as many entrez Id to gene symbol conflicting mappings as possible by hand cleanup, but after that point the software just takes the data as it is and does whatever collapse operation you've selected.

I would probably advise trying to do some kind of weighting in your dataset before doing the analysis.

Sorry I couldn't be of more help here


-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/c2ebd4a7-8ca4-4499-b784-cfe1ba3e2f5fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages