GSEA on non-model organism

14 views
Skip to first unread message

Adithi GR

unread,
Feb 10, 2026, 7:35:42 AMFeb 10
to gsea-help
Hello everyone, 

I'm new to GSEA. I'm currently working with CHO (Chinese Hamster Ovary cells) and was wondering what dataset that exists in the broad institute should I make use of. I looked at literature review and mostly they have used human or mouse datasets and was wondering if that is the right way to go about this?

Also in one of the papers, they have used "c5.bp.v3.1.symbols.gmt" as their expression dataset. However, there is no information on the CHIP platform what could that be if you would know please let me know.
Please refer to the 2.4 and 2.5 methodology section.

This is the paper link:
Transcriptomic analysis of clonal growth rate variation during CHO cell line development - ScienceDirect 
Thank you

Anthony Castanza

unread,
Feb 23, 2026, 2:19:08 PM (10 days ago) Feb 23
to gsea-help
Hello Adithi,

My apologies for the delay in getting back to you, unfortunately we don't offer direct support for Cricetulus griseus in MSigDB/GSEA. In order to run GSEA with data from this species, you would need to map your expression data to Human or Mouse orthologs. This (mapping to human orthologs specifically) is likely what was done for the linked publication, however we have no way of knowing specifically what was done there beyond what is said in their methods.
If you have your dataset in Ensembl IDs, I could perhaps prepare an ortholog chip file for you if Ensembl has the underlying orthology data (it usually does), it wouldn't be quite as polished as our release chips for Human and Mouse, but it could likely work for your purposes.

Let me know if you'd like us to attempt to generate this file.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Adithi GR

unread,
Mar 3, 2026, 5:34:52 AM (2 days ago) Mar 3
to gsea-help
Hi Anthony,

Thank you for replying back to me and helping me out.
I was able to figure it out I have another issue right now.

For my own work,
I'm trying to download specific databases the .gmt files from Broad Institute for Mouse genes.

For more context, I initially had genes in the format of Chinese Hamster which I had to map to Mouse, and I was not able to map all the genes using BioMart because some genes were in the format of LOC. Specifically for those genes I used a code to fetch the orthologs from their accession IDs and used BLAST for that purpose.

I'm worried that all the gene names in the expression file (.gct file) would not match the .gmt gene set database files.

Can anybody suggest me anything please?

Thank you


Reply all
Reply to author
Forward
0 new messages