Mixed MSigDB versions detected error

31 views
Skip to first unread message

Roei Cohen-Almagor

unread,
Jun 21, 2024, 9:54:26 AMJun 21
to gsea-help
Hi Team, 

I am running an analysis of RNAseq data with ENTREZEGENE IDs as my expression dataset and a Human Gene Set: Grade_Colon_And_Rectal_Cancer_Up. I uploaded this gene set by drag and dropping the gmt file. However I keep receiving the same error that the selected Chip does not match the version of MSigDB collection selected. Some gene identifiers may not be mapped. 

I have attempted running without the Chip with the no collapse, changing my expression dataset to ensemble gene IDs and changing the chip platform multiple times. 

Please could you advice on how to avoid this issue? Is it possible that it is an issue with the gene set itself? When I use the hallmark gene set this does not occur. 

Thank you, 
Roei

Anthony Castanza

unread,
Jun 21, 2024, 10:00:00 AMJun 21
to gsea-help
Hi Roei,

Could you please send a screenshot of your Run GSEA tab particularly showing the gene set files name and any chip file you used on your runs?

This is normally just a warning, but if you've picked a chip from a different version of MSigDB than the gene sets its probably something we should try to fix.

You might also try downloading the file you want to use from the website and loading it in manually that should generally bypass this, but we should definitely still try to figure out what is causing it.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/14135430-2f0e-427c-ac0f-080a9de54385n%40googlegroups.com.

Roei Cohen-Almagor

unread,
Jun 24, 2024, 9:59:55 AM (12 days ago) Jun 24
to gsea...@googlegroups.com
Hi Anthony, 

Thank you for your reply. I have attached an image below. 

I have attempted to upload the file manually, but I still have the same error unfortunately. Is it possible that the gene set is faulty? If I open the file in excel it doesn't appear to contain the data usually found in a GMT file. 

Thank you for your assistance. 

Best wishes, 
Roei

image.png




You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/w4AWcswS2vM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZyOC%2BrHwG35DYDxeofc1kDPsXXeuSURO4FSEyK80tUP8A%40mail.gmail.com.

Anthony Castanza

unread,
Jun 24, 2024, 6:57:33 PM (11 days ago) Jun 24
to gsea-help
 Hi Roei,

The issue is probably the (2) in the filename which I don't believe is included in the filenames we publish.

That said this should just be a warning so if you click on the job should just continue.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Roei Cohen-Almagor

unread,
Jun 25, 2024, 4:48:15 AM (11 days ago) Jun 25
to gsea...@googlegroups.com
Hi Anthony, 

Apologies for the confusion, the (2) in the filename was because it was the second download of that file. Here is an updated image without the (2). I still receive the Chip platform error, and if I continue, there is a tool execution error, although the gene set contains the number of genes required with the correct gene identifiers.  


I have also attached the normalised counts file in case it helps to uncover the issue. 

Thank you for your help.

Best Wishes, 
Roei 

image.png


normalized_counts1.gct.txt

Roei Cohen-Almagor

unread,
Jul 2, 2024, 6:15:30 AM (4 days ago) Jul 2
to gsea...@googlegroups.com
Hi Anthony, 

Please could you provide any update regarding this issue? 

Thank you very much. 

Best wishes, 
Roei 

Anthony Castanza

unread,
Jul 2, 2024, 12:05:26 PM (3 days ago) Jul 2
to gsea...@googlegroups.com
Hi Roei,

My apologies for missing your reply, I was away this last week and missed your message.
The issue here is that you have selected the Human Gene Symbol chip file, but the dataset you're providing is in Ensembl Gene IDs, so you want the chip file: Human_Ensembl_Gene_ID_MSigDB.v2023.2.Hs.chip

Also a couple notes here, with only three samples per phenotype, you'll want to use the gene set permutation mode not phenotype permutation mode as you don't actually have enough samples to construct 1000 unique permutations for the null distribution generation. Also, we generally recommend running more than just a single gene set as otherwise the FDR calculation that GSEA performs is not meaningful (the other statistics are fine though if you're really only interested in this one set).

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
Reply all
Reply to author
Forward
0 new messages