chip platform for running GSEA

33 views
Skip to first unread message

Lilian Tiemi Inoue

unread,
Apr 3, 2025, 1:46:08 PM4/3/25
to gsea-help
Hello,

I ran GSEA (desktop version) using an expression datacontaining thousands of genes (.txt), formatted as the attached file. I want to convert Ensembl IDs to Gene Symbols and have tried using both chip platforms Human_Ensembl_Gene_ID_MSigDB.v2024.1.Hs.chip and Human_Ensembl_Transcript_ID_MSigDB.v2024.1.Hs.chip.

However, I received the following message:
"The selected CHIP does not match the version of the MSigDB collection selected. Some gene identifiers may not be mapped."
Errors like this also appear: "After pruning, none of the gene sets passed size thresholds."

Could you please help me resolve this issue? I am also unsure whether I should select remap_only or collapse.

Thank you in advance for your guidance.

Best regards,

Lilian


expression_matrix.png


Anthony Castanza

unread,
Apr 3, 2025, 1:59:53 PM4/3/25
to gsea...@googlegroups.com
Hi Lilian,

The issue with your file is the version suffixes on your gene IDs, for example, ENSG00000223972.5 should be just ENSG00000223972 stripping that trailing .5 version.
Once you process your file to strip these trailing version identifiers, then the Human_Ensembl_Gene_ID_MSigDB.v2024.1.Hs.chip file should work for this data.
You would want to run with "collapse" and for RNA-seq data like thi, you also want to expand the "Advanced Fields" section and change the "Collapsing mode for probe sets =>1 gene" parameter to "sum_of_probes".

"The selected CHIP does not match the version of the MSigDB collection selected. Some gene identifiers may not be mapped."  is a warning message that appears if you aren't directly retrieving our gene set files from our built in server, or if the gene set file you select is from a different version of MSigDB (as indicated by the MSigDB version suffix in the file). If you're using your own provided gene set files, you'd generally want to make sure that their gene symbols have been remapped to match those in the chip file.

Let me know if you have any additional questions

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

A informação contida nesta mensagem de e-mail, incluindo quaisquer anexos, é confidencial e está reservada a Sociedade Beneficente de Senhoras Hospital Sírio Libanês e à pessoa para a qual foi endereçada. Caso você não seja o destinatário, fica por meio desta, notificado que não deverá retransmitir, imprimir, copiar, usar ou distribuir esta mensagem de e-mail ou quaisquer anexos. Caso você tenha recebido esta mensagem por engano, por favor, contate o remetente imediatamente e apague esta mensagem. Qualquer uso não autorizado ou disseminação dessa mensagem ou parte dela é expressamente proibido.

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/ec89ba90-b582-4db4-a607-53267a034122n%40googlegroups.com.

Lilian Tiemi Inoue

unread,
Apr 3, 2025, 3:00:49 PM4/3/25
to gsea...@googlegroups.com
Hi Anthony,

Thank you very much, it worked!

Could you please explain why I should change the "Collapsing mode for probe sets =>1 gene" parameter to "sum_of_probes"? I'm asking because I have previously performed several analyses without changing this parameter. Could this point change the results significantly?

Thanks,
Lilian


Em qui., 3 de abr. de 2025 às 15:00, Anthony Castanza <acas...@cloud.ucsd.edu> escreveu:
Geralmente, você não recebe emails de acas...@cloud.ucsd.edu. Saiba por que isso é importante
You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/UfKnVkJGO78/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZxxA%2Be70w3T1kzHwpL3qeTVxkZULngSAzGC7Qokxi%3DqtA%40mail.gmail.com.

Anthony Castanza

unread,
Apr 3, 2025, 3:57:20 PM4/3/25
to gsea-help
Hi Lilian,

There are cases where one official gene symbol can be represented by more than one Ensemble gene construct. 
One cause of this can be alternate gene versions on chromosomal patch assemblies. It can also happen when there is a disagreement between the annotation authorities, where Ensembl has assigned two partially overlapping constructs different gene IDs but that the nomenclature authority has assigned those constructs the same symbol due to the nature of the exonic overlap.

Since the data for RNA-seq comes from counting discrete entities when these counts have been assigned to different Ensembl constructs, but that these constructs are assigned the same symbol in analysis space it makes sense to sum them so that the counts for the gene symbol consist of all the counts that were assigned to all constructs that represent that symbol. In the default max probe mode GSEA will only consider the counts from whichever construct was more highly expressed, discarding the counts from the lower expressed construct. 

Because of the relatively low prevalence of these genes, and the typical nature of the relationships between the multiple constructs, generally the impacts of the collapsing mode here are quite minor. If you're continuing prior work I would recommend consistency in your approach, that said our recommendation for handling RNA-seq is the sum method.

Hope this helps!

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Lilian Tiemi Inoue

unread,
Apr 3, 2025, 4:25:06 PM4/3/25
to gsea...@googlegroups.com
It helped a lot, Anthony.

Thank you,
Lilian
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/UfKnVkJGO78/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+unsubscribe@googlegroups.com.

A informação contida nesta mensagem de e-mail, incluindo quaisquer anexos, é confidencial e está reservada a Sociedade Beneficente de Senhoras Hospital Sírio Libanês e à pessoa para a qual foi endereçada. Caso você não seja o destinatário, fica por meio desta, notificado que não deverá retransmitir, imprimir, copiar, usar ou distribuir esta mensagem de e-mail ou quaisquer anexos. Caso você tenha recebido esta mensagem por engano, por favor, contate o remetente imediatamente e apague esta mensagem. Qualquer uso não autorizado ou disseminação dessa mensagem ou parte dela é expressamente proibido.

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/UfKnVkJGO78/unsubscribe.
Reply all
Reply to author
Forward
0 new messages