can't collapse ENSEMBL mouse transcripts

55 views
Skip to first unread message

David Roe

unread,
Dec 21, 2021, 12:16:08 PM12/21/21
to gsea-help
My name column has mouse ENSEMBL transcript IDs in 140,000+ rows
Name
ENSMUST00000193812.1
ENSMUST00000082908.1
ENSMUST00000162897.1
.
.
.

When I use Mouse_ENSEMBL_Gene_ID_Human_Orthologs_MSigDB.v7.4.chip to collapse it, it produces an empty data set.

Can anyone tell me how to do this correctly?

Anthony Castanza

unread,
Dec 21, 2021, 12:23:47 PM12/21/21
to gsea-help
Hi David,

Our chip files don't support transcript level data, they're designed for converting gene level identifiers to gene symbols. Transcript level identifiers are best handled through packages such as tximport which are explicitly designed to handle this sort of data. Once your data has been converted to gene level then you'll want to make sure that the IDs (likely starting with ENSMUSG at that point not ENSMUST) have had their decimal version suffix removed as well. Then it should work with our chip file.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/22b91291-e110-4413-803d-22eea4f8a093n%40googlegroups.com.

David Roe

unread,
Dec 21, 2021, 12:27:47 PM12/21/21
to gsea-help
Thanks Anthony.
Reply all
Reply to author
Forward
0 new messages