About expression value

28 views
Skip to first unread message

Carolini Schultz

unread,
Jun 12, 2024, 12:08:21 PMJun 12
to gsea-help
Dear GSEA-help,

I am reading the User guide https://docs.gsea-msigdb.org/#GSEA/GSEA_and_RNA-Seq/, but I am still confused about what my input expression data should be. I want to use GSEA desktop to compare gene expression between females and males from my experiment (n=5/group). I have the raw counts and the TPM. I was using the TPM output from Salmon, but the USER guide say this metric is not adequate to compare samples. Could you give me a brief explanation about this? 

Thank you so much for your help!

Anthony Castanza

unread,
Jun 12, 2024, 12:26:33 PMJun 12
to gsea-help
Hi Carolini,

TPM is an internally relative measure that can't really be directly compared across samples. The general advice is to take the "raw" counts and then run them through something like DESeq², dump out the normalized counts table and to use that for GSEA.

If you have the salmon files though, we offer a tool on cloud.genepattern.org called tximport.DEseq2 that can ingest those files and produce a properly normalized counts file in the expected format for GSEA.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/3c67ac46-c81e-4d41-aef7-79a989dfd04en%40googlegroups.com.

Carolini Schultz

unread,
Jun 12, 2024, 12:43:46 PMJun 12
to gsea...@googlegroups.com
Dear Dr. Castanza, 

Thank you for your quick response! I am going to check out  cloud.genepattern.org

Thank you for your time!

Carolini

You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/Pq3YCDpljHw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZzpbar6n304U%3D9a2N1N1OZd7S%2BMKYtWKpjhm1M2_v9dJA%40mail.gmail.com.

Carolini Schultz

unread,
Jun 30, 2024, 12:05:18 PM (5 days ago) Jun 30
to gsea...@googlegroups.com
Dear Dr. Castanza,

I am currently using DESEq2 to normalize my raw counts. However, I am uncertain about the normalization process in my experiment. I have 12 groups, each representing a unique combination of factors such as biological sex, weaning age, and treatment. Not all groups are directly comparable. Should I normalize all groups or just the ones I want to compare? 

Thank you for your help!

Carolini

Anthony Castanza

unread,
Jul 1, 2024, 11:51:31 AM (4 days ago) Jul 1
to gsea-help
Hi Carolini,

This is a difficult question, and I can't really give a firm answer, especially not without knowing what the purpose of those other samples was in the context of the experimental design. That said, if the samples are truly not comparable and you don't intend to do any analysis that includes them as a part of a direct comparison to the other samples, I would probably exclude them from the dataset when collating the data for DESeq2/GSEA.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Carolini Schultz

unread,
Jul 2, 2024, 12:34:07 PM (3 days ago) Jul 2
to gsea...@googlegroups.com
Dear Dr. Castanza,

Thank you for your answer! I will read more on the subject to make an informed decision! 

Can I execute a leading-edge analysis when inputting my own gene sets? This is the error message I get:
image.png

I am using the Sus scrofa ensembl gene id. In both gene sets and expression file.


Anthony Castanza

unread,
Jul 2, 2024, 4:23:36 PM (3 days ago) Jul 2
to gsea...@googlegroups.com
Hi Carolini,

You should be able to run leading edge analysis with your own gene sets, yes. If you're sure the G1/S Transition gene set is in the data you provided then this might be an issue with special characters. In the gene sets we produce for MSigDB, we strip out all the special characters such as slashes, colons, commas, etc, as they can result in difficulties with the data parser. I might suggest going back to your dataset, stripping out these characters, and redoing your analysis with the cleaned up gene sets database file.

Let me know if you have any additional questions

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

Carolini Schultz

unread,
Jul 3, 2024, 8:57:09 AM (3 days ago) Jul 3
to gsea...@googlegroups.com
Dear Dr. Castanza, 

Thank you so much for your reply! Stripping out the special characters fixed the problem!

I hope you have a great day!

Carolini

Reply all
Reply to author
Forward
0 new messages