
Hi Matthew,
The text in your screenshot is illegible, perhaps it was compressed by your email client? I would suggest resending it as an attachment rather than as an in-line image.
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
I would like to run GSEA on an RPPA expression dataset and I am unsure which .chip annotation file would be correct to use, or if I need to create a custom file. I have included a screenshot which has the annotation info that the RPPA core
provided. Could you please advise? I am using GSEA 4.1.0 and the latest version of java.
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
gsea-help+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/gsea-help/c52b2873-73bd-4863-803c-5990b01c9d1bn%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/HXS8F-Lxcno/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/BYAPR05MB57824161705518EF3E9B6F53F7249%40BYAPR05MB5782.namprd05.prod.outlook.com.
Hi Matt,
Thanks! Looking at the data, I don’t know that there’s a good way to handle this data as-is.
There are issues like multiple homologous genes being represented by a single identifier (such as ACACA-B being both ACACA and ACACB and similarly AKT1/2/3 representing all three AKT genes in addition to separate versions of these genes) and based on the limited snapshot there doesn’t seem to be a consistent rule for this that could easily identify affected genes.
Our chips don’t have a way to handle mapping these so if you were to try running with the “Human_Gene_Symbol_with_Remapping_MSigDB.v7.4” chip, using the data from the “Gene Name” field, which is what I would normally suggest, the data from the ACK1/2/3 and ACACA-B antibodies (and similar cases) would be completely omitted. The data from the individual genes would be kept, but there are additional cases where there is both a phospho and a total antibody both with the same gene symbol (see: AKT1 in the screenshot). Depending on the collapse behavior GSEA would either try to add them together, or take the maximum of each antibody for each sample across them (which would probably effectively throw away the phospho data).
Complicated proteomics assays like this, particularly ones including phosphorylation assays are… a little more complicated than what GSEA is designed to handle.
There is a modified version of GSEA (not by us) that is designed to work with proteomics datasets similar to this, but it has very specific data type formats that you may, or may not be able to coerce your data into. You can take a look at it here though: http://prot-shiny-vm.broadinstitute.org:3838/ptmsigdb-app/
Sorry I couldn’t be of more assistance
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/CA%2BXZ7WvdMMLPnbBOxKXEv5LDNXsEJ3QDZ7-iPH8WWM616VQyAQ%40mail.gmail.com.