Leading Edge Analysis Error message

69 views
Skip to first unread message

Caren Yassa

unread,
Nov 8, 2023, 7:18:02 PM11/8/23
to gsea-help
Hello, 

I am trying to run leading edge analysis; however, I get this error message below. 

Thank you in advance for your help
error message.png

Anthony Castanza

unread,
Nov 9, 2023, 11:48:24 AM11/9/23
to gsea...@googlegroups.com
Hi Caern,

How did you initialize Leading Edge analysis? From the application cache? By locating a GSEA results folder from the filesystem?
At what stage did this error occur? When loading data or at one of the downstream steps?

Unfortunately because of how non-specific this error message is, it might be impossible for me to sort it out without a copy of the data you were trying to use. If you're willing to share that data with us so we can debug things here, you can send it confidentially to the gsea...@broadinstitute.org email address.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/f2dc19d0-da1c-480a-b24d-61b96e1f2416n%40googlegroups.com.

Caren Yassa

unread,
Nov 9, 2023, 11:56:18 AM11/9/23
to gsea-help
Hi Anthony, 

I initialized it from the application cache, and the results would load fine. Then I try to select the gene sets I want to appear in the analysis, and this error message appears. This issue is kinda recent. I haven't gotten this message before when I tried to do the analysis, so I'm not sure if the problem is the data because it worked before, and sometimes it works and sometimes not

Message has been deleted
Message has been deleted

Anthony Castanza

unread,
Nov 9, 2023, 5:24:44 PM11/9/23
to gsea...@googlegroups.com
Hi Caren,

The data in the screenshot you sent is not normalized. Raw counts from standard (i.e. not the pseudoalignment tools like Kallisto or Salmon) are typically integer values like what you see here.
If the data was normalized you would expect to see fractional counts (i.e 13.1, 2750.4, etc instead of just 13, 2750, etc).
We always recommend count normalization before running GSEA. If you're having difficulty getting your copy of DESeq2 to produce normalized counts, we have a version of DESeq2 on GenePattern (cloud.genepattern.org) that is designed to produce a normalized counts gct file that will work with GSEA as one of the outputs.

My inclination is to normalize all the samples together. That should give DESeq2 the best model possible of the count distributions that it needs for normalization.

As to your previous question about the Leading Edge analysis failing. If it is an intermittent issue with the exact same data, then it might be GSEA running out of system memory. If you quit and reopen GSEA, then reload the results folder from the filesystem, does it still happen?


-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

On Thu, Nov 9, 2023 at 12:27 PM Caren Yassa <yas...@uci.edu> wrote:

Here's the sample of counts for all 6 samples; 2 classes (sorry I did not include that above) and error message when I try to run the leading edge analysiscounts.png
On Thursday, November 9, 2023 at 9:16:11 AM UTC-8 Caren Yassa wrote:
I also have another follow-up question: How would I know if I need to normalize publicly acquired data (no metadata or description if it was processed). Below is a sample of the counts. Would I be able to determine that form just looking at the data? 

Also, I ran the enrichment map with normalized (using DESEq2 in r, and saved the normalized counts matrix) and the form fo the data matrix below. I get different pathway enrichment results and sometimes only enriched in one phenotype not the other, when I normalized, I kind of get enriched gene sets in both phenotypes. Would that determine that the data needed normalization? 

If normalization is recommended, in my analysis, I want to compare enrichment pathways between all samples (6 samples, 2 classes) as well as only 2 samples (1 sample per class). Do I need to normalize the 2 samples (1 sample per class) alone, or normalizing the matrix with all samples (3 samples) from both classes and then extracting only 1 sample per class from that data is sufficient? 

Thank you, 
Reply all
Reply to author
Forward
0 new messages