Leading Edge Analysis Error message

Caren Yassa

unread,

Nov 8, 2023, 7:18:02 PM11/8/23

to gsea-help

Hello,

I am trying to run leading edge analysis; however, I get this error message below.

Thank you in advance for your help

Anthony Castanza

unread,

Nov 9, 2023, 11:48:24 AM11/9/23

to gsea...@googlegroups.com

Hi Caern,

How did you initialize Leading Edge analysis? From the application cache? By locating a GSEA results folder from the filesystem?
At what stage did this error occur? When loading data or at one of the downstream steps?

Unfortunately because of how non-specific this error message is, it might be impossible for me to sort it out without a copy of the data you were trying to use. If you're willing to share that data with us so we can debug things here, you can send it confidentially to the gsea...@broadinstitute.org email address.

-Anthony

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/f2dc19d0-da1c-480a-b24d-61b96e1f2416n%40googlegroups.com.

Caren Yassa

unread,

Nov 9, 2023, 11:56:18 AM11/9/23

to gsea-help

Hi Anthony,

I initialized it from the application cache, and the results would load fine. Then I try to select the gene sets I want to appear in the analysis, and this error message appears. This issue is kinda recent. I haven't gotten this message before when I tried to do the analysis, so I'm not sure if the problem is the data because it worked before, and sometimes it works and sometimes not

Message has been deleted

Anthony Castanza

unread,

Nov 9, 2023, 5:24:44 PM11/9/23

to gsea...@googlegroups.com

Hi Caren,

The data in the screenshot you sent is not normalized. Raw counts from standard (i.e. not the pseudoalignment tools like Kallisto or Salmon) are typically integer values like what you see here.
If the data was normalized you would expect to see fractional counts (i.e 13.1, 2750.4, etc instead of just 13, 2750, etc).
We always recommend count normalization before running GSEA. If you're having difficulty getting your copy of DESeq2 to produce normalized counts, we have a version of DESeq2 on GenePattern (cloud.genepattern.org) that is designed to produce a normalized counts gct file that will work with GSEA as one of the outputs.

My inclination is to normalize all the samples together. That should give DESeq2 the best model possible of the count distributions that it needs for normalization.

As to your previous question about the Leading Edge analysis failing. If it is an intermittent issue with the exact same data, then it might be GSEA running out of system memory. If you quit and reopen GSEA, then reload the results folder from the filesystem, does it still happen?

-Anthony

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

On Thu, Nov 9, 2023 at 12:27 PM Caren Yassa <yas...@uci.edu> wrote:

Here's the sample of counts for all 6 samples; 2 classes (sorry I did not include that above) and error message when I try to run the leading edge analysis
On Thursday, November 9, 2023 at 9:16:11 AM UTC-8 Caren Yassa wrote:
I also have another follow-up question: How would I know if I need to normalize publicly acquired data (no metadata or description if it was processed). Below is a sample of the counts. Would I be able to determine that form just looking at the data?

Also, I ran the enrichment map with normalized (using DESEq2 in r, and saved the normalized counts matrix) and the form fo the data matrix below. I get different pathway enrichment results and sometimes only enriched in one phenotype not the other, when I normalized, I kind of get enriched gene sets in both phenotypes. Would that determine that the data needed normalization?

If normalization is recommended, in my analysis, I want to compare enrichment pathways between all samples (6 samples, 2 classes) as well as only 2 samples (1 sample per class). Do I need to normalize the 2 samples (1 sample per class) alone, or normalizing the matrix with all samples (3 samples) from both classes and then extracting only 1 sample per class from that data is sufficient?

Thank you,

To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/0811454f-b497-4a10-bc66-24789dedc2a6n%40googlegroups.com.

Reply all

Reply to author

Forward