Reg query about analysis view

110 views
Skip to first unread message

TAMIZHINI LOGANATHAN

unread,
Jan 19, 2021, 12:14:20 AM1/19/21
to gsea...@googlegroups.com
Hi sir,
I have been working on validation hormone-gene prediction. I have a list of each hormone and its respective 19318 genes with rank files.
I am validating with the DisGeNET database for disease-gene prediction. I have uploaded the rank file(19318 genes with rank).The rank nothing but the SVM score. In our case all SVM score is positive.The below table is the result of GSEA.Most of the enriched terms are nom p-val is 0 and FDR q val is 0.Is result producing good or bad.I don't know how to interpret this result.If it is "0",the disease enriched or not.Please help me with this.

Thanking you sir.

GS
follow link to MSigDB
GS DETAILSSIZEESNESNOM p-valFDR q-valFWER p-valRANK AT MAXLEADING EDGE
1NADH:Q(1) OXIDOREDUCTASE DEFICIENCYDetails ...250.723.650.0000.0000.0005350tags=100%, list=28%, signal=138%
2MITOCHONDRIAL COMPLEX I DEFICIENCYDetails ...270.673.530.0000.0000.0005350tags=93%, list=28%, signal=128%
3ABNORMAL MITOCHONDRIA IN MUSCLE TISSUEDetails ...250.663.370.0000.0000.0005350tags=92%, list=28%, signal=127%
4ACUTE NECROTIZING ENCEPHALOPATHYDetails ...200.723.340.0000.0000.0005350tags=100%, list=28%, signal=138%
5PROGRESSIVE MACROCEPHALYDetails ...230.663.260.0000.0000.0005350tags=91%, list=28%, signal=126%
6NICOTINAMIDE ADENINE DINUCLEOTIDE COENZYME Q REDUCTASE DEFICIENCYDetails ...390.523.200.0000.0000.0004855tags=72%, list=25%, signal=96%
7PALLOR OF OPTIC DISCDetails ...600.392.780.0000.0000.0056242tags=65%, list=32%, signal=96%
8INCREASED CSF LACTATEDetails ...560.402.680.0000.0010.0153658tags=52%, list=19%, signal=64%
9IMPAIRED EXERCISE TOLERANCEDetails ...690.342.470.0000.0100.1325819tags=57%, list=30%, signal=81%
10CEREBRAL EDEMADetails ...400.392.350.0000.0260.3594855tags=58%, list=25%, signal=77%
11TRUNCUS ARTERIOSUS, PERSISTENTDetails ...330.422.330.0000.0300.4369872tags=91%, list=51%, signal=186%
12LEOPARD SYNDROMEDetails ...270.422.220.0000.0770.7973792tags=56%, list=20%, signal=69%
13KETOTIC HYPOGLYCEMIADetails ...410.372.210.0000.0790.8325220tags=56%, list=27%, signal=77%
14MECKEL-GRUBER SYNDROMEDetails ...230.442.190.0000.0910.8985987tags=70%, list=31%, signal=101%
15TURNER SYNDROME, MALEDetails ...500.332.190.0000.0870.9085230tags=54%, list=27%, signal=74%
16CENTRAL SEROUS CHORIORETINOPATHYDetails ...740.292.140.0000.1230.9578303tags=66%, list=43%, signal=116%
17PAGET DISEASE EXTRAMAMMARYDetails ...270.412.140.0030.1190.9608541tags=81%, list=44%, signal=146%
18NOONAN SYNDROMEDetails ...580.312.130.0000.1140.9637784tags=66%, list=40%, signal=109%
19HYPOGLYCEMIADetails ...2290.222.110.0000.1340.9866514tags=48%, list=34%, signal=72%
20AGYRIA

Anthony Castanza

unread,
Jan 19, 2021, 1:21:37 AM1/19/21
to gsea...@googlegroups.com
Hello,

GSEA uses an empirical pValue distribution. As such a zero value indicates that in N permutations a value at least as extreme as the true enrichment score was observed zero times. When running with gene set permutation (as is done automatically when running GSEA Preranked) if many of these exactly zero values are observed it indicates that the signal in your ranked list is likely quite strong. If you need a numerical (non-zero) pValue, perhaps for downstream application, your best bet would be to increase the number of permutations tested (for example increase to 10,000 instead of the default 1,000) but be aware that this will substantially increase the GSEA runtime.

The pValues only tell you the significance of the result (smaller is better) but the ES (enrichment score) and particularly the NES (normalized enrichment score) tell you how enriched a specific gene set was in the ranked list and in which direction (positive or negative).

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/CABArLi5TA7XFb1mAbuv28Zq4jxCTuO20VJKeD4j%3DLUDZOqrKnQ%40mail.gmail.com.

TAMIZHINI LOGANATHAN

unread,
Jan 19, 2021, 3:11:02 AM1/19/21
to gsea...@googlegroups.com
Thanks for the explanation,sir.

Rohitesh Gupta

unread,
Feb 17, 2021, 6:26:47 AM2/17/21
to gsea...@googlegroups.com, acas...@cloud.ucsd.edu
Dear Anthony,

I am getting an error while uploading .cls file. Could please check and correct the same and let me know what the problem might be ?

Thanks and Best Regards,
Rohitesh

<Error Details>

---- Full Error Message ----
There were errors: ERROR(S) #:1
Parsing trouble
java.lang.IllegalArgumentExcepti ...

---- Stack Trace ----
# of exceptions: 1
------Mismatched numbers between unique item id's: 3 [EstHER2, ER, PR] and number of Template.Class's: 4
EstHER2 ER ProHER2 PR ------
java.lang.IllegalArgumentException: Mismatched numbers between unique item id's: 3 [EstHER2, ER, PR] and number of Template.Class's: 4
EstHER2 ER ProHER2 PR
at org.gsea_msigdb.gsea/edu.mit.broad.genome.objects.TemplateImpl.runChecksInit(TemplateImpl.java:327)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.objects.TemplateImpl.assignItems2ClassInOrder(TemplateImpl.java:373)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.objects.TemplateFactory.createTemplate_ordered_assign(TemplateFactory.java:221)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.objects.TemplateFactory.createTemplate(TemplateFactory.java:114)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ClsParser._parse_genecluster_style_categorical(ClsParser.java:250)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ClsParser.parse(ClsParser.java:226)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory._readTemplates(ParserFactory.java:341)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.readTemplate(ParserFactory.java:292)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.read(ParserFactory.java:752)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.read(ParserFactory.java:725)
at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserWorker.doInBackground(ParserWorker.java:51)
at java.desktop/javax.swing.SwingWorker$1.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.desktop/javax.swing.SwingWorker.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)


miRNA.cls

Anthony Castanza

unread,
Feb 17, 2021, 1:16:32 PM2/17/21
to Rohitesh Gupta, gsea...@googlegroups.com

Hi Rohitesh,

 

It looks like in your CLS file you’ve defined 4 sample types in the 2nd line, but in the 3rd line where you positionally assign samples to those phenotypes, you’ve only assigned phenotype IDs to 4 samples. The expectation with a CLS file is that all the defined phenotypes are assigned to samples.

You can see the guidelines for the CLS format here: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#CLS:_Categorical_.28e.g_tumor_vs_normal.29_class_file_format_.28.2A.cls.29

 

Additionally, GSEA typically expects replicates for each phenotype so ideally, you should have at a bare minimum three samples assigned to each of the phenotypes you’re interested in analyzing.

 

-Anthony

 

Anthony S. Castanza, PhD

Curator, Molecular Signatures Database

Mesirov Lab, Department of Medicine

University of California, San Diego

Reply all
Reply to author
Forward
0 new messages