enrichment map does not work and GSEA also doesn't work for certain samples

110 views
Skip to first unread message

bisht@osu

unread,
Dec 27, 2018, 7:07:39 PM12/27/18
to gsea-help

1.      GSEA and CYTOSCAPE

I was using Gene Set Enrichment Analysis (version 3.0) made available from Broad Institute (http://software.broadinstitute.org/gsea/index.jsp) which allows to Gene set enrichment Analysis and visualizes it using Cytoscape (version 3.7.0).

I can get the gene set enrichment analysis, (which I am skeptical might not be proper) but I am unable to visualize it using the Cytoscape. The Cytoscape opens when I click on Visualize Enrichment Maps, but it does not allow me to build the maps using Cytoscape.

I tried loading the GSEA individually by opening the enrichment Map app (plugin) on Cytoscape, but even when I load the files in the correct format, I am unable to build the maps. I got the GMT files from the MsigDB – c2.all.v2.symbols.gmt, the chip platform I used was GENE_SYMBOL.chip available on the GSEA website. I create the class file using the column names of the dataset that I upload and I specific the positive and negative phenotype as RB+ (WT) = positive and RB- = negative. Finally, I export the excel data as tabular file(.txt) which is also one of the supported input formats for GSEA tool.  

I am attaching an image of the error code that I get

 

When I google the error code, it says that it is probably because my connection to Cytoscape falters when I try to build enrichment maps. And it doesn’t stay connected even though it makes the initial connection.

2.      Issues with the analysis:

I tried to repeat the analysis with other file and I also made sure that the cls files, txt file(input, different sample than above), gmt file as mentioned above and chip file as mentioned above are in the same format as shown in the GSEA documentation on Wikipedia. I was getting an error with the analysis itself.

Attached below is the error.

 

to parse>C:\Users\Madhoolika\Desktop\Work\GSEA\New Trial\CHIP file\GENE_SYMBOL.chip< got: [C:\Users\Madhoolika\Desktop\Work\GSEA\New Trial\CHIP file\GENE_SYMBOL.chip]

INFO - Collapsing dataset was done. Original: 25562x2 (ann: 28506,2,GENE_SYMBOL.chip) collapsed: 320x2 (ann: 320,2,GENE_SYMBOL)

to parse>gseaftp.broadinstitute.org://pub/gsea/gene_sets_final/c2.v2.symbols.gmt< got: [gseaftp.broadinstitute.org://pub/gsea/gene_sets_final/c2.v2.symbols.gmt]

INFO - File download started.  Retrieving c2.v2.symbols.gmt from remote server...

INFO - Download complete

INFO - Begun importing: GeneSetMatrix from: c2.v2.symbols.gmt

INFO - Got gsets: 1687 now preprocessing them ... min: 15 max: 500

Done removeGeneSetsSmallerThan: 15 for: 501 / 1353

Done removeGeneSetsSmallerThan: 15 for: 1001 / 1353

INFO - Done preproc for smaller than: 15

ERROR - Tool exec error

xtools.api.param.BadParamException: After pruning, none of the gene sets passed size thresholds.

               at xtools.api.param.ParamFactory.checkAndBarfIfZeroSets(ParamFactory.java:88)

               at xtools.gsea.Gsea.execute(Gsea.java:111)

               at edu.mit.broad.xbench.tui.TaskManager$ToolRunnable.run(TaskManager.java:436)

               at java.lang.Thread.run(Unknown Source)

INFO - Renaming rpt dir on error to: C:\Users\Madhoolika\gsea_home\output\dec27\error_my_analysis.Gsea.1545954900756

Anthony Castanza

unread,
Dec 27, 2018, 7:46:02 PM12/27/18
to gsea...@googlegroups.com
Hello,

It is difficult to say why you might be encountering that error with Enrichment Map, Enrichment Map is a product of the Bader Lab (http://www.baderlab.org/Software/EnrichmentMap) which we include in GSEA though an agreement with them. However, your second error may have something to do with this, so, we can try to work through that first.

This error "After pruning, none of the gene sets passed size thresholds." Is usually a result of some sort of improper pruning of the gene list. 
I noticed in your output this line: "Collapsing dataset was done. Original: 25562x2 (ann: 28506,2,GENE_SYMBOL.chip) collapsed: 320x2"

Your 28506 initial gene identifiers appears to be getting collapsed to a list of just 320 items, a list that is of insufficient size to run GSEA. I suspect that your use of the "Gene_Symbol.chip" file (which is depreciated and no longer included in the release GSEA software and is only available through the website for legacy reasons) to collapse your data is the cause of this.

What is the original namespace of your data? Mouse gene symbols, Human gene symbols, ensembl ids, entrez IDs, etc? MSigDB is available in HUGO symbols, or Entrez IDs. We offer chip files to convert some specific IDs through the software (like Affymetrix IDs and Ensembl IDs), and you can easily create other CHIP files through (for example) BioMart. Details of the CHIP format is available here: http://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#CHIP:_Chip_file_format_.28.2A.chip.29

Please try your analysis with a CHIP file mapping the identifiers used in your specific dataset to Human Gene Symbols and let me know if that fixes the error you are observing.

-Anthony

Anthony S. Castanza, PhD
Curator, MSigDB
Mesirov Lab, Department of Medicine
University of California, San Diego

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/0fb2c38a-234f-4a54-905a-8df80d2ab49b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

madhool...@gmail.com

unread,
Jan 14, 2019, 12:49:34 AM1/14/19
to gsea-help
Hello,

Thank you for the reply that was helpful. However I am unable to identify which CHIP file I should use since my RNA seq data already has gene symbols and I can see that the HG files are specific for the platform used. Should I convert my gene symbols to Affymetrix or entrez IDs and then use the HG_U133.CHIP or other CHIPs. Do let me know. Thank you

Sincere Regards,
Madhoolika Bisht

Anthony Castanza

unread,
Jan 14, 2019, 11:09:04 AM1/14/19
to gsea-help
Hi,

What species is this RNA-seq data? If it is already in Human gene symbols, you should set the collapse dataset function to false. 

If it is in gene symbols for mouse, or some other species, the current recommendation is to construct your own chip file for your dataset by exporting the relevant symbol mappings from, for example, Ensembl BioMart. I have attached an example of such a chip file to this reply, however, this file has not been tested and I am unsure how GSEA will handle the multi-mappings present in it. It is provided without any guarantees whatsoever and I strongly recommend producing your own chip file to suit your own needs.

-Anthony

Anthony S. Castanza
Curator, MSigDB
Mesirov Lab, Department of Medicine
University of California, San Diego

DRAFT_Mouse_Gene_Symbol.chip
Reply all
Reply to author
Forward
0 new messages