GSEA 4.0.1 cli -- file not found. gmx contents instead of gmx name

83 views
Skip to first unread message

Sol Katzman

unread,
Aug 30, 2019, 6:59:27 PM8/30/19
to gsea-help
Dear GSEA,

GSEA 4.0.1 cli under linux.

Successfully ran UI version under Windows10.

Using local copies of p53 test files (gct and cls), and hallmark gene sets (gmt).

Grabbed command line from Windows UI run, made changes to file pointers for linux. See command line below.

Getting java error:

java.io.FileNotFoundException: File not found: /public/groups/wet/Sequences/programs/gsea/GSEA_4.0.1/HALLMARK_G2M_CHECKPOINT    http:/www.gsea-msigdb.org/gsea/msigdb/cards/HALLMARK_G2M_CHECKPOINT        AURKA   CCNA2   TOP2A   CCNB2   CENPA ...

This is part of the contents of the gmt file.  h.all.v7.0.symbols.gmt

The output from java indeed shows -gmx as a comma separated list of the lines in the gmt file, rather than a pointer to the gmt file.

1581     [INFO  ] - Parameters passed to GSEA tool:
1581     [INFO  ] - gmx    HALLMARK_TNFA_SIGNALING_VIA_NFKB    http://www.gsea-msigdb.org/gsea/msigdb/cards/HALLMARK_TNFA_SIGNALING_VIA_NFKB    JUNB    CXCL2    ATF3    NFKBIA    TNFAIP3    PTGS2    CXCL1    IER3    CD83    CCL20    CXCL3    MAFF    NFKB2    TNFAIP2    HBEGF    KLF6    BIRC3    PLAUR    ZFP36...SERPINB8    MXD1,HALLMARK_HYPOXIA    http://www.gsea-msigdb.org/gsea/msigdb/cards/HALLMARK_HYPOXIA    PGK1    PDK1    GBE1    PFKL...NUDT21    RBX1    SRSF6    GMPR2    DCTN4    COX17    CMPK2    CCNO,HALLMARK_G2M_CHECKPOINT    http://www.gsea-msigdb.org/gsea/msigdb/cards/HALLMARK_G2M_CHECKPOINT    AURKA    CCNA2    TOP2A    CCNB2    CENPA    BIRC5    CDC20    PLK1    TTK    PRC1...
1588     [INFO  ] - res    /public/groups/wet/Sequences/programs/gsea/GSEA_p53_datasets/P53_collapsed_symbols.gct
1588     [INFO  ] - cls    /public/groups/wet/Sequences/programs/gsea/GSEA_p53_datasets/P53.cls#MUT_versus_WT
1588     [INFO  ] - rpt_label    p53_symbols_hallmark_perm100
1588     [INFO  ] - collapse    false

Note that there is no problem with -res or -cls pointers.

Note that G2M_CHECKPOINT is the 9th line in the gmt file.

Any idea what could be wrong?

Thanks,
Sol Katzman
UC Santa Cruz, Genomics Institute.

-------------------------------------------------------------------------------
gsea-cli.sh GSEA \
-res /public/groups/wet/Sequences/programs/gsea/GSEA_p53_datasets/P53_collapsed_symbols.gct \
-cls /public/groups/wet/Sequences/programs/gsea/GSEA_p53_datasets/P53.cls#MUT_versus_WT \
-gmx /public/groups/wet/Sequences/programs/gsea/MSIGDB/h.all.v7.0.symbols.gmt \
-collapse false \
-mode Max_probe \
-norm meandiv \
-nperm 100 \
-permute phenotype \
-rnd_type no_balance \
-scoring_scheme weighted \
-rpt_label p53_symbols_hallmark_perm100 \
-metric Signal2Noise \
-sort real \
-order descending \
-create_gcts false \
-create_svgs false \
-include_only_symbols true \
-make_sets true \
-median false \
-num 100 \
-plot_top_x 20 \
-rnd_seed timestamp \
-save_rnd_lists false \
-set_max 500 \
-set_min 15 \
-zip_report false \
-out /public/groups/wet/Sequences/programs/gsea/TEST_runs



Sol Katzman

unread,
Aug 30, 2019, 7:07:48 PM8/30/19
to gsea-help
I suppose I should give you the rest of the error message, which points to the java code that is apparently attempting to find the "file":

java.io.FileNotFoundException: File not found: ...
        at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.createInputStream(ParserFactory.java:1056)
        at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.read(ParserFactory.java:799)
        at org.gsea_msigdb.gsea/edu.mit.broad.genome.parsers.ParserFactory.read(ParserFactory.java:787)
        at org.gsea_msigdb.gsea/xtools.api.param.GeneSetMatrixChooserAbstractParam._getObjects(GeneSetMatrixChooserAbstractParam.java:138)
        at org.gsea_msigdb.gsea/xtools.api.param.GeneSetMatrixChooserAbstractParam._getGeneSets(GeneSetMatrixChooserAbstractParam.java:182)
        at org.gsea_msigdb.gsea/xtools.api.param.GeneSetMatrixChooserAbstractParam.getGeneSetMatrixCombo(GeneSetMatrixChooserAbstractParam.java:77)
        at org.gsea_msigdb.gsea/xtools.api.param.GeneSetMatrixMultiChooserParam.getGeneSetMatrixCombo(GeneSetMatrixMultiChooserParam.java:15)
        at org.gsea_msigdb.gsea/xtools.gsea.Gsea.execute(Gsea.java:112)
        at org.gsea_msigdb.gsea/xtools.api.AbstractTool.module_main(AbstractTool.java:434)
        at org.gsea_msigdb.gsea/org.genepattern.modules.GseaWrapper.main(GseaWrapper.java:308)
        at org.gsea_msigdb.gsea/xapps.gsea.CLI.main(CLI.java:30)

Sol Katzman

unread,
Sep 3, 2019, 12:42:38 PM9/3/19
to gsea-help
Resolved (sort of)

Apparently the -gmx parameter does not directly take a pointer to a gmt file.

Instead it takes a pointer to a file containing a pointer (or a list of pointers?) to gmt file(s).
Note "echo", not "cat" here:

echo "MSIGDB/h.all.v7.0.symbols.gmt" > gmtList.txt

gsea-cli.sh GSEA \
-gmx gmtList.txt
...

The above works, but is NOT the same as what you get when you click the "command" icon after setting up "Run GSEA" in the Windows GUI version.

That command seems to show that -gmx should directly point to the gmt file:

gsea-cli.bat GSEA \
-gmx D:\Documents\DOWNLOADS\GSEA GeneSetEnrichmentAnalysis\gene_sets\h.all.v7.0.symbols.gmt \
...

Perhaps the unix and Windows versions are different.

It would be helpful if one could enter a command like "gsea-cli.sh help" or "gseq-cli.sh command help".

Or if there was more comprehensive documentation of the command parameters as there are, for example,
for Picard (another set of tools from the Broad.)

/Sol

On Friday, August 30, 2019 at 3:59:27 PM UTC-7, Sol Katzman wrote:

David Eby

unread,
Sep 3, 2019, 10:10:55 PM9/3/19
to gsea...@googlegroups.com
Hi Sol,

Sorry for the delayed response.  I didn't have time to get to this before now due to the US Labor Day weekend.

Thanks for diagnosing this.  It's definitely a bug.  

In the 4.0.0 release, we repurposed the GSEA GenePattern wrapper scripts for general CLI use since they solve a number of other issues we've had over the years.  The calling schemes are different in GenePattern as compared to normal CLI use, however, when multiple files can be involved.  That's the main difference here between -gmx vs. -cls or -res; the latter always point to individual files while the former can technically accept multiple files even though most folks won't use it that way.

If you're OK using this as a workaround, I'll discuss a fix with the team in our next meeting and we'll try to roll that out in an upcoming 4.0.x update.

Thanks,
David

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/65521a06-f38f-406a-a4cd-4585d2010bc0%40googlegroups.com.

Rui Li

unread,
Sep 27, 2019, 2:37:34 PM9/27/19
to gsea-help
Thanks a lot David! Everything works after the following steps:
1. downloading c5.all.v7.0.symbols.gmt
2. put 'c5.all.v7.0.symbols.gmt' into gmt.txt 
3.   $path/gsea-cli.sh GSEAPreranked -gmx $db/gmt.txt -norm meandiv -nperm 1000 -rnk $rnk -scoring_scheme classic -rpt_label $label -create_svgs false -make_sets true -plot_top_x $nplots -rnd_seed timestamp -set_max 500 -set_min 15 -zip_report false -out ./output


Reply all
Reply to author
Forward
0 new messages