error 1001: None of the gene sets that you specified passed the size threshold.

1,112 views
Skip to first unread message

Emilie elvira-matelot

unread,
Jul 12, 2018, 8:30:19 AM7/12/18
to gsea...@googlegroups.com
Hi,

I am having troubles running my GSEA analysis. I previously did it on one of my files and it worked well. But recently I wanted to do it again, and I received the error 1001 message. It was the same file I loaded on GSEA so I don't understand why now it doesn't work...
I join you the file just in case.


Mac OS 10.10.5
JAVA8 update 66
GSEA 3.0

Thank you for your help!

Emilie

IRvsNIR.gct

David Eby

unread,
Jul 12, 2018, 9:44:58 PM7/12/18
to gsea-help
Hi Emilie,

By far the most common cause of the pruning error is a symbol name space mismatch.  It's difficult to say any more without knowing the gene set database used in the analysis, but the key thing is consistency between the symbols in the dataset and those in the gene sets.  If you are using the MSigDB GMTs we provide online, just make sure that your gene symbols are in HUGO format.  Otherwise, all the genes will be filtered out as not being present in the database.

One other thing to mention is that your expression dataset is a bit small (621 genes) and that might also be the cause, in conjunction with the chosen gene set DB.  Being so small, it's possible that only a small number of genes are left in each gene set after filtering so that these fall below the minimum size thresholds.  While it's possible to adjust these thresholds, doing so might leave you with only minimally-viable gene sets.

In general, we recommend using datasets on the scale of full genomes, or thousands of genes, if possible.

Regards,

Emilie elvira-matelot

unread,
Jul 13, 2018, 8:33:27 AM7/13/18
to gsea...@googlegroups.com
Hi David,
thank you for your answer.
I used the h.all.v6.1.symbols gene set database. I checked the gene format and it's ok.
What I don't understand is that my analysis worked some months ago. Did something change between last february-march and now?

Thanks again,

Emilie

--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/07047051-eba4-4c08-bc9d-35c43c1f4e67%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Eby

unread,
Jul 16, 2018, 10:31:09 PM7/16/18
to gsea-help
Do you mean Feb-March 2017 or 2018?

If you mean 2017, then yes, actually there was a pretty substantial change: we released MSigDB v6.1 in October 2017.  MSigDB v6.0 came out Feb 2017, which is right in that timeframe.  So if that's what you meant then it might be worthwhile to go back and re-run against the MSigDB v6.0 files to see if there's a difference.  I don't *think* the Hallmarks collection changed much, though, so it seems unlikely in any case.

If you meant 2018, then nothing should have changed on our end since then.

If you're working on the same computer, you might try going through the GSEA history directory to find your old analysis for comparison.  Use "Help > Show GSEA home folder" and then look in 'output' for the correct day.  There should be an RPT file with the analysis settings, etc. that could be useful.

Regards,
David

On Friday, July 13, 2018 at 9:33:27 PM UTC+9, Emilie wrote:
Hi David,
thank you for your answer.
I used the h.all.v6.1.symbols gene set database. I checked the gene format and it's ok.
What I don't understand is that my analysis worked some months ago. Did something change between last february-march and now?

Thanks again,

Emilie
Reply all
Reply to author
Forward
0 new messages