numMarkers: 100 cannot be larger than dataset size: 50

892 views
Skip to first unread message

ankita lawarde

unread,
Jan 15, 2018, 1:39:17 AM1/15/18
to gsea-help
Dear Sir/ma'am,

I am using GSEA for my data and its throwing error as "numMarkers: 100 cannot be larger than dataset size: 50". i the parameter section i had set the max size as 50 and min size as 50, earlier it was showing no such error when i i had set the sizs parametr as same as the number of rows (54675). but this time its showing error.

i have attached my files along with this mail but there are no errors in the files.

I look forward to hear from you

Regards,
Ankita Lawarde
endothelium_50_chip.chip
endothelium_50_gct.gct
endothelium_50_grp.grp
tumor_non_tumor.cls

David Eby

unread,
Jan 16, 2018, 2:10:40 AM1/16/18
to gsea-help
Hi Ankita,

This looks like a misunderstanding about the purpose of that parameter.  As explained in the User Guide on the Run GSEA Page, this parameter simply controls the "Number of features (gene or probes) to include in the butterfly plot in the Gene Markers section of the gene set enrichment report."  It need not - and in fact should not - match the number of features in the dataset.

As that page points out, the entire Advanced Fields section of parameters should be avoided unless you're highly familiar with GSEA.  While this parameter could quite arguably have a better name, it's best to leave all of these set at their defaults in general unless directed otherwise.  I'll make a note of the changing the parameter name for a possible future release, though.

Regarding the max & min thresholds, these are related to the number of matches from the Gene Set (GMT, GMX, or GRP) rather than the number of genes in the dataset.  Again, especially with MSigDB Gene Sets, there should be no need to change these other than minor tweaking.

By the way, it's best to work with full datasets (thousands of genes) rather than filtered lists as otherwise it blunts the statistical power of GSEA.

Regards,
David Eby

David Eby
Consultant
Cancer Informatics Development
Broad Institute of MIT and Harvard
415 Main St, Cambridge, MA 02142, USA
http://www.broadinstitute.org/cancer
http://www.gsea-msigdb.org
https://twitter.com/GSEA_MSigDB
https://twitter.com/GenePattern

ankita lawarde

unread,
Jan 16, 2018, 5:00:47 AM1/16/18
to gsea...@googlegroups.com
Hi David Eby,

Thank you sir to clear out my doubt, now I understand this point clearly.
Thank you Very much.



Regards,
Ankita Lawarde

--
You received this message because you are subscribed to a topic in the Google Groups "gsea-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gsea-help/1E4zDo2qw3k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gsea-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/8dd9c9e8-38f5-4631-aeef-79ed5218ce3d%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Lisette Chávez

unread,
Jun 17, 2022, 7:54:06 PM6/17/22
to gsea-help
Hello, I am working with the GSEA, but I am receiving the same error: numMarkers: 100 cannot be larger than dataset size: 89. In this case, my list has 89 genes, I have tried to change the parameter 100 to 80, but It is selected by default in the software. 
Could you help me regarding this error, and why I can't run the program please?
Thank you in advance.

Best regards,
Lisette Chávez

To unsubscribe from this group and all its topics, send an email to gsea-help+...@googlegroups.com.

Anthony Castanza

unread,
Jun 18, 2022, 1:54:37 PM6/18/22
to gsea-help
Hi Lisette,

89 genes is not a sufficient number of genes to run GSEA. GSEA is not intended to be run with highly restricted datasets, rather it expects information for all expressed genes.
If you supply a full expression dataset (or ranked list depending on what mode of GSEA you are running) this error should go away as that would be within the expected design parameters of the method.

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego

You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gsea-help/3a21e0b6-c3b5-447c-aee2-dfc45c633033n%40googlegroups.com.

Lisette Chávez

unread,
Jun 20, 2022, 10:55:10 AM6/20/22
to gsea-help
Dear Anthony, 

Thank you very much for the answer, I will try in this way that you indicate.
Best regards,

Lisette

Reply all
Reply to author
Forward
0 new messages