Hello,
Setting an additional NES cutoff would be largely arbitrary, yes. A cutoff of 1.8 is pretty high, but not unreasonable to picking the strongest enrichment signals in the dataset, and as long as it is done in combination with consideration for standard NOM pValue and FDR cutoffs, it's not inherently scientifically objectionable (although you may be missing weaker signals in the data).
There is a small misconception about the NES here though. The NES of a set is not affected by any other set in the calculation - only the FDR is a global statistic like that. The NES is only affected by a set's
own null distribution (either the scoring of random permutations of the samples using the same set in phenotype permutation mode, or random permutations of the genes to construct a null set of identical size in gene set permutation mode). In neither mode does the NES have any correlation with collection size, additionally, because the null is always generated from a set of the same size, the normalization there in fact removes the effect of the gene set size on the enrichment score. What can be adversely impacted by collection size is the FDR statistic, which is why we recommend running GSEA with the lowest-level subcollection applicable to the analysis you're running.
Do let us know if you have any additional questions
-Anthony
Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego