Can GSEA use DEGs to analysis？

724 views

Skip to first unread message

kenny...@gmail.com

unread,

May 8, 2018, 5:37:17 AM5/8/18

to gsea-help

GSEA is a great tools for gene expression analysis， but I have an argue with my colleague. I think GSEA should use all gene expression data to analysis，but some papers have published to use DEGs for analysis.If it is correct to use DEGs? Looking forward to your reply.Thank you.

David Eby

unread,

May 9, 2018, 9:31:35 PM5/9/18

to gsea-help

Hi,

By DEGs, I assume you mean Differentially Expressed Genes. While it is possible it is not recommended. In general, you are correct that GSEA is best run on all expression data without filtering.

This is covered in the Preparing Data Files section of our User Guide, in the discussion of filtering based on expression values. To quote:

During the analysis, genes that are poorly expressed or that have low variance across the dataset populate the middle of the ranked gene list and the use of a weighted statistic ensures that they do not contribute to a positive enrichment score. By removing such genes from your dataset, you may actually reduce the power of the statistic. Processing time is rarely a factor; GSEA can easily analyze 22,000 genes with even modest processing power.

We have lately softened this stance slightly in our advice for RNA-Seq Data, in that it may be beneficial to filter out low count measurements. While GSEA can still process unfiltered RNA-Seq data, it may lead to a more "choppy" scoring due to a large number of ties across the middle of the ranking. Aside from this, however, is best to work with otherwise unfiltered data.

I hope this helps.

Regards,

David

David Eby
Consultant
Cancer Informatics Development
Broad Institute of MIT and Harvard
415 Main St, Cambridge, MA 02142, USA
http://www.broadinstitute.org/cancer
http://www.gsea-msigdb.org
https://twitter.com/GSEA_MSigDB
https://twitter.com/GenePattern

Reply all

Reply to author

Forward

0 new messages