--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/5dada5cf-7a8b-4704-96d1-60688ba98da8n%40googlegroups.com.
The nominal p value estimates the statistical significance of the enrichment score for a single gene set. However, when you are evaluating multiple gene sets, you must correct for gene set size and multiple hypothesis testing. Because the p value is not adjusted for either, it is of limited value when comparing gene sets. The Gene Set Enrichment Analysis PNAS paper describes the p value statistic in the section titled Appendix: Mathematical Description of Methods.
The FDR is adjusted for gene set size and multiple hypotheses testing while the p value is not. When a top gene set has a small nominal p value and a high FDR value, it generally indicates that it is not as significant when compared with other gene sets in the empirical null distribution. This could be because you do not have enough samples, the biological signal is subtle, or the gene sets do not represent the biology in question very well. On the other hand, the FDR is based on two distributions of all gene sets; if only one of many gene sets is enriched, that gene set is likely to have a high FDR. Finally, a top gene set with a high nominal p value and a low FDR value, generally indicates a negative result: the gene set itself is not significant and other sets are weaker.
In the GSEA report, a p value of zero (0.0) indicates an actual p value of less than 1/number-of-permutations. For example, if the analysis performed 100 permutations, a reported p value of 0.0 indicates an actual p value of less than 0.01. For a more accurate p value, increase the number of permutations performed by the analysis. Typically, you will want to perform 1000 permutations (phenotype or gene_set). (If you attempt to perform significantly more than 1000 permutations, GSEA may run out of memory.)
I am in doubt about the interpretation of that sentence.To view this discussion visit https://groups.google.com/d/msgid/gsea-help/1a1cfe9a-e154-4e8f-9aa6-ef3ea53a9ba8n%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/1f8dfe5a-ea70-4443-b7c2-89249d06699an%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/850664e4-a818-4b87-8c17-3b25a12edcb8n%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/52353932-46f7-45b7-ba92-892739d963d9n%40googlegroups.com.
Hi Anthony,
Thanks a lot for your feedback.
Yes, there is an interesting trend for that gene set in the time series, and I have also found it among the gene sets positively correlated with the phenotype in the GSEA continuous analysis. Now I have a picture of those gene sets that are exclusive to a certain time point and those that show an involvement across the time series.