Dear GSEA Support Team,
In microarray data, multiple probes may target the same gene but sometimes show different expression patterns. I would like to ask how GSEA handles such cases, and what would be the recommended approach on my side.
For example, if one probe for a gene is significantly upregulated while another probe for the same gene is not significant, which probe should I consider for GSEA input? Should I select one based on certain criteria (e.g., highest variance, average expression, most significant p-value), or does GSEA have a preferred or automatic way of resolving such redundancy?
Thank you very much for your guidance.
Best regards,
Yahya Yozbatiran
--
You received this message because you are subscribed to the Google Groups "gsea-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gsea-help+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/78fe5d3e-628d-4cd0-9132-89ed6c6ffeadn%40googlegroups.com.
Dear Anthony,
Thank you very much for your clear explanation.
As a follow-up, I would like to ask your recommendation on the preprocessing step. Which strategy would you suggest is generally the most appropriate or commonly used? For example, some approaches include selecting the probe with the highest variance across samples, averaging all probes for a gene, or selecting the most significant probe after differential expression testing.
In your experience, would you recommend relying on the default GSEA “max_probe” method, or do you find that preprocessing at the gene level (e.g., variance- or average-based) provides a more robust input for GSEA?
Thank you again for your guidance.
Best regards,
Yahya Yozbatiran
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZx8OowGyyTwnaCgYTuYwpMh4TY1nBPmve5dF8vKOaj7sQ%40mail.gmail.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/CAAacJH8XPP1fh48YVq7ZiputGzcp8QXfow2J_C2-NFEStFmKnA%40mail.gmail.com.
Dear Dr. Anthony Castanza,
Thank you very much for your helpful explanation, it helped me a lot.
I would like to kindly ask a follow-up question:
When preparing input for GSEA from microarray datasets, is it acceptable to provide log2-transformed expression values, or should the input strictly consist of raw signal intensities summarized at the gene level?
Thank you again for your valuable guidance. Have healthy days.
Best regards
Yahya Yozbatiran
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/CAGCeyZwRguqgb0RMVjhMMQfbEAiVtkKnRC8LO7MGp%3D6XA2vbOg%40mail.gmail.com.
To view this discussion visit https://groups.google.com/d/msgid/gsea-help/c4125710-84db-4283-a789-0ba6ef4a6d4dn%40googlegroups.com.