Request for clarification on DIA-Analyst vs GSEA results

14 views
Skip to first unread message

Fatou Diallo

unread,
Aug 12, 2025, 11:30:51 AMAug 12
to gsea-help

 Hello, 

I am reaching out in the context of my master’s thesis, which focuses on the proteomic analysis of head and neck cancers. The aim of my project is to investigate why HPV- cell lines produce more saBgal than HPV+ cell lines, taking into account the two HPV statuses specific to this cancer type.

I first analysed raw mass spectrometry data (DIA acquisition method) using DIA-Analyst, which allowed me to identify several relevant candidates. To better interpret these results, I then incorporated GSEA to highlight the associated signaling pathways.

However, I noticed an inconsistency: the fold change of a candidate in DIA-Analyst does not always match its enrichment score in GSEA. For example, focusing on trafficking in the HPV+ Rx vs HPV- Rx comparison (Rx = irradiation), some candidates with a high fold change and a relevant function (according to UniProt) do not show a high enrichment score in trafficking-related pathways in GSEA, contrary to my assumption.

I suspect there may be a methodological or conceptual aspect I am missing, and I would greatly value your insight on this matter.
Would you be willing to help me clarify this?

Kind regards,
Fatou

Anthony Castanza

unread,
Aug 12, 2025, 12:26:02 PMAug 12
to gsea-help
Hi Fatou,

I'm not sure I can answer your question here without a lot more information about your experiment. How many genes are present in your data and what namespace are their gene identifiers in? Are you using GSEA Preranked with some metric produced by the DIA-Analyst software? GSEA generally expects ranking, or expression level, information for all expressed genes in the sample, including both significant and non-significant components. We don't have hard recomendations for proteomic data, as this isn't the kind of data that is typically used for GSEA, however GSEA can operate on any arbitrary ranked list assuming it meets certian basic structural assumptions.

I don't quite understand what you mean by "the fold change of a candidate in DIA-Analyst does not always match its enrichment score in GSEA". The fold change of a candidate, assuming this was your input to GSEA Preranked, is a single gene metric that GSEA would use as the weighting factor for that specific gene, the enrichment score is a gene-set based score that is derived from a running sum based calculation over all genes where genes in the set are hits that increment the score, and misses (genes not in the set) deincrement the score.

Have you reviewed our user guide? https://docs.gsea-msigdb.org/#GSEA/GSEA_User_Guide/

-Anthony

Anthony S. Castanza, PhD
Curator, Molecular Signatures Database
Mesirov Lab, Department of Medicine
University of California, San Diego
Reply all
Reply to author
Forward
0 new messages