GSEA results in new vs old version of Webgestalt

80 views
Skip to first unread message

Maria Christou

unread,
Dec 4, 2024, 10:27:04 AM12/4/24
to webgestalt
Hello,
I have a set or ranked genes from zebrafish and I performed GSEA both in the 2019 version and the new version of WebGestalt. in the 2019 version I have a lot of enriched pathways with FDR < 0.05 where in the new version I have no hits in regards to significant pathways while using the same advanced parameters. 
Could you explain to me why this is happening?

Thank you,
Maria C

John Elizarraras

unread,
Dec 4, 2024, 3:27:07 PM12/4/24
to webgestalt
Hello,

The underlying databases were updated, meaning that you should not expect the exact same results. However, it the difference you see is more drastic than what I would expect. What pathway database are you using for enrichment?  I can check how large of a change the database had to see if that explains the lack of enrichment.

Best,
John

Maria Christou

unread,
Dec 5, 2024, 3:46:45 AM12/5/24
to webgestalt
Hi,
I have tried both GO BP and KEGG pathways and I get the same output i.e some significant pathways in the 2019 version and nothing significant in the new version.
I have attached a part of the results as example
241205 KEGG results.pdf

Maria Christou

unread,
Dec 12, 2024, 4:15:40 AM12/12/24
to webgestalt
Hello,
Did you have a chance to look at the GSEA results?

Kind regards,
Maria C

Jorke Kamstra

unread,
Jan 2, 2025, 10:22:42 AMJan 2
to webgestalt
Dear John,

We observed a similar difference between the 2019 and 2024 versions of webgestalt with a different dataset (Bos Taurus). I can imagine that databases change, but not that drastically. We are in doubt, which is the right output, the 2019 or the 2024 output? or is there a parameter in 2024 that we overlooked?

Best,
Jorke



Op donderdag 12 december 2024 om 10:15:40 UTC+1 schreef Maria Christou:

John Elizarraras

unread,
Jan 14, 2025, 2:43:08 PMJan 14
to webgestalt
Hello,

Sorry for the long wait for a response.

I have done some investigating into the zebrafish KEGG results. On average, the updated pathways are 15% larger,. Additionally, there are 15 new pathways. On top of that, there are 1648 new genes/proteins not in the 2019 version. This seems like a large difference (almost a quarter of genes are new). I attached a text file that shows the breakdown of the differences for each pathway.

I believe that addition of 1700 new genes is the biggest reason for this decrease. Having more genes mapped to a pathway would make the random permutations more stable, and would likely lead to decreased NES scores. There are other factors that could be involved, such as your list size, and how many genes were successfully mapped.

To answer Jorke's question, I would use 2024, as the pathways are more updated.  More pathways are added, while other irrelevant ones are removed by the database authors.

There aren't any major changes in default parameters. The biggest change we made is instead of filtering by < 5% FDR, we default to showing the top 10 enriched terms. You can change this in the Advanced Parameters menu below where you input your analyte list.

Let me know if you have any questions.

Best,
John
zebrafish_comparison.txt

John Elizarraras

unread,
Jan 14, 2025, 2:44:01 PMJan 14
to webgestalt
I forgot to mention, but in the TXT file gmt1 is the 2024 version of KEGG and gmt2 is the 2019 version.
Reply all
Reply to author
Forward
0 new messages