Calculation of Metapvalue in MultiOmics analysis

Charline Jnnt

unread,

Aug 2, 2024, 2:09:32 AM8/2/24

to webgestalt

Hello,

I am still working on a MultiOmics analysis, and I have a question about the MetaPvalue.

I understand how it is calculated for a pathway found in both analyte lists (integration of the two pvalues using the Stouffer's Z-score method), but I do not get how it is integrated for a pathway found in one list but not in the other..

For example, the Lysine degradation pathway is only enriched in List 1 with a pvalue of 0.11 and a size of 63. In the all results, this gene set has a Metapvalue of 1 with a size equal to 88. As I said, this path is not enriched in the List 2, so how the Metapvalue and the size are calculated ?

I tried increasing the number of displayed pathways (Sigmethod = "top", topThr=40), but I did'nt get any difference, I have only 9 enriched pathways in List 2 and Lysine degradation is not present.

Can you help me ?

Best

John Elizarraras

unread,

Aug 2, 2024, 11:44:50 AM8/2/24

to webgestalt

Hello,

This is a quirk because of the way the pathways are filtered. WebGestaltR allows you to filter out sets using the minNum and maxNum parameters which filters out sets that contain too few or too many analytes in the set. More importantly, analyte sets with zero overlap with your analyte list are not shown in the report for single list analysis.

What I am guessing is happening is that Lysine Degradation Pathway has zero overlap with List 2. This means it is not shown in the report for List 2. However, we still use this p-Value in calculating the meta-p. The List 2’s p value would be 1 since there is zero overlap, and when you use stouffer’s method with List 1’s p value of 0.11 and List 2's value of 1, you get a meta-p value of 1.

Another similar case is where List 1 is a different analyte type from List 2, and the pathway database you enrich against has a pathway for the genes but that pathway is not annotated for metabolites. Metabolites typically have fewer pathways annotated, so this is fairly common. In this scenario only the p-value from List 1 would be used for the meta-p since there is no information about List 2. As you saw, the size of the Lysine Degradation Pathway is larger in the meta-analysis page than in the individual page, so this indicates this pathway has metabolites and genes/proteins annotated.

As you note, this process can be confusing and sometimes opaque. We hide the pathway in List 2’s individual result because there is no overlap, but it still is important to show in the meta-analysis page, as List 1 has some overlap. We use List 2’s p-Value because it highlights how List 1 is enriched for this pathway, but this is not agreed upon by all of the lists/omic types.

I think we can improve the way the multi-list results page displays the results. For example, we may not show the Lysine Degradation pathway in List 2’s individual results, but it would be helpful to have information about List 2’s p-Value in the meta-analysis page so you can see why the meta-p value is high or low. We are still working on improving the visualizations of the multi-list results page to address this. An example of the work in this area is a heat-map next to the bar chart, which shows the logP value of the pathway in each list, which allows you to see which lists have the pathway, and the individual significance levels. A quick demo of this can be seen here: https://codepen.io/iblacksand/full/mdgLvJE.

A small note that may just be an typo caused by auto-correct is Sigmethod = "top", should be sigMethod = "top".

Let me know if you have any questions, or suggestions about how to improve how we can improve the displays of results.

Best,
John

John Elizarraras

unread,

Aug 2, 2024, 11:46:07 AM8/2/24

to webgestalt

The last line should read:

Let me know if you have any questions, or suggestions about how to improve the displays of results.

Charline Jnnt

unread,

Aug 5, 2024, 4:33:32 AM8/5/24

to webgestalt

Hi,

In this case, the lack of overlap between my metabolites list and the Lysine Degradation Pathway would explain the calculation of the MetaPvalue since it is equal to 1 (and not to the List 1's pvalue =0.11).

To verify this hypothesis, I searched for metabolites in the Lysine Degradation Pathway on the KEGG database (KEGG PATHWAY: ko00310 (genome.jp)) and indeed I didn't find my metabolites in this pathway. However, I don't find the same number of genes/compounds in the pathway compared to the size indicated in the enriched result. I don't understand why...

If you want to improve the visualization of the Multi-Omics analysis, a heatmap showing the logP value of both lists would be a good idea. You could also perhaps add information on the size and overlap of these pathways, which would be of interest to me personally. In the second scenario, when the path is not annotated in the database for a specific analyte type, maybe you can add "NA" in the size and P-value columns, just for better user understanding.