Interpretation of Monte Carlo value from transform_coordinate_matrices.py

219 views
Skip to first unread message

Paul

unread,
Aug 6, 2013, 4:58:04 PM8/6/13
to qiime...@googlegroups.com
I just completed a transform_coordinate_matrices.py analysis that included a monte carlo simulation. I understand how the analysis is completed but I am unclear on the output format of the data.  Below is a copy of the data.  Which value is the p-value and what do the other numbers mean?

FP1 FP2 Included_dimensions MC_p_value Count_better M^2
bray_curtis_16SDD_first24_pc.txt bray_curtis_MGRAST_samplenumberedit_pc.txt 3 0.000 0 0.385


Thanks, Paul

Daniel McDonald

unread,
Aug 7, 2013, 12:26:55 PM8/7/13
to qiime...@googlegroups.com
Hey Paul,

Have you tried opening the file in excel or google spreadsheets? It should be more clear. Let me know if you need more specifics on the numbers though

Best,
Daniel
--
 
---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

David Case

unread,
Nov 18, 2014, 1:13:39 PM11/18/14
to qiime...@googlegroups.com
Hello,

I would like to follow up on this topic from 2013. I am clear on the identity of values in the output of transform_coordinate_matrices.py, but I don't feel I have a good, intuitive sense of what the p-value represents. The QIIME index says the p-value "estimate[s] the probability of seeing an M^2 value as extreme as the actual M^2." I've downloaded the original Gower 1975 Procrustes paper, but still don't quite get it. Can you explain in a few sentences what the p-value is representing in this QIIME analysis?

Thanks,
David

Luke Ursell

unread,
Nov 18, 2014, 1:17:17 PM11/18/14
to qiime...@googlegroups.com
Hey David,

Any p-value is essentially representing the number of times you’d capture that result by random chance. Said another way, imagine if you scattered the labels to the points in PCoA space randomly -> how many times out of 100 would you get a better m2 value? If you only got a better M2 value 5 out of 100 times, then your p value of 0.05 is basicaly saying that your result is more significant than would be acquired by random chance.

Hope this helps,
Luke


For more options, visit https://groups.google.com/d/optout.

David Case

unread,
Nov 18, 2014, 1:24:49 PM11/18/14
to qiime...@googlegroups.com
Great, thanks for the clarifying response. And forgive my lack of statistics background, but what is the M^2 value, exactly? I have a sense it is a diagnostic value that is a function of the locations of the datapoints in coordinate space.

Thanks again,
David

Luke Ursell

unread,
Nov 18, 2014, 1:36:15 PM11/18/14
to qiime...@googlegroups.com
The m2 value is a closeness of fit between to the two sets of coordinate points you’re trying to overlap. A better fit = higher m2 value.

Put another way, you can have a significant differences, but the difference is small (say two groups with mean of 1.0 and 1.1, p< 0.05)
or you can have a sig diff with a more obvious difference (mean 1.0 vs mean 100.0, p value < 0.05)

So usually we look for a sig p-value, and M2 > 0.3. Only a guideline, and not likely publishable as gospel. 

Luke

David Case

unread,
Nov 18, 2014, 2:47:31 PM11/18/14
to qiime...@googlegroups.com
Great, thanks!

David

Luke Ursell

unread,
Nov 18, 2014, 3:06:31 PM11/18/14
to qiime...@googlegroups.com
David,

I've been told lower m2 values indicate a better fit, that I was wrong. I'm about to get on a plane but will follow up later. 

Luke. 


David Case

unread,
Nov 19, 2014, 9:00:08 PM11/19/14
to qiime...@googlegroups.com
Hi Luke,

Ah, I would love clarification on the M^2 values, and especially on whether high values are better or worse -- thanks. I was going to ask, is there a range of possible M^2 values, perhaps 0 to 1, as for an R^2 value in a linear regression?

Thanks,

Luke Ursell

unread,
Nov 19, 2014, 10:04:47 PM11/19/14
to qiime...@googlegroups.com
Yes, lower values are better. and the range is from 0 to 1. 

Luke

David Case

unread,
Nov 20, 2014, 12:20:14 PM11/20/14
to qiime...@googlegroups.com
Thanks for the clarification. I have an analysis for which M^2 is 0.6 but the p-value is very low (actually 0.002). To make sure I'm understanding everything, does this indicate the two datasets which were compared are not a close fit, and that the result is significant?

Thanks, David

Luke Ursell

unread,
Nov 20, 2014, 12:21:30 PM11/20/14
to qiime...@googlegroups.com
That is correct. Depending on the number of points it can be fairly easy to achieve a significant p-value, hence the importance of looking at the M2 value.

AS

unread,
Apr 4, 2021, 3:56:45 AM4/4/21
to Qiime 1 Forum
I have got an M2 value of 0.8 based on the weighted and unweighted unifrac coordinates while monte carlo p-value is 0.295. I have also received a warning: p-values in this file are NOT currently adjusted for multiple comparisons. How can I interpret the result obtained? Is it right to just say that the abundance weighted and diversity-based unweighted approach indicate different separation across the samples (clustering) and that both the output PCoA plots indicate the difference. 
Reply all
Reply to author
Forward
0 new messages