# Interpretation of Monte Carlo value from transform_coordinate_matrices.py

214 views

### Paul

Aug 6, 2013, 4:58:04 PM8/6/13
I just completed a transform_coordinate_matrices.py analysis that included a monte carlo simulation. I understand how the analysis is completed but I am unclear on the output format of the data.  Below is a copy of the data.  Which value is the p-value and what do the other numbers mean?

FP1 FP2 Included_dimensions MC_p_value Count_better M^2
bray_curtis_16SDD_first24_pc.txt bray_curtis_MGRAST_samplenumberedit_pc.txt 3 0.000 0 0.385

Thanks, Paul

### Daniel McDonald

Aug 7, 2013, 12:26:55 PM8/7/13
Hey Paul,

Have you tried opening the file in excel or google spreadsheets? It should be more clear. Let me know if you need more specifics on the numbers though

Best,
Daniel
--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.

### David Case

Nov 18, 2014, 1:13:39 PM11/18/14
Hello,

I would like to follow up on this topic from 2013. I am clear on the identity of values in the output of transform_coordinate_matrices.py, but I don't feel I have a good, intuitive sense of what the p-value represents. The QIIME index says the p-value "estimate[s] the probability of seeing an M^2 value as extreme as the actual M^2." I've downloaded the original Gower 1975 Procrustes paper, but still don't quite get it. Can you explain in a few sentences what the p-value is representing in this QIIME analysis?

Thanks,
David

### Luke Ursell

Nov 18, 2014, 1:17:17 PM11/18/14
Hey David,

Any p-value is essentially representing the number of times you’d capture that result by random chance. Said another way, imagine if you scattered the labels to the points in PCoA space randomly -> how many times out of 100 would you get a better m2 value? If you only got a better M2 value 5 out of 100 times, then your p value of 0.05 is basicaly saying that your result is more significant than would be acquired by random chance.

Hope this helps,
Luke

### David Case

Nov 18, 2014, 1:24:49 PM11/18/14
Great, thanks for the clarifying response. And forgive my lack of statistics background, but what is the M^2 value, exactly? I have a sense it is a diagnostic value that is a function of the locations of the datapoints in coordinate space.

Thanks again,
David

### Luke Ursell

Nov 18, 2014, 1:36:15 PM11/18/14
The m2 value is a closeness of fit between to the two sets of coordinate points you’re trying to overlap. A better fit = higher m2 value.

Put another way, you can have a significant differences, but the difference is small (say two groups with mean of 1.0 and 1.1, p< 0.05)
or you can have a sig diff with a more obvious difference (mean 1.0 vs mean 100.0, p value < 0.05)

So usually we look for a sig p-value, and M2 > 0.3. Only a guideline, and not likely publishable as gospel.

Luke

### David Case

Nov 18, 2014, 2:47:31 PM11/18/14
Great, thanks!

David

### Luke Ursell

Nov 18, 2014, 3:06:31 PM11/18/14
David,

I've been told lower m2 values indicate a better fit, that I was wrong. I'm about to get on a plane but will follow up later.

Luke.

### David Case

Nov 19, 2014, 9:00:08 PM11/19/14
Hi Luke,

Ah, I would love clarification on the M^2 values, and especially on whether high values are better or worse -- thanks. I was going to ask, is there a range of possible M^2 values, perhaps 0 to 1, as for an R^2 value in a linear regression?

Thanks,

### Luke Ursell

Nov 19, 2014, 10:04:47 PM11/19/14
Yes, lower values are better. and the range is from 0 to 1.

Luke

### David Case

Nov 20, 2014, 12:20:14 PM11/20/14
Thanks for the clarification. I have an analysis for which M^2 is 0.6 but the p-value is very low (actually 0.002). To make sure I'm understanding everything, does this indicate the two datasets which were compared are not a close fit, and that the result is significant?

Thanks, David

### Luke Ursell

Nov 20, 2014, 12:21:30 PM11/20/14