sample.z scores in compare.pls output

Grant Haines

unread,

Apr 7, 2026, 8:10:26 AM (4 days ago) Apr 7

to geomorph R package

Hi all,

I have a multi-module landmark set with males and females in which I am testing integration within each sex using integration.test, then comparing the integration strengths between them with compare.pls. I am trying to understand why there is a difference between sample effect sizes in the integration.test and subsequent compare.pls outputs.

In the two-module pupfish example from the compare.pls documentation, the effect sizes for each group in the sample.z portion of the output are the same as the z-scores in each sample's integration.test output (eg., integ.tests$Marsh.F$Z == group.Z$sample.z["Marsh.F"]). However, this isn't true when there are more than two modules. I understand why two-module and multi-module integration.tests calculate effect size differently, but how are the sample Z-scores are being calculated in multi-module compare.pls tests?

I would like to be able to report the difference in sample effect sizes between males and females, but right now can't figure out which pair of effect sizes is more appropriate to report.

thanks in advance,
Grant

Mike Collyer

unread,

Apr 7, 2026, 11:56:11 AM (4 days ago) Apr 7

to geomorph R package

Dear Grant,

This is unintentional but the integration test finds the mean of the pairwise Z scores. The compare.pls function find the Z of the combined distributions. So, for example, if there are three modules and thus, 3 pairwise modules to compare, the former would find Z from 1,000 random values, three times, and then take the mean of the three Z scores. The latter would find Z from 3000 random values. These converge for two groups because there is one comparison, only.

I have a slight philosophical problem with the former. I think when the function was originally made, there really was no better way to do it, and I am not sure if there still is, but it rather assumes that modules have approximately equal numbers of points. The latter might be more appropriate because it might capture the range of random values one could expect when module sizes are different, which for my example, produced more conservative Z scores. the latter also tends more toward a pooled standard error, but is not exactly that.

We will need to discuss this and update it but in the interim, I would trust the compare.pls values more than an unweighted mean for Z scores that should be weighted by module size. Both are just estimates, and I hope they don’t vary wildly, but unfortunately, they use different estimation procedures. At least for now.

Hope that helps,

Mike

--
You received this message because you are subscribed to the Google Groups "geomorph R package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geomorph-r-pack...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/geomorph-r-package/edba732a-df36-4bc9-96ad-55e31df20bc9n%40googlegroups.com.

Grant Haines

unread,

Apr 7, 2026, 2:32:12 PM (4 days ago) Apr 7

to geomorph R package

Dear Mike,
Thanks very much. This explanation fits with my results comparing my own data (with very unbalanced module sizes), to the pupfish data (in which I created a 3rd module by splitting the tail module, resulting in more balanced module sizes). My more unbalanced data produced effect sizes from integration.test 2-3x larger than those from compare.pls, while the integration.test effect sizes were maybe 1.5-2x larger for the pupfish data.

thanks again,

Grant

Mike Collyer

unread,

Apr 7, 2026, 3:32:42 PM (4 days ago) Apr 7

to geomorph R package

Dear Grant,

Thanks for confirming this. I also used the pupfish data and created a random split into three modules. I only found slightly higher Z scores from integration test but they were higher across the board. Your other results confirm my fears with simple averaging across modules.

Thanks again,

Mike

To view this discussion, visit https://groups.google.com/d/msgid/geomorph-r-package/f43bb046-d872-4d77-9343-aff9fdaa8b1an%40googlegroups.com.

Mike Collyer

unread,

Apr 9, 2026, 7:24:04 PM (2 days ago) Apr 9

to geomorph R package

Dear Grant, and everyone,

I have a couple updates to announce. First, some clarifications. I misspoke to you before. I said that, “the integration test finds the mean of the pairwise Z scores”. This was imprecise (well, just wrong). The function finds the Z of the mean of the r_pls statistics. I’m sorry if I caused confusion. Additionally, I said, “The compare.pls function find the Z of the combined distributions.” Although this was not incorrect, there was actually a bug, in which the first pairwise r_pls value was used instead of the mean of the pairwise r_pls values. This was not intentional, but would have contributed to Z-score differences between the methods. I found this, actually, while updating both of these functions, and phylo.integration. Now any integration-pls function that has 3+ modules uses a weighted average rather than simple mean of r_pls statistics. For those interested in the weights, they are found as MC/2 * (p1 + p2) / sum(p), where MC is the number of pairwise module comparisons (number of weights to use) and p refers to the number of landmarks. This produces vectors of weights centered on 1, which are greater than 1 if module pairs have more landmarks and less than 1 if module pairs have fewer landmarks. The weights won’t be disparate unless the numbers of landmarks within modules are also disparate, and they will all be 1 if modules have the same number of landmarks.

Why do this? Imagine four modules of 5, 5, 30, and 60 landmarks. The pairwise Z for all comparisons are around 2-3, except between the first two, in which case Z = 10. A comparison involving only 10% of the landmarks should not have equal weight for estimating the overall Z. With the new weighting, the r_pls weight would be 6/2 * 10/100 = 0.30 (since there are 6 total comparisons), instead of 1.0.

This adjustment will probably have little effect in most cases but should mitigate cases with variable effect sizes resulting from variable module sizes. Also, compare.pls now produces consistent results with integration.test or phylo.integration! These updates can be installed with the version on Github.

Best,

Mike

Grant H.

unread,

Apr 10, 2026, 4:06:51 PM (23 hours ago) Apr 10

to geomorph-...@googlegroups.com

Hi Mike,
Thanks very much for working to fix this so quickly! I have tried to install the updated stable version from github using devtools::install_github("geomorphR/geomorph", ref = "Stable"), but it appears that it includes the old versions of the updated functions, despite the updates being visible in github and the commit number (314c9e0) being correct in the installed version. I'm not fluent enough with github to know what is going on here, is there something different I should be doing?

best

Grant

You received this message because you are subscribed to a topic in the Google Groups "geomorph R package" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/geomorph-r-package/zHC12zu0cSY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to geomorph-r-pack...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/geomorph-r-package/17874576-6CA5-4EF6-BDC0-F60E9E1D2D41%40gmail.com.

Mike Collyer

unread,

Apr 10, 2026, 5:24:24 PM (22 hours ago) Apr 10

to geomorph-...@googlegroups.com

Hi Grant,

You might consider using the attribute, force = TRUE, if devtools is unwilling to overwrite the current version. I have occasionally had trouble installing remote packages if a current version is in use, and overcome it by restarting R. These are guesses. I just did a force-install from Github using the Stable branch and it appears to have worked for me.

Mike

To view this discussion, visit https://groups.google.com/d/msgid/geomorph-r-package/CAP-um-FmUDAz2%2BMrG7EY8WgHYFrGp5_r1%3DQMKBo2ZfNNXPjERg%40mail.gmail.com.

Reply all

Reply to author

Forward