Dear Noé and everyone,
I was able to diagnose the problem (bug) in Noé’s analyses and have made a couple of important updates (which can be obtained currently on github — see below). The issue was not associated with geomorph, but was associated with the support code for the lm.rrpp function in the RRPP package, on which procD.lm relies.
The problem results from a change in programming a while back to rely on core R functions rather than re-invent wheels, where possible. One context is missing data. When using the lm function with a data.frame, there is the argument, “na.action”, one can use to decide how to handle missing values. The default is to delete them. So, rather than having excessive code to deal with missing values, it is easier to ask R to try to create a linear model with the input variables. If it works, great! If not, stop and alert the user. This is how Noé got the error. The reason for the error, however, was an unanticipated way of R dealing with the missing values (choosing to retain them).
After some consideration, I realized that RRPP does not make sense if there are not residuals to randomize, so I updated code to force removing of missing values. This was not a trivial bit of coding for a trivial task, but I think it works now (at least it did with Noé’s data, when I tested it).
However, I wish to add one bit of advice. There are several functions in R that have ways to handle missing data. The best way to handle missing data though is to actively handle it before analysis. When working with data in R, I try to not ask R to fix things within a function that could be addressed outside of a function. To do so means to trust it was fixed correctly, and I might prefer to verify the correction rather than trust functions that could have programming bugs.
To make this easier, I will soon try to have an na.omit.geomorph.data.frame function to use in geomorph; i.e.,
updated.gdf <- na.omit(gdf) # gdf is a geomorph.data.frame
However, this is a tricky function to create. Whereas na.omit.data.frame simply has to omit any row with an NA value in a data.frame object (which looks like a matrix), geomorph data frames might have phylo trees, covariance matrices, and certainly data in an array rather than a matrix. Therefore, the updates for lm.rrpp and procD.lm are ready to go, but additional functions might take some time to develop.
As a reminder, to install from github:
devtools::install_github(“mlcollyer/RRPP”, build_vignettes = TRUE)
devtools::install_github(“geomorphR/RRPP”, ref = “Stable”, build_vignettes = TRUE)
cheers!
Mike