Importing GPA and EFA data from Momocs into geomorph for phylo PCA

165 views
Skip to first unread message

laura.au...@btopenworld.com

unread,
Jan 5, 2022, 9:56:21 AM1/5/22
to geomorph R package
Hello! 

I am trying to import my GPA and EFA data from momocs into geomorph for use in a phylogenetic PCA. Momocs does not have this function but I have used it for the EFA analysis however I now need to add a phylogenetically corrected PCA to my work so need to find a way of doing this on my Momocs data. 

When trying to read the momocs data for gm.prcomp with or without the tree I get this:
> pca <- gm.prcomp(align_ldk.gpa$coo)
Error in base::colMeans(x, na.rm = na.rm, dims = dims, ...) :
  'x' must be numeric

> gm.prcomp(align_ldk.gpa$coo, phy = FEP.tree, GLS = TRUE)
Error in h(simpleError(msg, call)) :
  error in evaluating the argument 'x' in selecting a method for function 'as.matrix': subscript out of bounds. 

I have tried exporting the GPA coordinates as a csv file and then importing into geomorph and carrying out pca. This does produce a plot, however it has analysed it incorrectly and PC1 accounts for 99% of the variance, and I cannot figure out why it doesn't plot correctly.

Has anyone attempted something similar or know way to get these packages working together?

Many thanks,

Laura

Mike Collyer

unread,
Jan 5, 2022, 10:14:42 AM1/5/22
to geomorph R package
Dear Laura,

I believe the $coo objects are lists rather than arrays, so R is not sure how to process the list as data.  Using simplify2array — e.g.,  gm.prcomp(simplify2array(align_ldk.gpa$coo)) — might work to convert then to arrays for analysis.

Mike

--
You received this message because you are subscribed to the Google Groups "geomorph R package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geomorph-r-pack...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/geomorph-r-package/e2ddd6bd-1353-475b-b024-616d07fa7bd2n%40googlegroups.com.

laura.au...@btopenworld.com

unread,
Jan 6, 2022, 10:22:33 AM1/6/22
to geomorph R package
Dear Mike, 

Thank you so much for that suggestion, it does now correctly read the data and perform the PCA. However, when I try to add the tree I get the error:
> array.gpa <- simplify2array(align_ldk.gpa$coo)
> pca <- gm.prcomp(array.gpa)
> phypca <- gm.prcomp(array.gpa, phy = FEP.tree, GLS = TRUE)
Error in edge[, 3] : subscript out of bounds

Both have 24 characters and the exact same name for tree tips and specimen ID so I don't understand what is causing this. In the FEP.tree object edge is listed as integer [45 x 2]. The tree is rooted with no branch lengths.

Thanks,

Laura

Mike Collyer

unread,
Jan 6, 2022, 10:45:41 AM1/6/22
to geomorph R package
The error suggests you have a different number of observations than taxa in the tree, or contrary to your assertion, the names are not exactly the same.  This often happens with an extraneous space, e.g., “taxon 1 “ vs. “taxon 1”

You can try 

match(dimnames(array.gpa)[[3]], FEP.tree$tip.label)

Also maybe try switching the two arguments around.  If you get NA for any portion of the output, you have either a naming issue or a missing taxon issue.

Mike

laura.au...@btopenworld.com

unread,
Jan 12, 2022, 9:31:17 AM1/12/22
to geomorph R package
Hi, 

I have checked the matching and both sets of data appear to be the same, so I an not sure why the error is occuring. 

> pca <- gm.prcomp(array.gpa)
> phypca <- gm.prcomp(array.gpa, phy = FEP.tree, GLS = TRUE)
Error in edge[, 3] : subscript out of bounds
> match(FEP.tree$tip.label, dimnames(array.gpa)[[3]])
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
> match(dimnames(array.gpa)[[3]], FEP.tree$tip.label)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Cheers, 

Laura

Bryan H. Juarez

unread,
Jan 12, 2022, 11:29:39 AM1/12/22
to geomorph-...@googlegroups.com
Hi Laura,

You mentioned that your tree has no branch lengths. I can replicate your error by removing branch lengths from the tree (setting $edge.lengths to NULL).

library(geomorph)
data(plethspecies)
Y.gpa <- gpagen(plethspecies$land)  

plethspecies$phy$edge.length <- NULL #remove branch lengths
PCA.w.phylo <- gm.prcomp(Y.gpa$coords, phy = plethspecies$phy)
Error in edge[, 3] : subscript out of bounds

If this is the case with your data, branch lengths are necessary for the calculation of the phylogenetic covariance matrix essential for most phylogenetic comparative methods.

Hope that helps.

Bryan



--
Bryan H. Juarez, PhD 
He/Him
NSF Postdoctoral Scholar
O'Connell and Hadly Labs
Biology Dept.
Stanford University
Twitter: @bhjuarez


laura.au...@btopenworld.com

unread,
Jan 15, 2022, 10:24:45 AM1/15/22
to geomorph R package
Ah that makes sense, thank you! I will see if I can locate this information. 

Many thanks, 

Laura

Reply all
Reply to author
Forward
0 new messages