Error with gm.prcomp

475 views
Skip to first unread message

Leonardo Hostos Olivera

unread,
Feb 11, 2022, 10:08:46 AM2/11/22
to geomorph R package
Dear all,

I am trying to use the gm.prcomp function for performing a phyloPCA, but I get an error and can't figure out what's causing it.

I have a tree with 44 taxa in the Newick format, I've read it with read.tree, performed the procrustes fit with the landmark data and included this into a geomorph data frame:

> phylo <- read.tree("crocs_tree2022")
> data <- readland.tps("crocs2022.tps", specID="ID")
> gpa <- gpagen(data, PrinAxes=TRUE, verbose=TRUE)
> gdf <- geomorph.data.frame(gpa)

Then when I perform the phyloPCA, I get this error: 

> phyPCA <- gm.prcomp(gdf$coords, phy=phylo, GLS=TRUE)
Error in h(simpleError(msg, call)) :
  error in evaluating the argument 'x' in selecting a method for function 't': NAs are not allowed in subscripted assignments

This only happens when I include the phylogeny, because when performing a standard PCA without the tree, there is no error.
> gm.prcomp(gdf$coords)

I tried doing the landmark digitalization again to generate a new TPS file, but I keep getting the same error and I'm not sure of what might be causing it. Any help would be greatly appreciated.

Thanks in advance

--
Leonardo Hostos Olivera
Licenciado en Biología (Lic. Blgo.)
Laboratorio de Paleontología y Evolución de Vertebrados
Universidad Peruana Cayetano Heredia, Lima, Perú

Adams, Dean [EEOB]

unread,
Feb 11, 2022, 11:44:50 AM2/11/22
to geomorph-...@googlegroups.com

Leonardo,

 

Please review the syntax and examples in the help file, and in the vignette about gm.prcomp.  Usually, such issues are a data input or matching issue (e.g., make certain that the names of your specimens match the names on the phylogeny).


Dean

 

Dr. Dean C. Adams (he/him)

Distinguished Professor of Evolutionary Biology

Director of Graduate Education, EEB Program

Department of Ecology, Evolution, and Organismal Biology

Iowa State University

https://faculty.sites.iastate.edu/dcadams/

phone: 515-294-3834

--
You received this message because you are subscribed to the Google Groups "geomorph R package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geomorph-r-pack...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/geomorph-r-package/1cff8a21-7444-4964-b434-1d79ce5cd8c4n%40googlegroups.com.

Leonardo Hostos Olivera

unread,
Feb 11, 2022, 2:45:46 PM2/11/22
to geomorph R package
Dear Dean, thank you for the replay.

I have checked my TPS and tree files, and they look exactly like other files I've worked with fine (I attached them just in case).
Also, I checked that the names of specimens match the names on the phylogeny:

> match(dimnames(gdf$coords)[[3]], phylo$tip.label)
 [1]  5 13 35 34 36  1  3 44 42 28 22 31 32 29 27 26 30 43  6 19 21 20 18 17 16  2 33
[28] 37 25 41 23 14 15 38 40 39  8 11  9 12 10  4  7 24

> dimnames(gdf$coords)[[3]] %in% phylo$tip.label
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[17] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[33] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

I don't know what else might be causing the problem.

Thanks for your time, hope to find the solution
Leonardo

--
Leonardo Hostos Olivera
Licenciado en Biología (Lic. Blgo.)
Laboratorio de Paleontología y Evolución de Vertebrados
Universidad Peruana Cayetano Heredia, Lima, Perú
crocs2022.tps
crocs_tree2022

María Constanza Marchesi

unread,
Feb 11, 2022, 2:48:46 PM2/11/22
to geomorph-...@googlegroups.com

Hello everybody,

I am new using R, and I’ve been struggling for some time now.

I am working on a data set containing 3D landmarks on vertebrae for 24 dolphin species, that are grouped in 3 subfamilies. In total there are 116 individuals.

We are exploring vertebral morphology within dolphins from the biomechanical point of view and trying to evidence morphological similarities in dolphin species with similar habitat requirements. For that we have stablished the categorial factor “Habitat”.

After performing GPA on our raw data. We first created a data frame containing the proc. coord and our categorical factors (that include species and subfamily apart from Habitat) and then calculated meanshapes for each species and a dataframe with the mean shape for each species and the categorial values for the means input from a cvs file.

Everything semmed to be fine this far, but when we try to plot a phylomorphospace and color the tips according to the habitat of each species, it is impossible for us to match species with the color representing their habitat, and species that are coastal are colored as if they were oceanic for eample (it messes up most of the colors)

We are assuming there is something wrong in the matching of the data, but we can’t seem to find the problem….

 

Hopefully someone can help us with this. 

I am copying the code we are using below , but I do not know if that is enough to try to help us.

Any comments will be welcome!

 

Thanks in advance


Dra. María Constanza (Kata) Marchesi
Becaria Posdoctoral / Postdoctoral fellow

Laboratorio de Mamíferos Marinos (LAMAMA)
Centro para el Estudio de los Sistemas Marinos (CESIMAR)
CCT CONICET-CENPAT
Blvd. Brown 2915
U9120ACD Puerto Madryn, Chubut, Argentina
Phone: +54 280 488-3184/488-3185 (ext 1330)

Women in Marine Mammal Science (WIMMS) 


Ever tried. Ever failed. No matter. Try again. Fail again. Fail better"
#———

Delphinidae_Data <- read.table("01Th_delphinidae_SoloAd.txt", header=T, dec =".")

Delphinidae_Clasif <-read.csv(file = "01Th_IDdelphinidae_SoloAd.csv", header = T,sep = ";",dec = ".")

 

Delphinidae_Clasif$Sp<-as.factor(Delphinidae_Clasif$Sp)

Delphinidae_Clasif$Subfamily<-as.factor(Delphinidae_Clasif$Subfamily)

Delphinidae_Clasif$Habitat<-as.factor(Delphinidae_Clasif$Habitat)

 

Delphinidae_3Darray <- arrayspecs(Delphinidae_Data[,2:ncol(Delphinidae_Data)], 35, 3)

 

Delphinidae_GPA <- gpagen(Delphinidae_3Darray)

 

gdf_delph<-geomorph.data.frame(Delphinidae_GPA,SP=Delphinidae_Clasif$Sp,Subfamily=Delphinidae_Clasif$Subfamily,Habitat=Delphinidae_Clasif$Habitat)

 

#Filogenia de la Familia

TreeDelphinidae<- '(((((((((((Slong,Sclym),Lhose),Ddelp),Scoeu),((Satte,Sfron),(TtrunO,TtrunC))),Schin),Sfluv),((((Pelec,Fatte),Gmacro),Ggris),Sbred)),((((Ccomm,Chect),((L. aust, Lcruc),Cheav)),Lobsc),Liper)),Lalbi),Lacut);'

 

Delphinidae_phy<-read.tree(text=TreeDelphinidae)

 

#Mean shapes

Y1 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lacut")])

Y2 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lalbi")])

Y3 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Ccomm")])

Y4 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Cheav")])

Y5 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Chect")])

Y6<- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Laust")])

Y7 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lcruc")])

Y8 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lipero")])

Y9 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lobsc")])

Y10 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Fatte")])

Y11 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Ggris")])

Y12 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Gmacro")])

Y13 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Pelec")])

Y14 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Sbred")])

Y15 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Lhose")])

Y16 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Satte")])

Y17 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Schin")])

Y18 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Ddelp")])

Y19 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Sclym")])

Y20 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Scoeu")])

Y21 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Sfluv")])

Y22 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Sfron")])

Y23 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "Slong")])

Y24 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "TtrunC")])

Y25 <- mshape(gdf_delph$coords[,,which(gdf_delph$SP == "TtrunO")])

 

YT <- array(c(Y25, Y24, Y23, Y22, Y21, Y20, Y19, Y18, Y17, Y16, Y15, Y14, Y13, Y12, Y11, Y10, Y9, Y8, Y7, Y6, Y5, Y4, Y3, Y2,Y1 ), dim = c(35,3,25), dimnames = list(c(),c(),Delphinidae_phy$tip.label))

 

mean_Clasif <-read.csv(file = "01Th_ID_SoloAd.csv", header = T,sep = ";",dec = ".")

mean_Clasif$Sp<-as.factor(mean_Clasif$Sp)

mean_Clasif$Subfamily<-as.factor(mean_Clasif$Subfamily)

mean_Clasif$Habitat<-as.factor(mean_Clasif$Habitat)

 

summary(mean_Clasif$Habitat)

 

# branch lenghts

Delphinidae_phy <- compute.brlen(Delphinidae_phy)

 

meanshape_gdf<-geomorph.data.frame(coords=YT,CS=CS, SP=mean_Clasif$Sp,Subfamily=mean_Clasif$Subfamily,Habitat=mean_Clasif$Habitat,Phy= Delphinidae_phy)

 

#filoPCA

Delphinidae_PCA_meanshapes<-gm.prcomp(A=meanshape_gdf$coords,phy =meanshape_gdf$Phy)

 

gp <- as.factor (meanshape_gdf$Habitat)

col.gp<-c("purple", "lightblue","red","orange","green","black","grey","blue")

names(col.gp) <- levels(gp)

col.gp1<- col.gp[match(gp, names(col.gp))]

summary(gp)

levels(gp)

 

#plot

plot(Delphinidae_PCA_meanshapes, axis1=1, axis2= 2, phylo =TRUE,col= col.gp1,  phylo.par= list(tip.labels = TRUE, node.labels=FALSE, edge.color= "black", edge.width= 1, tip.txt.cex =0.5, tip.txt.adj = c(-0.1, -0.1) ))

legend('topleft',legend= levels(gp),  fill= c("purple", "lightblue","red","orange","green","black","grey","blue"),cex=0.3)

title("Vertebra PCA",cex.main=1.5)







Bryan H. Juarez

unread,
Feb 11, 2022, 3:20:05 PM2/11/22
to geomorph-...@googlegroups.com
Hola Dra. Marchesi,

A plot of the issue would help determine where the coding issue lies. However, from what I can tell there seems to be a mismatch between the order of the colors in your legend(fill = ) argument and col.gp1. You can try sorting the fill argument or manually changing the order of colors by rearranging within c().

Specifically, I think the habitat:color matching between levels(gp) and c("purple", "lightblue"..) in your legend does not match what you've plotted using col.gp1 where you've manually rearranged the colors to be in the order of the color-matched habitats/input levels(gps). (i.e., col.gp1<- col.gp[match(gp, names(col.gp))]).

Hope that helps or at least spurs ideas of where the mismatch is coming from.

Best,
Bryan Juarez

--
You received this message because you are subscribed to the Google Groups "geomorph R package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geomorph-r-pack...@googlegroups.com.


--
Bryan H. Juarez, PhD 
He/They
NSF Postdoctoral Scholar
O'Connell and Hadly Labs
Biology Dept.
Stanford University

Twitter: @bhjuarez


Bryan H. Juarez

unread,
Feb 11, 2022, 3:53:39 PM2/11/22
to geomorph-...@googlegroups.com
Leonardo,

The following indicates the coordinate data are not in the same order as in the phylogeny.
"> match(dimnames(gdf$coords)[[3]], phylo$tip.label)
 [1]  5 13 35 34 36  1  3 44 42 28 22 31 32 29 27 26 30 43  6 19 21 20 18 17 16  2 33
[28] 37 25 41 23 14 15 38 40 39  8 11  9 12 10  4  7 24".

You need to use this information to rearrange the data you supply gm.prcomp.

I would do something like this:
coords <- coords[,,match(phylo$tip.label, dimnames(gdf$coords) [[3]] ) ] #rearrange data
then I would make the geomorph.data.frame and then use gm.prcomp. You'd just need to change my syntax to match your own objects. If that does not work, reorder the two inputs in match(), though I think I got the order of input objects for the function correct right this time.

Best, 
Bryan Juarez


María Constanza Marchesi

unread,
Feb 11, 2022, 3:56:08 PM2/11/22
to geomorph-...@googlegroups.com
Helo Bryan! Thank you for responding!

So I kept working and included what you said (or I think I did) but it is not working either…..

So, this is what I get when calling the habitat, and it looks ok, it is in the order it should be according to the order of the species in the .cvs file

> meanshape_gdf$Habitat

 [1] Shelf/Oceanic    Shelf/Oceanic    Coastal          Coastal        

 [5] Coastal          Coastal          Oceanic          Oceanic        

 [9] Shelf            Oceanic          OceanicDD        OceanicDD      

[13] Oceanic          Shelf            Oceanic          Oceanic        

[17] Coastal/Riverine Shelf/Oceanic    Oceanic          Oceanic        

[21] Riverine         Coastal          Shelf/Oceanic    Shelf           

[25] Coastal     

 

But then….I have tried two ways for matching the right color with the right species (according to that species habitat)

 

>Delphinidae_PCA_meanshapes<-gm.prcomp(A=meanshape_gdf$coords,phy =meanshape_gdf$Phy)

 

> gp <- as.factor (meanshape_gdf$Habitat)

> col.gp<-c("purple", "lightblue","red","orange","green","black","grey","blue")

> names(col.gp) <- levels(gp)

> col.gp<- col.gp[match(gp, names(col.gp))]

> summary(gp) #this also seems to be ok

         Coastal Coastal/Riverine          Oceanic        OceanicDD         Riverine

               6                1                8                2                1

           Shelf    Shelf/Oceanic

               3                4

> levels(gp)

[1] "Coastal"          "Coastal/Riverine" "Oceanic"          "OceanicDD"      

[5] "Riverine"         "Shelf"            "Shelf/Oceanic"  

 

when I run this script:

> plot(Delphinidae_PCA_meanshapes, axis1=1, axis2= 2, phylo =TRUE,col= col.gp,  phylo.par= list(tip.labels = TRUE, node.labels=FALSE, edge.color= "black", edge.width= 1, tip.txt.cex =0.5, tip.txt.adj = c(-0.1, -0.1) ))

> legend('bottomright',legend= levels(gp),  fill= c("purple", "lightblue","red","orange","green","black","grey","blue"),cex=0.3)

> title("Vertebra PCA",cex.main=1.5)

 

I get this plot…. in which species do not have the color they should

 

If I run this:

> plot(Delphinidae_PCA_meanshapes, axis1=1, axis2= 2, phylo =TRUE,col= c("purple", "lightblue","red","orange","green","black","grey","blue" ),  phylo.par= list(tip.labels = TRUE, node.labels=FALSE, edge.color= "black", edge.width= 1, tip.txt.cex =0.5, tip.txt.adj = c(-0.1, -0.1) ))

> legend('topleft',legend= levels(gp),  fill= c("purple", "lightblue","red","orange","green","black","grey","blue"),cex=0.3)

> title("Vertebra PCA",cex.main=1.5)

I get a similar plot, but colors do not match the habitats of each species either…for example Sfluv should be riverine but it is not green….

 


Thank you for taking the time to help me with this, I have been struggling for hours now!


Regards

Kata

Mike Collyer

unread,
Feb 11, 2022, 4:36:59 PM2/11/22
to geomorph-...@googlegroups.com
It’s possible that when matching names, a mistake can be made by not considering both directions.  For example:

> a <- 1:4
> b <- 1:3
> match(b, a)
[1] 1 2 3
> b %in% a
[1] TRUE TRUE TRUE


> match(a, b)
[1]  1  2  3 NA
> a %in% b
[1]  TRUE  TRUE  TRUE FALSE

I am not sure this is the issue, as it would likely be detected with a better error message.  Bryan’s suggestion can’t hurt but the function should be able to reorder a covariance matrix, internally.  I tried to replicate your error with a tree with polytomies (no error) and missing branch lengths (different kind of error).  The closest I came to yours was mismatched names, but although the error was similar, it not exactly the same. 

I believe the issue is in the tree but it is not possible to resolve with the information you have provided.  Could you provide a summary and/or plot of the tree?

Mike

Bryan H. Juarez

unread,
Feb 11, 2022, 4:39:29 PM2/11/22
to geomorph-...@googlegroups.com
De nada. 

In neither case do I expect the legend colors/habitats to match the plot. In your first plot there is the issue I sent in my previous email. In the second plot there's the issue of the supplied col = vector not matching the length of your # of species, meaning the color vector will be resampled in sequence to assign colors to the circles on the tips.

A general solution is to inspect col.gp to know the exact habitat-color matching that was plotted, THEN get the legend to match that. Currently the legend does not match the habitat-color matching because in the plot, you've resorted the habitat-color scheme to match the order of habitats in alphanumerical order (I think) and this does not match your supplied fill = argument in legend(fill = c("purple", "lightblue","red","orange","green","black","grey","blue")).

In short, this means that col.gp does not match the data you've supplied legend() for the combination of the legend= and fill= arguments. This is why I suggested you inspect col.gp and then rearrange your supplied legend(fill = ), given the order of habitats in legend(legend = levels(gp)).

Best,
Bryan

Leonardo Hostos Olivera

unread,
Feb 11, 2022, 4:40:59 PM2/11/22
to geomorph R package
Hi Bryan! Thanks for your help

I applied your suggestions for rearrangement and got this:

> gpa <- gpagen(data, PrinAxes=TRUE, verbose=TRUE, print.progress=FALSE)
> gpa$coords <- gpa$coords[,,match(phylo$tip.label, dimnames(gpa$coords)[[3]])]
> match(phylo$tip.label, dimnames(gpa$coords)[[3]])
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
[28] 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

But then I made the geomorph.data.frame, used gm.prcomp and got the same error as the beginning:

> gdf <- geomorph.data.frame(gpa)
> phyPCA <- gm.prcomp(gdf$coords, phy=phylo, GLS=TRUE)
Error in h(simpleError(msg, call)) :
  error in evaluating the argument 'x' in selecting a method for function 't': NAs are not allowed in subscripted assignments

Also, I tried reordering the two inputs in match(), but I don't get a correct rearrangement whit this:
> gpa <- gpagen(data, PrinAxes=TRUE, verbose=TRUE, print.progress=FALSE)
> gpa$coords <- gpa$coords[,,match(dimnames(gpa$coords)[[3]], phylo$tip.label)]
> match(dimnames(gpa$coords)[[3]], phylo$tip.label)
 [1] 36 32 40 38 39  5 35 24  4 37 20 23 14 25 33  2 41  7  1  6 21 19 43 30 26 13 15
[28]  8 16 10 18 29 27 11 12  9 44 22 42 31 28 34  3 17

The problem persists and seems that it's not related to the names.
I'd be thankful if you help me to find another solution, truly thanks for your time.

Kind regards

Bryan H. Juarez

unread,
Feb 11, 2022, 4:52:00 PM2/11/22
to geomorph-...@googlegroups.com
Ah my apologies! I cannot keep track of functions in different libraries that do or do not sort for you so I always sort beforehand.

Mike, just to be certain, is it currently safe to assume all functions in the currrent versions of geomorph and RRPP do sorting of data and tree for you? I do know one important note is that different variables need to be sorted prior to combining them into a single geomorph.data.frame.

Thanks!
Bryan

Leonardo Hostos Olivera

unread,
Feb 11, 2022, 4:56:21 PM2/11/22
to geomorph R package
Dear Mike, thanks for replying!
Here I provide a summary and plot (attached) of the tree

> summary(phylo)

Phylogenetic tree: phylo

  Number of tips: 44
  Number of nodes: 40
  Branch lengths:
    mean: 2.53012
    variance: 11.22774
    distribution summary:
   Min. 1st Qu.  Median 3rd Qu.    Max.
      1       1       1       2      17
  No root edge.
  First ten tip labels: Bernissartia
                        Iharkutosuchus
                        Bsternbergii
                        Tmacrorhynus
                        Agermanicus
                        Dollosuchoides
                        Tschlegelii
                        Scordovai
                        Tantiqua
                        Thcarolinense
  No node labels.

Regards

crocs_treeplot.png

Adams, Dean [EEOB]

unread,
Feb 11, 2022, 5:09:07 PM2/11/22
to geomorph-...@googlegroups.com

Yes that is correct. The example ‘plethspecies’ has data and a tree out of order, but still runs.

 

One possible issue here is that Leonardo’s tree has polytomies. For most R-based comparative methods analyses, one must first use ‘multi2di’ in ‘ape’ to resolve them via zero-length branches. That does not change the resulting phylogenetic covariance matrix, but makes a difference for some analyses.

 

A quick check though with this data does not seem to resolve it.  But I bring it up because it is good coding practice to remember this step: it can make a difference.  

 

I will investigate the current dataset challenge more when I can.

 

Dean

 

Dr. Dean C. Adams (he/him)

Distinguished Professor of Evolutionary Biology

Director of Graduate Education, EEB Program

Department of Ecology, Evolution, and Organismal Biology

Iowa State University

https://faculty.sites.iastate.edu/dcadams/

phone: 515-294-3834

 

Mike Collyer

unread,
Feb 11, 2022, 5:26:29 PM2/11/22
to geomorph-...@googlegroups.com
Any function that means using a covariance matrix and a GLS type of estimation will attempt to sort the covariance matrix by data names.  However, pre-sorting data is not a bad idea because there are some places where an attempt to sort will be made but the analysis will not be arrested if it cannot happen.  This is not true of geomorph function but is true of RRPP functions, where a covariance matrix might not be made from a phylogenetic tree.  We try to arrest analyses in geomorph, where the covariance matrix is made from a tree, so it is more essential that data and tree labels match.  

Despite best efforts though, something could slip through the cracks, so sorting data is a good strategy.

Regarding variable, yes, this is essential for building a data frame!

Cheers!
Mike

Mike Collyer

unread,
Feb 11, 2022, 6:35:53 PM2/11/22
to geomorph-...@googlegroups.com
The support functions in geomorph do attempt to use multi2di as a safety step, but these functions do not take the extra step to see if it worked as anticipated.  The ape function, ape::is.binary, might be helpful.  Because multi2di induces branch lengths of 0, I can imagine a covariance matrix that is not full rank or problems with ancestral state estimation arising.

I apologize for broken thread responses here.  I was not receiving responses sometimes, but was receiving responses to responses.  I did not receive Leonardo’s response but see that Dean received it and pointed out what I would have seen (polytomies).  I just found the original emails in my junk box.  I hope this did not cause any confusion.  

So I just now saw the data.  I can confirm that multi2di does not make a workable tree in this case and the function error occurs for ancestral state estimation, due to an inability to calculate phylogenetically independent contrasts.  Why this happens is not immediately clear.  So the problem is certainly in the tree but I’ll have to investigate what it might be.

Mike 

 

Mike Collyer

unread,
Feb 11, 2022, 7:02:06 PM2/11/22
to geomorph-...@googlegroups.com
Leonardo,

You have a tree with 44 tips and 40 internal nodes, with multiple polytomies.  Attempting to use multi2di produces a tree with 44 tips and 45 internal nodes, with multiple polytomies.  Using multi2di — which geomorph support functions do — should produce a tree with N tips and N - 1 internal nodes.  Without this condition, an error occurs down the line.

I’m not sure what to suggest but this is where the problem lies.  The tree is nether workable nor easily manipulated into a workable tree.  You might have to manipulate your original tree file to make it work.  We can try to think of simple solutions for geomorph for the future but I am not having any luck when searching for solutions for a tree with more internal nodes than tips.

Sorry, the best I can do is diagnose the problem but I have no solution to offer.

Mike

Leonardo Hostos Olivera

unread,
Feb 11, 2022, 7:37:44 PM2/11/22
to geomorph R package
Mike, thank you so much for the explanation.

Now I can understand where the error comes from. Hope this could be solved in the future in geomorph, since I guess it's not unusual to use trees with polytomies, especially in fields like paleontology.

Besides that, I tried performing a phylogenetic regression analysis using the procD.pgls() function to evaluate the effect of size on shape, and it ran fine.
> procD.pgls(coords ~ Csize, phy=phylo, data=gdf, iter=9999)
Why was there no error in this case? This function does not include multi2di?

Regards,
Leonardo

Adams, Dean [EEOB]

unread,
Feb 12, 2022, 1:23:52 PM2/12/22
to geomorph-...@googlegroups.com

Leonardo,

 

The difference between PGLS and gm.prcomp is that the latter requires ancestral state estimation. That requires a resolved phylogeny, which is where the polytomies come back into this.

 

As Mike stated, mathematically there are only N-1 nodes for a fully resolved tree of N tips. For some reason, the ‘multi2di’ function in ape (which geomorph uses) is returning N+1 nodes for your tree. I again suspect that is part of the problem as that will cascade into ancestral state estimation (though I do not have the full story yet).

 

I will try to investigate this when I can, but both Mike and I have confirmed it is a particular problem related to your tree. And even though have pinpointed where, we do not know why it occurs, or how to resolve this.  I will try to find time to investigate the latter within geomorph, but I make no promises (it is rather hard to devise a solution to a problem that should not exist mathematically!).

 

Please be patient while we continue to dig.

Mike Collyer

unread,
Feb 12, 2022, 3:50:23 PM2/12/22
to geomorph-...@googlegroups.com
Leonardo,

I have found that using a combination of ape::multi2di and ape::collapse.singles will manipulate your tree such that it can be used successfully in gm.prcomp.  I was turned onto this solution by seeing its use in phytools::fastAnc.  This might be something that we could build into geomorph functions in the future, but first we have to determine if it should be included in all cases.  (Some tree manipulation should be a user-level, philosophical decision.)

In the interim, it will allow you use any geomorph functions that involve estimating ancestral states.

Best,
Mike

Adams, Dean [EEOB]

unread,
Feb 13, 2022, 12:47:37 PM2/13/22
to geomorph-...@googlegroups.com

Leonardo et al.,

 

After quite a bit of digging, the issue was not related to the polytomies. Instead, Leonardo’s tree had two internal ‘nodes’ that were embedded within branches. That is, there were nodes with only a single descendant; meaning they were found along an internal branch.  To see these try this:

 

phytools:::plotTree.singletons(phylo)

 

Such issues are rare, and why they occur are not immediately clear to me. They are external to geomorph, and likely to R. As alluded to earlier, it is a tree issue.

 

I have updated gm.prcomp (or rather its internal functions) to catch such oddities, and have confirmed that for Leonardo’s tree things now longer throw an error. Please install from Github as:

 

devtools::install_github(‘geomorphR/geomorph’, ref = ‘Stable’)

Leonardo Hostos Olivera

unread,
Feb 14, 2022, 1:48:12 AM2/14/22
to geomorph-...@googlegroups.com
Dear Dean and Mike,
Thank you so much for your help and explanations!

I certainly hadn't noticed the two internal nodes located within the branches of my tree. After removing them, I could finally perform the analysis well. I guess it may have been a mistake while constructing the tree file in Newick format, as one can get confused by using too many parentheses.
Since it's an issue that could happen to someone else, I find pertinent to include perhaps inside the function a short evaluation of the presence of nodes along internal branches and a warning whether they are found. In this way, one could notice the exact error, check the location of problematic nodes using for example phytools::plotTree.singletons (as you suggested) and then correct the tree file.

Thanks again for everything!
Kind regards

--
Leonardo Hostos Olivera
Licenciado en Biología (Lic. Blgo.)
Laboratorio de Paleontología y Evolución de Vertebrados
Universidad Peruana Cayetano Heredia, Lima, Perú
You received this message because you are subscribed to a topic in the Google Groups "geomorph R package" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/geomorph-r-package/114YaHRT99Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to geomorph-r-pack...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/geomorph-r-package/CH2PR04MB6617009C8183B01B53FC0674A2329%40CH2PR04MB6617.namprd04.prod.outlook.com.


--
created with MySignature.io
Leonardo Hostos Olivera   Lic. Biología
Asistente Técnico de Ciencia de Datos en WCS Perú
Presidente de la organización AUPFAS
Miembro de T.E.A.M. Paleo - Lab PALEV (UPCH)
tlf.:  +51 943650661
email:  lhostos...@gmail.com
created with MySignature.iocreated with MySignature.iocreated with MySignature.io

Adams, Dean [EEOB]

unread,
Feb 14, 2022, 7:41:51 AM2/14/22
to geomorph-...@googlegroups.com
That is exactly what the update does


From: geomorph-...@googlegroups.com <geomorph-...@googlegroups.com> on behalf of Leonardo Hostos Olivera <lhostos...@gmail.com>
Sent: Monday, February 14, 2022 12:47:56 AM
To: geomorph-...@googlegroups.com <geomorph-...@googlegroups.com>

María Constanza Marchesi

unread,
Feb 18, 2022, 8:20:52 AM2/18/22
to geomorph-...@googlegroups.com
Hi Bryan! I wanted to let you know that I finally mannaged to plot and color my data.
The problem was in the data frame. As I was creating a new dataframe with meanshapes for each species, the order of my species in that matrix was not the same as in the original dataframe. I fixed it manually by rearranging the order of my classifiers and reloading the classifiers file! I am sure there could be a code for doing that, but it was easier for me to fix it manually.

Thank you for your help!

<clip_image001.png>

 

If I run this:

> plot(Delphinidae_PCA_meanshapes, axis1=1, axis2= 2, phylo =TRUE,col= c("purple", "lightblue","red","orange","green","black","grey","blue" ),  phylo.par= list(tip.labels = TRUE, node.labels=FALSE, edge.color= "black", edge.width= 1, tip.txt.cex =0.5, tip.txt.adj = c(-0.1, -0.1) ))

> legend('topleft',legend= levels(gp),  fill= c("purple", "lightblue","red","orange","green","black","grey","blue"),cex=0.3)

> title("Vertebra PCA",cex.main=1.5)

I get a similar plot, but colors do not match the habitats of each species either…for example Sfluv should be riverine but it is not green….

 

<clip_image002.png>

Reply all
Reply to author
Forward
0 new messages