Creating dendrograms with aboot function from distance matrix

426 views
Skip to first unread message

Νίκος Τουρβάς

unread,
Aug 20, 2018, 6:14:22 AM8/20/18
to po...@googlegroups.com
As far as I understand, aboot function (when distance option is set to nei.dist) creates a dendrogram based on Nei's standard genetic distance. Therefore I would expect a tree created with aboot's built in nei.dist (tree_1) and a tree created with aboot using a distance matrix created with hierfstat's function genet.dist (method = "Ds") to be identical. This is not the case in the example below. What am I missing?

Thanks in advance,

Nikos Tourvas


library(adegenet)
library(poppr)
library
(hierfstat)
library
(magrittr)

data
("nancycats")

# Tree using aboot function
tree_1
<- nancycats %>%
          genind2genpop
%>%
          aboot
(sample = 1000, cutoff = 50,
                distance
= nei.dist,
                tree
= "upgma")


# Tree from distance matrix
distance
.nei <- genet.dist(nancycats, method = "Ds")
distance
.nei <- as.matrix(distance.nei)

# added because hierfstat doesn't display pop names by default
pop_vector
<- popNames(nancycats)
colnames
(distance.nei) <- pop_vector
rownames
(distance.nei) <- pop_vector

tree_2
<- aboot(distance.nei, sample = 1000, cutoff = 50,
                 tree
= "upgma")

> sessionInfo()
R version
3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale
:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253    LC_MONETARY=Greek_Greece.1253
[4] LC_NUMERIC=C                  LC_TIME=Greek_Greece.1253    

attached
base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages
:
[1] hierfstat_0.04-29 poppr_2.8.0       adegenet_2.1.1    ade4_1.7-11      

loaded via a
namespace (and not attached):
 
[1] splines_3.5.1     gtools_3.8.1      shiny_1.1.0       assertthat_0.2.0  expm_0.999-2    
 
[6] sp_1.3-1          yaml_2.2.0        pegas_0.11        LearnBayes_2.15.1 pillar_1.3.0    
[11] backports_1.1.2   lattice_0.20-35   glue_1.3.0        quadprog_1.5-5    phangorn_2.4.0  
[16] digest_0.6.15     promises_1.0.1    colorspace_1.3-2  htmltools_0.3.6   httpuv_1.4.5    
[21] Matrix_1.2-14     plyr_1.8.4        pkgconfig_2.0.1   devtools_1.13.6   gmodels_2.18.1  
[26] purrr_0.2.5       xtable_1.8-2      sca


Zhian Kamvar

unread,
Aug 20, 2018, 8:53:27 AM8/20/18
to nikost...@gmail.com, poppr
Hello,

A couple of things:

1. Your use of aboot to construct tree_2 is incorrect. The function re-samples columns of the incoming data with replacement, calculates the distance, uses this distance to calculate a tree, and returns the tree. When you pass a genetic distance matrix to aboot, you are effectively trying to calculate a genetic distance from a genetic distance matrix. This is one of the reasons the results don't look the same. Instead, you should just use phangorn::upgma() or ape::nj() to calculate the tree from the distance matrix.

2. That being said, there IS a difference in how genet.dist() and nei.dist() behave in the case of nancycats. In this data set, population 17 is missing locus fca45. However, genind2genpop() sees this as simply 0 alleles recorded at that locus while hierfstat sees this as missing data and adjusts accordingly, so this affects how population 17 behaves relative to all others:

suppressPackageStartupMessages({
  library(poppr)
  library(hierfstat)
})

data("nancycats")

nei.poppr     <- nei.dist(genind2genpop(nancycats, quiet = TRUE))
nei.hierfstat <- genet.dist(nancycats, method = "Ds")
print.table(round(as.matrix(nei.hierfstat - nei.poppr), 2), zero.print = ".")
#>     P01   P02   P03   P04   P05   P06   P07   P08   P09   P10   P11   P12
#> P01   .     .     .     .     .     .     .     .     .     .     .     .
#> P02   .     .     .     .     .     .     .     .     .     .     .     .
#> P03   .     .     .     .     .     .     .     .     .     .     .     .
#> P04   .     .     .     .     .     .     .     .     .     .     .     .
#> P05   .     .     .     .     .     .     .     .     .     .     .     .
#> P06   .     .     .     .     .     .     .     .     .     .     .     .
#> P07   .     .     .     .     .     .     .     .     .     .     .     .
#> P08   .     .     .     .     .     .     .     .     .     .     .     .
#> P09   .     .     .     .     .     .     .     .     .     .     .     .
#> P10   .     .     .     .     .     .     .     .     .     .     .     .
#> P11   .     .     .     .     .     .     .     .     .     .     .     .
#> P12   .     .     .     .     .     .     .     .     .     .     .     .
#> P13   .     .     .     .     .     .     .     .     .     .     .     .
#> P14   .     .     .     .     .     .     .     .     .     .     .     .
#> P15   .     .     .     .     .     .     .     .     .     .     .     .
#> P16   .     .     .     .     .     .     .     .     .     .     .     .
#> P17   .  0.01  0.03  0.03 -0.02  0.07  0.04  0.03 -0.03 -0.02 -0.02  0.03
#>       P13   P14   P15   P16   P17
#> P01     .     .     .     .     .
#> P02     .     .     .     .  0.01
#> P03     .     .     .     .  0.03
#> P04     .     .     .     .  0.03
#> P05     .     .     .     . -0.02
#> P06     .     .     .     .  0.07
#> P07     .     .     .     .  0.04
#> P08     .     .     .     .  0.03
#> P09     .     .     .     . -0.03
#> P10     .     .     .     . -0.02
#> P11     .     .     .     . -0.02
#> P12     .     .     .     .  0.03
#> P13     .     .     .     .  0.03
#> P14     .     .     .     .  0.02
#> P15     .     .     .     .  0.02
#> P16     .     .     .     .  0.03
#> P17  0.03  0.02  0.02  0.03     .

On Mon, Aug 20, 2018 at 11:14 AM Νίκος Τουρβάς <nikost...@gmail.com> wrote:
As far as I understand, aboot function (when distance option is set to nei.dist) creates a dendrogram based on Nei's standard genetic distance. Therefore I would expect a tree created with aboot's built in nei.dist (tree_1) and a tree created with aboot using a distance matrix created with hierfstat's function genet.dist (method = "Ds") to be identical. This is not the case in the example below. What am I missing?

Thanks in advance,

Nikos Tourvas


library(adegenet)
library(poppr)
library(hierfstat)
library(magrittr)

data("nancycats")

# Tree using aboot function
tree_1 <- nancycats %>%
          genind2genpop %>%
          aboot(sample = 1000, cutoff = 50,
                distance = nei.dist,
                tree = "upgma")


# Tree from distance matrix
distance.nei <- genet.dist(nancycats, method = "Ds")
distance.nei <- as.matrix(distance.nei)

# added because hierfstat doesn't display pop names by default
pop_vector <- popNames(nancycats)
colnames(distance.nei) <- pop_vector
rownames(distance.nei) <- pop_vector

tree_2 <- aboot(distance.nei, sample = 1000, cutoff = 50,
                 tree = "upgma")

--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To post to this group, send email to po...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/4d9b2743-2341-4ac0-a862-279d3ebdef7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Νίκος Τουρβάς

unread,
Aug 20, 2018, 7:29:23 PM8/20/18
to poppr
Thank you very much for the speedy reply.

My ultimate goal when asking the previous question was to employ aboot in order to create a bootstraped dendrogram based on distances offered by the genet.dist function in hierfstat. That's why I was trying to create a tree based on Nei's distance this way - to verify that it worked! Unfortunately even after reading the aboot documentation, I still fail to understand how to do this. I would be grateful if someone could provide an example of using aboot in combination with genet.dist .

Again thanks for the time you spent on this silly question. :) I appreciate it a lot!




Zhian Kamvar

unread,
Aug 21, 2018, 6:01:45 AM8/21/18
to nikost...@gmail.com, po...@googlegroups.com
Hello,

The best way of doing this would be to use boot.phylo from ape (which aboot calls internally). The manual specifies all the gory details of what you need, but there's a bit of a trick to it with this situation: you want to shuffle all but the first column. I wrote up a quick tutorial of how to accomplish this here: https://gist.github.com/zkamvar/22ee313c7351cd03cd90b913bf3ad46a#file-bootstrap-genet-dist-md

One of the reasons why I'm not recommending to do this with aboot is the fact that it was initially designed to handle the genind structure (see "Bootstrapping" in  https://www.frontiersin.org/articles/10.3389/fgene.2015.00208/full#h3 for details). Since hierfstat stores each locus as a single column, then the mechanisms in aboot are no longer necessary.

I hope that helps.

Best,
Zhian

--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To post to this group, send email to po...@googlegroups.com.

Νίκος Τουρβάς

unread,
Aug 21, 2018, 3:28:29 PM8/21/18
to poppr
Well, this helps a lot! Thanks!

Buddhika Amarasinghe Dahanayaka

unread,
Jul 7, 2020, 11:25:14 PM7/7/20
to poppr
Hi Zhian and all,

I got a small issue with the dendrogram drawn using Nei's genetic distance. Hence, thought to include my question in this conversation. 

I have to change the font size (population names, nodes values and scale bar) and the position of the scale bar in the dendrogram. I used the tutorial "Population genetics and genomic in R" as my guidance. I drew two dendrograms (Image 1 and 2 attached). I was able to change the font size of the nodes and the position of the scale bar in image two but population names disappeared.
I used following functions. 

Image 1 

set.seed(999)
tree <- genchar %>%
  genind2genpop(pop = ~field/location()) %>%
  aboot(cutoff = 50, quiet = TRUE, sample = 1000, distance = nei.dist, tree = "upgma")

Image 2

library("ape")
cols <- rainbow(7)
plot.phylo(tree, cex = 0.8, font = 2, adj = 0, tip.color = cols[tree$grp],
           label.offset = 0.0125)
nodelabels(tree$node.label, adj = c(1.3, -0.2), frame = "n", cex = 0.6,
           font = 3, xpd = TRUE)
axisPhylo(1)  

   
 I would be grateful, if you could assist.

Thank you :)

Regards
Buddhika  

Imgae 1.JPG
image 2.JPG

Zhian N. Kamvar

unread,
Jul 16, 2020, 7:47:50 PM7/16/20
to Buddhika Amarasinghe Dahanayaka, poppr

Hello,


The cols vector only contains colors, so subsetting it with tree$grp will give no colors. You should add your population names to the cols vector like:

names(cols) <- setPop(genchar, ~field/location) %>% popNames()

Otherwise, I would go through the examples in the plot.phylo function from the {ape} package. It has some good examples.

Best,

Zhian

Reply all
Reply to author
Forward
0 new messages