Does gl.pcoa() use the distance matrix from gl.dist.pop()?

199 views
Skip to first unread message

Dallas

unread,
Jul 9, 2021, 11:27:42 AM7/9/21
to dartR
Hi there,

The PCoA plot I keep getting for my data in dartR is quite strange (all the points cluster along the axes), and I've spent much of my time over the last month trying to figure out if it's biologically correct, or if something is amiss with my code, etc.

I am wondering if the gl.pcoa() function actually uses the distance matrix I create with gl.dist.pop() to run the pcoa? Because I've done it without calculating a distance matrix first, and I get the same graph.

I also created a pcoa graph with just the ape package, and that pcoa, though graphically inferior, seems to show more of what I was expecting from my pcoa as far as separation between species. The pcoa() function in ape does require me to input the distance matrix, so I'm just wondering if my dartR pcoa is coming out weird because it isn't using the distance matrix?

Attached are images for the pcoa I get with dartR (in color), and the pcoa I get with ape (in black and white).

Thank you for any insight you can provide!
Dallas
ape_pcoa.png
dartR_pcoa.png

Arthur Georges

unread,
Jul 9, 2021, 8:44:40 PM7/9/21
to da...@googlegroups.com
Hi Dallas,

That PCA of yours is quite weird, and so it does look as if there is something awry there. To start at the beginning of your questions, by my understanding:

(a) gl.pcoa is a wrapper for glPca of adegenet. This takes the SNP matrix from the genlight object and applies a PCA in the sense of Pearson 1901, that is, it calculates the correlation matrix (or what is essentially equivalent, the Euclidean distance matrix) across individuals and then generates the ordination from that. So with PCA, the interindividual genetic distance is the fodder for the analysis, which of course incorporates the interpopulation distances.

(b) gl.pcoa will also take a distance matrix, in which case it is a wrapper for PCoA in ape, that is, in the sense of Gower 1966. The distance matrix you have handed ape is a distance matrix calculated for populations I think, and so the ordination picks up from this distance matrix, not the distance matrix calculated for individuals as with PCA. That is why you are getting a different graph.

So a first port of call would be to generate a distance matrix for the individuals in your dataset and hand it to ape and see if you reproduce the result you got from dartR.

If the two approaches, using dartR (glPca) and ape give the same results, then the interindividual variation in your dataset is causing the difference between your two graphs.

If the two approaches give a different result, then we have a problem that will need some forensics.

Let me know how you go.

Arthur


Peter Kriesner

unread,
Jul 10, 2021, 1:17:10 AM7/10/21
to da...@googlegroups.com
Hopefully a simple question: how do I sort or reorganise the individuals in my genlight object by pop (and then by individual id, although that isn't essential)? I'm wanting to produce a compoplot without the individuals all jumbled up as they are at the moment based on my initial data import.

Thanks,
Peter

Jose Luis Mijangos

unread,
Jul 11, 2021, 10:43:16 PM7/11/21
to dartR
Hi Peter,

Can you try the following code. Please let us know whether it worked for you.

library(dartR)
library(tidyr)
library(RColorBrewer)

genind_test <- gl2gi(bandicoot.gl)
dapc_res <- dapc(genind_test)
dapc.results <- as.data.frame(dapc_res$posterior)
dapc.results$pop <- pop(genind_test)
dapc.results$indNames <- rownames(dapc.results)

dapc.results <- pivot_longer(dapc.results, -c(pop, indNames))
colnames(dapc.results) <- c("Original_Pop","Sample","Assigned_Pop","Posterior_membership_probability")

cols <- brewer.pal(n = nPop(genind_test), name = "Dark2")

p <- ggplot(dapc.results, aes(x=Sample, y=Posterior_membership_probability, fill=Assigned_Pop))
p <- p + geom_bar(stat='identity')
p <- p + scale_fill_manual(values = cols)
p <- p + facet_grid(~Original_Pop, scales = "free")
p <- p + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8))
p

Cheers,
Luis

Arthur Georges

unread,
Jul 12, 2021, 1:05:51 AM7/12/21
to dartR
I have added this to the issues on github and we will look at developing a function to do this for the next release.

Dallas

unread,
Jul 21, 2021, 5:07:41 PM7/21/21
to dartR
Ok, so I went ahead and created a new distance matrix based on individuals, and then re-ran my ape PCoA based on that. Unfortunately, the PCoA looks the same except for more clutter thanks to all the individual labels. I'll attached it for your reference.

Dallas

ape_pcoa_ind.PNG

Jose Luis Mijangos

unread,
Jul 22, 2021, 1:53:21 AM7/22/21
to dartR
Hi Dallas,

We have updated the pcoa functions which might do the analyses as you want (see code below). To use these updated functions you can install the dev version of dartR as described below. 

Please let us know whether it worked for you.

Cheers,
Luis

library(devtools)
install_github("green-striped-gecko/dartR@dev")
library(dartR)
test <-  gl.dist.ind(bandicoot.gl)
test_2 <- gl.pcoa(test,nfactors=2)
res <- gl.pcoa.plot(test_2,bandicoot.gl)
 

Dallas

unread,
Aug 17, 2021, 11:07:00 AM8/17/21
to dartR
Hey Luis,

Sorry it's taken so long for me to try this! The PCoA this code generated definitely looks more like what we were expecting; however, it only shows one point per species. Do you know if there is a way to have all individuals represented on the PCoA?

I've attached an image of the PCoA for your reference.

Thank you so much for your help!

Dallas

forum_pcoa.PNG

Jose Luis Mijangos

unread,
Aug 23, 2021, 8:05:41 PM8/23/21
to dartR

Hi Dallas,

I have looked at your data. I think you are working with knotweed, which is a polyploid species. dartR is designed to analyse diploid data, that is why you are getting weird results. A nice article describing software to analyse polyploid data is Meirmans, Patrick G., Shenglin Liu, and Peter H. van Tienderen. "The analysis of polyploid genetic data." Journal of Heredity 109.3 (2018): 283-296.

Cheers,
Luis

Dallas

unread,
Aug 26, 2021, 2:08:06 PM8/26/21
to da...@googlegroups.com
Thanks for the reference article. I appreciate your help!

--
You received this message because you are subscribed to a topic in the Google Groups "dartR" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dartr/uws3JHO71rs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dartr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dartr/fea5fd71-5b35-44b1-aaf9-50196e79d5een%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages