Hello,
First of thank you sharing poppr, really enjoying using it for our research. I just have a few question regarding mlg.filter.
I have used poppr to identify MLGs in a GBS data using mlg.filter() with bitwise.dist and the farthest threshold.
For the same dataset we have also performed a maximum likelihood phylogenetic analysis using raxml.
We mapped the MLGs onto the maximum likelihood tree and found one instance where a MLG appears to be nested in another.
In the screen grab below is the clade in question and the "X" on the branches denotes that the clade can be collapsed as a MLG.
After looking at the MLGs in more detail it seems that two separate MLGs were identified.
MLG1: 000364 and 000149
MLG2: 000548, 000469, 000357, 000396, 000441, and 000297 (all remaining)
Am I correct in assuming this is simply due to differences in the relative relationships inferred by distance vs maximum likelihood?
I would be slightly more confident in the relative relationships inferred by our maximum likelihood tree. With this in mind, would it be appropriate to use pairwise branch lengths from the maximum likelihood as an input for mlg.filter?
I tried this using pairwise branch lengths calculated using cophenetic.phylo() from the R package ape.
With my first attempt, the plot didn't seem to work (see below)
Comparing the cophenetic.phylo() distance matrix with that produced by bitwise.dist(), the distance values were much lower based on cophenetic.phylo()
> mean(bitwise.dist(genlight))
[1] 0.09135868
> mean(as.dist(cophenetic.phylo(tre)))
[1] 0.001694099
If I crudely multiply the cophenetic.phylo() distance matrix values by 10 and repeat I get a plot more similar to what I would expect and reasonable MLGs.
Do the distance thresholds used need to be above a certain threshold to be used?
Hope this makes sense and I am happy to send example code if it is easier
Best wishes
Ollie