Hi,
I’m having another question concerning BUCKy again. I’m trying to see whether the rationale behind BUCKy can really fit my work.
Let me explain... The principal task in my work is to “delineate new species”. I work on fungi. So for fungal molecular taxonomists, we refer in general to the finding of highly supported monophyletic clades, a practice based on Phylogenetic Species Concept (PSC). With increasing number of used markers, the criterion for recognizing such species is the concordance between different genes; we expect the branches leading to different species to be concordant while the branches within species to be conflicting as the members within species do recombine (see attached Figure taken from Taylor et al., (2000)). Although conflicting within species, the branches within species should always supported the monophyly of the species, hence the expectation of a highly supported node at the origin of the species under the classic framework of molecular phylogenetics–that is where I find it difficult to fit in Bayesian Concordance Analysis (BCA).
If I’m not mistaking the concept of BCA, the concordance between genes for a “clade” is evaluated based on the “split” at the origin of that clade (from what I understand from Ané et al., (2007). While a split characterizes the separation between groups of branches, a monophyletic clade classically represents the idea of grouping with a common ancestor. In case where we have several individuals recombining and belonging to a same species, the split at the origin of the species can actually have low concordance factor.
For example, from the figure attached, under the phylogenetics species criterion, ABCD and WXYZ each belongs to different species with the split ABCD|WXYZ that should have high concordance factor as well as high bootstrap support/posterior probability. However, within species, the clades [ABCD] and [WXYZ] can each have low concordance factor as inside them there could be different splits–A|BCD, AB|CD or ABC|D for the clade [ABCD], and W|XYZ, WX|YZ or WXY|Z for the clade [WXYZ]– supported by different set of genes. However, these clades should have high bootstrap support/posterior probability under the classical phylogenetics frameworks.
When I started using BUCKy for my work, I had the idea that the clade representing a species of interest should have high concordance factor as the clade should be supported throughout the genome. However, the thorough examination of the results showed me the contrary in some cases while the simple ML or Bayesian analyses of singles or concatenated genes tended to give these clades high supports.
So my interpretation is that BUCKy is tailored for making “species trees” in which not many samples are drawn from each species, but when there are several individuals per species, it may not be the best tool to “designate” new species.
I just want some views from other users and the developers of the software and the algorithm behind.
Cited references.
Ané, C., Larget, B., Baum, D.A., Smith, S.D., Rokas, A., 2007. Bayesian estimation of concordance among gene trees. Mol. Biol. Evol. 24, 412–426. https://doi.org/10.1093/molbev/msl170
Taylor, J.W., Jacobson, D.J., Kroken, S., Kasuga, T., Geiser, D.M., Hibbett, D.S., Fisher, M.C., 2000. Phylogenetic Species Recognition and Species Concepts in Fungi. Fungal Genet. Biol. 31, 21–32. https://doi.org/10.1006/fgbi.2000.1228