--
You received this message because you are subscribed to a topic in the Google Groups "beast-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beast-users/3e_IKShCH3s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/cc073415-9cad-4bd4-ac7c-fdfda8f003afn%40googlegroups.com.
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/a3a40437-a60f-461d-99b7-dace48c35e74n%40googlegroups.com.
Luke,
Don’t worry, I do understand you, believe me. I agree with most of what you say, and using epsilon to inform species delimitation is a bad idea and was never the intention. This is based a misinterpretation of what epsilon is, so I will spend some text trying to explain it.
Coalescent theory models allele histories in populations, which in its most simple form assumes no recombination, no natural selection, and no gene flow or population structure. Rannala and Yang (2003) connected such populations in a tree. This model is now known as the multispecies coalescent (MSC) model. The choice of the word “multispecies” is unfortunatete, and likely an important source of the confusiuon in the discussion here, as well as otherwise, because what it does is really connecting multiple ideal populations in a tree, not what most biologists think of when they use the term ‘species’. As we all know, there are many concepts of species, and there is also disagreements about whether ‘species’ is real biological entity with unique properties or not. But a t least, it is a taxonomic rank, and taxonomic species are given binomial names, and species may have legal issues. Some species (Homo sapiens) are paradigmatic, whereas others are controversial. For example, despite the title of his most well-known book, considered “… species as one arbitrarily given for the sake of convenience to a set of individuals closely resembling each other, and that it does not essentially differ from the term variety”. In contrast, both Linnaeus and Ernst Mayr considered species as real units, created by God in the former cases and as the basic units of evolution in the latter. Much of modern biology is highly influenced by the Mayrian concept, but nevertheless, a unified and explicit universal model into which biologists can use their data to test hypothesis is lacking, and even on the conceptual side, there is a lot of disagreement (even if the ‘lineage’ concept sensu de Queiroz and others is gaining acceptance, it lacks operational criteria).
So, we are faced with a semantic problem here. Although we agree that the MSC models populations, the inclusion of the term ‘species’ has caused confusion. Say we have a set of alleles and under the assumption that these alleles have evolved according to the coalescent assumptions, are these alleles sampled from one or more such populations? We can use the MSC to compare the likelihood of our data under different models here, for example using model selection methodology as the likelihood ratio test. In a Bayesian framework, reversible model jump can be performed, where branches are collapsed and expanded. In a simple case, we may just compare the cases of one population (‘species’) versus two. The likelihood of the “two” case will also be affected by the depth of the split. In DISSECT/STACEY/Speedemon the number of tips is kept constant, which has some mathematical/computational advantages. This is implemented by using a modified branching model. Instead of the usual birth (or birth/death) model, in which the probability of branching events is a priori assumed to be constant across the branches, very high prior probabilities are assigned to very shallow branching events (defined by epsilon), so shallow that they can be assumed to approximate zero depth. So, if our sampled alleles are informative enough of deeper branching events, we will conclude that we have two populations (‘species’). If our data is not informative, then we conclude that we have one (actually, it would be more correct to say that we do not have evidence for more than one, and that the statistical power is low). Note though that because of computational reasons, there must be a measurable split height, which is defined by epsilon, and as is clearly stated in the STACEY/DISSECT documentation, this should be as small as possible. This is the intended use of epsilon, and IT IS NOT ARBITRARY in this sense. It is zero split height (approximately). I hope that I have made myself clear now. On the contrary, you are absolutely right that using it as an arbitrary threshold is not qualitatively different from single-locus thresholds, gdi, and what you like.
So why should you use DISSECT/STACEY/Speedemon if you don’t want to delimit ideal populations? I think the main advantage is that you can infer multilocus phylogenies under fully parameterized stochastic models without having an a priori knowledge on to which populations you should assign your samples. It should be obvious that using traditional taxonomic species delimitations usually would be a bad idea, as those usually are expected to be more inclusive that local, perfect och near perfect populations. But if you define species as the branches of an MSC tree, you can certainly also delimit them with empirical data. Future will tell if we can model the branches more like what we perceive species as, or if they are better view as clades in the phylogenetic tree framework, or something more elaborate.
So, in summary, epsilon should be used as an approximation of zero, and even if it technically can be used as an arbitrary threshold (which would not be an approximation of zero), such a threshold has no theoretical justification. It seems to me that you and others have interpreted “species delimitation” as something which should be applied to real Mayrian entities in nature. Personally, I lean towards the Darwinian interpretation of the word, a convenient taxonomic rank among others.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/a3a40437-a60f-461d-99b7-dace48c35e74n%40googlegroups.com.