My question is more about experimental design rather than program usage
Has anybody explored the effect of sampling bias on mcrca estimation?
I have seen published statements that the data set was pruned to avoid
unequal taxonomic repetition, but no reference was given. Intuitively this
seem sensible but is there evidence?
In my case this is relevant for two reasons
1) I'm planning to use calibration points from a sister group (also
estimated using BEAST).
The question is now whether I should add the entire data of this sister
group to my data or whether it is sufficient to include the minimum number
of taxa which represent the nodes for which mcrca estimates exist.
2) In my own data set a number of species are represented with multiple
haplotypes extracted from large phylogeographic data sets whereas other
species are only represented by a single individual. Should I prune the
data set so that each species and clade is equally represented?