Hi Rehan,
1. Missing character states are treated as a lack of knowledge of the what the state is -- like you simply didn't look at the character. This can be said in several equivalent ways:
- Pr(missing data|s) = 1 for every state s
- missing states are union of all possible states. For DNA N = A
\union T \union C \union G.
2. Missing states are presumed to be "missing completely at
random" (MCAR), so that the probability of being missing is
independent of both observed and unobserved features of the data.
This assumption might be violated if character states are more
likely to be missing from (for example) faster-evolving
characters. I see you have gamma rate heterogeneity in your
model, so if characters are preferentially missing in one of the
gamma bins, that could possibly be problematic. Its not a problem
for some taxa to have more missing characters though.
3. If the model is not violated -- including the missing-completely-at-random assumption -- then missing data shouldn't "pull" taxa anywhere. Instead, you might consider if (a) the prior favors balanced topologies over unbalanced ones (or the reverse) and (b) if the odd features of tree estimate are weakly supported. People often show consensus trees, but if the posterior probability of a feature is only (say) 0.7, then it is only weakly supported.
Does that help?
-BenRI
--
You received this message because you are subscribed to the Google Groups "revbayes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to revbayes-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/revbayes-users/16eac06f-0a0f-4ebc-a9a9-188f08e6e723n%40googlegroups.com.