Hi,
I am computational linguist working in applying and trying to understand phylogenetic methods to language.
I was trying to understand how to apply pruning algorithm to a FBD tree when an internal node is sampled. In case of languages, a sampled ancestor is an ancient language like Latin which has modern Romance languages as extant descendants. In linguistic phylogenetics, we typically work with binary data.
I am writing my understanding of pruning algorithm applied to a FBD tree. If Latin is an internal node with one incoming edge and only one outgoing
edge, then, the likelihood vector at the Latin node is calculated as
usual as if the site's state is unobserved. Then, the likelihood vector
is multiplied by the observed vector of Latin. Observed means the observed sequence in the nexus file.
For instance, lets say the observed Latin value at a site (or column in
the alignment matrix) is 1. This means that the character state vector for
Latin at that point is [0, 1].
Lets say, the computed likelihood vector at the Latin node for the states 0,
1 is [x, y] where x and y are real numbers usually less than 1. Then, an
element-wise multiplication of both the observed and the modified likelihood
vectors is [0*x, 1*y] which is [0, y]. If the site has missing character
(encoded as [1, 1]) for Latin, then, the elementwise multiplication
would be [x, y] since multiplying by [1, 1] does not change the computed
likelihood vector. Is this the way the pruning algorithm is applied to a FBD tree with sampled ancestors?
Best,
Taraka.