Implementation of triple mixture model

18 views
Skip to first unread message

Keren Halabi

unread,
Aug 12, 2019, 7:49:18 AM8/12/19
to Bio++ Usage Help Forum
Dear Bio++ team,

I would be grateful for some consultation regarding an implementation I am struggling with.

I would like to implement a codon model with 3 levels of mixture, in decreasing level of computation:
1) mixture over branches
2) mixture over sites with respect to hypothesis over two scenarios - the difference between the two scenarios is expressed by the value of an exponential parameter k such that the omegas used in (3) are the k'th power of the original omega values in one scenario, and their original values in the other.
3) mixture over sites with respect to the selective pressure (i.e., omega)

Please find attached a figure that attempts to further clarify what I want. The approach described in the document matches the first solution I considered.

The first and third levels are easy to apply:
1 - can be done in Bio++ with a non-homogeneous model with multiple copies of my sub-model (see (2)) that shave shared (aliased) parameters
3 - can be done by having my sub-model inherit from YNGP_M and call a MixtureOfASubstitutionModel instance in the constructor (as done in YNGP_M2, for example).

I am struggling specifically with the second level. I considered two options and here are my obstacles in each one. Inn all the cases, I add a proportion parameter.

* Add another layer of sub-model to wrap the existing one, in which the added parameter will be included
Let the original model which consists of a mixture over omega categories be M. I create a new model MProp in which the proportion parameter is included, and from its constructor I create a mix over 2 models of type M where in one k=1 and in the other it is y (similarly to how mixture over YN98 models is done in YNGP_M2, for example).
The problem: If I want to restrict the likelihood computation such that the omega category doesn't change between branches per site, I need to be able to access the mixture level in the sub-model M. I don't know how to do that or if it's even possible.

* Join levels 2 and 3:
When using the MixtureOfASubstitutionModel instance in my sub-model's constructor, instead of setting x omega categories, I will set 2x, and adjust their proportions to be a product of the omegas proportion and the added proportion parameter such that:
p(omega0_scenario1) = p0*prop
p(omega1_scenario1) = p1*prop
p(omega2_scenario1) = p2*prop
p(omega0_scenario2) = p0*(1-prop)
p(omega1_scenario2) = p1*(1-prop)
p(omega2_scenario2) = p2*(1-prop)
The problem: I don't know how to restrict the theta parameters that are created automatically with a dynamic parameter such as the added one

* Consider level 2 manually:
Run over the functions that handle transitions matrices:
double Pij_t    (size_t i, size_t j, double d) const;
double dPij_dt  (size_t i, size_t j, double d) const;
double d2Pij_dt2(size_t i, size_t j, double d) const;
const Matrix<double>& getPij_t(double t) const;
const Matrix<double>& getdPij_dt  (double d) const;
const Matrix<double>& getd2Pij_dt2(double d) const;
In all the above functions, I could consider the proportion parameter by computing a mixed transition matrix over the two scenarios (similarly to what is done in AbstractMixedSubstitutionModel::getPij_t).
The problem here is also related to the hypernodes: When using my sub-model in a non-homogeneous framework the likelihood computation is done with the function RNonHomogeneousMixedTreeLikelihood::computeTransitionProbabilitiesForNode which computes the transition probabilities by directly accessing the transition matrix of each omega category separately, thereby ignoring the prop parameter which is incorporated into the functions listed above, in the instance of the sub-model.

Any solution to any of the options I presented here would help me a great deal.

Many thanks!
Keren
 


tripleMixtureStructFig.pdf

Laurent Guéguen

unread,
Aug 21, 2019, 6:55:34 AM8/21/19
to Bio++ Usage Help Forum
Hi Keren,

I took some time to think about it, and to my opinion the best possible way would be the second solution,
with mixture of both parameters. 

The best solution (and most elegant) would be the first, aka a mixture of mixture, but as you guessed in the current implementation it would not
be possible to link the omegas (and I have no time these days to do this).

So, for the second solution, could you specify to me your problem with parameters more formally? There may be a tricky solution to handle this.


Also, for level 3, why does your submodel inherit from YNPG_M? YNPG_M has a link towards a mixedmodel, and so I do not
understand why your submodel should do the same?

Cheers,
Laurent





Reply all
Reply to author
Forward
0 new messages