Hi Yifei,
I will let Laurent answer the codon part, I'll take the optimization one :)
On Mon, Oct 29, 2012 at 6:05 PM, Yifei Huang <
huangy...@gmail.com> wrote:
> 1) I am wondering how constraints are handled by newton-like optimizers. If
> my understanding is correct, in each step of optimization most of the
> newton-like algorithms firstly calculate a searching direction based on the
> gradient and Hessian and then perform a linear search. How is the searching
> direction determined if the constraint is reached?
In all optimizers, there are two general way to handle constraints
1) the "push-it-to-the-bound" way (default): after each optimization
step, the optimizer proposes a new set of values. If some of these
values fall outside the allowed interval, the corresponding value is
set to the closest bound. The new set of parameters is then forwarded
to the function for evaluation.
2) the "parameter transform" way, which is the only
mathematically-correct way, yet slower in practice. This uses
parameter transforms to get a function defined on -inf,+inf.
>
> 2) Which algorithm is used to perform linear search in multiple dimensional
> optimizers?
The "direction" procedure is only used for Gradient functions (and
Powell's method). There it uses a Brent optimization procedure. The
newton method however does not use such linear search: it approximates
the local curve by a paraboloid, using the hessian matrix. Then it
gets the minimum of the paraboloid and uses it as a new point. Newton
algorithms are most efficient for function that are "paraboloid-like"
(it finds the minimum of a paraboloid function in one step only). In
practice, homogeneous likelihoods are very close to paraboloid curves,
hence the success of the method.
The "push-it-to-the-bound" method works well with newton. It has,
however, horrible results with direction functions, as the constraints
break the linearity of the transform.
>
> 3) I am not sure how to use MetaOptimizer. I think you mentioned somewhere
> that in MetaOptimizer we can use newtown-like optimizers firstly and then
> refine the result by derivative-free optimizers when we are close to the
> Maximum. This method might be faster because numerical instability of
> numerical differentiation might hurt the convergence. However, I don't know
> how to implement it. Could you please tell me where I can find an example?
A MetaOptimizer does not do this. What you suggest can be done by
calling two optimizers in a raw, the first one with a higher tolerance
(for instance 1e-4), and the second one with a finer one (for instance
1e-6).
What the MetaOptimizer does is to use different optimization procedure
for various set of parameters. Typically, a Newton optimizer for
parameters for which we have analytical derivatives, and a Brent
optimizer for parameters for which no analytical derivative is
available.
I don't think there is an example of use available. A MetaOptimizer is
used in OptimizationTools::optimizeNumericalParameters, maybe that can
help? The idea is that we have to tell the MetaOptimizer which
parameter should be optimized with what. This is achieved through a
class called "MetaOptimizerInfo", which has a addOptimizer method.
Once you have instanciated such an info class, you pass it to the
constructor of the MetaOptimizer together with the function.
Hope this helps a bit, do not hesitate if you have further questions.
>
> Sorry about so many questions. I learnt a lot from you guys and the source
> code of Bio++. Without Bio++, it is almost impossible for me to develop my
> own models because of the very steep learning curve of numerical algorithms
> and the obscure code in most C based phylogenetic softwares. I appreciate
> your works a lot.
This is good to know :)
Cheers,
Julien.