What is the stopping rule for numerical optimization of PAML ?

75 views

Skip to first unread message

Tae-Kun Seo

unread,

Oct 7, 2024, 2:22:34 AM10/7/24

to PAML discussion group

Dear Ziheng (or users who know this issue well),

When PAML searches maximum likelihood estimates (MLEs) by climbing up the surface of likelihood function, what is the stopping rule for numerical optimization? (Either convergence of MLEs? or convergence of log-likelihood function? (I guess the latter..) and what is the tolerance for determining convergence? )

I'm trying to analyze rather large data sets (several dozens of taxa) with codeml and with branch model (i.e. each branch has its own omega value). But, it takes a lot of time (never stops for some data sets) and I'm thinking of making stopping rule (or tolerance) less strict to reduce computation time. Can I easily do this by editing control file or changing/ re-compiling source files?

Also, it seems that gamma rate heterogeneity is not allowed for branch model. Can I easily do this by changing/ re-compiling source files? I understand it will require much more computation time.

Thanks in advance.

Tae-Kun Seo

Sandra AC

unread,

Oct 31, 2024, 4:01:48 AM10/31/24

to PAML discussion group

Dear Tae-Kun,

Thanks for your message! Please find below some suggestions that Ziheng has given in previous instances:

* * *

The following code can be found in `tools.c`:

int H_end(double x0[], double x1[], double f0, double f1, double e1, double e2, int n)

/* Himmelblau termination rule. return 1 for stop, 0 otherwise.

The stopping rule used in CODEML is apparently called Himmelblau termination rule. Such a rule is based on the idea that changes in the variable values should be small, and that the change in the objective function should be small. To this end, the stopping rule used by default in CODEML is probably too stringent (i.e., the small value used in the algorithm is actually too small). If you want to change this setting, you can open the file `codeml.c` and find the following lines:

if (com.method == 1)

j = minB(noisy > 2 ? frub : NULL, &lnL, x, xb, e, com.space);

else if (com.method == 3)

j = minB2(noisy > 2 ? frub : NULL, &lnL, x, xb, e, com.space);

else

j = ming2(noisy > 2 ? frub : NULL, &lnL, com.plfun, NULL, x, xb, com.space, e, np);

Once you locate the code above, please update it following the code below and recompile the program:

if (com.method == 1)

j = minB(noisy > 2 ? frub : NULL, &lnL, x, xb, e*10, com.space);

else if (com.method == 3)

j = minB2(noisy > 2 ? frub : NULL, &lnL, x, xb, e*10, com.space);

else

j = ming2(noisy > 2 ? frub : NULL, &lnL, com.plfun, NULL, x, xb, com.space, e, np);

In the code, `e` is assigned to `e=1e-8`, and so `e*10` will be `1e-7`. Nevertheless, you may also want to try `e*100`, which could be good as well. Please note that the stop rule uses both relative error and absolute error at the same time. The idea follows something like

|x* - x| < ( |x| + 1) e

so that relative error is used when "x" is large (much larger than 1) and absolute error is used if "x" is close to 0.

If you were to use site models such as M7 or M8 with many parameters to optimise (or in case other users here see this message and want to run M7 or M8 models with large datasets and the analyses are taking too long to finish), you can first estimate the branch lengths under M0 while fixing the tree topology of interest. Then use the same tree topology with fixed branch lengths (those estimated under M0) as initial values, but please make sure you add option `fix_blength = 1`in the control file to enable this property! Nevertheless, it seems that enabling `fix_blength = 1` can lead to long runs as the stopping rule is quite stringent, and so you can modify the source code as mentioned above, recompile the program, and run again. Some notes on options `method` and `fix_blength`:

- When using `method = 1`, the program optimises the branch lengths with other parameters (kappa, p, q, w) fixed, and then optimises the other parameters (using BFGS) with the branch lengths fixed. The iteration is very slow because the two sets of parameters are correlated, so that the algorithm zigzags forever. This process will however give you the MLEs under M7/M8. If you want to save time and get approximate values under M7/M8, you can use `fix_blength = 3`. When enabling this option, CODEML will assume that the M7/M8 branch lengths are proportional to the branch lengths in the tree (in other words, branch lengths under M0), so it estimates a scale factor c, together with parameters p, q, w, etc. in M7/M8. The iteration takes place under BFGS with 4 parameters for M7 (c, kappa, p, q) and 7 for M8 (c, kappa, prob, p, q, w_s). Following previous observations, the lnL may be higher when you set `fix_blength = 1` (i.e., the estimates are MLEs) than when you set `fix_blength = 3` (the estimates are approximated).
- In the PAML documentation, p. 26, you have a note regarding option `fix_blength`: "The value 3 (proportional) means that branch lengths will be proportional to those given in the tree file, and the proportionality factor is estimated by ML". Note that systematic tests never took place and it is hard to know when the approximation can be good or bad. Nevertheless, it is likely to be good when sequences are similar and/or when most sites are conserved with w < 1.

* * *

Hope you find these suggestions useful, and please let us know if updating and recompiling the code following the suggestions above worked!