Initial omega values for Codeml Models

Vigzy 77

unread,

Mar 14, 2019, 10:23:42 AM3/14/19

to PAML discussion group

Hi,

I'm performing Codeml branch and branch site analysis and have noticed in many tutorials that we are supposed to run models at least twice, once with starting omega less than 0 and once with a starting omega value greater than 0 and check consistency of results.

I am getting branch lengths with M0 model and using them as initial values(fix_blength=1) for ML iteration for these complicated models. Since I am fixing my initial values, will running it many times change the output?

I noticed the reason we run it more than once is because Codeml uses random values as starting values , unlike my case.

Regards,

Vigzy

cajawe

unread,

Mar 15, 2019, 4:22:30 PM3/15/19

to PAML discussion group

I am getting branch lengths with M0 model and using them as initial values(fix_blength=1) for ML iteration for these complicated models. Since I am fixing my initial values, will running it many times change the output?
I noticed the reason we run it more than once is because Codeml uses random values as starting values , unlike my case.

You aren't fixing your initial values. By using fix_blength=1, you're telling codeml to use the M0 branch lengths as starting estimates, not fixed estimates. If you want to fix the branch lengths then you should use fix_blength=2. Note that this won't fix the substitution matrix parameters (i.e., omega, kappa)—just the branch lengths.

Anyway, the effect of changing initial omega varies depending on the codon model.

In some cases, codeml uses the specified initial value as provided.

I think this is the case for M0.
However, if you set an impossible starting value (i.e., less than 0), then I think it gets re-set to some acceptable value.

In some cases, codeml adds some random noise to the specified initial value.

I think this is the case for M2a, where initial omega controls the w2 parameter.
So, if you put initial omega as 2, codeml might start from 2.13 in one replicate run, 1.95 in another, etc, whereas if you put initial omega as 100 it would start from, e.g., 100.13, 99.95, etc.

In some cases, setting initial omega doesn't have any effect.

For example, the M7 model doesn't have an omega parameter.

Finally, in some cases it might not be possible to vary the initial omega value because it needs to be fixed.

This is the case for the Branch-site null model.

These are my recollections based on a check of the codeml source code from several years ago (hence, several updates ago). Perhaps someone can check the current codeml source code and report back with how initial omega settings affect each model.

Beyond varying omega, you have other options, though none is ideal:

Instead of omega, you can always change initial kappa.

This parameter is a part of all of the codeml codon models.
My guess is that varying kappa will generally have less influence on likelihood optimization than varying omega for most models.

You can use random starting branch lengths via fix_blength=-1.

However, for large data sets and complex models, this will often be inefficient and, I suspect, potentially misleading.
Random branch lengths are going to be really unrealistic and my hunch is that using them will increase the risk of climbing suboptimal peaks compared to your chances when using reasonable branch length estimates obtained from, say, the M0 model.

Finally, you can use the in.codeml option to manually adjust whichever parameters or branch lengths you think are relevant for a given data set and model.

Message has been deleted

Vigzy 77

unread,

Mar 22, 2019, 6:13:47 AM3/22/19

to PAML discussion group

Hi,

Thanks a lot for the detailed explanation. It was of great help to me.

The concepts are now very clear to me on this matter.

I am doing my analysis by using M0 lengths and fix_blength=1 and the results seem resonable and consistent with multiple runs with different starting omegas.

Regards,

Vigzy

Orlagh

unread,

Aug 31, 2020, 12:05:11 PM8/31/20

to PAML discussion group

Hello! I am a new user with a similar question.

I am running the branch and branch-sites models for several gene families. I am unsure about how to set initial kappa and omega values. So far, I ran the M0 model first for each gene family using estimated omega and kappa values (I set kappa to 2 and omega to 0.5). I'm feeling confused about what value to set omega and kappa to for the other models I'm trying to run. Should it be the kappa and omega values output from the M0 results file?

Additionally, due to the inherent randomness, would you recommend running each model three times, and select the best log likelihood value across each replicate?

Thank you very much.

Ziheng

unread,

Nov 8, 2020, 7:21:11 AM11/8/20

to PAML discussion group

yes, running the same analysis 3 times is a good idea.

if you use different initial values and find that the results do not change, then you know that the initial values are not so important and you don't have to agonize over them so much.

best, ziheng

Reply all

Reply to author

Forward