How can I get single omega value for the given alignment file

153 views
Skip to first unread message

Yazhini arangas

unread,
Nov 15, 2021, 1:08:20 PM11/15/21
to PAML discussion group
Dear all, 
          I wanted to compare the dN/dS value for two domains present within a protein. I have split the alignment for each domain and performed the phylogenetic tree estimation separately. Also, obtained corresponding codon sequences using PAL2NAL program. After running the codeml, I find two omega values. For example in Model 2, I find the following omega values 

MLEs of dN/dS (w) for site classes (K=3)

p:   0.93842  0.06158  0.00000
w:   0.01138  1.00000 294.86691 <- this one
(note that p[2] is zero)

dN & dS for each branch

 branch          t       N       S   dN/dS  <- this one     dN      dS  N*dN  S*dS
  18..1       4.467    117.2     44.8   0.0723   0.3273   4.5298   38.4  202.9
  18..19      3.900    117.2     44.8   0.0723   0.2857   3.9540   33.5  177.1
  19..20      0.666    117.2     44.8   0.0723   0.0488   0.6756    5.7   30.3
  20..21      1.986    117.2     44.8   0.0723   0.1455   2.0142   17.1   90.2
  21..22      2.419    117.2     44.8   0.0723   0.1773   2.4533   20.8  109.9
  22..23      1.414    117.2     44.8   0.0723   0.1036   1.4337   12.1   64.2
  23..2      30.568    117.2     44.8   0.0723   2.2397  30.9952  262.5 1388.2
  23..14      5.019    117.2     44.8   0.0723   0.3677   5.0891   43.1  227.9
  22..8       2.571    117.2     44.8   0.0723   0.1884   2.6068   22.1  116.7
  21..24      4.849    117.2     44.8   0.0723   0.3553   4.9166   41.6  220.2
  24..5       4.188    117.2     44.8   0.0723   0.3069   4.2468   36.0  190.2
  24..15     18.222    117.2     44.8   0.0723   1.3351  18.4766  156.5  827.5
  20..7      13.172    117.2     44.8   0.0723   0.9651  13.3561  113.1  598.2
  19..25      1.169    117.2     44.8   0.0723   0.0856   1.1851   10.0   53.1
  25..26      1.735    117.2     44.8   0.0723   0.1271   1.7590   14.9   78.8
  26..27      0.000    117.2     44.8   0.0723   0.0000   0.0000    0.0    0.0
  27..28     12.414    117.2     44.8   0.0723   0.9095  12.5872  106.6  563.7
  28..3      13.625    117.2     44.8   0.0723   0.9983  13.8155  117.0  618.7
  28..9      50.000    117.2     44.8   0.0723   3.6634  50.6985  429.4 2270.6
  27..10      8.237    117.2     44.8   0.0723   0.6035   8.3525   70.7  374.1
  26..4       8.395    117.2     44.8   0.0723   0.6151   8.5119   72.1  381.2
  25..29      6.015    117.2     44.8   0.0723   0.4407   6.0995   51.7  273.2
  29..30      2.948    117.2     44.8   0.0723   0.2160   2.9890   25.3  133.9
  30..6      17.800    117.2     44.8   0.0723   1.3042  18.0491  152.9  808.4
  30..31      3.448    117.2     44.8   0.0723   0.2526   3.4957   29.6  156.6
  31..11     19.655    117.2     44.8   0.0723   1.4401  19.9296  168.8  892.6
  31..13     20.189    117.2     44.8   0.0723   1.4792  20.4710  173.4  916.8
  29..32      5.014    117.2     44.8   0.0723   0.3673   5.0836   43.1  227.7
  32..16     19.585    117.2     44.8   0.0723   1.4349  19.8586  168.2  889.4
  32..17      5.647    117.2     44.8   0.0723   0.4137   5.7257   48.5  256.4
  18..12      6.866    117.2     44.8   0.0723   0.5031   6.9620   59.0  311.8

I understand that the below one is for each branch in the tree but what are omega values at top represent. Which omega value should I consider for reporting dN/dS for the given input sequence.

Also, I have '(note that p[2] is zero)' message. What does it mean?

Could some one explain these results?

Thank you.

Yazhini arangas

unread,
Nov 15, 2021, 1:50:37 PM11/15/21
to PAML discussion group
Also, I would like to confirm whether p-value <0.05 is considered statistically significant like used in other statistical tests.

Thank you.

Janet Young

unread,
Nov 15, 2021, 3:22:43 PM11/15/21
to PAML discussion group
Hi Yazhini,

If you want a single overall 'average' dN/dS for each domain, I would run model 0 rather than model 2.  Model 2 is used more to examine whether dN/dS varies over the different codons within an alignment, which is why you get 3 omega values (model 2 allows 3 classes). In contrast model 0 assumes a single class of site and sounds better suited to what you are trying to do. 

The 'p' you see in the output is not a p-value - it is actually the proportion of codons with each omega, so in your example:
p:   0.93842  0.06158  0.00000
w:   0.01138  1.00000 294.86691
model 2's best guess is that 93.8% of codons have dN/dS of 0.1138, 6.2% of codons have dN/dS of 1, and 0% of codons have dN/dS of 294.

all the best,

Janet Young
Fred Hutchinson Cancer Research Center
Seattle, WA, USA

Ziheng

unread,
Feb 17, 2022, 12:42:58 PM2/17/22
to PAML discussion group
yes, as janet says, it is simpler and more appropriate to simply run model M0 (model = 0, NSsites = 0) using the two datasets (two domains), and then look at how different omega or dN/dS values are between the two datasets.

if you want to make things more complicated, there are some models in codeml that assign different omega values to different partitions of sites in the same alignment.  the paper is the following, and the example data files are in the examples\MHC.Swanson2002MBE/ folder in the paml release.  you will need to try to reproduce the results in the paper first to make sure that the program is working before trying to do the same on your own data.
Yang Z, Swanson WJ. 2002. Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes. Mol Biol Evol 19:49-57.
best, ziheng
Reply all
Reply to author
Forward
0 new messages