BEB and NEB Result

91 views
Skip to first unread message

sunny kevin

unread,
Jul 13, 2022, 4:45:31 AM7/13/22
to PAML discussion group
Hello everyone,


Actually, I am a little confused. Do we have to compute the NEB and the BEB probabilities for each orthology?

I performed the branch-site models on specific branches (#1).

Output from codeml run -
out.mlc file -

248 Detailed output identifying parameters
249
250 kappa (ts/tv) =  2.46840
251
252
253 dN/dS (w) for site classes (K=4)
254
255 site class             0        1       2a       2b
256 proportion       0.00000  0.00000  0.23914  0.76086
257 background w     0.02808  1.00000  0.02808  1.00000
258 foreground w     0.02808  1.00000 999.00000 999.00000
259
260
261 Naive Empirical Bayes (NEB) analysis (please use the BEB results.)
262 Positive sites for foreground lineages Prob(w>1):
263
264      1 C 1.000**
265      2 T 1.000**
266      3 W 1.000**
267      4 F 1.000**
268      5 R 1.000**
269      6 T 1.000**
270      7 T 1.000**
271      8 C 1.000**
272      9 T 1.000**
273     10 W 1.000**
274     11 T 1.000**
275     12 S 1.000**
276     13 G 1.000**
277     14 G 1.000**
278     15 S 1.000**
279     16 S 1.000**
280     17 T 1.000**
281     18 A 1.000**
282     19 C 1.000**
283    20 A 1.000**
284     21 S 1.000**
285     22 G 1.000**
286     23 R 1.000**
287     24 P 1.000**
288     25 T 1.000**
289     26 E 1.000**
290     27 S 1.000**
291     28 S 1.000**
292     29 C 1.000**
293     30 S 1.000**
294     31 G 1.000**
295     32 A 1.000**
296     33 A 1.000**
297     34 G 1.000**
298     35 S 1.000**
299     36 V 1.000**
300     37 C 1.000**
301     38 G 1.000**

337 Bayes Empirical Bayes (BEB) analysis (Yang, Wong & Nielsen 2005. Mol. Biol. Evol. 22:1107-1118)
338 Positive sites for foreground lineages Prob(w>1):
339
340
341 The grid (see ternary graph for p0-p1)
342
343 w0:   0.050  0.150  0.250  0.350  0.450  0.550  0.650  0.750  0.850  0.950
344 w2:   1.500  2.500  3.500  4.500  5.500  6.500  7.500  8.500  9.500 10.500
345
346
347 Posterior on the grid
348
349 w0:   0.141  0.148  0.141  0.123  0.103  0.085  0.073  0.067  0.063  0.056
350 w2:   0.090  0.092  0.094  0.096  0.099  0.101  0.103  0.106  0.108  0.111
351
352 Posterior for p0-p1 (see the ternary graph)
353
354  0.007
355  0.007 0.009 0.015
356  0.007 0.010 0.017 0.018 0.021
357  0.008 0.011 0.019 0.020 0.022 0.021 0.018
358  0.009 0.013 0.021 0.022 0.022 0.020 0.016 0.015 0.011
359  0.010 0.015 0.024 0.024 0.020 0.018 0.013 0.012 0.009 0.009 0.006
360  0.013 0.019 0.025 0.023 0.016 0.015 0.010 0.009 0.006 0.006 0.005 0.005 0.004
361  0.017 0.024 0.021 0.019 0.010 0.010 0.006 0.006 0.004 0.004 0.003 0.003 0.003 0.003 0.002
362  0.025 0.027 0.011 0.011 0.005 0.006 0.003 0.004 0.003 0.003 0.002 0.002 0.002 0.002 0.002 0.002 0.002
363  0.012 0.012 0.003 0.004 0.002 0.003 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.002 0.001 0.002 0.001 0.001 0.001
364
365 sum of density on p0-p1 =   1.000000
366
367 Time used:  0:39
                                                       

In my rst file -

       dN/dS (w) for site classes (K=4)
  5
  6 site class             0        1       2a       2b
  7 proportion       0.00000  0.00000  0.23914  0.76086
  8 background w     0.02808  1.00000  0.02808  1.00000
  9 foreground w     0.02808  1.00000 999.00000 999.00000
 10
 11 Naive Empirical Bayes (NEB) probabilities for 4 classes
 12 (amino acids refer to 1st sequence: lumpus)
 13
 14    1 C   0.00000 0.00000 0.48878 0.51122 ( 4)
 15    2 T   0.00000 0.00000 0.40988 0.59012 ( 4)
 16    3 W   0.00000 0.00000 0.52935 0.47065 ( 3)
 17    4 F   0.00000 0.00000 0.01894 0.98106 ( 4)
 18    5 R   0.00000 0.00000 0.51067 0.48933 ( 3)
 19    6 T   0.00000 0.00000 0.40988 0.59012 ( 4)
 20    7 T   0.00000 0.00000 0.42107 0.57893 ( 4)
 21    8 C   0.00000 0.00000 0.33620 0.66380 ( 4)
 22    9 T   0.00000 0.00000 0.40988 0.59012 ( 4)
 23   10 W   0.00000 0.00000 0.02790 0.97210 ( 4)
 24   11 T   0.00000 0.00000 0.28071 0.71929 ( 4)
 25   12 S   0.00000 0.00000 0.44899 0.55101 ( 4)
 26   13 G   0.00000 0.00000 0.02666 0.97334 ( 4)
 27   14 G   0.00000 0.00000 0.01144 0.98856 ( 4)
 28   15 S   0.00000 0.00000 0.01397 0.98603 ( 4)
 29   16 S   0.00000 0.00000 0.40878 0.59122 ( 4)
 30   17 T   0.00000 0.00000 0.28071 0.71929 ( 4)
 31   18 A   0.00000 0.00000 0.47605 0.52395 ( 4)
 32   19 C   0.00000 0.00000 0.33620 0.66380 ( 4)
 33   20 A   0.00000 0.00000 0.42819 0.57181 ( 4)
 34   21 S   0.00000 0.00000 0.00069 0.99931 ( 4)
 35   22 G   0.00000 0.00000 0.02184 0.97816 ( 4)
 36   23 R   0.00000 0.00000 0.02522 0.97478 ( 4)
 37   24 P   0.00000 0.00000 0.02310 0.97690 ( 4)
 38   25 T   0.00000 0.00000 0.01847 0.98153 ( 4)
 39   26 E   0.00000 0.00000 0.02106 0.97894 ( 4)
 40   27 S   0.00000 0.00000 0.02409 0.97591 ( 4)
 41   28 S   0.00000 0.00000 0.02613 0.97387 ( 4)
 42   29 C   0.00000 0.00000 0.48878 0.51122 ( 4)
 43   30 S   0.00000 0.00000 0.52182 0.47818 ( 3)
 44   31 G   0.00000 0.00000 0.00002 0.99998 ( 4)
 45   32 A   0.00000 0.00000 0.29580 0.70420 ( 4)
 46   33 A   0.00000 0.00000 0.29580 0.70420 ( 4)
 47   34 G   0.00000 0.00000 0.43515 0.56485 ( 4)
 48   35 S   0.00000 0.00000 0.40878 0.59122 ( 4)
 49   36 V   0.00000 0.00000 0.01685 0.98315 ( 4)
 50   37 C   0.00000 0.00000 0.48878 0.51122 ( 4)
 51   38 G   0.00000 0.00000 0.45295 0.54705 ( 4)


        lnL =  -558.649776
 89
 90 Bayes Empirical Bayes (BEB) probabilities for 4 classes (class)
 91 (amino acids refer to 1st sequence: lumpus)
 92
 93    1 C   0.30360 0.33920 0.17070 0.18651 ( 2)
 94    2 T   0.27729 0.36551 0.15595 0.20125 ( 2)
 95    3 W   0.31656 0.32625 0.17795 0.17924 ( 2)
 96    4 F   0.15498 0.48780 0.08747 0.26975 ( 2)
 97    5 R   0.30943 0.33337 0.17396 0.18324 ( 2)
 98    6 T   0.27729 0.36551 0.15595 0.20125 ( 2)
 99    7 T   0.28093 0.36185 0.15800 0.19921 ( 2)
100    8 C   0.25257 0.39019 0.14212 0.21511 ( 2)
101    9 T   0.27729 0.36551 0.15595 0.20125 ( 2)
102   10 W   0.17334 0.46946 0.09785 0.25934 ( 2)
103   11 T   0.23233 0.41043 0.13078 0.22647 ( 2)
104   12 S   0.29016 0.35263 0.16317 0.19404 ( 2)
105   13 G   0.17053 0.47227 0.09626 0.26094 ( 2)
106   14 G   0.13459 0.50816 0.07590 0.28134 ( 2)
107   15 S   0.14239 0.50038 0.08034 0.27689 ( 2)
108   16 S   0.27702 0.36576 0.15581 0.20141 ( 2)
109   17 T   0.23233 0.41043 0.13078 0.22647 ( 2)
110   18 A   0.29902 0.34377 0.16813 0.18907 ( 2)
111   19 C   0.25257 0.39019 0.14212 0.21511 ( 2)
112   20 A   0.28347 0.35931 0.15942 0.19779 ( 2)
113   21 S   0.10928 0.53350 0.06149 0.29573 ( 2)
114   22 G   0.16132 0.48147 0.09106 0.26615 ( 2)
115   23 R   0.16874 0.47405 0.09526 0.26195 ( 2)
116   24 P   0.16388 0.47891 0.09251 0.26470 ( 2)
117   25 T   0.15391 0.48886 0.08687 0.27036 ( 2)
118   26 E   0.16081 0.48196 0.09078 0.26644 ( 2)
119   27 S   0.16643 0.47637 0.09395 0.26325 ( 2)
120   28 S   0.16952 0.47326 0.09570 0.26151 ( 2)
121   29 C   0.30360 0.33920 0.17070 0.18651 ( 2)
122   30 S   0.31421 0.32859 0.17664 0.18056 ( 2)
123   31 G   0.07964 0.56316 0.04444 0.31276 ( 2)
124   32 A   0.23790 0.40486 0.13390 0.22334 ( 2)
125   33 A   0.23790 0.40486 0.13390 0.22334 ( 2)
126   34 G   0.28547 0.35733 0.16053 0.19667 ( 2)
127   35 S   0.27702 0.36576 0.15581 0.20141 ( 2)
128   36 V   0.15059 0.49219 0.08499 0.27223 ( 2)
129   37 C   0.30360 0.33920 0.17070 0.18651 ( 2)
130   38 G   0.29129 0.35150 0.16380 0.19341 ( 2)


Is this the NEB and BEB output ?

Do I have to do any further analysis ?

How do I interpret the output by taking NEB and BEB into consideration ?
From outfile.mlc -          
264      C 1.000**
265      2 T 1.000**
266      3 W 1.000**
267      4 F 1.000**
268      5 R 1.000**
269      6 T 1.000**
270      7 T 1.000**
271      8 C 1.000**
          
    These are the positive sites (99 % probability).

Null - lnL - -558.649776, np - 22
Fixed - lnL - -559.449764 , np - 21

LTR = 2×(−559.449764−(−558.649776)) => -1.59

Is the LTR calculation correct ?
 
dof = 1(np1-np0 = 22 - 21) = 1

chi2 1 1.59

df =  1  prob = 0.207326134 = 2.073e-01

If the p value is significant, I can report the gene is positive selection.

Suggestions appreciated.

What are the other steps ?

Thanks
Kevin

Ziheng

unread,
Jul 22, 2022, 2:04:42 PM7/22/22
to PAML discussion group
our recommendation is to ignore the NEB results and focus on the BEB results.
in the output, there is one block with the heading "NEB".  After that there is another block with the heading "BEB.
there are examples in the examples/ folder, which you can run first to get familiar with the interpretation of the output.
best, ziheng
Reply all
Reply to author
Forward
0 new messages