How to narrow down the range of 95% HPD using mcmctree in PAML14.9j

177 views
Skip to first unread message

Guo Cen

unread,
Jul 14, 2021, 5:37:31 AM7/14/21
to PAML discussion group
Hi Ziheng and everyone,

I'm using mcmctree in PAML14.9j to estimate divergent time, the control file is attached below:
          seed = -1
       seqfile = 11bg.txt
      treefile = 11bg.trees
      outfile = 11bg_time

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 1    * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 3    * 1: global clock; 2: independent rates; 3: correlated rates
       RootAge = '<1.0'  * safe constraint on root age, used if no fossil for root.

         model = 4    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 1 1 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha

   rgene_gamma = 2 5  * gamma prior for overall rates for genes
  sigma2_gamma = 1 10   * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: 0.1  0.1  0.1  0.08 0.08  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 200000
      sampfreq = 100
       nsample = 20000

*** Note: Make your window wider (100 columns) before running the program.





And the tree is looks like:

24 1

(Oa,((RuH,OaH),((((GnB,OlB),RiB),((DiB,BmB),MuB)),(((PdD,HaD),AuD),(((BaA,DsA),MhA),((((GaC,OlC),RhC),((DC,BC),MC)),(AC,(PC,HC))))))))'>.486<.54';

//end of file




The 95% HPD results in the out.file are:

Posterior mean (95% Equal-tail CI) (95% HPD CI) HPD-CI-width

t_n25          0.4226 (0.2035, 0.5556) (0.2270, 0.5669) 0.3399 (Jnode 46)
t_n26          0.3983 (0.1921, 0.5366) (0.2152, 0.5516) 0.3364 (Jnode 45)
t_n27          0.2179 (0.1018, 0.3175) (0.1077, 0.3231) 0.2154 (Jnode 44)
t_n28          0.3214 (0.1542, 0.4397) (0.1692, 0.4487) 0.2795 (Jnode 43)
t_n29          0.2058 (0.0973, 0.2926) (0.1006, 0.2951) 0.1945 (Jnode 42)
t_n30          0.1616 (0.0761, 0.2328) (0.0764, 0.2332) 0.1567 (Jnode 41)
t_n31          0.0812 (0.0373, 0.1239) (0.0360, 0.1222) 0.0862 (Jnode 40)
t_n32          0.1344 (0.0625, 0.1974) (0.0630, 0.1977) 0.1347 (Jnode 39)
t_n33          0.0910 (0.0418, 0.1371) (0.0413, 0.1363) 0.0950 (Jnode 38)
t_n34          0.3026 (0.1451, 0.4154) (0.1599, 0.4256) 0.2657 (Jnode 37)
t_n35          0.1142 (0.0523, 0.1756) (0.0501, 0.1725) 0.1225 (Jnode 36)
t_n36          0.1078 (0.0492, 0.1657) (0.0474, 0.1636) 0.1162 (Jnode 35)
t_n37          0.2877 (0.1376, 0.3960) (0.1506, 0.4049) 0.2543 (Jnode 34)
t_n38          0.1226 (0.0561, 0.1870) (0.0541, 0.1846) 0.1306 (Jnode 33)
t_n39          0.0805 (0.0364, 0.1263) (0.0344, 0.1238) 0.0894 (Jnode 32)
t_n40          0.2066 (0.0972, 0.2898) (0.1063, 0.2974) 0.1911 (Jnode 31)
t_n41          0.1863 (0.0875, 0.2626) (0.0925, 0.2665) 0.1740 (Jnode 30)
t_n42          0.1526 (0.0715, 0.2176) (0.0755, 0.2202) 0.1448 (Jnode 29)
t_n43          0.0761 (0.0350, 0.1136) (0.0359, 0.1144) 0.0785 (Jnode 28)
t_n44          0.1250 (0.0583, 0.1812) (0.0610, 0.1831) 0.1221 (Jnode 27)
t_n45          0.0834 (0.0385, 0.1238) (0.0395, 0.1244) 0.0850 (Jnode 26)
t_n46          0.1207 (0.0560, 0.1767) (0.0576, 0.1782) 0.1206 (Jnode 25)
t_n47          0.1153 (0.0534, 0.1694) (0.0555, 0.1711) 0.1156 (Jnode 24)
mu             0.4377 (0.2552, 0.8668) (0.2300, 0.7629) 0.5329
mu_bar         0.4514 (0.1275, 1.0843) (0.0822, 0.9460) 0.8637
sigma2         0.5494 (0.2715, 1.1309) (0.2231, 0.9894) 0.7663
sigma2_bar     0.2533 (0.0849, 0.5653) (0.0608, 0.5033) 0.4425
kappa          2.4009 (2.3611, 2.4408) (2.3602, 2.4396) 0.0793
lnL        -382082.3904 (-382092.9230, -382073.9700) (-382092.2470, -382073.4790) 18.7680



As we can see the range of the 95% HPD for each node is quite wide, is there any way to narrow down the range?

Thanks and best wishes,
Guo Cen



Ziheng

unread,
Nov 2, 2021, 8:06:06 AM11/2/21
to PAML discussion group
you have a pair of bounds on the root:>.486<.54
the bounds are by default implemented using 2.5% of tail probability. if you are confident that the age should not be younger than the minimum, you can use a small tail probability, like 1e-100. read about fossil calibrations in the document.
more fossil calibrations will help, but they are hard to come by.
when the clock is violated, assuming the strict clock will produce seriously biased date estimates, but including more partitions will help in reducing the posterior CI. however, having many partitions helps to improve the precision, but not necessarily the accuracy. here are 2 papers that discuss those issues.  the maths are about toy examples, which you can ignore, but you can look at the biological conclusions.
best wishes, ziheng


dos Reis M, Yang Z. 2013. The unbearable uncertainty of Bayesian divergence time estimation. J Syst Evol 51:30-43.

Zhu T, dos Reis M, Yang Z. 2015. Characterization of the uncertainty of divergence time estimation under relaxed molecular clock models using multiple loci. Syst Biol 64:267-280.
Reply all
Reply to author
Forward
0 new messages