Discrepancies between combined trace and combined trees?

6 views
Skip to first unread message

Ziv Lieberman

unread,
May 19, 2026, 11:27:41 PM (5 days ago) May 19
to PAML discussion group
Hi all,
Upon reviewing some files I noticed an unusual discrepancy which I wonder if anyone might help explain. As always I am happy to provide outputs, but would prefer to do so off of the public forum for data privacy.
This actually follows up a query from 7 January 2026 about how to combined traces.

Context: I inferred divergence dates in MCMCTree (with approximate likelihood). I used three independent runs. I then combined the MCMC traces (.log) across the three runs using LogCombiner from the BEAST package. I summarized trees across runs in MCMCTree using the print = -1 statement in  the control file.

Problem: Posterior parameters are slightly different between the combined trace and the combined tree. For example, at a particular node, the trace has a mean age of 54.36 million years, 95%HPD [61.76, 47.59 ]. At the same node, the tree has mean 54.32 [62.38, 47.9].

I presume this may be attributable to some difference in how LogCombiner and MCMCTree combine the traces. I am just wondering if anyone might have insight on these differences so they can be avoided in future. Fortunately, the discrepancies almost all fall within rounding error to the nearest million years. . .
Thank you!
-Ziv

Sandra AC

unread,
May 20, 2026, 6:56:52 AM (4 days ago) May 20
to PAML discussion group
Dear Ziv,

Thanks for your message! 

Unfortunately, I am not familiar with LogCombiner nor have used it to summarise sampled values from chains run with MCMCtree. When running multiple independent chains, I use a bash script I wrote some time ago to combine the "mcmc.txt" files from chains that pass QC filters during MCMC diagnostics (please note that, if the run was interrupted, there will be an incomplete line at the end of the "mcmc.txt" file, which needs to be removed; the script does this automatically). Then, I use `print = -1` to summarise the samples in the combined "mcmc.txt" file and generate the combined timetree. If you are using other software to summarise this combined "mcmc.txt" file, perhaps there are some assumptions in the way the sampled values are used. Not sure if that will be the case, but perhaps this program uses a burn-in to get rid of a percentage of the sampled values in the combined "mcmc.txt" file or the first X sampled values -- that would explain why the mean time estimates and corresponding CIs differ. If that is not the case, I am not sure what the problem might be. If this issue persists, perhaps you may want to use the aforementioned bash script ("Combine_MCMC.sh") and follow the guidelines in our latest MCMCtree tutorial (see from section "MCMC diagnostics").

Perhaps other PAML users can give you more insights into how LogCombiner works and what may be going on but, in the meantime, I hope that the abovementioned somewhat helps!

All the best,
Sandy

Reply all
Reply to author
Forward
0 new messages