treeannotator: big difference in tree heights between median/mean and "Common Ancestor"

1,814 views
Skip to first unread message

Mike Famulare

unread,
Apr 26, 2017, 2:56:21 PM4/26/17
to beast-users
For my MCC tree from contemporary tip-dated viral sequences collected over the last few years, the "Common ancestor" node heights are systematically older than either the median or mean heights (which are similar).  At deeper nodes, the difference can be 20% of the total tree height. For my science question, I need to understand why they are so different and which is "more trustworthy".

My understanding is the common ancestor option is somehow calculating the mean pairwise MRCA (but under what model and from what data, I don't know, couldn't find a reference). I would've guessed that this was similar to the mean heights from the tree ensemble, which I understand as the MRCA under the whole-tree coalescent.  The difference is large enough to affect the interpretation of my results. So, my questions:
  1. What does the "common ancestor" option calculating?
  2. How does it relate the the tree ensemble heights?
  3. When the "common ancestor" heights are very different from the mean/median, which should I consider more meaningful?
Thanks,

--Mike

Remco Bouckaert

unread,
Apr 26, 2017, 5:18:19 PM4/26/17
to beast...@googlegroups.com
Hi Mike,

Mean estimates are calculated as the mean MRCA time for those trees in the set where the clade is monophyletic.So, if there is very low posterior support for a clade, the mean estimate can be based on a small subset of the trees from the tree set.

Common ancestor estimates for a clade are calculated as the average MRCA time for that clade over all trees in the set, and are not necessarily based on monophyletic clades.

Perhaps you can check the posterior support for those clades that have large deviations between common ancestor and mean estimates, and based on that determine whether you trust the mean height estimates.

Cheers,

Remco


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Mike Famulare

unread,
Apr 27, 2017, 2:44:48 PM4/27/17
to beast-users
Hi Remco,

Thanks. Yes, the posterior supports for the clades with large deviations are small. (Which was expected--our downstream analyses are working with the tree ensemble for this reason, but a single visualization is nice to have around.) 

Follow-up question for confirmation. For the common ancestor average MRCA, the clade from each tree is defined as the clade closest to the tips containing all the members of the target clade?

Thanks again,

--Mike

Mike Famulare

unread,
Apr 27, 2017, 2:50:01 PM4/27/17
to beast-users

Also, there appears to be a bug in TreeAnnotator (v1.8.4 at least) in that the "height_95%_HPD" for the "common ancestor" MCC tree are for the monophyletic clades only (as in mean/median).  Disregard if this has been fixed, but I can't test myself because Beast2 treeAnnotator chokes on my Beast 1.8.4 trees file (the "[& ...]" stuff).  


Thanks for your quick support!

--Mike

Remco Bouckaert

unread,
Apr 30, 2017, 7:50:18 PM4/30/17
to beast...@googlegroups.com
Hi Mike,

Follow-up question for confirmation. For the common ancestor average MRCA, the clade from each tree is defined as the clade closest to the tips containing all the members of the target clade?

MRCA stands for most recent common ancestor, so your description is pretty accurate :-)


On 28/04/2017, at 6:50 AM, Mike Famulare <mikefa...@gmail.com> wrote:


Also, there appears to be a bug in TreeAnnotator (v1.8.4 at least) in that the "height_95%_HPD" for the "common ancestor" MCC tree are for the monophyletic clades only (as in mean/median).  Disregard if this has been fixed, but I can't test myself because Beast2 treeAnnotator chokes on my Beast 1.8.4 trees file (the "[& ...]" stuff).  


Looks like there is the same problem in v2.4 — thanks for pointing this out (I raised an issue for this #691).

Cheers,

Remco

Dawson White

unread,
Sep 22, 2020, 2:21:13 PM9/22/20
to beast-users
Hello Remco, 
I am observing significantly shorter (4-7x) CA heights compared to median heights (MH) among two datasets generated with SNAPP and TreeAnnotator 2.6.2. This seems to be a reversal of expectations, and I am lacking confidence in my understanding of how they were generated and should be interpreted. 
1) How can CA height be shorter than MH height when it summarizes "average MRCA time for that clade over all trees in the set" compared to only the subset where the given clade is monophyletic?
2) The first "populations" inference is very messy with low PP across the 5 nodes, so it makes sense that there could be a difference between CA and MH. However, the second "species" inference has perfect PP across two nodes, so how can we understand a 5x difference between CA and MH?
Thanks! Dawson

sara

unread,
Feb 3, 2021, 4:07:45 PM2/3/21
to beast-users
Hi Remko (et al)!
related question; hope someone is listening :S

I have doubts on the meaning of all "node labels" possibilities in figtree (144)
Ran TreeAnnotator (262) for MCC tree with "median heights" as "node heights". when I check all values in TreeView, I have this (example of a node):

node age: -0.269
node height (raw): 0.269   # these match the node height in the figure
height: 0.288
height median = 0.252

I know these are small but why these differences, and what is exactly being calculated in each?
or any place I can see it?

many thanks in advance,

Reply all
Reply to author
Forward
0 new messages