Do the consensus tree branch lengths need to be rescaled manually?

21 views
Skip to first unread message

Karolis Ramanauskas

unread,
Jul 6, 2021, 9:18:05 AM7/6/21
to bali-phy-users
Hi,

I am running BAli-Phy version 3.6.0 and it is fantastic. I am a little confused about the reported branch lengths, however. For the example pictured below I ran this command:

bali-phy -n chain --align aln.fasta --alphabet DNA --smodel=gtr+Rates.gamma[4] --imodel=rs07 --iterations 100000000

And then I ran bp-analyze --subsample 2 chain-1 chain-2 ... chain-9. At this time there were around 5000 samples.

In the image you can see a part of the greedy.tree (left) produced by bp-analyze as well as the FastTree tree (right) produced from the P1.max.fasta file. The scale bar is set at 0.075. The branch lengths in the greedy.tree file are much shorter than I expect. Do I need to manually rescale this tree?

The tree length for the FastTree tree is 2.13 subs/site. For the BP tree, the length is 0.076. The reported scale for the one (and only) partition is 30.43. Multiplying these numbers appears to scale the tree to what I would expect it to be: 0.076 * 30.43 = 2.312. I want to confirm that this is how we are expected to do this. Did I miss something in the manual?

Thank you very much,
Karolis

baliphy_branch_lengths.png

Benjamin Redelings

unread,
Jul 6, 2021, 10:04:19 AM7/6/21
to bali-ph...@googlegroups.com

Hi Karolis,

That is a good question -- you are right, the branch lengths for the saved trees should approximately sum to 1.  So you need to multiply by the scale to get the branch lengths in terms of substitutions-per-site.  You should be able to use the included program tree-tool to actually scale the trees like "tree-tool --scale=30.43 newick.tree".  And yes, I should put this in the manual!

Hmmm.... I suppose that I could modify bp-analyze to compute these scaled trees.  We would need to print a tree for each partition, since each gene has its own scale.  The trees for all the partitions would be the same except for the scale.

Another solution would be to always log scaled trees for each partition.  This could take a lot of disk space if there are many partitions.

What do you think?

-BenRI

--
You received this message because you are subscribed to the Google Groups "bali-phy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bali-phy-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bali-phy-users/337e13ac-8a95-4929-b814-45623792040bn%40googlegroups.com.

Karolis Ramanauskas

unread,
Jul 15, 2021, 3:50:20 PM7/15/21
to bali-phy-users
Thanks, Benjamin,

Personally, just knowing that the branches should be rescaled by a partition-specific scale factor to produce a tree (or trees) with the desired scale (substitutions per site) is quite enough. On the other hand, if it is common for people to need a posterior set with all the trees rescaled, it may be a useful feature to add to the bp-analyze script (maybe a command-line option). But, if I understand correctly, the only operation involved is a simple multiplication of branch_length[n] * scaling_factor[part]? If so, the rescaling could be done at any point in any software. Or does tree-tool --scale perform some additional operations?

Thanks,
Karolis
Reply all
Reply to author
Forward
0 new messages