Making a tree ultrametric

112 views
Skip to first unread message

Yan Wong

unread,
Dec 3, 2014, 6:22:27 AM12/3/14
to dendrop...@googlegroups.com
Is there a DendroPy routine for forcing a tree to be ultrametric, even using something a simple as extending all the terminal edges? If not, what's the quickest way to find the shallowest and deepest leaf in a tree?

p.s. I just tried reading in a blank file into a TreeList and got an "UnexpectedEndOfStreamError". I know the answer is probably "don't do that", but I'm reading in many files in a loop, some of which may be empty. Perhaps simply an empty tree list should be returned? OTOH I can work around it by using a try/except block, so it's not particularly onerous.

Jeet Sukumaran

unread,
Dec 5, 2014, 11:16:22 PM12/5/14
to dendrop...@googlegroups.com
"Making a tree ultrametric": under which model? For a strict molecular
clock you could try ``dendropy.paup.estimate_ultramtric_tree()``: if
given a tree and a character matrix it estimates branch length.

If you mean you just want to stretch a tree out so it looks good (i.e.,
the "Photoshop" model), you will have to roll your own.

With regards the blank file: no, I am not going to handle that in
another way.

DendroPy is a programmer's library, not R.

It is the responsibility of the programmer to ensure that input fulfills
the contract of the function to get the promised output, and it is in
the interests of the programmer to be alerted to problems as soon as
they are encountered, instead of hoping the library takes care of
intended or unintended slop either sensibly or otherwise and then having
to deal with surreal downstream.

-- jeet
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--



--------------------------------------
Jeet Sukumaran
--------------------------------------
jeetsu...@gmail.com
--------------------------------------
Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
--------------------------------------

Jeet Sukumaran

unread,
Dec 5, 2014, 11:50:31 PM12/5/14
to dendrop...@googlegroups.com
When you say "shallow" do you mean closest to the root or furthest from
the root?

If your metric is based on distance from the root (i.e., "level"), then
a preorder iteration of the tree, where the first node (the root) given
an attribute `level=1`, and then each subsequent node's `level`
attribute is assigned one plus the value of its parent `level`
attribute. You will use two variables keep track of the leaf with the
minimum level and the maximum level attribute, respectively, updating as
you encounter new leaves as needed.

The `Node.distance_from_root()` provides for calculating the weighted
path length of each node from the root. You could use that as an example
to get started.

-- jeet

On 12/3/14, 6:22 AM, Yan Wong wrote:
> If not, what's the quickest way to find the shallowest and deepest leaf
> in a tree?

Yan Wong

unread,
Dec 9, 2014, 6:20:19 PM12/9/14
to dendrop...@googlegroups.com
On Saturday, 6 December 2014 04:50:31 UTC, Jeet Sukumaran wrote:
When you say "shallow" do you mean closest to the root or furthest from
the root?

Good point. Closest in this case, but the principle is the same whatever.
 
The `Node.distance_from_root()` provides for calculating the weighted
path length of each node from the root. You could use that as an example
to get started.

Thanks. I was wondering if there was a measure built-in to the tree metadata, or a routine that did it for me without having to iterate through nodes and so revisit the same edges multiple times? I guess calc_node_ages() is the thing I need to calculate values over the whole tree, even though I only need to store the max and min lengths. I'm probably just being overly worried about optimisation, but then I am expecting to deal with trees with > 100,000 nodes.

Point taken in the other reply to this post about programmers having responsibility for input.

Cheers

Yan

Jeet Sukumaran

unread,
Dec 9, 2014, 6:35:22 PM12/9/14
to dendrop...@googlegroups.com
You should definitely avoid revisiting the same edges multiple times. If
all you want is min/max age *values* (as opposed to the nodes withs
those values), then just call `tree.internal_node_ages()`. This returns
a (sorted) list of internal node ages, where "age" is defined as sum of
edge lengths from the tips. I am assuming that you want to exclude tip
nodes (age == 0), and that's why I suggested "tree.internal_node_ages()"
as opposed to just "tree.node_ages()". You also probably want to exclude
to root age:

node_ages = tree.internal_node_ages()
n2 = [a for a in node_ages if a < tree.seed_node.age]
min_age = min(n2)
max_age = max(n2)

(NOTE: I have not actually run the above fragment, and just wrote it off
the the top of my head, so it may need tweaking to work).

Things get a little more tricky if you actually want the nodes
associated with those ages. You can just make a second pass through the
tree to find them (via the age attribute), but if efficiency is a
concern, you almost definitely want to roll your own custom age
calculating function based on `tree.calc_node_ages()` that gets you the
nodes directly.


On 12/9/14, 6:20 PM, Yan Wong wrote:
>
> Thanks. I was wondering if there was a measure built-in to the tree
> metadata, or a routine that did it for me without having to iterate
> through nodes and so revisit the same edges multiple times? I
> guess calc_node_ages() is the thing I need to calculate values over the
> whole tree, even though I only need to store the max and min lengths.
> I'm probably just being overly worried about optimisation, but then I am
> expecting to deal with trees with > 100,000 nodes.

Yan Wong

unread,
Dec 10, 2014, 12:14:20 PM12/10/14
to dendrop...@googlegroups.com
On Tuesday, 9 December 2014 23:35:22 UTC, Jeet Sukumaran wrote:
If  all you want is min/max age *values* (as opposed to the nodes withs
those values), then just call `tree.internal_node_ages()`.

That's exactly what I want. Thanks.
Reply all
Reply to author
Forward
0 new messages