Is it possible to output a table of bipartitions with associated edge lengths?

38 views
Skip to first unread message

Edward Braun

unread,
Aug 12, 2015, 9:24:01 PM8/12/15
to DendroPy Users
I hope there isn't an obvious answer to this that I'm missing, but is it possible to output the bipartitions in a tree along with the length branch that generates that bipartition? 

For example, if you have the tree:

[&R] (A,(B,(C,(D,E))));

With some associated branch lengths (e.g.):

[&R] (A:10.2,(B:9.8,(C:4.1,(D:4.02,E:4.02):0.08):5.7):0.4):0.0;

Would it be possible to output something like the following?

11110  0.4
11100  5.7
11000  0.08
10000  4.02
etc.

A table using the * and . notation that is generated by some programs (like phylip consense) would also be usable. But I'd like branch lengths from a single tree, not # of trees in a consensus

Jeet Sukumaran

unread,
Aug 13, 2015, 10:08:34 AM8/13/15
to dendrop...@googlegroups.com
The bipartition associated with an edge on a tree is available through
the ``Edge.bipartition`` attribute. So you can just traverse the tree
and access the edge length and corresponding bipartition object:

~~~
#! /usr/bin/env python

import dendropy

tree = dendropy.Tree.get(
data="[&R] (A:10.2,(B:9.8,(C:4.1
(D:4.02,E:4.02):0.08):5.7):0.4):0.0;",
schema="newick")

tree.encode_bipartitions()
# Also can do: for edge in tree.preorder_edge_iter():
for node in tree:
print("{}: {}".format(
node.edge.bipartition.leafset_as_bitstring(),
node.edge.length))
~~~

You can also access the ``Tree.bipartition_edge_map`` attribute, that is
a dictionary mapping bipartitions to edges, but this is less efficient
than the above approach.

~~~
tree.encode_bipartitions()
for bipartition in tree.bipartition_edge_map:
print("{}: {}".format(
bipartition.leafset_as_bitstring(),
tree.bipartition_edge_map[bipartition].length))
~~~

-- jeet
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--



--------------------------------------
Jeet Sukumaran
--------------------------------------
jeetsu...@gmail.com
--------------------------------------
Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
--------------------------------------

Edward Braun

unread,
Aug 13, 2015, 11:24:32 AM8/13/15
to DendroPy Users
Thanks Jeet! That does exactly what I wanted.

Derrick Zwickl

unread,
Aug 13, 2015, 5:15:54 PM8/13/15
to dendrop...@googlegroups.com

I think that is functionality that I got Jeet to implement in Sumtrees a few years ago,if you want something a bit simpler. It is the -e option if I recall.

--
You received this message because you are subscribed to the Google Groups "DendroPy Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dendropy-user...@googlegroups.com.

Jeet Sukumaran

unread,
Aug 13, 2015, 8:01:28 PM8/13/15
to dendrop...@googlegroups.com
Note that with DendroPy 4.x, this option has changed. ``-e`` now
specifies the edge-length summarization strategy.

Instead of ``-e``, to get the table of bipartitions and associated edge
lengths (as well as a whole lot of other information), you need to
specify ``-x <PREFIX>`` or ``--extended-output <PREFIX>``. From the help:

~~~
-x PREFIX, --extended-output PREFIX
If specified, extended summarization information will
be generated, consisting of the following files:
- '<PREFIX>.topologies.trees'
A collection of topologies found in the sources
reported with their associated posterior
probabilities as metadata annotations.
- '<PREFIX>.bipartitions.trees'
A collection of bipartitions, each represented
as a tree, with associated information as
metadataannotations.
- '<PREFIX>.bipartitions.tsv'
Table listing bipartitions as a group pattern as
the key column, and information regarding each
the bipartitions as the remaining columns.
- '<PREFIX>.edge-lengths.tsv'
List of bipartitions and corresponding edge
lengths. Only generated if edge lengths are
summarized.
- '<PREFIX>.node-ages.tsv'
List of bipartitions and corresponding ages.
Only generated if node ages are summarized.
~~~

So the ``<PREFIX>.edge-lengths.tsv`` will come the closest to what the
OP was requesting. Note that each bipartition gets a single row, and the
edge lengths are single field in that row with commas separating
individual entries. So parsing out the edge lengths requires an
additional layer of processing (a script to read the tab-delimited
input, extract the bipartition bitstring and edge lengths field, and
then split the edge lengths field to its individual edge lengths). If
the OP does not require the actual tree summarization, then the more
direct approach of reading in trees and traversing the edges will be
more efficient.

-- jeet



On 8/13/15 5:15 PM, Derrick Zwickl wrote:
> I think that is functionality that I got Jeet to implement in Sumtrees a
> few years ago,if you want something a bit simpler. It is the -e option
> if I recall.
>
> On Aug 12, 2015 6:24 PM, "Edward Braun" <ebra...@gmail.com
> <mailto:ebra...@gmail.com>> wrote:
>
> I hope there isn't an obvious answer to this that I'm missing, but
> is it possible to output the bipartitions in a tree along with the
> length branch that generates that bipartition?
>
> For example, if you have the tree:
>
> [&R] (A,(B,(C,(D,E))));
>
> With some associated branch lengths (e.g.):
>
> [&R] (A:10.2,(B:9.8,(C:4.1,(D:4.02,E:4.02):0.08):5.7):0.4):0.0;
>
> Would it be possible to output something like the following?
>
> 11110 0.4
> 11100 5.7
> 11000 0.08
> 10000 4.02
> etc.
>
> A table using the * and . notation that is generated by some
> programs (like phylip consense) would also be usable. But I'd like
> branch lengths from a single tree, not # of trees in a consensus
>
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to dendropy-user...@googlegroups.com
> <mailto:dendropy-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages