Hi Rich,
This does not address the immediate problem, but let us first get this
out of the way: you have to ensure that all phylogenetic data objects in
the same "universe" share the same operation taxonomic unit references:
https://pythonhosted.org/DendroPy/tutorial/taxa.html
OTU's in DendroPy are distinct from their string labels, and just
because two taxonomic entities have the same label, it does not mean
that they are the same operational taxonomic concept. In particular,
when creating or reading trees (or any other phylogenetic data object),
if you want to do any operations with them, you have to make sure that
they reference the same set of OTU's as represented by TaxonSet objects
[NOTE: in DendroPy 4, this will be called "TaxonNamespace" to emphasize
this concept]. By default, unless reading via a managed collection
(e.g., a TreeList or a DataSet), every tree read in will get its own
taxon namespace reference. You have to explicitly specify the TaxonSet
to use to ensure all trees that are compared or operated on share the
same TaxonSet.
In Dendropy 3, the `symmetric_difference()` operation as well as many
other binary tree operations generally coercs both trees to have the
same TaxonSet reference by "remapping" the OTU's from one tree to
another. Hence, that is why you still got meaningful results instead of
an error. In DendroPy 4, things will be a lot more strict: if you try to
compare two trees with different TaxonNamespace references, you will get
an error.
Ok, with that out of the way, let us address your problem. It comes from
the fact that the initial trees are compared as rooted. But when writing
out using the "as_newick_string()" function, the rooting statement is
not written, and thus when reading it back in again the tree is read as
unrooted. So, if you go this route, you will have to explicitly specify
that the tree is treated as rooted by:
nt2=dendropy.Tree.get_from_string(
t2.as_newick_string(),
as_rooted=True,
schema='newick')
But you should not this route. There is a good reason
"as_newick_string()" writes out an incomplete string: it is not a public
function for public usage, but an internal one used for debugging. In
fact, in DendroPy4, this method longer exists, but is replaced by
"tree._as_newick_string()" to emphasize that this is not a public method.
You should be using the "as_string()" function, which *is* a public
function, and the recommended way to get a string represented of a
phylogenetic data object. You can see the difference in the two
representations here:
t2 = dendropy.Tree.get_from_string("[&R](A,(B,C));", "newick")
print(t2.as_newick_string())
print(t2.as_string("newick"))
The "as_string()" function is *very* rich, and supports lots of features
that control how the tree is rendered. The "as_newick_string()" does not:
t2 = dendropy.Tree.get_from_string("[&R] (A,(B,C));", "newick")
for nd in t2:
nd.annotations.add_new("!color", "#ff6600")
t2.annotations.add_new("foo", "bar")
print(t2.as_newick_string())
print(t2.as_string("newick", suppress_annotations=False))
So, putting it altogether, we get:
(1) Trees have to reference the same taxonomic namespace, as given by
the TaxonSet object reference
(2) Use the proper "as_string('newick') function to serialize the trees.
##########################################################
#! /usr/bin/env python
import dendropy
taxon_namespace = dendropy.TaxonSet()
trees = []
for i in range(100):
d = []
for j in range(2):
t=dendropy.treesim.birth_death(
birth_rate=1.0,
death_rate=0.5,
ntax=10,
taxon_set=taxon_namespace)
d.append(t)
trees.append(d)
for t1, t2 in trees:
d1 = t1.symmetric_difference(t2)
nt1=dendropy.Tree.get_from_string(
t1.as_string("newick"),
schema='newick',
taxon_set=taxon_namespace)
nt2=dendropy.Tree.get_from_string(
t2.as_string("newick"),
schema='newick',
taxon_set=taxon_namespace)
nd1 = nt1.symmetric_difference(nt2)
if nd1 != d1:
print "WEIRD: the symmetric difference of the original
trees is", d1, "but after converting them to newick representation then
back, it is", nd1
else:
print "OK"
#############################
> --
> You received this message because you are subscribed to the Google
> Groups "DendroPy Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
dendropy-user...@googlegroups.com
> <mailto:
dendropy-user...@googlegroups.com>.
> For more options, visit
https://groups.google.com/d/optout.
--
--------------------------------------
Jeet Sukumaran
--------------------------------------
jeetsu...@gmail.com
--------------------------------------
Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
--------------------------------------