As I understand it, the resolve_polytomies() method is intended to create a resolution of a polytomy so that each topology is equiprobable. But )on my computer) that doesn't seem to work for the simple case of resolving a 4-tip tree into one of the 15 possible topologies. Here's some code to demonstrate:
import collections
import dendropy
from dendropy.simulate import treesim
import random
taxa = dendropy.TaxonNamespace(['a', 'b', 'c', 'd'])
random.seed(123)
trees = collections.defaultdict(int)
def lexicographic_sort_tree(tree):
"""
A hacky way to get a unique identifier for each topology - there must be a better way
"""
for nd in tree.postorder_node_iter():
nd._child_nodes.sort(key=lambda x: x._sort_key)
nd._sort_key = nd.taxon.label if nd.taxon and nd.taxon.label else "\n".join(ch._sort_key for ch in nd._child_nodes)
return tree
for i in range(500000):
tree = treesim.star_tree(taxa)
tree.resolve_polytomies(rng=random)
trees[str(lexicographic_sort_tree(tree))] += 1
for k, v in sorted(trees.items()):
print(k, v)
For which I get the following counts, which are clearly biased in favour of the 3 evenly balanced topologies. Do others find this?
(((a,b):0.0,c):0.0,d) 31438
(((a,b):0.0,d):0.0,c) 31474
(((a,c):0.0,b):0.0,d) 31134
(((a,c):0.0,d):0.0,b) 31108
(((a,d):0.0,b):0.0,c) 31151
(((a,d):0.0,c):0.0,b) 31246
((a,(b,c):0.0):0.0,d) 31442
((a,(b,d):0.0):0.0,c) 31234
((a,(c,d):0.0):0.0,b) 31373
((a,b):0.0,(c,d):0.0) 41816
((a,c):0.0,(b,d):0.0) 41283
((a,d):0.0,(b,c):0.0) 41680
(a,((b,c):0.0,d):0.0) 31119
(a,((b,d):0.0,c):0.0) 31142
(a,(b,(c,d):0.0):0.0) 31360