Non-equiprobable breaking of polytomies

Yan Wong

unread,

Jun 26, 2020, 11:01:21 AM6/26/20

to DendroPy Users

As I understand it, the resolve_polytomies() method is intended to create a resolution of a polytomy so that each topology is equiprobable. But )on my computer) that doesn't seem to work for the simple case of resolving a 4-tip tree into one of the 15 possible topologies. Here's some code to demonstrate:

import collections

import dendropy

from dendropy.simulate import treesim

import random

taxa = dendropy.TaxonNamespace(['a', 'b', 'c', 'd'])

random.seed(123)

trees = collections.defaultdict(int)

def lexicographic_sort_tree(tree):

"""

A hacky way to get a unique identifier for each topology - there must be a better way

"""

for nd in tree.postorder_node_iter():

nd._child_nodes.sort(key=lambda x: x._sort_key)

nd._sort_key = nd.taxon.label if nd.taxon and nd.taxon.label else "\n".join(ch._sort_key for ch in nd._child_nodes)

return tree

for i in range(500000):

tree = treesim.star_tree(taxa)

tree.resolve_polytomies(rng=random)

trees[str(lexicographic_sort_tree(tree))] += 1

for k, v in sorted(trees.items()):

print(k, v)

For which I get the following counts, which are clearly biased in favour of the 3 evenly balanced topologies. Do others find this?

(((a,b):0.0,c):0.0,d) 31438
(((a,b):0.0,d):0.0,c) 31474
(((a,c):0.0,b):0.0,d) 31134
(((a,c):0.0,d):0.0,b) 31108
(((a,d):0.0,b):0.0,c) 31151
(((a,d):0.0,c):0.0,b) 31246
((a,(b,c):0.0):0.0,d) 31442
((a,(b,d):0.0):0.0,c) 31234
((a,(c,d):0.0):0.0,b) 31373
((a,b):0.0,(c,d):0.0) 41816
((a,c):0.0,(b,d):0.0) 41283
((a,d):0.0,(b,c):0.0) 41680
(a,((b,c):0.0,d):0.0) 31119
(a,((b,d):0.0,c):0.0) 31142
(a,(b,(c,d):0.0):0.0) 31360

mholder

unread,

Jun 26, 2020, 8:30:52 PM6/26/20

to DendroPy Users

Thanks for the bug report!

This is fixed in https://github.com/jeetsukumaran/DendroPy/tree/equiprobable at least when you are calling the function with the default limit=2 as you were.

Jeet, I'd recommend that we just note that the distribution is not equiprobable when limit > 2. I can implement an equiprobable version of for that case if we really need it, but I think that it is a bit tedious to do correctly...

Mark

Yan Wong

unread,

Jun 27, 2020, 4:08:39 AM6/27/20

to DendroPy Users

Thanks so much for this Mark. It's really helpful. I'm a little surprised that others hadn't noticed this before, but perhaps it's not a very commonly used function.

Reply all

Reply to author

Forward