>>> import dendropy
>>> treelist = dendropy.TreeList.get_from_path("true-trees-1X.tre", "newick")
>>> l = treelist.taxon_set.labels()
>>> l
['MON', 'MAC', 'TUR', 'BOS', 'SUS', 'VIC', 'FEL', 'CAN', 'EQU', 'MYO', 'PTE', 'SOR', 'ERI', 'CAV', 'RAT', 'MUS', 'CAL', 'TAR', 'NEW', 'PAN', 'HOM', 'GOR', 'PON', 'MIC', 'OTO', 'TUP', 'SPE', 'DIP', 'ORY', 'OCH', 'CHO', 'DAS', 'PRO', 'LOX', 'ECH', 'ORN', 'GAL']
>>> splita = 'MON', 'MAC'
>>> splitb = 'MON', 'TUR'
>>> splitc = 'MON', 'BOS'
>>> treelist.frequency_of_split(labels=splita)
0.94825
>>> treelist.frequency_of_split(labels=splitb)
0.0
>>> treelist.frequency_of_split(labels=splitc)
0.0
>>>
I was expecting the three numbers to total to one, because all the trees in the set treelist are binary and are on the full taxa set: Here is "proof" though I am sure there is a better way to check for polytomies, and for the full taxon set, sorry, here it is:
>>> polytomycheck = [[e for e in treelist[i].postorder_edge_iter()] for i in range(len(treelist))]
>>> len(polytomycheck[0])
72
>>> len(l)
37
>>> check2 = [len(polytomycheck[j]) == 72 for j in range(4000)]
>>> False in check2
False
Also, I expect the same frequency for using "the other side of split a: split a is for the quartet (1) 'MON', 'MAC' | 'TUR', 'BOS, and so using the other side I expect to see 0.94825 again, but I don't:
>>> othersidesplita = 'TUR','BOS'
>>> treelist.frequency_of_split(labels=othersidesplita)
0.73825
Any suggestions are much appreciated! I am not attached to doing this a certain way; I just need to write a function that estimates the relative frequency in a list of trees of the three possible unrooted quartet topologies on four taxa appear as induced subtrees.
Thanks so much!
--
You received this message because you are subscribed to the Google Groups "DendroPy Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dendropy-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
def dendropy_score_triples(quartets, treefiles, oryza_names=False, write_trees=False, output_dir='./', filter_file=None, gene_order_file=None):
I am puzzled as to what inputs format the quartets are in, I looked in
and I see a reference to a "quartet file"-what does the file look like?
I am also confused by the bitmasks: from my messing around shown below it seems that different subsets of taxa are identified by different bitmasks. Also, I still don't understand how to give Dendropy the right bitmask for a given split in the frequency of splits dendropy function....
>>> L = list(treelist.taxon_set.labels())
>>> L
['MON', 'MAC', 'TUR', 'BOS', 'SUS', 'VIC', 'FEL', 'CAN', 'EQU', 'MYO', 'PTE', 'SOR', 'ERI', 'CAV', 'RAT', 'MUS', 'CAL', 'TAR', 'NEW', 'PAN', 'HOM', 'GOR', 'PON', 'MIC', 'OTO', 'TUP', 'SPE', 'DIP', 'ORY', 'OCH', 'CHO', 'DAS', 'PRO', 'LOX', 'ECH', 'ORN', 'GAL']
>>> treelist.taxon_set.get_taxa_bitmask(labels = 'MON')
0
>>> treelist.taxon_set.get_taxa_bitmask(labels = 'PAN')
0
>>> treelist.taxon_set.get_taxa_bitmask(labels = [L[1], L[3]])
10
>>> treelist.taxon_set.get_taxa_bitmask(labels = [L[1], L[3], L[5]])
42
e = root.edge | |
cm = e.split_bitmask |
>>> trees = dendropy.TreeList.get_from_path("true-trees-1X.tre", "newick")
>>> edges1 = [e for e in trees[1].postorder_edge_iter()]
>>> edges1[1].split_bitmask
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Edge' object has no attribute 'split_bitmask'
There are a few things I still don't understand: 1. How to get a list of bitmasks for a tree representing all the splits, 2. how to mask the taxa I don't care about when I input the information to masked_frequency_of_splits 3. what an example would look like where you actually give a mask as an input to either is_compatible or masked_frequency_of_splits. I can't seem to actually get dendropy to tell me what the split_bitmask is for any edge or split.... For example if I want to feed a split_bitmask to the function split_bitmask_string()
I don't know where to find my split_bitmask info in the first place...
Thanks as always!
Ruth
>> <mailto:dendropy-users+unsub...@googlegroups.com>.