comparison to Phylip

14 views
Skip to first unread message

Lee Katz

unread,
Aug 9, 2016, 9:05:27 AM8/9/16
to bio-phylo
Hi, I am comparing symmetric distance and branch length distance metrics against the Phylip package.

When I run treedist (Phylip) on two of my trees with distance type=symmetric distance and rooted=yes, I get 89.  When I run this code, I get either 70 or 21, depending on which is ref or which is query.  Treedist does not change when I vary the ref/query.

perl -MBio::Phylo::IO -e '$ref=Bio::Phylo::IO->parse(-file=>"Lyve-SET.flattened.dnd")->first; $query=Bio::Phylo::IO->parse(-file=>"RealPhy.flattened.dnd")->first; print $query->calc_symdiff($ref)."\n";'

Similarly when I run treedist with distance type=branch score distance and rooted=yes, I get 0.117 with either tree as ref or query.  However, with Bio::Phylo code I get 0.07 with either tree set to ref or query.

perl -MBio::Phylo::IO -e '$ref=Bio::Phylo::IO->parse(-file=>"Lyve-SET.flattened.dnd")->first; $query=Bio::Phylo::IO->parse(-file=>"RealPhy.flattened.dnd")->first; print $ref->calc_branch_length_distance($query)."\n";'

Is there an explanation of the differences in algorithms?

Lee Katz

unread,
Aug 11, 2016, 8:24:06 AM8/11/16
to bio-phylo
I noticed finally that I was getting random results sometimes and so I looked around for what the random factor could have been.  It looks like I had multifurcating trees.  Using resolve() fixed it (and also deroot()).  Additionally, I noticed that the symmetric difference was consistently half of what treedist gave me, which makes sense if Bio::Phylo is doing a one-sided score.  Therefore I am doubling what Bio::Phylo gives me so that it makes sense with other tools.

However, it is still not matching up with the branch score (Kuhner-Felsenstein).  Any ideas?
Reply all
Reply to author
Forward
0 new messages