Robinson-Foulds distances

1,728 views
Skip to first unread message

Tallen

unread,
Feb 10, 2012, 11:25:03 PM2/10/12
to raxml
Hi Alexis,

I am trying to calculate the Robinson-Foulds distance between two
trees in RAxML v7.2.8, but got this error message. Did I miss anything
here? Thanks a lot!

Tallen



Found a total of 58 taxa in first tree of tree collection test.tre
Expecting all remaining trees in collection to have the same taxon set

This is RAxML version 7.2.8 released by Alexandros Stamatakis in
October 2010.

RAxML computation of RF distances for all pairs of trees in a set of
trees

RAxML was called as follows:
raxmlHPC -f r -z test.tre -m GTRGAMMA -s test.phy -n test.txt

Found 2 trees in File test.tre
raxmlHPC: treeIO.c:921: addElementLen: Assertion `!readNodeLabels'
failed.
Aborted

Andre J. Aberer

unread,
Feb 11, 2012, 6:34:19 AM2/11/12
to ra...@googlegroups.com
Hi Tallen,

do you want to compute the RF-distance of rooted trees? This is not
possible with RAxML and if I am not mistaken, the RF-distance is only
defined for unrooted trees.

Maybe just unroot the trees (newick-utils can do this
http://cegg.unige.ch/newick_utils) and feed them into RAxML again.

> I am trying to calculate the Robinson-Foulds distance between two
> trees in RAxML v7.2.8, but got this error message. Did I miss anything
> here? Thanks a lot!
>
>

> Found a total of 58 taxa in first tree of tree collection test.tre
> Expecting all remaining trees in collection to have the same taxon set
>
> This is RAxML version 7.2.8 released by Alexandros Stamatakis in
> October 2010.
>
> RAxML computation of RF distances for all pairs of trees in a set of
> trees
>
> RAxML was called as follows:
> raxmlHPC -f r -z test.tre -m GTRGAMMA -s test.phy -n test.txt
>
> Found 2 trees in File test.tre
> raxmlHPC: treeIO.c:921: addElementLen: Assertion `!readNodeLabels'
> failed.
> Aborted

--
Best regards,
Andre J. Aberer

M.Sc. (Bioinformatics)
Scientific Computing Group

Heidelberg Institute for Theoretical Studies (HITS gGmbH)
Schloss-Wolfsbrunnenweg 35
D-69118 Heidelberg

Tel.: +49 6221 533 264
Fax: +49 6221 533 298
Email: andre....@h-its.org
WWW: http://www.exelixis-lab.org
http://www.h-its.org/english/research/sco/index.php

Amtgericht Mannheim / HRB 337446
Managing Directors: Dr. h.c. Dr.-Ing. E.h. Klaus Tschira, Prof. Dr.-Ing. Andreas Reuter

zx1106

unread,
Feb 11, 2012, 2:18:20 PM2/11/12
to ra...@googlegroups.com
Hi Andre,

Thanks a lot for the help! Sorry that I forgot to unroot the tree
first. Can I ask that what each column means and how we interpret the
output from RAxML with -f r?

Tallen


0 1: 6 0.054545, 1.360000 0.012364, 3.580000 0.032545

Andre J. Aberer

unread,
Feb 12, 2012, 5:14:23 AM2/12/12
to ra...@googlegroups.com
Hi Tallen,

> Thanks a lot for the help! Sorry that I forgot to unroot the tree
> first. Can I ask that what each column means and how we interpret the
> output from RAxML with -f r?
>
> Tallen
>
>
> 0 1: 6 0.054545, 1.360000 0.012364, 3.580000 0.032545
>
>

It means that tree one and two have a RF-distance of 6. Just see
http://en.wikipedia.org/wiki/Robinson%E2%80%93Foulds_metric
for the meaning of this. This is the most important number. Though in
RAxML the distance is divided by 2 (in contrast to the wiki-definition).

0.054545 should be a normalization of the RF-distance to values between
0 and 1. The other values behind the first comma are (I guess) one-sided
and weighted RF-distances -- since they were dropped in the latest RAxML
release, they are most probably not specifically useful in the daily
phylogenetic analysis.

-Andre

--
Best regards,
Andre

Olaf R.P. Bininda-Emonds

unread,
Feb 12, 2012, 3:39:51 PM2/12/12
to ra...@googlegroups.com
Hi,

Just a quick correction to something that Andre wrote yesterday. Even if RAxML can only calculate it for unrooted trees, the RF-distance is indeed defined for both rooted and unrooted trees, with the normalization being slightly different between the two of them (2n - 4 versus 2n - 6, respectively, where n = number of taxa).

Cheers,

Olaf

-----------------------------------------------------------------------------------------
Prof. Dr. Olaf R.P. Bininda-Emonds
AG Systematik und Evolutionsbiologie
IBU - Fakultät V
Carl von Ossietzky Universität Oldenburg
26111 Oldenburg
Germany

Phone: +49 441 798 3965
Fax: +49 441 798 193965
e-mail: olaf.b...@uni-oldenburg.de
WWW: http://www.molekularesystematik.uni-oldenburg.de/

Alexis

unread,
Feb 15, 2012, 1:15:56 PM2/15/12
to raxml
Thanks Olaf and Andre :-)

The meaning of this line:

0 1: 6 0.054545, 1.360000 0.012364, 3.580000 0.032545

is as Andre correctly wrote the RF distance between the first and
second tree in the tree file.

The first two values are the crude RF distance (sum of the number of
bipartitions unique to one of the two trees) and the
normalized(relative) RF distance. The following pairs of numbers both
represent different flavors of the wighted RF distance that also uses
the bipartition support values on the respective trees.

There are two ways of defining the WRF:

1. Compute the sum of the supports of the unique bipartitions (and the
relative counterpart thereof)
2. Compute the sum of the supports of the unique bipartitions plus the
difference of support values among the shared bipartitions

As far as I remember we couldn't agree with Olaf which definition is
more appropriate, so I decided to implement both ;-)

Alexis



dilly.desilva

unread,
Sep 4, 2013, 7:47:07 AM9/4/13
to ra...@googlegroups.com
Dear Alexis,

I'm comparing the RF distances between a consensus tree generated from a partitioned analysis and the tree generated
from running RAxML on the individual partitions without forcing any topology.
For example, using

raxmlHPC-SSE3 -f r -z trees.txt -m GTRGAMMA -n RFdist


output like so:
0 1: 6 0.750000
0 1: 4 0.500000
0 1: 2 0.250000
0 1: 0 0.000000
etc.

(looped through each partition hence 0 1 on each line, with 0 representing the consensus tree)

The no. of taxa in my data is very small (n= 7), no. of partitions ~ 2400.
I find that only 27% of the partitions have an RF distance of 0 (i.e. the same topology as the consensus tree), 40% have a distance
of 2 and so on. I'm trying to understand how the consensus tree is built if the majority rule requires more than half the data to conform to the bipartitions in the consensus tree and in this example on 27% are exactly the same. Or does it take into account the 40% of partitions with RF distance 2 as well?

Thanks

Dilrini




I have a question about how the consensus tree is built in a partitioned analysis

Andre J. Aberer

unread,
Sep 4, 2013, 8:06:37 AM9/4/13
to ra...@googlegroups.com
Hi Dilrini,

I am quickly rephrasing what you did in order to make sure, I understood
correctly:

* you inferred a tree for each of the 2400 partitions
* you created the consensus tree of these 2400 trees
* you computed RF distances between each of the 2400 trees and the
consensus tree

And your question is whether trees with an RF > 0 are considered in the
consensus tree as well?

Yes, those trees are considered, since the consensus tree is not about
tree equality, but about frequencies of bipartitions/branches in the
tree set. The consensus tree only displays those bipartitions that
conform with the consensus tree (i.e., those branches that occur in more
than 50% of the trees).

So if 60% of your trees had a RF distance of 0 to the consensus tree,
this means that each of the branches of the consensus tree must have a
support of >= 60%. For each unit of absolute RF-distance > 0, the
support in the consensus tree decreases somewhere.

--
Best regards,
Andre
--
Best regards,
Andre J. Aberer

M.Sc. (Bioinformatics)
Scientific Computing Group

Heidelberg Institute for Theoretical Studies (HITS gGmbH)
Schloss-Wolfsbrunnenweg 35
D-69118 Heidelberg

Tel.: +49 6221 533 264
Fax: +49 6221 533 298
Email: andre.aberer <at> h-its <dot> org

dilly.desilva

unread,
Sep 4, 2013, 8:54:28 AM9/4/13
to ra...@googlegroups.com, andre....@googlemail.com
Thank you Andre, that answers my question - I was under the impression that tree equality was the determining factor. Apologies for not being clear.

Best

D
Reply all
Reply to author
Forward
0 new messages