AleRax Feature Request - additional majority rule consensus tree info

31 views
Skip to first unread message

Joshua Mayoral

unread,
Jan 24, 2025, 3:40:12 PMJan 24
to GeneRax
Greetings,

Currently, AleRax v1.2.0 produces a "FAM__consensus_50.newick" tree in the "reconciliations/summary" directory. If I understand correctly, for a given gene family, this tree represents the majority rule gene tree topology from all reconciliations, which seems like a useful tree to use for downstream analysis. I was wondering if it would be possible/practical to add two features to this consensus tree:

1) Compute the average branch lengths from the gene tree distributions using only those gene trees that match the consensus gene tree topology.

2) Label internal nodes to match the labels of the species tree, so as to indicate where the ancestral genes reside in the species tree framework (perhaps as a separate file, so as not to confuse with the support values currently used as internal node labels). I realize that this information can be discerned from the .xml reconciliation files and using visualization software like ThirdKind, but I think it would be helpful to include this information in the consensus tree as well.

Please let me know your thoughts, and thank you to all the contributors for developing this neat method! 

Best,
Josh Mayoral

Stefan Flaumberg

unread,
Jan 25, 2025, 2:44:50 AMJan 25
to GeneRax
Hi Josh,

Can't say anything about the first proposal, but the second one seems a bit algorithmically troublesome to me:
As you correctly mentioned, the consensus gene tree topology is produced from sampled reconciled gene trees. In these reconciled gene trees internal nodes can be labelled according to the species node where an event took place (and they are labeled this way in the AleRax test version I referred you to in November, please see all/*.alerec files there). But the reconciled gene trees can have different topologies (which is obvious) and different reconciliations (scenarios) even when having the same topology (which is less obvious, but is crucial here).
So to label a consensus tree we should first consider only the reconciled trees whose topologies match the consensus tree and then somehow (majority rule again?) pick up a single scenario among the trees considered.

I guess, both your proposals stem from the need to describe evolution of a gene family in a single reasonable scenario. For this purpose I've implemented a feature permitting inference of the maximum likelihood reconciliation for each gene family. I was going to propose it after my current PR on github gets accepted by the developers.

Best,
Stefan

Joshua Mayoral

unread,
Jan 25, 2025, 8:02:56 AMJan 25
to GeneRax
Hi Stefan,

Thank you for the quick and thoughtful response.

Indeed, I overlooked that gene trees with the same topology might have a different reconciliation scenario. And you correctly described my goal - obtaining a maximum likelihood reconciliation from AleRax. I'll be awaiting the maximum likelihood proposal over on github and further thoughts from the devs.

Cheers,
Josh 
Reply all
Reply to author
Forward
0 new messages