generax on 100 genetrees in nexus format

jigyasa arora

Jul 20, 2021, 6:44:01 AM7/20/21
to GeneRax
Thank you again for an amazing software and easy to follow wiki!
I am interested in running generax on a nexus gene-tree file with 100 trees in it. As generax accepts newick format, I was wondering if its possible to use a nexus file as gene-tree? Or would you suggest creating a consensus gene-tree?

I am interested in generating a sort of "average" value of DTL events accounting for the variations in the gene-tree topology.

Benoit Morel

Jul 28, 2021, 7:28:32 AM7/28/21
to GeneRax
I assume in my answer that the 100 gene trees were inferred from the same gene alignment. I do not recommend creating a consensus tree, because generax only accepts strictly binary trees.

One solution is to export the 100 trees into 100 newick files, and run GeneRax on each of them. If you do this, there are two possibilities:
- you want GeneRax to correct the gene tree topologies, in which case each gene trees will be the "starting tree" in the GeneRax tree search. It could happen that GeneRax always finds the same "final tree" from those 100 starting trees, in which case you won't really get an average value over the various topologies (but it's hard to tell  without trying). However the final trees should be more realistic than your current topologies, if you got them from an alignment.
- you want to reconcile those gene tree topologies without correcting them. The problem with this approach is that the number of DTL events tends to be overestimated when reconciling inaccurate gene trees, so the average will also be overestimated.

A second solution is to use the tool ALE ( developed by another team. ALE takes as input a rooted species tree and a distribution of n unrooted gene trees, and samples m reconciled rooted gene trees (you can set the number of samples m). It implements two reconciliation models (dated and undated, the undated corresponding to the model implemented in GeneRax). If you decide to use ALE, an even better approach is to generate the n input gene trees using a bayesian tool such as MrBayes (maybe this is how you got the nexus file?).

My feeling is that ALE best fits this usecase. It would be interesting to compare both approaches (ALE and GeneRax with gene tree correction), they should (in theory!!) converge to the same results.


jigyasa arora

Jul 28, 2021, 10:10:09 PM7/28/21
to GeneRax
I did exactly as you mentioned-exported the 100 trees as separate nwk files and ran Generax on them separately.
It's interesting that you mentioned that the average might be an overestimation. I ended up finding that taking an average would not be the best way to consolidate the results. I will also examine ALEml results with Generax. Thanks for the advice!
