Gene origin inferred at non spine node of the tree... Is it expected behavior?

20 views
Skip to first unread message

Deepti Varshney

unread,
May 26, 2026, 8:22:52 AMMay 26
to GeneRax
Hello Team,

You may remember me from our previous email exchange about the Galaxy integration of AleRax. I have a new question about the tool's behavior.

I am using AleRax (v1.2.0) for gene family evolution analysis in land plants and have a question about interpreting gene origin nodes in the reconciled tree.

Specifically, I am analyzing the YUCCA1 gene family. AleRax infers the origin of the gene at a node that does not lie on the spine of the species tree. I am trying to understand under what conditions this is expected behavior and how such a result should be interpreted biologically.

Our workflow:
- I used IQ-TREE to generate 1,000 bootstrap gene trees from a trimmed alignment for YUCCA1
- These 1,000 trees were supplied as input to AleRax along with the species tree.

My question is: Is it expected and valid for AleRax to place the origin of a gene at a node outside the spine of the species tree? If so, what biological or algorithmic scenario leads to this, and how should I interpret the tree-network diagram in this context?

I have attached the following data for your reference: the species tree, AleRax command, the summary output for this family, and the graphical view of the origination scores.

Looking forward to your response!
YUCCA1.zip

Benoit Morel

unread,
May 27, 2026, 9:18:24 AM (13 days ago) May 27
to Deepti Varshney, GeneRax
Hello Deepti,
It is always hard to tell for sure, but I think that this could simply happen because you reach the limits of reconstruction methods and end up with a non-plausible result.
AleRax and other methods don't make assumptions about where gene trees are more likely to originate (e.g. "on the spine of the species tree"). They use statistics to sample scenarios that fit the sampled gene trees (here with IQTREE) and that also fit the reconciliation probabilities (one transfer is more likely than 50). Ideally, they would always find the real scenario, and this scenario should be biologically plausible.

But it can easily fail, for instance if the ground truth gene tree is not covered by the gene tree distribution. Or if there were so many gene transfers that finding the correct scenario is just too hard.

I hope it helps,
Benoit


--
You received this message because you are subscribed to the Google Groups "GeneRax" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generaxusers...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/generaxusers/9bab510f-5cb0-4e27-a2ee-37b2f7d83f59n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages