DTL rates in EVAL mode

53 views
Skip to first unread message

Giacomo Mutti

unread,
Nov 9, 2021, 5:22:51 AM11/9/21
to GeneRax
Dear Benoit,

I am forced to use the EVAL mode of GeneRax due to computational constraints. However, it is not completely clear to me how this mode works. In the wiki it says: "EVAL does not optimize the tree topology, and just evaluates the likelihood, the DTL rates and the reconciliation of the starting gene trees."

I run GeneRax (v2.0.2 from GitHub) with --per-family option activated to get DTL rates for each gene tree however with EVAL nothing gets saved to the results folder. Is this because EVAL just takes the default starting DTL parameters? But then why it says that the DTL rates are evaluated? If so, what are the starting DTL rates?

Thanks again and have a nice day!
Giacomo Mutti

Benoit Morel

unread,
Nov 9, 2021, 2:48:53 PM11/9/21
to GeneRax
Dear Giacamo,

I think this is an unwanted behavior that I haven't found the time to fix yet. Now I finally have more time to invest in GeneRax and I hope I can fix this problem this week. I'll keep you in touch!
Unrelated to your issue: do you think you could share your data with me? It would be interesting to have a "challenging" dataset to try potential code optimizations.

Best,
Benoit

Benoit Morel

unread,
Nov 10, 2021, 4:21:17 AM11/10/21
to GeneRax
Dear Giacomo, (sorry for misspelling your name!)

I had a look at the code. The EVAL mode does not do what it's supposed to do. But there is an undocumented mode (RECONCILE instead of EVAL) that does. Here is the current state:
- EVAL does not optimize anything. It uses default DTL rates (0,2 for each) and only computes the reconciliations.
- RECONCILE does not optimize the gene trees, but it optimizes the DTL rates. If you use --per-family-rates, the per-family rates and reconciliation likelihood scores are outputed in the stats files.

In both mode, the total reconciliation likelihood in the logs is wrong (we always prints 0). I have fixed this in the last commit, but it's only available from GitHub (not in BioConda).

Let me know if that's enough for your purpose. I hope this will be fast enough, because the RECONCILE mode that estimates the DTL rates should be slower than the EVAL mode that only computes the reconciliations...

I plan to rewrite this poorly written part of the code from scratch and to update the documentation in the next days. I am sorry for the mess...

One last observation: if your gene trees were inferred from the sequences only, and not corrected with GeneRax (nor with any other species tree aware method), the DTL rates are very likely to be substantially overestimated. I often observed that GeneRax and all existing reconciliation software tend to compensate gene tree reconstruction error with transfers or DL events.

Best,
Benoit

Benoit Morel

unread,
Nov 10, 2021, 2:28:45 PM11/10/21
to GeneRax
FYI, both EVAL and RECONCILE modes now have the same behavior (the one described in the wiki). This change is available in the github repository, but not in BioConda yet. To update your github repository, you can run ./gitpull.sh and ./install.sh from your repository.

I won't remove the RECONCILE mode, but I recommend to use the EVAL mode to be consistent with the documentation once you have the fix.

Giacomo Mutti

unread,
Nov 11, 2021, 7:00:39 AM11/11/21
to GeneRax
Dear Benoit, thanks for your very quick reaction! 

This is good to know. I'll try and rerun the analysis with this update and let you know if it's fast enough. I tested it on few data and now it prints the dtl per gene and the rec likelihood as it should. I noticed that the speciesEventCounts and the eventCounts file are exactly the same as the ones I found with the old EVAL mode, would this be normal? (The xml, nhx orthogrous and transfer file are different though.) 

Yes indeed part of my conclusion is that it is very likely that gene tree errors will lead to overestimate DTL events (I also tested ecceTERA) so what you told me makes perfect sense.

Thanks again!
Giacomo Mutti

Benoit Morel

unread,
Nov 11, 2021, 9:01:46 AM11/11/21
to GeneRax
Hi Giacomo,


" I noticed that the speciesEventCounts and the eventCounts file are exactly the same as the ones I found with the old EVAL mode, would this be normal?"

This is weird, indeed. I plan to rewrite from scratch this part of the code because it has become a mess...

Benoit
Reply all
Reply to author
Forward
0 new messages