AleRax species tree error

82 views
Skip to first unread message

Kiran More

unread,
May 30, 2024, 2:04:59 PM5/30/24
to GeneRax
Hi,

I'm attempting to run AleRax on my own data after successfully following the related tutorial, and I'm getting the following error when the program initializes my species tree:

[00:00:00] Run settings:
Family file: data/test/families.txt
Starting species tree: specified by user: data/test/speciesTree.txt
Output directory: output1
MPI Ranks: 1
Number of reconciled gene trees to sample: 100
Random seed: 123
Reconciliation model: UndatedDTL
Model parametrization:  global to all species and families
Rate optimizer: GRADIENT
Memory savings: OFF
Transfer constraints: transfers to parents are forbidden
Prune species mode is disabled
Gene tree rooting: all gene tree root positions are considered with the same probability
Origination strategy: gene families can originate from each species with the same probability
Species tree search: skipping species tree search
AleRax will exclude gene families covering less than 1 species

[00:00:00] Generating ccp files...
[00:00:00] Filtering families
[00:00:00] Families: 1
[00:00:00] Triming families covering less than 1 species
[00:00:00] Remaining families: 1
[00:00:00] Initializing starting species tree...
alerax: /home/kmore/AleRax/ext/GeneRaxCore/src/IO/RootedNewickParser.cpp:293: corax_rtree_t* build_rtree(RTreeParser*): Assertion `internal_index == p->nodes_number' failed.
[gra-login3:01978] *** Process received signal ***
[gra-login3:01978] Signal: Aborted (6)
[gra-login3:01978] Signal code:  (-6)
[gra-login3:01978] [ 0] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(+0x38790)[0x2b8ae3f4b790]
[gra-login3:01978] [ 1] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(+0x867ac)[0x2b8ae3f997ac]
[gra-login3:01978] [ 2] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(gsignal+0x12)[0x2b8ae3f4b6f2]
[gra-login3:01978] [ 3] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(abort+0xd3)[0x2b8ae3f354b2]
[gra-login3:01978] [ 4] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(+0x223d5)[0x2b8ae3f353d5]
[gra-login3:01978] [ 5] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(+0x31662)[0x2b8ae3f44662]
[gra-login3:01978] [ 6] build/bin/alerax(_Z11build_rtreeP11RTreeParser+0x1b3)[0x583063]
[gra-login3:01978] [ 7] build/bin/alerax(_Z25custom_rtree_parse_newickPKcbP12ParsingError+0x65)[0x5830f5]
[gra-login3:01978] [ 8] build/bin/alerax[0x4fd3dd]
[gra-login3:01978] [ 9] build/bin/alerax(_ZN13PLLRootedTreeC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb+0x16)[0x571cf6]
[gra-login3:01978] [10] build/bin/alerax(_ZN11SpeciesTreeC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb+0x12)[0x576212]
[gra-login3:01978] [11] build/bin/alerax(_Z23initStartingSpeciesTreeR12AleArgumentsRSt6vectorI10FamilyInfoSaIS2_EE+0x37c)[0x46d6cc]
[gra-login3:01978] [12] build/bin/alerax(_Z3runR12AleArguments+0x8ad)[0x471fdd]
[gra-login3:01978] [13] build/bin/alerax(_Z11alerax_mainiPPcPv+0x12b)[0x47271b]
[gra-login3:01978] [14] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(+0x2394a)[0x2b8ae3f3694a]
[gra-login3:01978] [15] /cvmfs/soft.computecanada.ca/gentoo/2023/x86-64-v3/usr/lib64/libc.so.6(__libc_start_main+0x85)[0x2b8ae3f36a05]
[gra-login3:01978] [16] build/bin/alerax(_start+0x21)[0x468aa1]
[gra-login3:01978] *** End of error message ***
Aborted

This is the species tree that is causing the problem: (((Nmar,(Ncur,Nqua)),(Nste,(MAST4C,((Acan,Pinf),(Spar,Tcla)))))),((Aign,((Msph,(Rram,Drot)),((Cfas,Ddis),(Ttra,(Patl,(Cowc,(Aque,(Tadh,(Hsap,Lana))))))))));

Stepan Puhov

unread,
May 30, 2024, 5:36:56 PM5/30/24
to GeneRax
Hi Kiran,

Your input species tree is clearly misformatted: both major clades are embedded in respective extra pairs of parentheses, yet the tree lacks the external (rooting) parentheses. 

Consider using the following refined version, hopefully it should work:
(((Nmar,(Ncur,Nqua)),(Nste,(MAST4C,((Acan,Pinf),(Spar,Tcla))))),(Aign,((Msph,(Rram,Drot)),((Cfas,Ddis),(Ttra,(Patl,(Cowc,(Aque,(Tadh,(Hsap,Lana))))))))));

A more tentative comment:
The taxonomic sampling used is quite sparse (if I interpret the species names correctly), which may generally be not so good for the precision of gene history inference. Long branches can bias phylogenetic likelihood estimation, while lack of genes from more closely related taxa can lead to underestimation of duplications and losses.

Best,
Stepan

Benoit Morel

unread,
Jun 6, 2024, 1:32:31 PM6/6/24
to Stepan Puhov, GeneRax
Thank you for your help Stepan :-)

--
You received this message because you are subscribed to the Google Groups "GeneRax" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generaxusers...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generaxusers/cdc9f00a-41c3-4e13-8743-482657d23c5cn%40googlegroups.com.

Kiran More

unread,
Jun 12, 2024, 12:35:35 AM6/12/24
to GeneRax
Thanks, Stepan--your reformatted tree seems to have solved that issue. Re: sampling, this is a toy example to make sure I understood how to use the program first, which apparently I needed!

Cheers,
Kiran
Reply all
Reply to author
Forward
0 new messages