Problems with parallelization

127 views
Skip to first unread message

Николай Джеус

unread,
Jul 1, 2021, 1:07:44 PM7/1/21
to GeneRax
Hi Benoit,

I have problems with running the GeneRax with the mpiexec. After starting, each line output to the console is repeated several times and the process ends with the LibpllException error.

""""
[00:00:00] [Initialization] Initial optimization of the starting random gene trees
[00:00:00] [Initialization] All the families will first be optimized with sequences only

[00:00:00] [Initialization] Initial optimization of the starting random gene trees
[00:00:00] [Initialization] All the families will first be optimized with sequences only

[00:00:00] [Initialization] Initial optimization of the starting random gene trees
[00:00:00] [Initialization] All the families will first be optimized with sequences only

[00:00:00] [Initialization] Initial optimization of the starting random gene trees
[00:00:00] [Initialization] All the families will first be optimized with sequences only
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node tux exited on signal 6 (Aborted).
--------------------------------------------------------------------------

terminate called after throwing an instance of 'LibpllException'
what(): Error while reading tree (file is empty) from file: resultsWithError/startingTrees/OG0004469.newick
terminate called after throwing an instance of 'LibpllException'
""""
When running GeneRax in single-threaded mode, there are no problems

Have you got any idea what could go wrong?
I have attached an archive with the data and the output of generax, in case you need them
https://drive.google.com/file/d/1Pb6Z5EVP1Bj33yI2ws08UtTWFbrLQ9BK/view?usp=sharing

Thanks a lot!
Nick

Benoit Morel

unread,
Jul 2, 2021, 4:50:40 AM7/2/21
to Николай Джеус, GeneRax
Hi Nick,

GeneRax fails to map the genes to the species. You should either provide a mapping file for each gene family, or prefix the gene names with the species names followed by an underscore (but this only works if the species name does not contain any underscore, which is not the case in your dataset, so I would rather go for mapping files). More information here: https://github.com/BenoitMorel/GeneRax/wiki/Gene-to-species-mapping

I don't know why GeneRax doesn't detect the problem. Ideally, it should exit from the very start with a proper error message. I will try to fix this for the next release. Note that the sequential run (without mpiexec) does not crash, but does not seem to do anything, because the likelihood scores do not change from one step to another.

Also, thanks for having taken the time to provide all the data + a readme. This really helps us developers to quickly reproduce and identify the issues, and to fix them :-)

Let me know if this helps,
Benoit


--
You received this message because you are subscribed to the Google Groups "GeneRax" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generaxusers...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generaxusers/4af25fff-5eb7-4632-8a88-56951823bbf5n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Николай Джеус

unread,
Jul 5, 2021, 9:40:19 AM7/5/21
to GeneRax
Hi Benoit,

I have created the mapping files and added a link to them to the families file. I've also removed all "_" characters from species and gene names (just to be on the safe side). Single-threaded launch results are now more correct. But the parallelization problem remained. I was wondering if you can guess what could have gone wrong. Maybe in the name of the mapping file it is worth specifying a specific extension? (At the moment, the extension is not indicated in the name of the files, but their content is written in the TreeRecs format)

Ps.: my data - https://drive.google.com/file/d/1ubFYCx74PWtXF9YfXlcRoofoVIDrQm6I/view?usp=sharing

Many thanks,
Nick


пятница, 2 июля 2021 г. в 11:50:40 UTC+3, beno...@gmail.com:

Benoit Morel

unread,
Jul 5, 2021, 4:04:02 PM7/5/21
to Николай Джеус, GeneRax

Dear Nick,

I think that your data is now fine (it works on my machine, with and without MPI).
However, there could be an issue with your installation. I think that the compiler didn't find MPI when installing GeneRax, which could explain why your logs are printed 4 times and telling that generax is run without MPI.

Could you please try to reinstall, and send me all the logs coming from the installation script?
From the git repository, you need to remove the build directory, and then to run the install script again.
(in case you installed with bioconda, let me know, the procedure for reinstalling is different)

Best,
Benoit

Николай Джеус

unread,
Jul 7, 2021, 5:52:13 AM7/7/21
to GeneRax
Dear Benoit,
I guess it was the MPI version
I tried reinstalling the GeneRax several times, but it did not lead to any improvement. But this time, after reading your letter, I tried again, having previously installed the new version of the MPI to the conda environment(the old was 2.1.1, the new is 3.4.2). And now everything is alright. 

Since you asked me to send the installation logs, I did a little experiment by installing and running the GeneRax with the old and new version of the MPI. Launching with the new version gave no errors. Launching with the old version gave an error, but another one (:D), not critical. An archive with the results of installation and launches is in https://drive.google.com/file/d/1aET6Ipa3dn9gX6Vojh2Tk9BLq6KzlpL_/view?usp=sharing

Thank you for your help!
Nick

понедельник, 5 июля 2021 г. в 23:04:02 UTC+3, beno...@gmail.com:

Benoit Morel

unread,
Jul 7, 2021, 6:02:40 AM7/7/21
to Николай Джеус, GeneRax
Dear Nick,

Thanks for the feedback, I am glad it worked :-)

Best,
Benoit

Reply all
Reply to author
Forward
0 new messages