Likelihood computation for multiple sites in parallel

13 views
Skip to first unread message

Keren Halabi

unread,
Oct 2, 2018, 9:52:36 AM10/2/18
to Bio++ Usage Help Forum
Dear Bio++ team,

I was wondering if there is an existing option to parallelize likelihood computation for multiple independent sequence positions for the purpose of speeding up the likelihood computation procedure. 

If yes, how can I use it? 
If not, what would be the best approach to implement it, in your opinion?

Many thanks!
Keren

Laurent Guéguen

unread,
Oct 4, 2018, 3:12:24 AM10/4/18
to Bio++ Usage Help Forum
Dear Keren,

no it is not implemented. It is somewhere on the todo list to use mpi for this, and it would not be
too difficult to do this. But it is a matter of time, as usual.

Cheers,
Laurent

Keren Halabi

unread,
Nov 1, 2018, 11:45:12 AM11/1/18
to Bio++ Usage Help Forum
Dear Laurent,

I have incorporated usage of OpenMP to the likelihood computation with respect to sequence sites in my local version.

As result of the parallelization, the duration of my fitting procedure (when using optimization method FullD(derivatives=Newton,nstep=10)) is reduced approximately by a factor of 2 when using 4 threads. Specifically, the duration of the likelihood computation alone is reduced by a factor of 4. Thus, I wonder if there is any additional code segments that could benefit from parallelization. 

Since this incorporation is not likely to be effectively merged into the newlik version, I wonder if I should share it. If you wish that I do so, please let me know and I will pull request. 

Many thanks!
Keren

Laurent Guéguen

unread,
Nov 5, 2018, 11:07:09 AM11/5/18
to Bio++ Usage Help Forum
Dear Keren,

this is a great idea.
Where do you apply the parallelization? Do you split the alignment in several parts? 

Cheers,
Laurent

Keren Halabi

unread,
Nov 6, 2018, 8:57:28 AM11/6/18
to Bio++ Usage Help Forum
Dear Laurent,

I actually embedded openmp component for the loops computing the likelihood over sequence sites, allowing automatic parallelization without having to manually split the alignment.

You can view this implementation in the relevant commit my local development branch.

Please let me know if you would like me to pull request (I will create a separate branch for the pull request and commit to devel, according to Julien's guidance).

Cheers,
Keren

Laurent Guéguen

unread,
Nov 7, 2018, 7:54:49 AM11/7/18
to Bio++ Usage Help Forum
Dear Keren,

as far as I understand it, the parallelization is at a rather low level, ie directly on likelihood arrays.
I suppose it is much easier to do it that way, instead of parallelization at a higher level, ie on
alignment? 

This is fine with me if you pull request it. I let Julien check if everything is fine for him, since I am
not used to parallelization.

Thanks,
Laurent

Keren Halabi

unread,
Nov 8, 2018, 7:32:50 AM11/8/18
to Bio++ Usage Help Forum
Dear Laurent,

Indeed, the usage of OpenMP is both easier to use and allows additional features such as dynamic threading based on a given upper boundary on number of threads.

I have pulled request the usage of OpenMP to devel branch, for you convenience.

Cheers!
Keren
Reply all
Reply to author
Forward
0 new messages