Hi again,
thanks for this evaluation! Do you mind sending your alignment and output files to my e-mail?
In general, RAxML-NG could be slower than RAxML for some specific datasets and settings, although those are really rare
individual cases. I observed it just a couple of times among dozens of diverse dataset I tested. In particular, it can
happen on very small alignments like the one you've tested, since improved flexibility of NG comes at a (small) constant
cost.
With respect to your results, I have several comments:
- we put limited effort into SSE3 optimization in NG, since nowadays there are rather few machines without AVX support,
and their number will only decrease with time (sorry)
- raxml8-pthreads will always use *at least* 2 threads, even if you specify "-T 1"
- regarding the optimal number of threads: unfortunately, it is very difficult to estimate it accurately given different
data types and models, partitioning, CPU architectures, RAM latency/bandwidth etc. etc. In your case, even
oversubscribing physical CPU cores and using as few as 20 AA sites/thread doesn't seem to result in performance
degradation; however, in many other cases it does, and sometimes very badly. So I added this error/warning to prevent
the wasting of resources.
There is also another aspect of it: we might want to optimize time-to-solution (your example), or we might want to
optimize throughput/efficiency. According to your results, using 4 threads yields ~2x speedup, so parallel efficiency is
only 50%. Now imagine you want to analyze multiple datasets, or you want to run multiple trees searches/bootstraps in
parallel. Obviously, in this scenario it will be more efficient (and faster!) to use just a single thread for each search.
Hope this helps!
Thanks,
Alexey
On 21.12.2017 00:07, Sergios-Orestis Kolokotronis wrote:
> A comparison using a few threads on a Mac. I saw in a discussion thread a note from the developers asking for updates on
> comparisons, so here's one. I noticed the searches using NG were taking a bit longer than I would have expected and,
> given the warning on thread usage pasted below, I thought I'd look into it. Of course, this is merely an isolated case.
> Maybe it has to do with the phylogenetic informativeness of the dataset. Trees estimated with v8.2.11 all exhibited low
> support for deep and mid-level nodes.
>
> Dataset: AA, 27 taxa, 251 positions
> Gaps: 2.24 %
> Invariant sites: 24.70 %
> ML search with 20 MP start trees
> Alignment patterns: 199
> Model: LG+Γ4+F
>
> _Run in NG 0.5.1b_
> run mode: ML tree search
> start tree(s): parsimony (20)
> random seed: 1513791424
> tip-inner: ON
> pattern compression: ON
> per-rate scalers: OFF
> site repeats: OFF
> fast spr radius: AUTO
> spr subtree cutoff: 1.000000
> branch lengths: ML estimate (linked)
> SIMD kernels: SSE3
> parallelization: NONE/sequential
>
> |
> raxmlng051b --msa myALN.fasta --model LG+G4+F --tree pars{20}--threads 1
> |
> then --threads 2and and so on.
>
> When I try running 4 threads in NG, it exits with a "Too few patterns per thread" error.
> Data distribution: partitions/thread: 1-1, patterns/thread: 49-50
> WARNING: You are using too many threads (4) for your alignment with 199 unique patterns.
> NOTE: Please consider using 1 threads ('--threads 1' option) for the optimal performance.
> NOTE: As a general rule-of-thumb, please assign at least 200-1000 alignment patterns per thread.ERROR: Too few patterns
> per thread! RAxML-NG will terminate now to avoid wasting resources.
> NOTE: Please reduce the number of threads (see guidelines above).
> NOTE: This check can be disabled with the '--force' option.
> I forced 4 and 10 threads and it worked, yielding a time improvement.
> *So, how are more threads negatively impacting the overall search duration here? Please let me know if I'm missing
> something.*
>
> _Run in v8.2.11 SSE3 Pthreads_
> |
> raxml8211 -T 1-s myALN.fasta -m PROTGAMMALGF -p12345 -N 20-n myALN.rx8.T1
> |
> then -T 2and so on.
>
>
>
> *Alpha*
>
>
>
> *Final Log-Lik*
>
>
>
> *Time per tree search (min-max, s)*
>
>
>
> *Total time (s)*
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
raxml+un...@googlegroups.com
> <mailto:
raxml+un...@googlegroups.com>.
> For more options, visit
https://groups.google.com/d/optout.