Hi Artem,
thanks for the files, they helped me to identify the problem:
Branch length stealing - as well as a couple of other RAxML commands - implicitly disables alignment pattern
compression, which in turn does not work with non-contiguous partitions (e.g. p1=1-10,16-20 p2=11-15).
As a quick workaround, I'd suggest either:
1. Re-sort your alignment columns such that every partition is a contiguous segment (e.g. p1=1-15 p2=16-20)
You can trick RAxML into doing this for you by e.g. adding a dummy gap-only sequence to the alignment and running "-f c"
command.
2. Hack the code by commenting out line 2298 in axml.c:
--- a/axml.c
+++ b/axml.c
@@ -2298,7 +2298,7 @@ static void sitesort(rawdata *rdta, cruncheddata *cdta, tree *tr, analdef *adef)
index[0] = -1;
- if(adef->compressPatterns)
+ // if(adef->compressPatterns)
I'm not sure this solution is 100% bullet-proof, but it seems to work for me (maybe Alexis can comment on this since he
is more familiar with the code).
Please note, that you should use vertorized and parallelized version of RAxML (e.g. raxmlHPC-PTHREADS-AVX2) on such a
big dataset.
Finally, since you alignment contains very few undetermined characters (2%), branch length stealing is unlikely to
change the original tree.
Hope this helps,
Alexey