RAXML-NG: reading large tree raises ERROR: ERROR reading tree file (LIBPLL-111): memory exhausted.

104 views
Skip to first unread message

Metin Balaban

unread,
Oct 26, 2020, 8:31:07 PM10/26/20
to raxml
Hi raxml-ng team,
I faced an issue similar to Dimitri faced in this old thread:
https://groups.google.com/u/1/g/raxml/c/HebqaMcPojE/m/Tleo3XCfAwAJ

I am trying to run raxml-ng on large alignments with a ton of repeated sequences starting from a single user tree. The starting tree, which contains 23613 sequences, the log file, input alignment, and the reduced alignment raxml-ng outputs is attached in the following google drive. 


raxml-ng raises the following error:
ERROR: ERROR reading tree file (LIBPLL-111): memory exhausted.
I see the error with any number of threads. How do I fix the issue?

I don't want to manually prune species with identical sequences because raxml-ng does it for me and if I do it it'd complicate my application. 

Best Regards,
-Metin



Alexey Kozlov

unread,
Oct 26, 2020, 9:24:58 PM10/26/20
to ra...@googlegroups.com
Hi Metin,

seems like your tree file is malformed: I tried to open it with Dendroscope, DendroPy, and Phangorn
- and all three tools failed.

Apart from this, >96% of your sequences (22708 out of 23613) are exact duplicates. It *really* does
not make any sense to include them into tree inference, since with standard likelihood models as
implemented in e.g. raxml-ng or fasttree, the branching order of identical sequences is arbitrary.
And it will make raxml-ng run MUCH longer.

So I would recommend just using the reduced alignment generated by raxml-ng.

Best,
Alexey

On 27.10.20 01:31, Metin Balaban wrote:
> Hi raxml-ng team,
> I faced an issue similar to Dimitri faced in this old thread:
> https://groups.google.com/u/1/g/raxml/c/HebqaMcPojE/m/Tleo3XCfAwAJ
>
> I am trying to run raxml-ng on large alignments with a ton of repeated sequences starting from a
> single user tree. The starting tree, which contains 23613 sequences, the log file, input alignment,
> and the reduced alignment raxml-ng outputs is attached in the following google drive.
>
> https://drive.google.com/drive/folders/150uIFaII54PS58kzeVT8dOVCK3RKr9Vi?usp=sharing
>
> raxml-ng raises the following error:
> *ERROR: ERROR reading tree file (LIBPLL-111): memory exhausted.*
> I see the error with any number of threads. How do I fix the issue?
>
> I don't want to manually prune species with identical sequences because raxml-ng does it for me and
> if I do it it'd complicate my application.
>
> Best Regards,
> -Metin
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> raxml+un...@googlegroups.com <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/e69e5f2d-b296-4442-a772-81b298f49701n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/e69e5f2d-b296-4442-a772-81b298f49701n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Metin Balaban

unread,
Oct 26, 2020, 10:29:57 PM10/26/20
to raxml
Hi Alexey,

Thanks for the quick response!

Treeswift and newick utilities opens the tree file, while FigTree can't. Maybe these tools cannot process the large polytomy in the tree.

Anyway, I didn't know it would make raxml-ng slower. I thought raxml-ng would use reduced phy file and reduced newick file internally, which would be only 905 species in this case. I will follow your recommendation.

-Metin

Alexey Kozlov

unread,
Oct 28, 2020, 6:47:32 AM10/28/20
to ra...@googlegroups.com
Hi Metin,

> Thanks for the quick response!

you are welcome :)

> Treeswift and newick utilities opens the tree file, while FigTree can't. Maybe these tools cannot
> process the large polytomy in the tree.

we figured out the problem which was caused by the "ladder-like" shape of the tree (and its size).
this has been now fixed in the github version of raxml-ng. thanks for reporting!

> Anyway, I didn't know it would make raxml-ng slower. I thought raxml-ng would use reduced phy file
> and reduced newick file internally, which would be only 905 species in this case. I will follow your
> recommendation.

yes I understand this could me misleading, so in future version, we sill likely change the default
behavior to remove duplicate sequences during tree search, and re-insert them into the final tree
at the end.

Best,
Alexey
> https://groups.google.com/d/msgid/raxml/9f078706-1a3c-4c92-a159-7c4f3e106bc4n%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/9f078706-1a3c-4c92-a159-7c4f3e106bc4n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages