RAxML 8.2.9 - EPA (tree format problem?)

190 views
Skip to first unread message

Alice Lévesque

unread,
Apr 11, 2017, 5:15:16 PM4/11/17
to raxml
Hello,

I want to use RAxML 8.2.9 to place short sequences (OTUs translated into amino acids) into a reference tree.
Here is the command line I'm using :

raxmlHPC-PTHREADS-SSE3 -f v -m PROTGAMMALG -s  Mafft_align_g23.phy -t Tree_gp23_refseq.newick -n EPA_final -T 2

I have my alignment in PHYLIP format (126 sequences) and my reference tree in NEWICK format (55 sequences).
I previously tested the best substitution model to use, and I got LG. 

However, when I run that command, it gives me a FAIL message after 1 sec. 
I am running RAxML on a server by submitting the job. So the output of "the job" is :

raxmlHPC-PTHREADS-SSE3: treeIO.c:1413: treeReadLen: Assertion `0' failed.
/var/spool/slurm/slurmd/job11271/slurm_script: line 13:  3462 Aborted                 raxmlHPC-PTHREADS-SSE3 -f v -m PROTGAMMALG -s Mafft_align_g23.phy -t Tree_gp23_refseq.newick -n EPA_final -T 2

RAxML also creates 2 files in my output directory: the *.reduced file and the *.EPA_final in which there is the summary of what RAxML did (attached to that message). In this last file, I understand I had multiple identical sequences but it should not cause the failure of the job, right? Or is my tree the problem (treeReadLen)?

I don't know what to try next. I thought that my command line was alright... Sorry, I am just a beginner with RAxML.

Thanks for your help!

Alice Levesque
Master student Microbiology 
RAxML_summ.txt

Alexandros Stamatakis

unread,
Apr 12, 2017, 12:11:15 AM4/12/17
to ra...@googlegroups.com
There seems to be a problem with your tree format, does the tree
visualize correctly?

If it does, please send me the tree file such that I can check,

Alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Alice Lévesque

unread,
Apr 12, 2017, 9:14:27 AM4/12/17
to raxml
Hi Alexis,

Thanks for your quick reply!

I generated my tree with Geneious v8.1.7, and then exported it in NEWICK format. I can visualize the tree correctly in MEGA7.

I will send you the file at your personnal e-mail address.

Thanks a lot!

Alice

Alexandros Stamatakis

unread,
Apr 13, 2017, 12:32:59 AM4/13/17
to ra...@googlegroups.com
can you send me the alignment file as well please?

it might be (can't test without the alignment) that RAxML can't handle
taxon names of this form:

'YP_009037594.1 (2)'

tryy replacing these, e.g., by:

YP_009037594.1_2


Alexis
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>
Message has been deleted

Alice Lévesque

unread,
Apr 13, 2017, 11:10:48 AM4/13/17
to raxml
I just tried to change taxon names as you suggested. Now, I have a new error message (attached here).

I guess this is the confirmation that something is wrong with my tree format... How could I fix it? Manually seems a bit excessive... Is there an easier way to change all the problematic "," into ")" ?

Thanks a lot!

Alice
RAxML_error.txt

Lucas Czech

unread,
Apr 13, 2017, 11:16:02 AM4/13/17
to ra...@googlegroups.com

Hi Alice,

as the error message states, RAxML is expecting to read a strictly bifurcating tree. Could you check whether this is the case, for example using Geneious or some other tree viewer?!

If you just change the problematic "," into ")", you are changing the tree topology, which is probably not want to want. One way to resolve this is to infer a new tree (for example, using RAxML). How did you obtain your tree in the first place?

Cheers
Lucas

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com.

Alice Lévesque

unread,
Apr 13, 2017, 11:54:06 AM4/13/17
to raxml
Hi Lucas,

I generated that Neighbour-joining tree using Geneious, with an outgroup and 1000 bootstrapping. 
And sorry, I'm a bit new with bioinformatic vocabulary, but what is a bifurcating tree? Here I joined to that message the tree visualized in MEGA7. Is it how it is supposed to look like? I thought it looked okay..

Thanks!

Alice
Tree_MEGA7.PNG

Lucas Czech

unread,
Apr 13, 2017, 12:02:24 PM4/13/17
to ra...@googlegroups.com

Hi Alice,

thanks, there is the issue - the tree is indeed not bifurcating. "Bifurcating" means, each node splits exactly into two child nodes (see also https://en.wikipedia.org/wiki/Phylogenetic_tree#Bifurcating_tree). In other words, your tree contains polytomies. When working with Maximum Likelihood methods (like RAxML EPA does), the trees usually have to be bifurcating (no polytomies). So that means, in order to run EPA, you first need to infer a bifurcating tree.

Hope that helps, so long
Lucas

Alice Lévesque

unread,
Apr 13, 2017, 2:07:48 PM4/13/17
to raxml
That makes sense. I will try to produce a bifurcating tree now. 
Will let you know if it does not work.

Thanks a lot for your help!

Alice

Alexandros Stamatakis

unread,
Apr 13, 2017, 4:53:38 PM4/13/17
to ra...@googlegroups.com
dear alice,

why don't you just infer the reference tree (the one that needs to be
bifurcating) with RAxML, that one can then directly be read by EPA ...

alexis

On 13.04.2017 21:07, Alice Lévesque wrote:
> That makes sense. I will try to produce a bifurcating tree now.
> Will let you know if it does not work.
>
> Thanks a lot for your help!
>
> Alice
>
> Le jeudi 13 avril 2017 12:02:24 UTC-4, Lucas Czech a écrit :
>
> Hi Alice,
>
> thanks, there is the issue - the tree is indeed not bifurcating.
> "Bifurcating" means, each node splits exactly into two child nodes
> (see also
> https://en.wikipedia.org/wiki/Phylogenetic_tree#Bifurcating_tree
> <https://en.wikipedia.org/wiki/Phylogenetic_tree#Bifurcating_tree>).
>>> it, send an email to raxml+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout
>>> <https://groups.google.com/d/optout>.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "raxml" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to raxml+un...@googlegroups.com <javascript:>.

Alice Lévesque

unread,
Apr 18, 2017, 10:24:54 AM4/18/17
to raxml
Hi Alexis,

Thanks for the tips. 
Which function should I select to infer my tree? (-f ?) I read the manuel but I did not identify the right algorithm to use...

Best regards,

Alice Lévesque

Alexandros Stamatakis

unread,
Apr 18, 2017, 2:14:16 PM4/18/17
to ra...@googlegroups.com
It depends how thoroughly you want to do it, see, for instance:

RAxML manual at
http://sco.h-its.org/exelixis/resource/download/NewManual.pdf especially
pages 53 and following.

There's also a web-based step by step tutorial here:

http://sco.h-its.org/exelixis/web/software/raxml/hands_on.html

There are also a couple of easy to use web-servers available for RAxML:

http://sco.h-its.org/exelixis/web/software/raxml/index.html#web

Alexis

On 18.04.2017 17:24, Alice Lévesque wrote:
> Hi Alexis,
>
> >>> it, send an email to raxml+un...@googlegroups.com.
> >>> For more options, visit
> https://groups.google.com/d/optout <https://groups.google.com/d/optout>
> >>> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >>
> >> --
> >> You received this message because you are subscribed to the
> Google
> >> Groups "raxml" group.
> >> To unsubscribe from this group and stop receiving emails from
> it,
> >> send an email to raxml+un...@googlegroups.com <javascript:>.
> >> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> >> <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "raxml" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send

Alice Lévesque

unread,
Apr 19, 2017, 11:08:54 AM4/19/17
to raxml
Hi Alexis,

Thanks for the tips. I did solve the problem for one of my two EPA tree (I have OTUs from the amplification of 2 viral genes). 

I generated bifurcating trees using the -f a command. And used the RAxML_bipartitions.XX tree for the EPA (-f v). It worked for one of the gene. But for the other, even if the tree seems bifurcating to me, I still get the same error message as before (from my server) :
 
raxmlHPC-PTHREADS-SSE3: treeIO.c:1409: treeReadLen: Assertion `0' failed.
/var/spool/slurm/slurmd/job11389/slurm_script: line 13: 44124 Aborted                 raxmlHPC-PTHREADS-SSE3 -f v -m PROTGAMMAILG -s Mafft_align_chlvd_aa.phy -t RAxML_bipartitions.Refseq_tree_chlvd -n EPA_final -T 2

I attached to that message, the summary of what RAxML did and a screen shot of my bifurcating tree. 
So apparently, there would be another problem with this tree...

What do you think?

Thanks a lot!

Alice Levesque
Tree.PNG
RAxML_summ.txt

Alexandros Stamatakis

unread,
Apr 19, 2017, 5:28:00 PM4/19/17
to ra...@googlegroups.com
you should use the RAxML_bestTree. file as input, this must work ... I hope

alexis

On 19.04.2017 18:08, Alice Lévesque wrote:
> Hi Alexis,
>

Alice Lévesque

unread,
Apr 20, 2017, 11:06:26 AM4/20/17
to raxml
Thanks a lot! Finally, it worked! I have my 2 EPA trees now.
Just a last question, the RAxML_labelledTree_XX file contains values on tree nodes. As I read in another message from the forum, these values are not Bootstrap value but likelihood weights... What is the difference? Is it as reliable as BS? And I guess that for the weight, the higher the value is, the better it is?

Thanks again
Best regards,

Alice

Lucas Czech

unread,
Apr 20, 2017, 11:56:32 AM4/20/17
to ra...@googlegroups.com

Hi Alice,

that sounds great; glad it worked out.

I'm not entirely sure, but it sounds like you are mixing up a couple of things here:

  • The RAxML_originalLabelledTree contains the original reference tree, where the inner nodes of the tree are labelled with consecutive numbers, preceded by "I", e.g., "I123". So, those are not bootstrap values, just unique identification labels. The same is true for the RAxML_labelledTree, which is the reference tree plus single branches for each placed query sequence. This tree can be used to get a quick overview of the placement positions. (However, if you have many query sequences, this can become a mess. In this case, you can try my visualization: http://doc.genesis-lib.org/demos_visualize_placements.html)

  • The RAxML_portableTree.*.jplace on the other hand is the standardized output format for placements, see http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0031009
    It also contains the reference tree, this time labelled with the so called edge_num, using brackets "{123}",  which is also a unique number for each branch. (I think, the numbers are however not one-to-one mapped between this tree and the labelled tree...).

    The jplace file then contains a list of all your queries, listing possible placement positions for each of them. Those positions have a "likelihood weight", which tells you how certain this position is for a given query. For each query, those weights sum up to 1.0 over all branches of the tree - but usually only the more probable ones are written to the file, in order to keep it small.

In summary, the likelihood weights can be seen as a probability distribution of placement positions for a query on the branches of the tree.

Does that make sense to you?

Lucas

For more options, visit https://groups.google.com/d/optout.

Alice Lévesque

unread,
Apr 20, 2017, 12:10:24 PM4/20/17
to raxml
Hi Lucas,

Yes, it is really likely that I might be mixing up a few things...
I attached to that message how I see my tree. So these values (at each node) are not likelihood weights if I understand correctly...

So, the *.jplace file is a tree containing the likelihood weights, and it is that file I should use (and visualize).
How can I obtain bootstrap values for my EPA tree? It's just that I see in the literatture EPA trees with bootstrap values.. 

Thanks a lot to take time to answer my (multiple) questions!

Alice
Tree_gp23.PNG

Lucas Czech

unread,
Apr 20, 2017, 12:35:55 PM4/20/17
to ra...@googlegroups.com

Hi hi,

yep, the tree you posted indeed just shows unique branch numbers, no bootstrap supports or likelihood weights.

The jplace file is not a tree, but contains it, followed by the placement data - you can simply open it in a text editor and have a look, if you want ;-)

I guess you want to show bootstrap values of the reference tree, right? (As opposed to bootstrap support for the query sequences - this would be a totally different problem)
In this case, obtaining those bootstrap values can be done independently of the placements. Just use your original alignment and call RAxML so that it calculates a bootstrap tree (see the RAxML manual for details on how to do that).

Then, you get a tree that has bootstrap support values for the inner branches. Use this tree as a reference tree for EPA (I'm afraid you have to re-run it, unless your current reference tree already has bootstrap support values. If so, read on).

Finally, you end up with a reference tree with bootstrap values, and the placement files (labelled tree and jplace file), which do not have the bootstrap values. So, you need to combine those trees.

We had exactly this issue a couple of weeks ago in the gooogle group, see https://groups.google.com/forum/#!topic/raxml/V7ZS5dhffgQ
I proposed to use a custom tool for this, which you find here: http://doc.genesis-lib.org/demos_labelled_tree.html

Let me know if this is what you want to achieve.

Lucas

Alice Lévesque

unread,
Apr 20, 2017, 1:04:23 PM4/20/17
to raxml
Oh wow! Thanks a lot Lucas!
It is exactly what I'm looking for!

So, I already have my reference tree with bootstrap values, I just need to combine it with the placement files. This is perfect!

Best regards,

Alice

Lucas Czech

unread,
Apr 20, 2017, 1:07:06 PM4/20/17
to ra...@googlegroups.com

Nice! Let me know if you have any trouble running it!

Lucas

Alice Lévesque

unread,
Apr 21, 2017, 1:22:07 PM4/21/17
to raxml
Hi,

Just to be sure, is it normal to have polytomies on the final EPA tree? If short sequences were really similar for example...

Thanks a lot!
Also, I want to let you know that I am really grateful for your help! 

Alice

Lucas Czech

unread,
Apr 21, 2017, 1:28:44 PM4/21/17
to ra...@googlegroups.com

Hi Alice,

are you observing polytomies only at the new branches that are added for the query sequences, or does the underlying reference tree suddenly have them?

In the first case, this is one of two ways of adding query sequences as new branches. See here for details: http://doc.genesis-lib.org/namespacegenesis_1_1placement.html#af6874a6171b7665adedce46c8ffb55a4

In the documentation of the program, it is also mentioned how to change this to the second kind of behaviour, see http://doc.genesis-lib.org/demos_labelled_tree.html

If however the second is the case, i.e., you are suddenly getting polytomies in your reference sequences, then something went wrong. In that case, please send me your files so that I can investigate.

Cheers and have a nice weekend
Lucas

Alice Lévesque

unread,
Apr 21, 2017, 1:37:04 PM4/21/17
to raxml
It is indeed the first case, only the query sequences produce polytomies. Is that a problem (for a publication for example?) and I need to resolve the placements? Or I can keep it this way.

Thanks!!

Alice

Lucas Czech

unread,
Apr 21, 2017, 1:43:37 PM4/21/17
to ra...@googlegroups.com

Well, that depends on what you want to show with those placements. I chose to merge all queries per branch into a polytomy for two reasons:

  1. It is easier to see and collapse them into one node in a tree viewer, in cases where you do not want to show all of them as single branches.
  2. It is what RAxML does with the labelled tree.

If you instead want to fully resolve them into single branches, like pplacer guppy does, the tree will be a bit more crowded and collapsing is not that easy any more. Your choice. See the two links for details and for how to change the behaviour.

Lucas

Alexandros Stamatakis

unread,
Apr 22, 2017, 4:45:04 PM4/22/17
to ra...@googlegroups.com


On 21.04.2017 20:37, Alice Lévesque wrote:
> It is indeed the first case, only the query sequences produce
> polytomies. Is that a problem (for a publication for example?) and I
> need to resolve the placements? Or I can keep it this way.

you should keep it that way, it's actually the idea of the whole
algoritm ... please have a look at the original paper at:

https://academic.oup.com/sysbio/article/60/3/291/1667010/Performance-Accuracy-and-Web-Server-for


alexis
>>> <https://groups.google.com/forum/#%21topic/raxml/V7ZS5dhffgQ>
>>>> * The /RAxML_///originalLabelledTree// contains
>>>> the original reference tree, where the inner
>>>> nodes of the tree are labelled with consecutive
>>>> numbers, preceded by "I", e.g., "I123". So,
>>>> those are not bootstrap values, just unique
>>>> identification labels. The same is true for the
>>>> /RAxML_//labelledTree/, which is the reference
>>>> tree plus single branches for each placed query
>>>> sequence. This tree can be used to get a quick
>>>> overview of the placement positions. (However,
>>>> if you have many query sequences, this can
>>>> become a mess. In this case, you can try my
>>>> visualization:
>>>> http://doc.genesis-lib.org/demos_visualize_placements.html
>>>> <http://doc.genesis-lib.org/demos_visualize_placements.html>)
>>>>
>>>> * The /RAxML_portableTree.*.jplace/ on the other
>>>> hand is the standardized output format for
>>>> placements, see
>>>> http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0031009
>>>> <http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0031009>
>>>> It also contains the reference tree, this time
>>>> labelled with the so called /edge_num/, using
>>>>> > <mailto:raxml+un...@googlegroups.com>.
>>>>> > For more options, visit
>>>>> https://groups.google.com/d/optout
>>>>> <https://groups.google.com/d/optout>.
>>>>>
>>>>> --
>>>>> Alexandros (Alexis) Stamatakis
>>>>>
>>>>> Research Group Leader, Heidelberg Institute for
>>>>> Theoretical Studies
>>>>> Full Professor, Dept. of Informatics, Karlsruhe
>>>>> Institute of Technology
>>>>> Adjunct Professor, Dept. of Ecology and
>>>>> Evolutionary Biology, University
>>>>> of Arizona at Tucson
>>>>>
>>>>> www.exelixis-lab.org <http://www.exelixis-lab.org>
>>>>>
>>>>> --
>>>>> You received this message because you are
>>>>> subscribed to the Google Groups "raxml" group.
>>>>> To unsubscribe from this group and stop receiving
>>>>> emails from it, send an email to
>>>>> raxml+un...@googlegroups.com.
>>>>> For more options, visit
>>>>> https://groups.google.com/d/optout
>>>>> <https://groups.google.com/d/optout>.
>>>>
>>>> --
>>>> You received this message because you are subscribed to
>>>> the Google Groups "raxml" group.
>>>> To unsubscribe from this group and stop receiving emails
>>>> from it, send an email to raxml+un...@googlegroups.com.
>>>> For more options, visit
>>>> https://groups.google.com/d/optout
>>>> <https://groups.google.com/d/optout>.
>>>
>>> --
>>> You received this message because you are subscribed to the
>>> Google Groups "raxml" group.
>>> To unsubscribe from this group and stop receiving emails from
>>> it, send an email to raxml+un...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout
>>> <https://groups.google.com/d/optout>.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "raxml" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to raxml+un...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages