Is it okay to run raxmlgui 1.5b with GTR when the best fit model is HKY (according to jmodeltest)

352 views
Skip to first unread message

jbak

unread,
Jun 22, 2015, 4:12:23 PM6/22/15
to ra...@googlegroups.com, bakonyi...@agrar.mta.hu
Dear raxml users,

I'd like to run raxmlgui 1.5b using my partitioned DNA dataset.
Jmodetest2.0 selected HKY as the best fit model for all partitions (without G and/or I) . Indeed, the sequences are closely related with quite a few polimorphic sites and limited number of nucleotide substitutions (actually the 3 pathogens studied are really closely related).

RAxML only implements GTR-based models of nucleotide substitution. Could you give any advice? What will happen if I use GTR, instead of HKY? Will the phylogeny be good?

Thanks a lot,

József


Alexandros Stamatakis

unread,
Jun 22, 2015, 5:15:38 PM6/22/15
to ra...@googlegroups.com
Dear Jozsef,

In general I'd say yes, we have done some test recently and it turned
out that using simpler models than GTR can change the topology sometimes
on rather small datasets. There is an option now to specify a K80 model
in RAxML which is similar to HKY85.

If you have a snp datasets I'd recommend using the ascertainment bias
correction as well.

Cheers,

Alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Ingo Michalak

unread,
Jun 23, 2015, 4:34:08 AM6/23/15
to ra...@googlegroups.com
Dear József,
please note, that both options Alexis mentions are not (yet) implemented in raxmlGUI. Therefore you should use (command line) RAxML for now.

Good luck
Ingo

jbak

unread,
Jul 3, 2015, 3:05:20 AM7/3/15
to ra...@googlegroups.com

Dear Alexis and Ingo,

Thanks a lot for your comments and instructions.

 

In RaxMLGUI1.5, when I used pure GTR (with no rate heterogeneity among sites) for my dataset (for which jmodeltest choose HKY), I got many warning messages like this:

 

WARNING the alpha parameter with a value of XXXXXXX estimated by RAxML for partition number Y with the name "ZZZZZ" is larger than 10.000000. You should do a model test and confirm that you actually need to incorporate a model of rate heterogeneity! You can run inferences with a plain substitution model (without rate heterogeneity) by specifyng the CAT model and the "-V" option!


As you suggested, I have succesfully run the command line version 8.1.21 with model K80 and no rate heterogeneity among sites (-V) for each partition.

 

I used this command line for the best ML tree search:
raxmlHPC-PTHREADS-AVX –T 8 –M -m GTRCAT –V -p 12345 -# 10 -q partitions.txt -s MG_raxmlinfile.phy --K80 -o WAC11137 -n outputT1

 

I still got several of this warning message at the stage of Gamma-based final tree optimization:

 

WARNING the alpha parameter with a value of XXXXXXX estimated by RAxML for partition number Y with the name "ZZZZ" is larger than 10.000000. You should do a model test and confirm that you actually need to incorporate a model of rate heterogeneity! You can run inferences with a plain substitution model (without rate heterogeneity) by specifyng the CAT model and the "-V" option!

 

Does this warning message count at all if i used the above command line. Or would you advice to use –F?

 

 

Additionally, I am a bit still worried about the equal base frequencies defined by model K80. Definately, they are not equal in my dataset according to jmodeltest. Is there any chance to tell raxml to estimate base frequencies while still using --K80 and -V?


Unluckily, PHYML does not work with partitioned data, as I know.

 

Thanks a lot for your help once more.

 

Cheers,

József

Alexandros Stamatakis

unread,
Jul 3, 2015, 5:59:56 AM7/3/15
to ra...@googlegroups.com
Hi József,

> Thanks a lot for your comments and instructions.
>
> In RaxMLGUI1.5, when I used pure GTR (with no rate heterogeneity among
> sites) for my dataset (for which jmodeltest choose HKY), I got many
> warning messages like this:
>
> WARNING the alpha parameter with a value of XXXXXXX estimated by RAxML
> for partition number Y with the name "ZZZZZ" is larger than 10.000000.
> You should do a model test and confirm that you actually need to
> incorporate a model of rate heterogeneity! You can run inferences with a
> plain substitution model (without rate heterogeneity) by specifyng the
> CAT model and the "-V" option!
>
>
> As you suggested, I have succesfully run the command line version 8.1.21
> with model K80 and no rate heterogeneity among sites (-V) for each
> partition.
>
> I used this command line for the best ML tree search:
> *raxmlHPC-PTHREADS-AVX –T 8 –M -m GTRCAT –V -p 12345 -# 10 -q
> partitions.txt -s MG_raxmlinfile.phy --K80 -o WAC11137 -n outputT1*
>
> **
>
> *I still got several of this warning message at the stage of Gamma-based
> final tree optimization:***
>
> **
>
> WARNING the alpha parameter with a value of XXXXXXX estimated by RAxML
> for partition number Y with the name "ZZZZ" is larger than 10.000000.
> You should do a model test and confirm that you actually need to
> incorporate a model of rate heterogeneity! You can run inferences with a
> plain substitution model (without rate heterogeneity) by specifyng the
> CAT model and the "-V" option!
>
> Does this warning message count at all if i used the above command line.
> Or would you advice to use –F?

Well it is still valid, since the final optimization is done under
GAMMA, but that's okay, this warning is mainly intended to notify users
that they could get the largest part of the analysis done without the
additional computational cost induced by using +GAMMA

> Additionally, I am a bit still worried about the equal base frequencies
> defined by model K80. Definately, they are not equal in my dataset
> according to jmodeltest. Is there any chance to tell raxml to estimate
> base frequencies while still using --K80 and -V?

This is not K80 any more but HKY85. I've just implemented this in RAxML.
This is part of version 8.2.0 that I will release soon, maybe today.

Alexis
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of
> Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

jbak

unread,
Jul 3, 2015, 9:49:39 AM7/3/15
to ra...@googlegroups.com
Dear Alexis,

A huge thank to you! It would be more than fantastic to me to see HKY85 in raxml.
I am looking froward to seeing the new release.

Cheers,
József
Reply all
Reply to author
Forward
0 new messages