IUPAC ambiguous (R,M, K etc) nucleotides in alignment

827 views
Skip to first unread message

Fae KRZ

unread,
Aug 5, 2015, 1:36:40 PM8/5/15
to raxml
I know that in RAxML, ambiguous characters will be treated as missing data. But I am not sure if R, M, K, V, D, H etc are considered as ambiguous characters or only non IUPAC nucleotides (like X) are considered as ambiguous. 
In addition, what does it mean  that ambiguous characters will be treating as missing data? Does it mean the column containing ambiguous character will be deleted in alignment and following phylo analysis?

Many thanks,
fae

Alexandros Stamatakis

unread,
Aug 6, 2015, 2:05:19 AM8/6/15
to ra...@googlegroups.com
hi fae,

> I know that in RAxML, ambiguous characters will be treated as missing
> data. But I am not sure if R, M, K, V, D, H etc are considered as
> ambiguous characters or only non IUPAC nucleotides (like X) are
> considered as ambiguous.

R, Y etc. are treated as ambigous characters as well, but in a different
way, here is how it works, if an amibiguous character represents A or C
then, the probabilities of A and C will be set to 1.0 at the tips, if it
represents A or C or G or T all probabilities at the tips will be set to
1.0 ....

> In addition, what does it mean that ambiguous characters will be
> treating as missing data?

no gaps are treated as missing data ...

> Does it mean the column containing ambiguous
> character will be deleted in alignment and following phylo analysis?

no,

alexis

>
> Many thanks,
> fae
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Fae KRZ

unread,
Aug 7, 2015, 10:36:31 AM8/7/15
to raxml

Hi Alexis,

Thank you so much for your kind reply.
Would you please send me the link of a reference that has explained treating ambiguous nucleotides in RAxML? I would like to cite it in my paper as my sequence database contain a large number of ambiguous positions. 

The resulting RAxML tree (GTRGAMMA) is not bifurcating while the resulting MEGA tree is bifurcating (with complete deletion of gaps and missing data). As my sequences have up to 50 nucleotides gap (40 sequences) and contain ambiguous positions, could this difference be because of the difference in treating gaps and amb positions in MEGA and RAxML please?


Bests,
Fae

Alexandros Stamatakis

unread,
Aug 10, 2015, 2:50:23 AM8/10/15
to ra...@googlegroups.com
Dear Fae,

> Thank you so much for your kind reply.
> Would you please send me the link of a reference that has explained
> treating ambiguous nucleotides in RAxML? I would like to cite it in my
> paper as my sequence database contain a large number of ambiguous
> positions.

There is no such reference available, what is implemented in RAxML is
standard practice in all likelihood-based codes (i.e., ML and Bayesian).

> The resulting RAxML tree (GTRGAMMA) is not bifurcating

is this a consensus tree? otherwise raxml output trees are always
bifurcating ...

> while the
> resulting MEGA tree is bifurcating (with complete deletion of gaps and
> missing data). As my sequences have up to 50 nucleotides gap (40
> sequences) and contain ambiguous positions, could this difference be
> because of the difference in treating gaps and amb positions in MEGA and
> RAxML please?

No the difference is probably because you are using different input
alignments.

Alexis

>
>
> Bests,
> Fae
Reply all
Reply to author
Forward
0 new messages