extract variable sites

1,210 views
Skip to first unread message

Sergios-Orestis Kolokotronis

unread,
Jan 26, 2015, 2:07:14 PM1/26/15
to ra...@googlegroups.com
Hello. I am using a variable sites-only alignment containing A/C/G/T/- and I still get this message:

For partition No Name Provided you specified that the likelihood score shall be corrected for invariant sites
via an ascertainment bias correction. However, some sites in this partition are already invariant.
This is not allowed, please remove all invariant sites and try again, exiting ...

Command used: raxml8116 -T 25 -s input.fasta -m ASC_GTRGAMMA --asc-corr=lewis -b 12345 -N 500 -k -o outgroup -n stuff

Is there a way for RAxML to extract variable sites?

Thank you,
Sergios

Alexandros Stamatakis

unread,
Jan 26, 2015, 2:22:39 PM1/26/15
to ra...@googlegroups.com
Gia sou Sergio,

> Hello. I am using a variable sites-only alignment containing A/C/G/T/-

Are you 100% sure? Do you maybe have some ambiguous characters left in
your alignment?

> and I still get this message:
>
> For partition No Name Provided you specified that the likelihood score
> shall be corrected for invariant sites
> via an ascertainment bias correction. However, some sites in this
> partition are already invariant.
> This is not allowed, please remove all invariant sites and try again,
> exiting ...
>
> Command used: raxml8116 -T 25 -s input.fasta -m ASC_GTRGAMMA
> --asc-corr=lewis -b 12345 -N 500 -k -o outgroup -n stuff
>
> Is there a way for RAxML to extract variable sites?

No that's not possible, such things are better handled via a script,
while it sounds easy in principle, it's rather complicated internally,
but please send me the alignment, I can try to improve the error message
and maybe have RAxML print the problematic alignment site ...

Parakalo,

Alexis

>
> Thank you,
> Sergios
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University
of Arizona at Tucson

www.exelixis-lab.org

Alexis

unread,
Feb 8, 2015, 5:46:43 AM2/8/15
to ra...@googlegroups.com
Hi sewrgio,

just released raxml v 8.1.17 it now explicitely prints out the offending site, in your case it is: 

Partition 0 with name "No Name Provided" is to be analyzed using ascertainment bias correction, but it has at least one invariable site!
This is is not allowed! RAxML will print the offending site and then exit.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-

keep in mind that - is modeled as completely undetermined character under ML, hence I count these sites as invariable,

alexis

Sergios-Orestis Kolokotronis

unread,
Feb 8, 2015, 8:48:23 PM2/8/15
to ra...@googlegroups.com
Thanks Alexi,

columns with gaps must have escaped my filter. Thanks for pointing that out. 
Any chance it can tell which column (number) it is, so that go in and delete it? Or produce the "REDUCED" alignment?

Thanks again,
S.

Alexandros Stamatakis

unread,
Feb 9, 2015, 4:21:38 AM2/9/15
to ra...@googlegroups.com
Dear Sergio,

> columns with gaps must have escaped my filter. Thanks for pointing that
> out.
> Any chance it can tell which column (number) it is, so that go in and
> delete it? Or produce the "REDUCED" alignment?

No, that's a bit more complicated that you would think and would
needlessly increase code complexity and decrease maintainability, i.e.,
it's better handled by a pre-processing script.

Alexis

Alexis

unread,
Apr 10, 2015, 9:46:26 AM4/10/15
to ra...@googlegroups.com
Gia sou Sergio,

RAxML v 8.1.18 (released now) will now indicate the offending (invariable) column numbers in the original alignment.

Alexis

Sergios-Orestis Kolokotronis

unread,
Apr 10, 2015, 11:14:13 AM4/10/15
to ra...@googlegroups.com
Sweet! Thanks!! Will check it out today.

Zhang Dz

unread,
Aug 29, 2016, 2:34:49 AM8/29/16
to raxml
Dear Sergios-Orestis, 
I have met the same problem as you were, the up-dated RaxMl would indicate these invariale columns, but there are two many of them, over 10k sites. How did you handle this?



在 2015年4月10日星期五 UTC+8下午11:14:13,Sergios-Orestis Kolokotronis写道:

Sergios-Orestis Kolokotronis

unread,
Sep 7, 2016, 12:08:47 PM9/7/16
to raxml
Hello Zhang,

you mean how did I remove the invariant sites from the alignment? One easy way is to run it through FABOX at http://users-birc.au.dk/biopv/php/fabox/alignment_variable_sites.php

Best,
Sergios

Carlos Munoz Ramirez

unread,
Jan 19, 2017, 2:30:26 PM1/19/17
to raxml
I think this program will not remove sites that under RAxML standards are invariant. For example, it will consider sites with gaps as variants. It will also consider sites with singleton ambiguities as variants as well. For RAxML, this sites are interpreted as invariant. You can test what I am saying using this short alignment.

>one
aaaaaagaaaaacc-tttt
>two
waaaaataaaaaccctttt
>three
aaaaaaaaaaaaccctttt
>four
taaaaaaaaaaaccctttt
>five
aaaaaaaaawaaccctttt
Message has been deleted

Alexandros Stamatakis

unread,
Mar 13, 2017, 5:01:15 AM3/13/17
to ra...@googlegroups.com, Sergios-Orestis Kolokotronis
I think Sergios also had a script, but it might well be that your
dataset contains 1/3 of invariable sites with ambiguous characters etc.

If you don't trust the script why don't you try it on a small test
dataset, please also see previous disussions about this on here.

alexis

On 11.03.2017 22:30, Cansu Çetin wrote:
> Can you suggest a script for removing invariable sites for RaxML
> ascertainment bias correction?
> I tried "phrynomics" package in R but it removed one third of my SNP
> sites.I couldn't be sure that it removed 'correct' sites..
>
> https://github.com/bbanbury/phrynomics/blob/master/R/RemoveInvariantSites.R
>
> Thank you,
> Cansu
>
>
> On Thursday, January 19, 2017 at 1:30:26 PM UTC-6, Carlos Munoz Ramirez
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
Message has been deleted

Alexandros Stamatakis

unread,
Mar 13, 2017, 3:29:21 PM3/13/17
to ra...@googlegroups.com
> Thank you very much for your reply!

:-)

> Actually, I had 2/3 of my sites removed, leaving me with around 10.000
> SNPs out of 30.000 SNPs; I typed incorrectly in the previous comment.
>
> I will try running R package phrynomics on a smaller dataset soon, thank
> you!

:-) I hope the script works correctly,

alexis
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
> Adjunct Professor, Dept. of Ecology and Evolutionary Biology,
> University
> of Arizona at Tucson
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

philipp....@gmail.com

unread,
Nov 9, 2017, 3:56:11 AM11/9/17
to raxml
The discussion seems a bit old already, but I have had a similar problem: I had many columns in my alignment that consisted of singletons and Ns only, which were recognized as invariant by RAxML standards.

It's true that Phrynomics is not going to help here, I used the remove invariant sites function of this package to create my final alignments, but it was not removing the sites as mentioned above.

I found a solution that is not super straightforward but worked quite fine. I just grep out all the sites RAxML is complaining about to a text file, imported the input fasta alignment in tabular format (conversion in FaBox) into R, removed the sites according to the list, export the alignment again - and voila, it works.

Cheers,
Philipp


Alexandros Stamatakis

unread,
Nov 9, 2017, 4:31:43 AM11/9/17
to ra...@googlegroups.com
many thanks for sharing this :-)

alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

Sumudu S

unread,
May 8, 2018, 8:20:29 AM5/8/18
to raxml
Hi Phillip,

I recently realized that my fasta alignment has many invariant sites. Tried to remove them as you suggested but not successful yet. I need to run ascertainment bias correction for my data. Would you be able to illustrate this a bit , what package you used in r. Would be a great help. 

Thank you
Best
Sumudu

Alexandros Stamatakis

unread,
May 8, 2018, 11:40:25 PM5/8/18
to ra...@googlegroups.com
Hi Sumudu,

If you search the google group, I think there was someone who had posted
a script to remove the invariant sites.

Alexis

Sumudu S

unread,
May 9, 2018, 8:49:34 AM5/9/18
to raxml
Hi

I searched but couldn't find a proper solution yet. I found this https://github.com/biologyguy/BuddySuite/wiki/AlignBuddy. It works for the test dataset given by them. But when I tried with my data set it only removes invariant sites with same base only for my data set. I couldn't track any issue with my data though. Hoping  I could come up with something to solve this.

Thank you very much for the guidance. It really helps. 

Best 
Sumudu 

Alexandros Stamatakis

unread,
May 9, 2018, 2:19:56 PM5/9/18
to ra...@googlegroups.com
that's the old post, I just found it:

"you mean how did I remove the invariant sites from the alignment? One
easy way is to run it through FABOX at
http://users-birc.au.dk/biopv/php/fabox/alignment_variable_sites.php"

alexis

On 09.05.2018 15:49, Sumudu S wrote:
> Hi
>
> I searched but couldn't find a proper solution yet. I found this
> https://github.com/biologyguy/BuddySuite/wiki/AlignBuddy
> <https://github.com/biologyguy/BuddySuite/wiki/AlignBuddy>. It works for
> the test dataset given by them. But when I tried with my data set it
> only removes invariant sites with same base only for my data set. I
> couldn't track any issue with my data though. Hoping / I could come up
> with something to solve this./
> > an email to raxml+un...@googlegroups.com <javascript:>
> > <mailto:raxml+un...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>

Sumudu Samarasinghe

unread,
May 10, 2018, 9:52:05 PM5/10/18
to ra...@googlegroups.com
Hi,

Well I found this as well. But I came across another comment that this doesn't work as intended. So I didn't try. I'll give it a try anyway.
If I find a solution I will post it here. Thank you very much.
Best
Sumudu

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com <mailto:raxml+unsubscribe@googlegroups.com>.

For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

--
You received this message because you are subscribed to the Google Groups "raxml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com.

Gil Yardeni

unread,
Aug 10, 2018, 9:37:52 AM8/10/18
to raxml
Hi everyone,

I encountered the same issues but find no solution in the comments.
Has anyone been able to solve it and can describe the solution?

Thanks.


Am Montag, 26. Januar 2015 20:07:14 UTC+1 schrieb Sergios-Orestis Kolokotronis:

Alexandros Stamatakis

unread,
Aug 10, 2018, 11:58:11 PM8/10/18
to ra...@googlegroups.com
Doesn't this help?

https://groups.google.com/d/msg/raxml/1eCZ0uh9nN8/nX-y6mUwCgAJ

Alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.

Gil Yardeni

unread,
Aug 20, 2018, 5:25:40 AM8/20/18
to raxml
FABOX definitely didn't remove the invariable sites, unfortunately. I'm still looking into it and hope to find a workaround.

Gil

Gil Yardeni

unread,
Aug 22, 2018, 11:57:15 AM8/22/18
to raxml
Hello Alexis (and other members). 
I narrowed down the dataset trying to find the issue, and it seems that raxml (with model -m ASC_GTRGAMMA) simply recognizes heterozygous sites as non-variable. Example:

Pattern: AAAAAAAWWWAAAAAAAAWAAAAWWAA
Pattern occurs at the following sites of the input alignment: 
Site 2 

I'm quite confused - is the intention of the model to use only homozygous while it's possible for us to remove all heterozygous sites? Our data comes from diploids, and for us that would mean not using a significant part of the data.

Thank you.

Ana

unread,
Sep 9, 2018, 1:53:50 PM9/9/18
to raxml
Hi,

I'm facing exactly the same issue. My data does not have gaps or missing data, but it does have heterozygous sites that were called as invariant by RAxML. I'm confused about it too.. Should these sites be considered as invariant and deleted?

Alexandros Stamatakis

unread,
Sep 10, 2018, 12:46:20 AM9/10/18
to ra...@googlegroups.com
yes, they should be considered as invariant and be deleted ...

alexis
> <https://groups.google.com/d/optout>.
>
> --
> Alexandros (Alexis) Stamatakis
>
> Research Group Leader, Heidelberg Institute for Theoretical
> Studies
> Full Professor, Dept. of Informatics, Karlsruhe Institute of
> Technology
>
> www.exelixis-lab.org <http://www.exelixis-lab.org>
Reply all
Reply to author
Forward
0 new messages