--asc-corr=stamatakis how to provide number of invariant sites

258 views
Skip to first unread message

Sean

unread,
Apr 13, 2018, 9:29:47 PM4/13/18
to raxml
Hello,

I am working with RAD data and have a phylip file containing only variable sites. How do I set the number of invariant sites using the correction --asc-corr=stamataki ? The manual says to use the partition file. So would I calculate the number of total sites (T) and just create a partition file with DNA, p1:1-T? (and the invariant sites will be inferred?)

Also, what is the best way to calculate the number of invariable sites from a vcf file?

Thanks,
Sean

Alexey Kozlov

unread,
Apr 15, 2018, 2:22:13 PM4/15/18
to ra...@googlegroups.com
Hi Sean,

> I am working with RAD data and have a phylip file containing only variable sites. How do I set the number of invariant
> sites using the correction --asc-corr=stamataki ? The manual says to use the partition file. So would I calculate the
> number of total sites (T) and just create a partition file with /DNA, p1:1-T/? (and the invariant sites will be inferred?)

T must be the number of (variable) sites in your PHYLIP file. Please see p. 44 of RAxML manual for the explanation how
to specify the number of invariant sites:

https://sco.h-its.org/exelixis/php/countManualNew.php

> Also, what is the best way to calculate the number of invariable sites from a vcf file?

VCF file usually contain variable sites only (SNPs), so I don't think there is any possibility to *calculate* the number
of invariant sites. However, this information could be theoretically specified in one of the header fields (but not sure
about this).

Best,
Alexey

Mariana López

unread,
Jul 10, 2018, 8:05:31 AM7/10/18
to raxml
Hi!!
I need some help, actually I need to check if I am interpreting correctly the option ASC_STAM:

I work with a genome size specie of 4.4 millions bp (or positions), I made whole genome sequencing, but after the SNP identification pipeline, I have an alignment.fasta file with ONLY variable sites (18000 variable positions)
If I am right, when usign ASC_FELS the number of invariant sites will be (4382000=4400000-18000), is that ok??
If I used instead ASC_STAM, I must define the number of those 4382000 positions that are A/C/G/T?? Is that ok?? I know the proportion of each nucleotide in the genome, but I must define the total variable As and so on...

I am asking this, because I discussed the parameters used by a collegue, and he defined the invariant sites as the total position of the whole genome that are A/C/G/T, not considering the variable sites.. I mean he reported all the genome as invariant sites plus the variable sites present in the fasta file. Because of that I need to confirm if I am rigth or wrong!!

tranks in advance!
sincerely
Mariana

Alexey Kozlov

unread,
Jul 10, 2018, 8:28:58 AM7/10/18
to ra...@googlegroups.com
Hi Mariana,

- for the Felsenstein correction, you put the *total* number of *invariable* sites, i.e. ASC_FELS{4382000}

- for the Stamatakis correction, you put the number of *invariable* sites with As, Cs, Gs, and Ts, i.e.
ASC_STAM{nA/nC/nG/nT}, where nA + nC + nG + nT = 4382000

Since you have whole-genome sequences, all those numbers can be easily computed by simple character counting.

Best,
Alexey

> I work with a genome size specie of 4.4 millions bp (or positions), I made whole genome sequencing, but after the SNP
> identification pipeline, I have an alignment.fasta file with ONLY variable sites (18000 variable positions)
> If I am right, when usign ASC_FELS the number of invariant sites will be (4382000=4400000-18000), is that ok??
> If I used instead ASC_STAM, I must define the number of those 4382000 positions that are A/C/G/T?? Is that ok?? I know
> the proportion of each nucleotide in the genome, but I must define the total variable As and so on...
>
> I am asking this, because I discussed the parameters used by a collegue, and he defined the invariant sites as the total
> position of the whole genome that are A/C/G/T, not considering the variable sites.. I mean he reported all the genome as
> invariant sites plus the variable sites present in the fasta file. Because of that I need to confirm if I am rigth or
> wrong!!
>
> tranks in advance!
> sincerely
> Mariana
>
>
> El domingo, 15 de abril de 2018, 20:22:13 (UTC+2), Alexey Kozlov escribió:
>
> Hi Sean,
>
> > I am working with RAD data and have a phylip file containing only variable sites. How do I set the number of
> invariant
> > sites using the correction --asc-corr=stamataki ? The manual says to use the partition file. So would I calculate
> the
> > number of total sites (T) and just create a partition file with /DNA, p1:1-T/? (and the invariant sites will be
> inferred?)
>
> T must be the number of (variable) sites in your PHYLIP file. Please see p. 44 of RAxML manual for the explanation how
> to specify the number of invariant sites:
>
> https://sco.h-its.org/exelixis/php/countManualNew.php <https://sco.h-its.org/exelixis/php/countManualNew.php>
>
> > Also, what is the best way to calculate the number of invariable sites from a vcf file?
>
> VCF file usually contain variable sites only (SNPs), so I don't think there is any possibility to *calculate* the
> number
> of invariant sites. However, this information could be theoretically specified in one of the header fields (but not
> sure
> about this).
>
> Best,
> Alexey
>
> --
> You received this message because you are subscribed to the Google Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Mariana López

unread,
Jul 10, 2018, 9:37:08 AM7/10/18
to ra...@googlegroups.com
Thanks in advance!!
Mari

To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com <mailto:raxml+unsubscribe@googlegroups.com>.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "raxml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raxml+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Los grandes espíritus siempre han encontrado una violenta oposición de parte de mentes mediocres. A. Einstein
Message has been deleted

Alexandros Stamatakis

unread,
May 11, 2019, 2:36:46 PM5/11/19
to ra...@googlegroups.com
Are you using RAxML-NG or standard RAxML?

Would be helpful if you could paste in the command line and the exact
error you are getting.

Alexis

On 11.05.19 17:04, Hollie Topliffe wrote:
> Hi,
>
> Sorry to jump on this thread - but I'm trying to do the same thing with
> probably the same type of data (SNP data from TB genomes!) but when I
> specify the ASC_STAM parameter it tells me this is no longer possible.
> How do you go about this with the "-m ASC_GTRGAMMA --asc_corr=stamatakis
> -q txtfile" setup? Do the number of invariable sites go in to the txt
> file specified by -q? How is this formatted in the file?
>
> Thanks,
> Hollie
> send an email to ra...@googlegroups.com <javascript:>
> > <mailto:ra...@googlegroups.com <javascript:>>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/d40884f5-eb42-49cc-acfe-330934079b9c%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/d40884f5-eb42-49cc-acfe-330934079b9c%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org
Reply all
Reply to author
Forward
0 new messages