how to calculate nucleotide diversity π using vcf file

1,373 views
Skip to first unread message

zyu...@gmail.com

unread,
Jan 16, 2018, 9:56:31 AM1/16/18
to Stacks
Hi 

I am trying to obtain nucleotide diversity for two populations, and the average nucleotide diversity on a per-site basis was found to be 0.3305 for a population, 0.3289 for another. But when I looking some papers, the π about 0.001 when analysing all nucleotide positions.  How do I get nucleotide diversity considering  all nucleotide positions?

Here is the simple version I calculate the nucleotide diversity π
stacks-1.48/bin/populations -V vcf -M popmap -fstats -k

Many thanks


Yu

Julian Catchen

unread,
Jan 17, 2018, 6:04:35 AM1/17/18
to stacks...@googlegroups.com, zyu...@gmail.com
Hi Yu,

Nucleotide diversity is calculated by the populations program for both
variant and all sites. You can find it in the
populations.sumstat_summary.tsv file.

Best,

julian

zyu...@gmail.com wrote:
> Hi
>
> I am trying to obtain nucleotide diversity for two populations, and the
> average nucleotide diversityon a per-site basis was found to be 0.3305

zyu...@gmail.com

unread,
Jan 21, 2018, 8:50:43 AM1/21/18
to Stacks

Hi Julian,

So if the SNPs are the same both in the two subpopulations, the variant = all sites, right?

Best, 

Julian Catchen

unread,
Jan 23, 2018, 10:36:48 AM1/23/18
to stacks...@googlegroups.com, zyu...@gmail.com
Given two populations, 'variant sites' refer to sites that are variant
across both populations -- so a particular site may only be polymorphic
in one of the two populations, or it may be polymorphic in both. If the
set of SNPs is exactly the same across the two populations, then the
variant sites will be polymorphic in both populations.

The 'all sites' refers to variant *and* fixed sites, so it includes
every position found in the RAD locus in the calculations of pi and
other stats.

julian

zyu...@gmail.com wrote:
>
> Hi Julian,
>
> So if the SNPs are the same both in the two subpopulations, the variant
> = all sites, right?
>
> Best,
>
> On Wednesday, January 17, 2018 at 7:04:35 PM UTC+8, Julian Catchen wrote:
>
> Hi Yu,
>
> Nucleotide diversity is calculated by the populations program for both
> variant and all sites. You can find it in the
> populations.sumstat_summary.tsv file.
>
> Best,
>
> julian
>

Jenni McIntyre

unread,
May 4, 2018, 9:55:22 AM5/4/18
to Stacks
Hi ,
Just wondering - how could I get the overall nucleotide diversity for all positions (variant and non-variant) between populations? 
For the mean overall nuc.diversity for variant positions I used the populations.fst_X-Y.tsv file and just averaged the 'overall pi' column. 
Thanks,
Jenni
Reply all
Reply to author
Forward
Message has been deleted
0 new messages