private alleles

370 views
Skip to first unread message

Giulia Trauzzi

unread,
Jun 8, 2021, 6:02:47 AM6/8/21
to stacks...@googlegroups.com
Hello everyone,

I have been using Stacks de novo to call SNPs on my data set.

I have used the wrapper denovo_Map.pl to set the core parameters on a subset and then I ran the modules individually on the whole data set (686 individuals).

I have noticed that I get a certain number of private alleles when running the small subset of 180 individuals with the wrapper but I get 0 private alleles in all populations when I run the modules individually on the whole data set. The only explaination I could think of is that the increase in sample size show that in truth, there are no private alleles.
Or that those private alleles do not pass the hwe filter that I apply when running the whole data set.

Do you have any other explaination for this? Has this something to do with the wrapper vs running the modules individually?

Thanks

Giulia

Virus-free. www.avast.com

Julian Catchen

unread,
Jun 8, 2021, 5:24:00 PM6/8/21
to stacks...@googlegroups.com, Giulia Trauzzi
Hi Giulia,

The core Stacks pipeline (ustacks/cstacks/sstacks/gstacks) is agnostic
when it comes to population structure. It assembles all the loci across
the dataset without much thought to specific biology (with an exception
to SNP calling). So running the wrappers vs. running individual
components will not effect private alleles.

It is the populations program that adds the population frame to the data
and this is true for private alleles. These alleles will be defined
based on your population map -- if you have a single population in the
analysis then the definition is as you would expect. However, if you
have multiple populations then private alleles are specific to each
population. If you change the population map (say moving from a
geographically based population definition to populations based on sex),
you will see different alleles identified as private.

Of course, as your sample size increases you will also see more private
alleles, but this is related to your power to detect them or the
probability of sampling a low frequency allele in the population.

Relatedly, the other parameter which can affect this definition is the
minor allele frequency (MAF) filter (and its related MAC). Obviously,
private alleles are often at low frequency, so if you filter out all low
frequency alleles, you will see a commensurate drop in private alleles.

Best,

julian

Giulia Trauzzi wrote on 6/8/21 5:02 AM:
Reply all
Reply to author
Forward
0 new messages