Apparent excess of mid-frequency alleles in 1d-spectrum

69 views
Skip to first unread message

Katharine Prata

unread,
Feb 13, 2024, 11:50:18 PMFeb 13
to dadi-user
Hi Ryan & all,

We are using ANGSD to generate 1d-spectra and keep getting spectra with small 'peaks' at the mid-frequencies.

We assumed it may be either due to miss-assigned ancestral allele or paralogs.

After using more strict depth filters (per group max depth filtering) and comparing setting the major allele to ancestral as the reference, we get these results for group 1 and 2 (see attached). There are two other groups but the results are basically the same.

These groups are within one species and the WGS sequences have been aligned to a closely-related species. We did this so that the reference is most likely the ancestral allele.

We are now re-running making these spectra but with sequences aligned to the same species as the groups - to see if there are any difference.

Also, attached 'amil' is when we look at the spectra of our outgroup species where it is aligned to the reference of the same species... this looks more normal but weird drop at 0.5.

Note, these are all folded spectra! I understand the ancestral is not needed for folded spectra but in ANGSD it requires an ancestral reference when making either folded or unfolded spectra. It doesn't look like there's a difference though. We were planning to use unfolded spectra but were getting these weird patterns so switched to folded.

Any advice or comments on what artefact would potentially create this pattern is helpful or whether these spectra look ok and a natural phenomenon? The tajima's D for these spectra (group1 and 2) were also curiously negative (opposite to what I'd expect from the look of these spectra) - so the low freqs are in fact more abundant than one would expect from a neutral spectrum.

I will update the new results when aligning group1-4 to the same species reference genome.

Kind regards,
Kat

out.fold.group1.saf.anc1D.png
out.fold.Amil.saf.major1D.png
out.fold.Amil.saf.anc1D.png
out.fold.group2.saf.major1D.png
out.fold.group1.saf.major1D.png
out.fold.group2.saf.anc1D.png

Katharine Prata

unread,
Feb 15, 2024, 6:12:31 PMFeb 15
to dadi-user
Updated: new sfs with using the same species of the reference have a similar pattern.
ahya.fold.group2.major1D.png
ahya.fold.group4.major1D.png
ahya.fold.Amil.major1D.png
ahya.fold.group3.major1D.png
ahya.fold.group1.major1D.png

Ryan Gutenkunst

unread,
Feb 16, 2024, 6:52:28 PMFeb 16
to dadi...@googlegroups.com
Hello Kat,

With folded spectra, the ancestral state doesn’t matter at all, so I’m surprised in this case the outgroup reference doesn’t matter. (For unfolded, you definitely want a real outgroup.) That bump in the folded spectra also can’t be ancestral state misidentification. It’s not obvious to me what could be happening. For paralog issues, filtering based on Hardy-Weinberg might be helpful, since paralogs should generate excess heterozygotes.

Best,
Ryan
> --
> You received this message because you are subscribed to the Google Groups "dadi-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/d237c507-f806-432f-9e82-86da75a171can%40googlegroups.com.
> <ahya.fold.group2.major1D.png><ahya.fold.group4.major1D.png><ahya.fold.Amil.major1D.png><ahya.fold.group3.major1D.png><ahya.fold.group1.major1D.png>

Katharine Prata

unread,
Jun 3, 2024, 12:16:16 AMJun 3
to dadi-user
Hi,

I thought I'd update everyone in the group about how we (hopefully) resolved this issue.

We looked where these high het snps were position in the genome to see if there was patterning and potential paralogous regions, however these sites appeared to come from everywhere.

We used the ANGSD het filter however it created a zig-zag pattern in the AFS. 

Seeing the commentary on issues for ANGSD (https://github.com/ANGSD/angsd/issues/156).

We followed Mikhail Matz' advice and projected down 80% after applying the het filter. This method smoothed out all our 1d and our 2d sfs.

Attached examples.

Cheers,
Kat


ahya.unfold.het05.Amil_snp69661171Dfoldproject.png
ahya.unfold.het05.group3.group4_snp69661172Dfoldproject.png
Reply all
Reply to author
Forward
0 new messages