Hi Ryan & all,
We are using ANGSD to generate 1d-spectra and keep getting spectra with small 'peaks' at the mid-frequencies.
We assumed it may be either due to miss-assigned ancestral allele or paralogs.
After using more strict depth filters (per group max depth filtering) and comparing setting the major allele to ancestral as the reference, we get these results for group 1 and 2 (see attached). There are two other groups but the results are basically the same.
These groups are within one species and the WGS sequences have been aligned to a closely-related species. We did this so that the reference is most likely the ancestral allele.
We are now re-running making these spectra but with sequences aligned to the same species as the groups - to see if there are any difference.
Also, attached 'amil' is when we look at the spectra of our outgroup species where it is aligned to the reference of the same species... this looks more normal but weird drop at 0.5.
Note, these are all folded spectra! I understand the ancestral is not needed for folded spectra but in ANGSD it requires an ancestral reference when making either folded or unfolded spectra. It doesn't look like there's a difference though. We were planning to use unfolded spectra but were getting these weird patterns so switched to folded.
Any advice or comments on what artefact would potentially create this pattern is helpful or whether these spectra look ok and a natural phenomenon? The tajima's D for these spectra (group1 and 2) were also curiously negative (opposite to what I'd expect from the look of these spectra) - so the low freqs are in fact more abundant than one would expect from a neutral spectrum.
I will update the new results when aligning group1-4 to the same species reference genome.
Kind regards,
Kat