Power With Low SNPs and High Numbers of Individuals

Alex Krohn

unread,

Sep 13, 2023, 1:20:18 PM9/13/23

to dadi-user

Hi there,

I have a RADseq dataset from a large-genomed amphibian. I have hundreds of individuals, but unfortunately, only a few hundrend SNPs shared among those hundreds of individuals. I was wondering if any testing has been done on the number of SNPs needed to correctly infer parameters in models using dadi.

It seems intuitive that power would decrease with increasing model complexity (number of parameters, and number of populations), regardless of the number of individuals. Is that true?

Is it worth it to increase power by decreasing the number of individuals, either by subsampling individuals or "populations"?

For example, the strongest population structure is at the regional level (across multiple states in the US). My data show two admixed populations at this level, with hundreds of individuals each population. I could either subsample those two populations down to a few dozen individuals each, or zoom in to potential subpopulations within those two major populations to do my analyses.

For what it's worth, I'm mostly concerned about testing whether these two populations (or subpopulations within them) are expanding or contracting, while considering their shared population history, most similar to the "simple models plus instantaneous size change" from Dan Portik. I would be most interested in which model best fit the data using AIC, and whether nu[1,2]a or nu[1,2]b was larger.

It seems like subsampling and/or zooming in would remove important variation and thus conflate population structure/admixture for potential changes in population size. Is that right, or is this an acceptable practice?

Thanks for your help,

Alex

Ryan Gutenkunst

unread,

Sep 19, 2023, 5:34:32 PM9/19/23

to dadi-user

Hello Alex,

The most systematic analysis I’m aware of is this: http://doi.org/10.1186/s12862-014-0254-4

But I think in your case there’s a more important issue than number of samples. Simply, unmodeled population substructure could bias your inferences in unknown but potentially strong ways. So I would not look toward combining subpopulations to increase power, because you would simultaneously be increasing the potential for bias.

Best,
Ryan

> --
> You received this message because you are subscribed to the Google Groups "dadi-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dadi-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/8c7519d6-f276-48cb-a4d1-77cc1e3d8f64n%40googlegroups.com.

Alex Krohn

unread,

Sep 26, 2023, 10:16:50 AM9/26/23

to dadi...@googlegroups.com

Hi Ryan,

That is an informative paper, thanks for sending it.

Regarding unmodelled population structure, do you think this bias is present in studies that use dadi to assess migration among species? There is likely unmodelled population structure within each species, yet there are numerous studies using dadi to infer ancient migration rates among species with only ~40 or so samples per species. Or is it that the overwhelming pattern of interspecies divergence swamps the intraspecific population structure?

In my case, the strongest overall pattern is population-level structuring across a biogeographic barrier. There may be substructuring below that, but it is very hard to visualize. Unfortunately, many "subpopulations" are very isolated, and inbreeding is high. So, below this biogeographic population (visible with PCA, Structure, etc.), the overwhelming pattern becomes one of drift and divergence through inbreeding.

Thanks again for your response.

-Alex

You received this message because you are subscribed to a topic in the Google Groups "dadi-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dadi-user/XVWObXYsWxI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dadi-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/034DE66D-B825-4B53-986B-4A4ECDFEAA82%40gmail.com.

Ryan Gutenkunst

unread,

Sep 28, 2023, 7:59:21 PM9/28/23

to dadi-user

Hello Alex,

I’m not on top of the interspecific migration literature, but certainly if people are lumping subpopulations, this could introduce bias. There is also a nice paper showing that not modeling size changes can introduce bias: https://doi.org/10.1093/molbev/msab047 .

My instinct would be to pick a few subpopulations and use those to look for migration between your species. If you consistently see no or very little inferred gene flow using different pairings, that would be evidence to me of isolation. (Even indirect gene flow through other subpopulations would show up in a pairwise analysis.) It would be interesting to compare this approach with lumping the subpopulations.

Best,
Ryan

> To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/CALjbiNCvCD-iOimJ-eOXiJg24vhqUwQubWxRRjXuUSgdZigFVg%40mail.gmail.com.

Alex Krohn

unread,

Sep 29, 2023, 9:17:34 AM9/29/23

to dadi...@googlegroups.com

Thanks again for the suggestions. I really appreciate the guidance.

-Alex

To view this discussion on the web visit https://groups.google.com/d/msgid/dadi-user/E4EF6944-9288-49DE-AEE3-231FA551EB52%40gmail.com.

Reply all

Reply to author

Forward