Hi there,
I have a RADseq dataset from a large-genomed amphibian. I have hundreds of individuals, but unfortunately, only a few hundrend SNPs shared among those hundreds of individuals. I was wondering if any testing has been done on the number of SNPs needed to correctly infer parameters in models using dadi.
It seems intuitive that power would decrease with increasing model complexity (number of parameters, and number of populations), regardless of the number of individuals. Is that true?
Is it worth it to increase power by decreasing the number of individuals, either by subsampling individuals or "populations"?
For example, the strongest population structure is at the regional level (across multiple states in the US). My data show two admixed populations at this level, with hundreds of individuals each population. I could either subsample those two populations down to a few dozen individuals each, or zoom in to potential subpopulations within those two major populations to do my analyses.
It seems like subsampling and/or zooming in would remove important variation and thus conflate population structure/admixture for potential changes in population size. Is that right, or is this an acceptable practice?
Thanks for your help,
Alex