Dear Dr. Pavlidis,
Thank you very much for your response. This information is very helpful. I do have a follow up question: what modifications should the user make for pooled data?
In my case, I'm working with low-coverage data pooled across ten individuals per population, and in some cases pooled across multiple populations. I formatted my input according to the SweedFinder format, with n = total number of reads at the site in the population and x = number of reads with the derived allele. As a result, when I compute the osfs, the integer values range from 0 to 750 (because of my filtering, 750 is the upper bound on the number of reads). But most sites will not have 750 reads per site -- on average, we expect about 75x.
Do you recommend this input format for pooled data? Do you think the variability in overall coverage per site will impact SweeD's computation? Do I need to be implementing some kind of scaling?
Thank you very much for all of your help!
All the best,
Halie