'error in sample_g' message when running polyfreqs()

13 views
Skip to first unread message

john spring

unread,
Dec 14, 2016, 3:27:09 PM12/14/16
to polyfreqs-users

Hi,

     I'm hoping to run polyfreqs on a small ddRAD-seq dataset from an autotetraploid plant species (733 loci, 89 individuals), and have run into an error that I don't understand. I believe everything is installed correctly as the polyfreqs function works as the tutorial shows using the ref_reads and total_reads dataset. When I attempt to do the same using my data however, I get the following error message:

> out1<-polyfreqs(totalreadmatrix,referencereadmatrix,ploidy = 4)
Starting MCMC...

Error in sample_g(tM, rM, gM_init, pV_init, ploidy, error) : 
  NAs not allowed in probability

I have a fair amount of missing data, but they are entered into the matrices of read counts as 0, not NA, so I am unsure where the NAs are generated or how to resolve the issue. Any help would be much appreciated.

Thanks,

John

Paul Blischak

unread,
Dec 14, 2016, 3:35:12 PM12/14/16
to polyfreqs-users
Hi John,

I've been talking with someone else who has also been getting this error. The main reason she was getting the error is because her read counts are really high and there are problems with computational underflow. How high are your read counts? 

Paul

john spring

unread,
Dec 14, 2016, 4:02:14 PM12/14/16
to polyfreqs-users
They are pretty variable, but many are probably high. Quickly glancing at the first few loci, the individuals with the greatest read depths have counts of roughly 300 to 900 in the total reads table, although most are much lower. Across all individuals and loci (and excluding 0 values), the mean total read count is 63 with sd 82 and median 35.
 

Paul Blischak

unread,
Dec 14, 2016, 5:02:43 PM12/14/16
to polyfreqs-users
Hi John,

I messed around with the coding for the sample_g function and pushed the results to GitHub. If you install polyfreqs from there you can try it out and see if it works now. The two commands below should do the trick but let me know if they don't.

install.packages("devtools")
devtools
::install_github("pblischak/polyfreqs")


Please give this a try and let me know if it works. Thanks!

Paul

john spring

unread,
Dec 14, 2016, 6:38:12 PM12/14/16
to polyfreqs-users
The install commands worked, as did the new version of the package, at least as far as I can tell. It runs, and gives output files that look right when I compare them to the test dataset.
I'd be happy to send copies of my read count files or whatever else if you want to look at it in more detail.

I really appreciate your prompt help. Thanks!

john spring

unread,
Dec 14, 2016, 6:58:18 PM12/14/16
to polyfreqs-users
On closer inspection, it appears that the output covers all loci, but only for the first 10 individuals.

Auto Generated Inline Image 1

Paul Blischak

unread,
Dec 16, 2016, 10:14:40 AM12/16/16
to polyfreqs-users
Hi John,

Sorry I didn't get back to you yesterday. The number of rows in the posterior_freqs, het_obs, and het_exp output isn't the number of individuals because all of these quantities are calculated per locus across individuals. The number of rows in these matrices is the number of samples saved from the posterior during the MCMC. If you want to get genotypes for all of the individuals too run the polyfreqs command with 'genotypes=TRUE' as an argument. I would also suggest saving more than 10 posterior samples so that you can get better parameter estimates. Running the analysis with 10,000 MCMC generations and sampling every 10th or 20th generation should give good results. You can check effective sample sizes with the R package "coda". Code for doing this should be in the vignette but if it isn't I'd be happy to help you check the ouput.

Paul


john spring

unread,
Dec 16, 2016, 12:34:07 PM12/16/16
to polyfreqs-users
Paul,

Okay, thanks, that clears things up for me. I suppose I should have read the documentation more closely and been able to figure that out on my own, but thanks for explaining it to me nonetheless.

I really appreciate your help.

John

Reply all
Reply to author
Forward
0 new messages