Dear All,
I'm setting up an analysis broadly following the BFD* tutorial, and I have two questions:
1. When it comes to prior specification, there is very little known about the species I work on, and as such it is difficult to make sensible estimates of tree height and population sizes (for the lambda and theta values), or choose a sensible distribution. My plan therefore is to allow SNAPP to estimate these values in the mcmc chain (with broad upper and lower bounds). I then plan to validate these by doing an additional run using modified priors, as suggested in the Bryant et al., 2012 paper. Does this sound like an appropriate way forward?
2. My preliminary runs are taking prohibitively long to run (for example, I set one up 4 days ago to run for 100,000 MCMC steps, and the likelihood.log file still simply says "Sample Likelihood" ie. (I think) it still hasn't completed 1 of these 100,000 steps!). My data consists of 115 individuals and 1762 SNPs. Reading around on this group, the general advice seems to be to play around with threading, which I've done without much success. Failing that, I've just set up a run with 34 individuals, in the hope that this will result in more feasible runtimes - is there a minimum number of individuals per species needed for BFD? and does anyone have any advice for speeding up runtimes in addition to the above?
Thanks very much,
Jane