Nested Sampling capabilities and citations

97 views
Skip to first unread message

Kate Naughton

unread,
Aug 14, 2019, 1:29:07 AM8/14/19
to beast...@googlegroups.com
Hi all, 

After a good deal of testing and finagling, I'm finally using Nested Sampling to compare three models of species delimitation under StarBeast. It has taken a while to fine-tune the sub-chain lengths, and I had a couple of questions.

For interest's sake: 24 individuals, 50 exon sequences, linked site model (JC69), strict clock. Two of the models contain three taxa, the third model contains only two. Sub-chain length 300k, information content ranges from 1400-1600, and I'm running with 396 particles (18 threads, 22 particles per thread, on a c5 9xlarge on Amazon Web Services).

(1) Auto-calibration:
There appears to be an auto-calibration option for the sub-chain length, although it's not possible to use it in a multi-threaded analysis - so unless you're running a fairly small number of particles, I'm assuming the main usefulness (in large analyses) is to run it once, with one particle, to see where the sub-chain length maxes out. We've done this on my end, and we've ended up with a sub-chain length of 300,000 - which is a bit of a slog.

Is there a reason that this feature has not been included in the tutorials or "how-to-use" on Github? I've run a few tests with single particles and the results are consistent, so it's been validated fairly well - and if I were certain, it would have saved some time. A few tests suggested 200k would be sufficient (i.e., running with increasing sub-chain length until consistent estimates are produced), and if one of those runs hadn't popped up with an odd result that was very different form the previous, I might have relied on that, with no reason to doubt the results.

(2) Stop-factor:
Another parameter we ended up playing with was the stopFactor. The default value (2) with my data means that an analysis would stabilise, unable to improve the likelihood, and it would sit on those values for about a third of the run time - repeated tests have shown that the runs stabilised at approximately 1.10 * H * N, and dropping the stopFactor down to 1.25 has saved a decent number of CPU hours (and given that this analysis has a sub-chain length of 300k, these runs take over a week across 18 CPUs), while still showing a lengthy stabilisation period (which is reassuring).

Is there any reason we shouldn't do this? I'm assuming that a safe stopFactor will vary with the depth and complexity of the data, and the shape of the likelihood space.

(3) resume capability:
Apologies if this has been picked up already, I see it's on the list of features to add - I just wanted to ask if there was any progress there. I sadly don't have a UPS available for my desktop machine (hence the AWS time), which means that a power outage of less than ten seconds lost ten days of processing. I realise that it's probably a lot more complex to add this functionality than it looks, but it would be extremely reassuring to have it in place!

(4) citations:
I'm starting to put together the paper, and I went and had a quick look for other papers that have used the Nested Sampling under BEAST2, and I couldn't find any on Web of Science (either using keywords or citation search) - which isn't surprising, given how new the package is; but it's possible I've missed something. Do you know of any so far?

Thanks again for the help - in spite of some of the difficulties listed above, I'm very excited about this particular analysis the options that it opens up for future research questions.

cheers,
-Kate

Remco Bouckaert

unread,
Aug 19, 2019, 10:31:23 PM8/19/19
to beast...@googlegroups.com
Hi Kate,

Some answers below:

On 14/08/2019, at 5:28 PM, Kate Naughton <kmnau...@gmail.com> wrote:
(1) Auto-calibration:
There appears to be an auto-calibration option for the sub-chain length, although it's not possible to use it in a multi-threaded analysis - so unless you're running a fairly small number of particles, I'm assuming the main usefulness (in large analyses) is to run it once, with one particle, to see where the sub-chain length maxes out. We've done this on my end, and we've ended up with a sub-chain length of 300,000 - which is a bit of a slog.

Is there a reason that this feature has not been included in the tutorials or "how-to-use" on Github? I've run a few tests with single particles and the results are consistent, so it's been validated fairly well - and if I were certain, it would have saved some time. A few tests suggested 200k would be sufficient (i.e., running with increasing sub-chain length until consistent estimates are produced), and if one of those runs hadn't popped up with an odd result that was very different form the previous, I might have relied on that, with no reason to doubt the results.

The auto-calibration was an idea we tried to reduce the runtime of the analysis. Unfortunately, it turned out to either result in biased estimates or runtimes exceeding the run time for fixed length subchain length. Please do not use it.


(2) Stop-factor:
Another parameter we ended up playing with was the stopFactor. The default value (2) with my data means that an analysis would stabilise, unable to improve the likelihood, and it would sit on those values for about a third of the run time - repeated tests have shown that the runs stabilised at approximately 1.10 * H * N, and dropping the stopFactor down to 1.25 has saved a decent number of CPU hours (and given that this analysis has a sub-chain length of 300k, these runs take over a week across 18 CPUs), while still showing a lengthy stabilisation period (which is reassuring).

Is there any reason we shouldn't do this? I'm assuming that a safe stopFactor will vary with the depth and complexity of the data, and the shape of the likelihood space.

The default stop factor of 2 is recommended in the original nested sampling paper by Skilling. There is some theoretical motivation that guarantees that once you reach this many steps the results can be trusted: it is possible some consecutive steps do not make any progress as far as the likelihood is concerned just because the posterior landscape is very flat and stopping based on likelihood not changing alone can be too premature.

I believe that the stop factor is rather conservative, as you suggested. If you have run an analysis with a small number of points without ever getting into an area where the likelihood was stuck for some time and then increasing again, that suggests the posterior landscape is well behaved and you may want to stop earlier than the stop factor suggests.

(3) resume capability:
Apologies if this has been picked up already, I see it's on the list of features to add - I just wanted to ask if there was any progress there. I sadly don't have a UPS available for my desktop machine (hence the AWS time), which means that a power outage of less than ten seconds lost ten days of processing. I realise that it's probably a lot more complex to add this functionality than it looks, but it would be extremely reassuring to have it in place!

I raised an issue for this (nested-sampling package issue #5), but am afraid that it needs a bit of thought to be able to restore everything efficiently.

(4) citations:
I'm starting to put together the paper, and I went and had a quick look for other papers that have used the Nested Sampling under BEAST2, and I couldn't find any on Web of Science (either using keywords or citation search) - which isn't surprising, given how new the package is; but it's possible I've missed something. Do you know of any so far?

The nested sampling package should be cited as:

Patricio Maturana, Brendon J. Brewer, Steffen Klaere, Remco Bouckaert. Model selection and parameter inference in phylogenetics using Nested Sampling. Systematic Biology, syy050, 2018
doi:10.1093/sysbio/syy050

Cheers,

Remco

Reply all
Reply to author
Forward
0 new messages