distributions to approximate posteriors in Bayesian updating

91 views
Skip to first unread message

Quresh Latif

unread,
Nov 10, 2023, 2:11:22 PM11/10/23
to nimble-users
Hi all. I'm wondering if anyone has thoughts or can suggest resources that describe useful distributions of those available in Nimble that are sufficiently flexible for approximating posterior distributions for use as priors in Bayesian updating. The conundrum I am facing is neatly described in this post. It seems to me that there is plenty of theory available for Bayesian updating, but much less practical guidance on implementation, especially if we want to use available software such as Nimble. As an example, I have attached histograms showing the actual and a normal approximation of a posterior parameter distribution from a fitted model. As you can see, they look quite different, so it does not appear that a normal distribution will be a good choice for approximating this posterior. On the other hand, I don't know how close the approximation needs to match the actual posterior to function effectively (i.e., for estimates following analysis of all data to look similar regardless of whether we use Bayesian updating vs just analyzing all of the data).

Any thoughts or references on this topic would be much appreciated.
Posterior_actual.png
Posterior_normalApprox.png

Daniel Turek

unread,
Nov 19, 2023, 7:36:34 AM11/19/23
to Quresh Latif, nimble-users
Quresh, one thought, if you're comfortable installing nimble from a different branch, then on branch "prior_samples_sampler" I drafted a sampler which uses an existing set of samples (presumably the posterior from a pre-existing MCMC run) as the prior for one or more parameters in a new model.  One each MCMC iteration, this sampler will use one of those (existing) samples as the value for the target node(s) to which this sampler is assigned.  The numeric samples are provided to the sampler (at the time of sampler assignment) as an array, with dimension (nSamples x nDim), where nSamples is the number of samples you have from the previous MCMC run, and nDim is the number of dimensions in the target node(s).  The sampler is assigned as:

conf$addSampler(targetNodes, type = "prior_samples", samples = samplesArray)

There's certainly a chance of running into memory issues for large dimensions or numbers of samples, but for reasonably sized problems it should work ok.  Maybe this will help out for your problem.

I'm also copying the documentation for the "prior_samples_sampler" below, so you can read more about it.

@section prior_samples sampler:

The prior_samples sampler uses a provided set of numeric values (\code{samples}) to define the prior distribution of one or more model nodes.  One every MCMC iteration, the prior_samples sampler takes value(s) from the numeric values provided, and stores these value(s) into the target model node(s).  This allows one to define the prior distribution of model parameters empirically, using a set of numeric \code{samples}, presumably obtained previously using MCMC.  The \code{target} node may be either a single scalar node (scalar case), or a collection of model nodes.

The prior_samples sampler provides two options for selection of the value to use on each MCMC iteration.  The default behaviour is to take sequential values from the \code{samples} vector (scalar case), or in the case of multiple dimensions, sequential rows of the \code{samples} matrix are used.  The alternative behaviour, by setting the control argument \code{randomDraws = TRUE}, will instead use random draws from the \code{samples} vector (scalar case), or randomly selected rows of the \code{samples} matrix in the multidimensional case.

If the default of sequential selection of values is used, and the number of MCMC iterations exceeds the length of the \code{samples} vector (scalar case) or the number of rows of the \code{samples} matrix, then \code{samples} will be recycled as necessary for the number of MCMC iterations.  A message to this effect is also printed at the beginning of the MCMC chain.

Logically, prior_samples samplers might want to operate first, in advance of other samplers, on every MCMC iteration.  By default, at the time of MCMC building, all prior_samples samplers are re-ordered to appear first in the list of MCMC samplers.  This behaviour can be subverted, however, by setting nimbleOptions(MCMCorderPriorSamplesSamplersFirst = FALSE).

The prior_samples sampler can be assigned to non-stochastic model nodes (nodes which are not assigned a prior distribution in the model). In fact, it is recommended that nodes being assigned a prior_samples are not provided with a prior distribution in the model, and rather, that these nodes only appear on the right-hand-side of model declaration lines.  In such case that a prior_samples sampler is assigned to a nodes with a prior distribution, the prior distribution will be overridden by the sample values provided to the sampler; however, the node will still be a stochastic node for other purposes, and will contribute to the model joint-density (using the sample values provided relative to the prior distribution), will have an MCMC sampler assigned to it by default, and also may introduce potential for confusion.  In this case, a message is issued at the time of MCMC building.

The prior_samples sampler accepts the following control list elements:
\code{samples}. A numeric vector or matrix.  When the \code{target} node is a single scalar-valued node, \code{samples} should be a numeric vector.  When the \code{target} node specifies d > 2 model dimensions, \code{samples} should be a matrix containing d columns.  The \code{samples} control argument is required.
\code{randomDraws}. A logical argument, specifying whether to use a random draw from \code{samples} on each iteration.  If \code{samples} is a matrix, then a randomly-selected row of the \code{samples} matrix is used.  When \code{FALSE}, sequential values (or sequential matrix rows) are used (default = \code{FALSE}).





On Fri, Nov 10, 2023 at 2:11 PM Quresh Latif <quresh...@birdconservancy.org> wrote:
Hi all. I'm wondering if anyone has thoughts or can suggest resources that describe useful distributions of those available in Nimble that are sufficiently flexible for approximating posterior distributions for use as priors in Bayesian updating. The conundrum I am facing is neatly described in this post. It seems to me that there is plenty of theory available for Bayesian updating, but much less practical guidance on implementation, especially if we want to use available software such as Nimble. As an example, I have attached histograms showing the actual and a normal approximation of a posterior parameter distribution from a fitted model. As you can see, they look quite different, so it does not appear that a normal distribution will be a good choice for approximating this posterior. On the other hand, I don't know how close the approximation needs to match the actual posterior to function effectively (i.e., for estimates following analysis of all data to look similar regardless of whether we use Bayesian updating vs just analyzing all of the data).

Any thoughts or references on this topic would be much appreciated.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/83d0b79a-e600-40fa-9056-98eb9fd26ff7n%40googlegroups.com.

Quresh Latif

unread,
Nov 19, 2023, 4:19:34 PM11/19/23
to Daniel Turek, nimble-users

Wow! Sounds like the perfect feature for Bayesian updating. I thought about my current problem a bit more and decided that multivariate normal approximations should suffice for priors as I am not actually drawing inference from the posteriors but rather running simulations to estimate and compare power for alternative sampling and trend scenarios. However, our team has discussed incorporating Bayesian updating into the analysis for our long-term monitoring program to avoid having to analyze all of the data anew every year, and thus improve efficiency of the analysis. Very cool!

 

Quresh S. Latif 
Research Scientist
Bird Conservancy of the Rockies

Phone: (970) 482-1707 ext. 15

www.birdconservancy.org

Quresh Latif

unread,
Nov 19, 2023, 4:27:41 PM11/19/23
to Daniel Turek, nimble-users

Oh, I realized what you meant by branch now. You’re saying this isn’t available yet in the main version of Nimble on CRAN? When do you expect this feature to be incorporated into the main branch?

 

If you’re looking for folks to test it out, I could try it out in my simulations. I have never installed a package from a branch, but I could give it a shot.

 

Quresh S. Latif 
Research Scientist
Bird Conservancy of the Rockies

Phone: (970) 482-1707 ext. 15

www.birdconservancy.org

 

From: Daniel Turek <danie...@gmail.com>
Sent: Sunday, November 19, 2023 5:36 AM
To: Quresh Latif <quresh...@birdconservancy.org>
Cc: nimble-users <nimble...@googlegroups.com>
Subject: Re: distributions to approximate posteriors in Bayesian updating

 

Quresh, one thought, if you're comfortable installing nimble from a different branch, then on branch "prior_samples_sampler" I drafted a sampler which uses an existing set of samples (presumably the posterior from a pre-existing MCMC run) as the prior for one or more parameters in a new model.  One each MCMC iteration, this sampler will use one of those (existing) samples as the value for the target node(s) to which this sampler is assigned.  The numeric samples are provided to the sampler (at the time of sampler assignment) as an array, with dimension (nSamples x nDim), where nSamples is the number of samples you have from the previous MCMC run, and nDim is the number of dimensions in the target node(s).  The sampler is assigned as:

Daniel Turek

unread,
Nov 21, 2023, 9:32:02 AM11/21/23
to Quresh Latif, nimble-users
Quresh, thanks for the kinds words, and apologies that this feature isn't available in the main package just yet.  TBD on when it will get push into a release.

In the meantime, you can give it a go by installing nimble from the "prior_samples_sampler" branch.  This is very straightforward to do, by running the code below:

## start a new R session
remove.packages("nimble")
library(remotes)
remotes::install_github("nimble-dev/nimble", ref = "prior_samples_sampler", subdir = "packages/nimble")
library(nimble)


Please let me know if this works for you, or if you have any questions, or if you try using the new prior_samples sampler.

Cheers,
Daniel

Reply all
Reply to author
Forward
0 new messages