Convergence depending on initial values

306 views
Skip to first unread message

Rémi

unread,
Jun 22, 2021, 8:53:12 AM6/22/21
to nimble-users

Hi all,

On the same model as my previous post, some NIMBLE code get to generate chains that converge to a different point depending on the initialization of some parameters while as I understand it these parameters are first updated from their full conditionnal posterior distribution so the initialization shouldn't change anything :

MODEL:
I'm working on CMR with a latent multinomial model dealing with misidentifications where latent histories are filled with 0 (non captured), 1 (captured and correcly identified) and 2 (captured and misidentified) and the vector x (that counts the number of times each latent history has been seen) follows a multinomial.
The parameters are the capture probability "p" and the identification probability "a".

METROPOLIS ALGORITHM:
1) update p and a (sample them from their posterior conditionnal on x)
-> That step does not depend on the initial values of p and a

2.1) sample x_prop
2.2) calculate metropolis ratio r = f(x_prop | p, a) / f(x | p, a)
2.3) accept xprop with proba r
-> That step depends only on the values of p and a that were gererated previously, not on the initialization, right ?

So my question is how is it possible that if I change the initial values of p and a I give to nimble the chains change their convergence distribution ? (The initialization of x does not change anything, all chains converge toward the same posterior for identical initialization of p and a)
Is there something I misunderstood about the initializations or values used inthe algorithm (like the r ratio that would be calculated with the values of previous iteration and not the actual values of the model) ?

Thank you,

Best regards,
Rémi

David Pleydell

unread,
Jun 22, 2021, 11:13:49 AM6/22/21
to nimble-users
Hi Remi

When a posterior distribution is multi-modal, standard MCMC algorithms will often converge to one of those modes and then struggle to make jumps between modes. 

One alternative is adaptive parallel tempering.  

I have implemented many of the ideas presented in these two papers in a nimble package called nimbleAPT.  
The package is available here https://github.com/DRJP/nimbleAPT (installation instructions are at the bottom of that page) 

Once installed, in R you can get a quick overview of what the package can do by running 'vignette("nimbleAPT-vignette")'. You can also run 'demo(APT_demo)' for further insight. 

If you use the package and find it useful, please cite it as specified here https://zenodo.org/record/1049549#.YNH1znXnikA

Any feedback concerning the package would be much appreciated

Best regards
David

Chris Paciorek

unread,
Jun 22, 2021, 3:38:29 PM6/22/21
to Rémi, nimble-users
Hi Rémi,

As David says if you start from different initial values, then multimodality in the posterior could cause it to look like you are getting different posteriors. I don't know if there is multimodality in your case or not.

As far as your question about the initialization not changing anything, if your configuration uses Metropolis (i.e., NIMBLE's RW or RW_block samplers or related samplers), initialization DOES matter. E.g. if you have Metropolis sampler on 'p' your initial sample for 'p' will be proposed based on its initial value and acceptance/rejection will depend on the current values of other parameters in the model that affect the conditional posterior for 'p'. Same thing when the MCMC then goes to propose 'a' using another Metroopolis sampler. So saying "That step does not depend on the initial values of p and a" is not correct, assuming I am correctly understanding what you are saying.  On the other hand, if the updates for 'p' and 'a' were based on conjugate samplers then for whichever parameter is sampled first, the initial value for that parameter won't matter. But the update for the first sampler will differ depending on the initial value of the second parameter, if the second parameter is involved in the conditional posterior for the first parameter. 

Hope that helps.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/55df534c-2870-463d-a824-e3cc7ab63d5dn%40googlegroups.com.

Jose Jimenez Garcia-Herrera

unread,
Jun 23, 2021, 3:50:34 PM6/23/21
to David Pleydell, nimble...@googlegroups.com

Hi David. I've been browsing your nimbleAPT package. In some codes the improvements are very important. I have a question about it. How to deal with conjugate sampler?

 

Best regards,

Jose

--

David Pleydell

unread,
Jun 24, 2021, 6:57:38 PM6/24/21
to nimble-users
Hi Jose

Many thanks for this feedback. It's a good question!

The package was not designed with conjugate updates in mind. Moreover, having just taken a brief look at nimble's 'MCMC_conjugacy.R' file for the first time, I suspect it could be quite a challenge to adapt nimbleAPT to somehow to include conjugate updates. If that's an challenge you (or anyone else reading this) would like to look into, then we could certainly discuss further off-group. 

Right now, the easiest option would be to just drop using conjugate updates (removeSamplers()) and switch to Metropolis-Hastings, e.g. using one or more  'RW_block_tempered' samplers (for continuous parameters). I realise the list of samplers in the package could be more extensive, so if there are any samplers in particular that you would like to see added then do let me know. 

Best regards
David 

Reply all
Reply to author
Forward
0 new messages