Convergence diagnostics

7 views
Skip to first unread message

Chris

unread,
Nov 13, 2023, 12:05:14 PM11/13/23
to PEUQSE-users
Hey Ashi! I have a question related to getting convergence diagnostics, I apologize it is a little long.

To fit within time constraints of the HPC cluster we're using, we repeatedly run a relatively short mcmc chain with the "continue_sampling" option on for the run. The problem is, our convergence diagnostics plots (Geweke and autocorrellation plots) are not for the entire run, just each smaller set of samples.

For Ensemble Slice sampling many separate mpi runners, It seems like the function 'getConvergenceDiagnostics'  requires a 3d array of data with each dimension being the walker sample number, the actual walker/mpi process number, and the parameter values (i.e. numSamples, numChains, numParameters).

I was wondering if there was a meaningful way to calculate the convergence diagnostics if we only have the cumulative post_burn_in_samples. We unfortunately did not set the "mcmc_store_samplingObject" option to True, which I think would have given us more information for each individual runner.

I tried the following code using the saved parameter estimation object, and it did give me an answer, but I am not sure if the plots that it gave me are at all useful, since it is not checking each runner independently:
```
param_names = pe_object.UserInput.model['parameterNamesAndMathTypeExpressionsDict']
sample_shape = pe_object.post_burn_in_samples.shape
chain_samples = pe_object.post_burn_in_samples.reshape(sample_shape[0], 1, sample_shape[1])
PEUQSE.InverseProblem.calculateAndPlotConvergenceDiagnostics(
    chain_samples,
    param_names,
    pe_object.UserInput.scatter_matrix_plots_settings,
    pe_object.UserInput.directories['graphs'],
    showFigure = False
     
)
```
This is the output:
Screen Shot 2023-11-13 at 11.49.56 AM.pngScreen Shot 2023-11-13 at 11.51.23 AM.png

TLDR, we would like to evaluate how well our model converged using the convergence statistics, but we only have the cumulative post burn in samples. We are wondering if it is possible without having to re-run our model.
Chris B.

Aditya Ashi Savara

unread,
Nov 13, 2023, 12:56:52 PM11/13/23
to Chris, PEUQSE-users
As background for anyone reading this: I am assuming that you have looked at the convergence diagnostics section in the below document, which I should better reference from the manual.


As noted, convergence is anyway something to be a bit wary of because it doesn't tell us if there is some other solution that has not been found.
Also as noted, the convergence diagnostics indicate that the Geweke plot should go to zero, and the ACT plot should first be sloped and then flat.

What if we are applying these to an ensemble of walkers, and not a single chain?

Let's consider what ESS and EJS (aka AIES) do.  Those ensembles of walkers communicate and have position movement / exchange. They don't ignore each other.  Both of those algorithms are expected to ultimately have all walkers in the HPD.
Actually, even if they were all independent walkers, all or most walkers would end up inside the HPD.
So, at least qualitatively, we'd expect the aggregate chain from all walkers to have the same behavior.
Since convergence is anyway a "nothing is certain" criteria, and since I happen to know your posteriors looked great, I think it is fine.
We see the correct behavior in your convergence graphs.  I've seen sloping down like in the ACT graph before followed by another mode coming in -- but I"ve also seen that happen without another mode coming in, and there is always a chance of another mode coming in.
I would just disclose in the paper that the convergence diagnostics were run on the ensemble's full chain and to not worry further.

My personal view goes something like this: we do what is computationally reasonable, we look for reasonable results, we check that the convergence diagnostics look reasonable, and we report if they do. 
Some problems you can run for a year and the convergence diagnostics still would not look reasonable, for a variety of reasons that I won't explain in this email.

If anyone is really interested, they can look at the two pds linked in the below issues card. I'm particularly fond of the first pdf linked because he is very honest about the reality, is that all you can do is "be paranoid" yet accept "maybe" as the answer about whether convergence has occurred or not.  But none of us should go into the level of detail he does for trying to ensure convergence, that's not appropriate for the types of problems we work on. This is no different from being unsure if there are other solutions with conventional parameter estimation. The key is to take reasonable efforts to explore parameter space, check that this specific case has converged reasonably for the computational resources and time we have available, then report what our best answer is so far as a society, and then move to the next research problem.





--
You received this message because you are subscribed to the Google Groups "PEUQSE-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to PEUQSE-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/PEUQSE-users/9c7ac8bf-5a0d-4a5a-851e-0ffd07639b75n%40googlegroups.com.

Aditya Ashi Savara

unread,
Nov 13, 2023, 1:03:32 PM11/13/23
to Chris, PEUQSE-users
That reminds me. That was a feature we wanted to add in PEUQSE that we never got around to doing yet. I don't know if it's in that unmerged branch or not. I wanted to have a feature that would show the *shape* of the posterior across time, then we could see if a new mode was coming in or not, because you'd literally see a new peak growing into your posterior. That would let you know if you should run for longer. Similar about if the peak shape was starting to change. Actually, that's also similar to the idea at the bottom of the issues card of my last email, because looking at the "evidence" is similar to comparing the shape of the posterior to that of the prior. But I really wanted to compare the posterior to itself over time, and I still do want that feature in PEUQSE, even though i may not have made an issues card for it.

Chris

unread,
Nov 13, 2023, 1:33:06 PM11/13/23
to PEUQSE-users
That makes sense, as always thank you for the patience and the detailed explanation. I think generally the convergence statistics for the full run will satisfy our audience for the paper.

Regards,
Chris B.
Reply all
Reply to author
Forward
0 new messages