As background for anyone reading this: I am assuming that you have looked at the convergence diagnostics section in the below document, which I should better reference from the manual.
As noted, convergence is anyway something to be a bit wary of because it doesn't tell us if there is some other solution that has not been found.
Also as noted, the convergence diagnostics indicate that the Geweke plot should go to zero, and the ACT plot should first be sloped and then flat.
What if we are applying these to an ensemble of walkers, and not a single chain?
Let's consider what ESS and EJS (aka AIES) do. Those ensembles of walkers communicate and have position movement / exchange. They don't ignore each other. Both of those algorithms are expected to ultimately have all walkers in the HPD.
Actually, even if they were all independent walkers, all or most walkers would end up inside the HPD.
So, at least qualitatively, we'd expect the aggregate chain from all walkers to have the same behavior.
Since convergence is anyway a "nothing is certain" criteria, and since I happen to know your posteriors looked great, I think it is fine.
We see the correct behavior in your convergence graphs. I've seen sloping down like in the ACT graph before followed by another mode coming in -- but I"ve also seen that happen without another mode coming in, and there is always a chance of another mode coming in.
I would just disclose in the paper that the convergence diagnostics were run on the ensemble's full chain and to not worry further.
My personal view goes something like this: we do what is computationally reasonable, we look for reasonable results, we check that the convergence diagnostics look reasonable, and we report if they do.
Some problems you can run for a year and the convergence diagnostics still would not look reasonable, for a variety of reasons that I won't explain in this email.
If anyone is really interested, they can look at the two pds linked in the below issues card. I'm particularly fond of the first pdf linked because he is very honest about the reality, is that all you can do is "be paranoid" yet accept "maybe" as the answer about whether convergence has occurred or not. But none of us should go into the level of detail he does for trying to ensure convergence, that's not appropriate for the types of problems we work on. This is no different from being unsure if there are other solutions with conventional parameter estimation. The key is to take reasonable efforts to explore parameter space, check that this specific case has converged reasonably for the computational resources and time we have available, then report what our best answer is so far as a society, and then move to the next research problem.