Hello again and thank you for the useful HDDM toolbox that I was just recently discovered.
My current problem is more of a conceptual and it relates to the assumption on the distribution of z_j (inverse logit of a Normal), which may me problematic (even with non-informative priors) if comparing a biased-drift and biased-starting-point models as I do.
This is because even for the non informative prior of z_j, σ_z cannot get values much larger than 0.7. So if for example μ_z=0, then assuming for simplicity that all subjects have the same threshold --> z_j would always have a unimodal distribution (see attached figure, z=θ/IC).

To see why this is problematic, consider the distribution of p_bias over the sample (trials in which alternative A was chosen divided by the amount of trials, for each participant) for the simple case of equal thresholds and no to trial-to-trial variability. You can analytically show that for the 'biased z' model, p_bias will be distributed the same as z_j. For equal thresholds and no trial-to-trial variability, this is:

So if z_j is expected to have a unimodal distrubution (due to the priors), then p_bias is also expected to have a unimodal distribution. Then if the observed p_bias distribution is Uniform or bimodal - both informative and non-informative priors cannot well account for the data.
This is not the case for the 'biased v' model, as the distribution of p_bias can get both unimodal and bimodal, with a Normal distribution of v_j and σ_v within the range of both informative and non informative priors (see the figure below, where observed=original data and expected is given by PPC data, based on each of the two models).

Theoretically, I could have just modified the code so that σ_z may get larger values, allowing for the distribution of expected p_bias to get bimodal. Though I am not sure whether the historical reason for using invlogit(Normal(μ_z,σ_z)) is mainly due to the 0<z<1 transform (so limiting σ_j is because we have no reason to assume z_j is extremely wide or bimodal), or that it carries some additional assumption that can allow the use of a bimodal z_j distribution.
Any thoughts on this matter would be highly appreciated and I will be extremely happy discussing that further.
Best,
Lior