Hi Ben,
Thanks again for such a quick turnaround. It is so great to have the model running now without yielding errors! Apologies for the confusion over the -Inf vals...I just needed to convince myself that come of the problems I was having weren't due to the exponents. Also, thank you for clarifying that additional dimensions are allowed beyond the stated requirement for dDHMMo.
I ran the model with Nimble using both the marginalized and unmarginalized (discrete version) code you updated for unequal time intervals. Interestingly, the discrete version gets a couple of the posteriors close, with the exception of sdS and mean.r, but the posteriors for the marginalized version are wonky (they don't match the simulated values). Yet, when I run the model using JAGS with the same simulated values (the discrete version as outlined in BPA Ch. 9.5 where y[i,t] ~ dcat(po[z,[i,t], i, t-1,], the posteriors from the JAGS run are close for all params). So it seems the model is still not translating well going from JAGS to Nimble? Here's the output for JAGS and Nimble (red vertical dashed lines on the density plots indicate the simulated values) :
JAGS (discrete, unequal time intervals):
2 chains, each with 7000 iterations (first 500 discarded), n.thin = 3
n.sims = 4332 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
ann.s[1] 0.531 0.058 0.417 0.493 0.532 0.570 0.646 1.001 4300
ann.s[2] 0.499 0.045 0.413 0.468 0.499 0.529 0.588 1.002 1500
mean.R 0.381 0.020 0.343 0.367 0.381 0.394 0.420 1.005 330
mean.RR 0.947 0.009 0.927 0.941 0.947 0.953 0.963 1.001 4300
mean.p 0.275 0.015 0.246 0.264 0.275 0.285 0.307 1.005 350
mean.r 0.611 0.046 0.520 0.579 0.612 0.643 0.698 1.001 4300
mean.s 0.910 0.025 0.852 0.896 0.913 0.928 0.949 1.007 360
s[1] 0.912 0.051 0.788 0.884 0.923 0.950 0.981 1.001 2100
s[2] 0.959 0.027 0.890 0.945 0.965 0.979 0.993 1.001 4300
s[3] 0.893 0.039 0.806 0.868 0.897 0.922 0.956 1.002 960
s[4] 0.952 0.030 0.878 0.937 0.958 0.974 0.991 1.009 300
s[5] 0.678 0.054 0.570 0.642 0.678 0.715 0.781 1.001 2300
s[6] 0.922 0.031 0.851 0.904 0.926 0.945 0.971 1.001 4300
s[7] 0.925 0.027 0.863 0.909 0.929 0.945 0.969 1.006 460
s[8] 0.909 0.029 0.844 0.891 0.913 0.930 0.958 1.001 4300
s[9] 0.943 0.034 0.862 0.923 0.949 0.969 0.989 1.003 860
s[10] 0.646 0.047 0.555 0.614 0.645 0.678 0.736 1.003 810
s[11] 0.940 0.026 0.879 0.924 0.944 0.959 0.980 1.006 360
s[12] 0.905 0.031 0.839 0.886 0.907 0.928 0.956 1.003 780
s[13] 0.905 0.031 0.836 0.885 0.909 0.927 0.954 1.001 4300
s[14] 0.919 0.072 0.725 0.894 0.940 0.968 0.990 1.056 450
sdS 0.542 0.806 0.000 0.054 0.248 0.687 2.739 1.001 4300
deviance 2422.954 21.151 2383.323 2408.372 2421.992 2436.450 2466.201 1.031 56



Gray circles/purple shading = JAGS estimated values
Black circles = values used to simulate data
Nimble (discrete, unequal time intervals):
user system elapsed
852.58 5.12 881.03
mean sd 2.5% 50% 97.5% Rhat n.eff
ann.s[1] 0.551 0.057 0.440 0.552 0.660 1.04 2123
ann.s[2] 0.510 0.043 0.427 0.510 0.599 1.01 2361
mean.R 0.364 0.018 0.328 0.364 0.399 1.04 2129
mean.RR 0.947 0.010 0.926 0.947 0.964 1.00 9975
mean.p 0.265 0.015 0.236 0.265 0.295 1.01 1968
mean.r 0.891 0.015 0.861 0.892 0.918 1.06 1924
mean.s 0.947 0.023 0.889 0.950 0.983 1.03 58
s[1] 0.941 0.046 0.823 0.953 0.995 1.02 1028
s[2] 0.983 0.018 0.932 0.989 1.000 1.01 1314
s[3] 0.921 0.033 0.843 0.925 0.972 1.01 1632
s[4] 0.991 0.009 0.966 0.993 1.000 1.00 1525
s[5] 0.618 0.051 0.515 0.620 0.715 1.02 2344
s[6] 0.940 0.027 0.876 0.944 0.981 1.00 1800
s[7] 0.944 0.023 0.892 0.946 0.981 1.00 1999
s[8] 0.935 0.023 0.882 0.937 0.971 1.00 1916
s[9] 0.993 0.007 0.974 0.995 1.000 1.00 1635
s[10] 0.587 0.043 0.501 0.586 0.669 1.03 1692
s[11] 0.966 0.018 0.921 0.969 0.991 1.00 1651
s[12] 0.937 0.023 0.884 0.939 0.973 1.01 1525
s[13] 0.938 0.021 0.891 0.940 0.972 1.01 1228
s[14] 0.994 0.006 0.978 0.996 1.000 1.00 1571
sdS 1.754 0.480 1.056 1.670 2.930 1.00 608



Gray circles/purple shading = Nimble estimated values
Black circles = values used to simulate data
Nimble (marginalized, unequal time intervals):
user system elapsed
1011.45 2.65 1036.67
mean sd 2.5% 50% 97.5% Rhat n.eff
ann.s[1] 0.955 0.020 0.915 0.956 0.994 1.00 2595
ann.s[2] 0.879 0.026 0.826 0.879 0.927 1.00 2860
mean.R 0.096 0.006 0.085 0.096 0.107 1.00 2231
mean.RR 0.947 0.009 0.927 0.947 0.964 1.00 9506
mean.p 0.173 0.007 0.161 0.173 0.187 1.02 2331
mean.r 0.989 0.009 0.966 0.992 1.000 1.01 1157
mean.s 0.984 0.007 0.968 0.985 0.995 1.09 90
s[1] 0.995 0.004 0.985 0.995 0.999 1.00 1925
s[2] 0.997 0.003 0.989 0.998 1.000 1.00 1429
s[3] 0.983 0.007 0.967 0.984 0.994 1.00 2277
s[4] 0.997 0.003 0.989 0.997 1.000 1.01 1472
s[5] 0.931 0.015 0.899 0.932 0.958 1.00 2239
s[6] 0.985 0.007 0.968 0.986 0.995 1.00 2394
s[7] 0.982 0.008 0.964 0.983 0.993 1.00 2553
s[8] 0.974 0.009 0.953 0.975 0.989 1.00 2301
s[9] 0.996 0.003 0.988 0.997 1.000 1.00 1609
s[10] 0.886 0.020 0.844 0.887 0.922 1.00 2259
s[11] 0.986 0.007 0.969 0.987 0.997 1.00 2329
s[12] 0.970 0.011 0.945 0.971 0.987 1.00 2021
s[13] 0.965 0.012 0.938 0.966 0.985 1.00 2342
s[14] 0.996 0.004 0.985 0.997 1.000 1.00 1759
sdS 1.501 0.413 0.878 1.442 2.425 1.02 608



Gray circles/purple shading = JAGS estimated values
Black circles = values used to simulate data
It seems like we are really close....maybe just missing something simple? As you can see in the discrete Nimble version, mean.r is being estimated really high for some reason, as well as sdS (I'm using a gamma prior on it, but the posterior doesn't reflect that distribution). Then, things go really haywire when we try to use dDHMMo with the marginalized version. I should also say that I tried running the marginalized version without a random effect of time on survival and that didn't seem to improve things. I also tried indexing on individual with both the discrete and marginalized versions and that didn't change things either.
I hope this helps to make the problem more clear...ahh, it's difficult to do this over email! :) It would be really great to get the marginalization going as it should theoretically save a lot of computational time when it comes to running the actual data? As the code is currently written, the model run time is longest for marginalization and least for the JAGS run (though to be fair, I had to really bump up the iterations in Nimble to reach convergence).
The version of code I used to generate the results above is attached. The JAGS code I sent earlier remains unchanged.
Erica