Thanks for posting this example. Here are some suggestions.
Try using "calculate=FALSE" in the call to nimbleModel. For such a large model, the uncompiled calculation can be a substantial portion of the time. When you get to configureMCMC, try using "useConjugacy=FALSE" to make that faster.
PSI[ 1 , 2 ,i,j]<- phi[1,i,j] * ( SW[j] * (psiSW1[1]/(sum(psiSW1[1:8]) )) + WS[j] * (psiWS1[1]/(sum(psiWS1[1:8]) )) )
PSI[ from_state , to_state ,i,j]<- phi[from_state,i,j] * ( SW[to_state] * (psiSW[from_state, to_state]/(sum_psiSW[from_state])) + WS[to_state] * (psiWS[from_state, to_state]/(sum_psiWS[from_state])))
(which would be enclosed an additional for-loops for from_state and to_state)
with the following in an appropriate for-loop
sum_psiSW[from_state] <- sum(psiSW[from_state, 1:8])
sum_psiWS[from_state] <- sum(psiWS[from_state, 1:8])
The reason for creating the sum_psiXX variables is to reduce redundant calculation and simplify the model graph. The reason to write fewer lines of model code with more indexing is that every line of model code creates overhead.
I would remove the E0 declarations and provide E0 in the constants list.
A big change you could try would be to use the dDHMM (dynamic hidden Markov model) in nimbleEcology, reducing the need for the many individual latent states and sampling of dcat variables, which is typically not very efficient.
Similarly I would change this
log(psiSW1[s]) <- betaSW1[s] # from state 1 in summer -> winter
betaSW1[s] ~ dnorm(0,1)
to
log(psiSW[k, s]) <- betaSW1[k, s] # from state 1 in summer -> winter
betaSW1[k, s] ~ dnorm(0,1)
to avoid embedding indices in variable names and have many fewer declarations.
This is a small thing, but you don't need the taus. Instead you can use the sigmas, e.g.
cov[i,1] ~ dnorm(mu1[i], sd = sig1x)
If this still doesn't go well, there are some more in-depth things you could try. For example, if you don't want to use the dDHMM because you want to sample the individual latent states, you could skip over all of the PSI and E arrays and write a custom distribution that uses the ingredients directly. It would look something like:
alive[i,j] ~ dMyCRmodel(phi[ alive[i, j-1], i, j], psiSW[1:9, 1:8], psiWS[1:9, 1:8] ). # I may not have that just right, but the idea is to pass the ingredients as arguments / parent nodes
which internally could use the elements needed without ever forming the large arrays. (The alive[i,j] should still be sampled by a categorical sampler, but check the configureMCMC output to be sure.) Some of the slow performance in JAGS or NIMBLE is likely due to the large model graph from the large arrays. A benefit of NIMBLE's extensibility (you can write new functions and distributions) is to avoid that kind of inefficient (but common) use of the model language.
Separately, or in addition, one could write a sampler that makes use of latent state structure of the model.
I hope some of that helps. Let us know how it goes.
Perry