Thanks for posting to the list.
In general it is hard to predict just what is feasible or with how much computational effort. I think you can do it, but it's hard to say for sure. Here are some thoughts on this problem.
If it is size-dependent survival with imperfect detection, would dCJS be sufficient? That would compute faster than an HMM. The survival and/or detection probabilities can be filled with values calculated from the size-dependence and year random effects.
From a purely computational perspective, the year random effects undercut the benefit of the marginalization partially but not entirely. That is because if a year random effect is updated by MCMC, the dependent calculations will include the detection history of every fish (possibly) alive during that year, yet those calculations will cover the entire life of the fish. It still has the benefit of removing the discrete latent states after the last time a fish is known to be alive.
In the case of using individual latent states, a potentially very useful trick is to set the latent states for non-detection years that are between detection years as *data*. The animal must be alive between detections, and providing this information as data avoids wasted computation from sampling those states (with never a change in value) via MCMC.
The idea of using a weighted likelihood for individuals with identical capture histories is a good one but would not work if the individuals have different sizes or lived in different years. If you can use it, the trick was reported in Turek et al. 2016
and recently given as a worked example in the capture-recapture workshop materials
put together by Olivier Gimenez (See "Class 8 live demo"). Also do you have river effects?
Block sampling could be useful, perhaps for coefficients (do you have more covariates than size?), perhaps for sets of adjacent year effects.
Sometimes it is effective to cut the matrices or arrays of transition or detection probabilities out of the model. For example, using dCJS or dHMM (or dDHMM) or simply dcat, a common scheme is to use deterministic declarations to fill entries of a large matrix or array (indexed by individual, time, and/or stage), rows or slices or which are then used in dCJS, dHMM, or dcat. That can create a very large number of nodes in a case like yours with 40K fish. An alternative is to write a customized version of dCJS (or one of the others) that takes as input the underlying parameters and/or covariates used to calculate entries of the large matrix or array. Sometimes many of the values are 0s and there are very few actual inputs, so the dCJS (or other) steps can be written directly in terms of those inputs, and the large matrix or array never needs to be formed in the model or the customized distribution. However, this approach could run into limitations for a large number of covariates.
I hope that helps!