I'm wondering if there might be really two issues: (1) How are the conditional estimates for sites determined when visit information is missing for one or more years and (2) Can one construct estimates when the visit information is missing?
The second question is a definite "yes" as it is just about predictions based on the estimated coefficients and the observed data. One can certainly make predictions for sites not visited at all so I'm not seeing a problem with such predictions. And getting an estimate for a year with no visits doesn't mean that site/year combination provides any information for the estimators.
If a site contains all NA's from single year dataset with occu, then as you note that site is not used in the estimation of the coefficients. But one can still make a prediction for that site. With colext only years with visits for a site are used to construct estimates.
But maybe I've completely missed the point.
If I have time tomorrow, I'll attach an explicit example as to how the conditional estimates are calculated.
Jim