Hi Sara,
Thanks for the note and great question! I agree that this is a situation where that error message is a bit confusing. Before stPGOcc() runs the model, it will create a new data object that gets rid of all site/year/visit combinations in "data$y" for which there is an NA value. Because the situations where you have NA values in the site-level covariates lines up with situations where you don't have observed detection-nondetection data, the values you put in there won't influence the actual model fitting process (i.e., the parameter estimates won't change). However, for all multi-season models in spOccupancy, the model will predict and return occurrence probability (psi) and latent occurrence (z) for all combinations of site and year, regardless of if there was any data used to fit the model. So, in the "psi.samples" and "z.samples" components of the resulting object from "stPGOcc()", there will be predictions for the unsampled site/year combinations, and those inherently depend on the covariate values that you supply to "occ.covs". If you aren't super interested in the "psi.samples" or "z.samples" at the non-sampled site/year combinations, then you can just replace the NA values in "occ.covs" with some non-NA value and then just make sure you don't interpret those site/year combinations in any subsequent interpretation of those when you're summarizing the results from the model.
Now in a situation where you did want to get the estimates at non-sampled site/year combinations (or in a situation where you had missing covariate values at site/years that were actually surveyed), it does become relevant how you impute the missing value into the model. My perspective on this is that there isn't one way that works best in all situations and rather it depends on the specific covariate that you're working with. I would just say you should do the approach that you think will most accurately represent the covariate at those locations (and of course acknowledging that you did impute the covariate value if you end up using predictions at that site/year combination). For example, if you have a site/year level covariate with missing values in some years, it might be best to use the mean across that site (row mean) if there is very little change in the covariate over the years and you expect there to be substantial correlation in the values at the site across different years. Alternatively, if you had some covariate like "time of day" as a survey-level covariate on detection probability and it wasn't always recorded, it could be better to use the overall mean value if the time the survey takes place is fairly random and not tied to the specific site. You should calculate the means using the values that you supply in "occ.covs" (i.e., in your case use the means on the actual scale of the covariate since you're standardizing in the formula).
Let me know if I can clarify any of that!
Cheers,
Jeff