Dear all
I am currently running a two-year single-species dynamic occupancy model (DOM) using unmarked. The data is collected from the SABAP2 citizen science project, in which different sites are surveyed a different number of times each year. The examples I have seen all consist of balanced datasets with the same sites surveyed each year, however, this is not the case for my situation.
My situation is as follows: I have 3935 sites surveyed in 2010 and 3636 sites surveyed in 2011. Over these two years there at 5492 unique sites surveyed and 2079 of these sites are survyed in BOTH 2010 and 2011. The number of surveys per season are capped at 50 for each site.
This leads me to ask questions regarding how to format my data and some more technical questions about how the package treats sites that were surveyed in one year and not another.
I was hoping to set up my data in terms of the 5492 unique sites surveyed over these two years. This means that my detection history, siteCovs, obsCovs and yearlySiteCovs are all objects with 5492 rows representing each unique site.
But, since there were only 3935 sites surveyed in 2010, I am concerned about how the initial occupancy (psiformula) is estimated for sites that were not surveyed in 2010. I cannot have differing rows for the siteCov and other arguments, otherwise I get an error message regarding the dimensionality.
For the first primary period, does the model work by estimating coefficients using the data from the sites that were surveyed in that first period, and then project the fitted model on to the sites that were not surveyed in year 1? And then similarly for sites with no data in year 2, I would imagine the model uses the sites that have data in year 2 to estimate colonisation and extinction and then projects this on to sites that were not surveyed in year 2 in order to estimate occupancy? Or I suppose it may estimate these parameters using only the subset of sites surveyed both times?
Should I rather set up my data to include sites that are common to every season in a study (2079), or to set it up based on the sites observed in the first year of study (3935)?
I hope my question is making sense!
Thanks
James