effective open multi-season model by stacking years (so site/years = sites)

3,847 views
Skip to first unread message

L. Jay Roberts

unread,
Jun 3, 2014, 3:50:29 PM6/3/14
to unma...@googlegroups.com
I have read Andy and others suggest, typically I think as a solution to problems associated with sparse data, to include multiple seasons in a single-season type model by treating each season-site combination as a separate site, and then include year as a site covariate. So you end up with more "sites" and thus a larger sample size for fitting occupancy and detection covariate relationships. I've tried this with a couple datasets and it seems worthwhile in that the modeled associations with occupancy covariates like habitat and topography have been cleaner and occupancy estimates are not very different than models using the more standard site x visit data structure. 

I have a few general questions that I would appreciate opinions on:

1) do we consider the use of each site multiple times across seasons (and thus the assumption that they are independent) pseudo-replication? And as a result is it not entirely appropriate to calculate seasonal occupancy, and turnover/persistence/colonization/etc, from subsets of sites by summarizing site-specific estimates by season?  We are not accounting for the fact that occupancy at sites is correlated across seasons after all. I feel like maybe this is true, but I'm not sure which way the bias might push the estimates - I feel like it will bias occupancy estimates lower than a traditional multi-season model since the multi-season model might inflate the occupancy estimate at a site during seasons in which the species was not detected but was detected in adjacent season. 

2) though site-occupancy might be subject to some pseudo-replication issues, the fitting of covariate relationships for both occupancy and detectability will benefit from more opportunities to evaluate the effects of covariate sets especially when high-quality habitat sites are consistently occupied across seasons and lower-quality habitat sites are only occupied in some seasons - in the latter case the set of covariates at the "low quality" sites will be attributed to unoccupied in years where the species was not detected, which seems more appropriate than in the traditional multi-season model where those covariates are more likely to be attributed to occupied. 

3) finally can anyone point me to publications that have used this data structure? I haven't been able to find any. I recently received reviewer comments on a manuscript where I used this method and they flagged the idea of stacking seasons as violation of the closure assumption, which is interesting because it is actually assuming a completely open population across seasons. So I obviously need to clarify the methods, or the reviewer is not intimately familiar with occupancy, or a bit of both. 

Jeffrey Royle

unread,
Jun 3, 2014, 6:24:10 PM6/3/14
to unma...@googlegroups.com
hi Jay,

(1)  I don't see a problem computing seasonal occupancy from this "stacked" model, ditto turnover/persistence etc.. although I think the meaning of those things is less clear since the model destroys the temporal dynamic structure when you use the stacked data.
 I don't believe there should be a systematic bias due to using the stacked data.

(2) I don't know

(3) I can't think of a reference that uses this idea.  Anyone?  I would disagree with the reviewer comments on your manuscript.  The closure assumption is the same whether you stack the data and model fixed time effects or whether you use the fully dynamic model!

regards
andy



--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dan Linden

unread,
Jun 4, 2014, 12:02:24 AM6/4/14
to unma...@googlegroups.com
Hey Jay, here are my thoughts:

1) Like Andy said, you wouldn't expect bias. You might underestimate the error in some covariate relationships and potentially have problems with unmodeled heterogeneity, but this would depend on the degree of pooling you employed (e.g., year-specific parameters vs. other variations). A random effects structure could address this but you'd have to go the BUGS route to accomplish it easily.

2) Due to the stacking (and depending on pooling), you are now estimating the probability of occurrence across space and time as opposed to just space. What you just described is a system where the relationship between occurrence and habitat quality across space in a given year is less pronounced than when you incorporate multiple years. A dynamic multi-season model would be able to pick that up if habitat quality were specified as a covariate on extinction probability. Without using the multi-season structure you lose the ability to further distinguish the mechanisms driving the observed data and instead have to lump occurrence and persistence into a single probability. That may not be ideal but I would not say there is anything necessarily wrong with it - just a compromise to accommodate data limitations.

3) May not be the best example (and an unintentional plug), but I stacked 2 years of data for an occupancy model in the paper listed below.  I did it not because of sparse data but because I felt a multi-season model was overkill for only 2 years, especially given that I was not interested in dynamic components (it was also multi-scale and multi-species).

Linden, D.W., and G.J. Roloff. 2013. Retained structures and bird communities in clearcut forests of the Pacific Northwest, USA. Forest Ecology and Management 310:1045-1056.

Kery Marc

unread,
Jun 4, 2014, 3:26:32 AM6/4/14
to unma...@googlegroups.com

Hi all,

 

I find this an interesting discussion. Basically, I feel that stacking is an OK, but approximate way of dealing with a sparse-data situation so that you can fit the model in unmarked. However, you do not fully accommodate the dependence structure due to repeated measurements at the same sites, that is, yes, you are committing pseudo-replication in the sense of Hurlbert (1984). In analogy with other cases of where you have unmodelled spatial or temporal dependency in ecology (cf. models with or without spatial autocorrelation) I would say that you don’t get biased parameter estimates, but simply too small SEs. – It would perhaps be better to use BUGS to fully incorporate the dependencies by adding a site random effect and it might be an interesting exercise to compare the inferences between the two approaches. But I don’t think that you have to.

 

An extreme case of stacking is when you have no spatial replication at all, only temporal replication, e.g. 9 years of data for a single site. Check some recent publications by Yuichi Yamaura and colleagues on community occupancy models where they fit them in exactly this setting. For instance, in Journal of Applied Ecology 2011, 48, 67–75 they write « We applied our model to 9-years of bird monitoring data after a forest fire at a single site (N36_ 3532¢¢, E140_ 37 20¢¢) by substituting time (e.g. years) for space (e.g. sites).»

 

I have run some limited simulations for the related N-mixture model to convince myself that the estimates for a time-for-space substitution are fine. And they were. – BTW, simulation is SUCH a great tool to investigate such questions for yourself.

 

Regards  --  Marc

--

Gretchen Nareff

unread,
Jul 22, 2018, 3:22:03 PM7/22/18
to unmarked
I'm hoping after four years, people may know of more publications using the stacked data method (the OP's third question)? 

Dan, I just downloaded the paper you cited here; I'm submitting to the same journal.

Thanks.

John Clare

unread,
Jul 22, 2018, 3:40:38 PM7/22/18
to unmarked
My memory is that Dan, Andy, and Angela Fuller stacked across years for recent fisher papers:
  • Fuller, Angela K., Daniel W. Linden, and J. Andrew Royle. "Management decision making for fisher populations informed by occupancy modeling." The Journal of Wildlife Management 80.5 (2016): 794-802.
  • Linden, Daniel W., et al. "Examining the occupancy–density relationship for a low‐density carnivore." Journal of applied ecology 54.6 (2017): 2043-2052.

Jeffrey Royle

unread,
Jul 22, 2018, 4:53:49 PM7/22/18
to unma...@googlegroups.com
hi Gretchen,
 In addition to what John said, I think also the Crum et al. (2017) paper on moose occupancy in NY (JWM paper, maybe 2016?) also used a stacked type of model.   Surely there must be some more published references... people will know, I hope.
 (at any rate, we talk about this a lot in chapter 13 of AHM v. 2 which will go to press in January....)

regards,
andy


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

Gretchen Nareff

unread,
Jul 23, 2018, 6:30:12 PM7/23/18
to unmarked
Thank you both!

L. Jay Roberts

unread,
Jul 24, 2018, 12:24:05 PM7/24/18
to unmarked
I've recently used it with distsamp:

And also the original question referred to the analysis and review of this paper:

And prior to that we also used a stacked data structure to look at habitat associations in this paper:


On Monday, July 23, 2018 at 3:30:12 PM UTC-7, Gretchen Nareff wrote:
Thank you both!

Gretchen Nareff

unread,
Jul 25, 2018, 8:02:52 PM7/25/18
to unmarked
Thank you, Jay

mira sytsma

unread,
Nov 13, 2018, 11:17:29 PM11/13/18
to unmarked
Hi all,

This is exactly the thread I've been looking for! I'm having this issue with my data as well, I've stacked my detection histories and now am running into issues with pseudoreplication and how to deal with it. I have been able to get random effects incorporated into RMark, but am much more used to using Unmarked and am struggling with RMark quite a bit. Those of you who have published using this stacked approach, have reviewers dinged you for it? Or how did you go about justifying your decision to use this approach? 

Thanks!
Mira

On Wednesday, July 25, 2018 at 5:02:52 PM UTC-7, Gretchen Nareff wrote:
Thank you, Jay

Chris Merkord

unread,
Nov 14, 2018, 1:15:25 PM11/14/18
to unma...@googlegroups.com
Another example of a stacked years analysis:

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone

Kery Marc

unread,
Nov 14, 2018, 1:22:52 PM11/14/18
to unma...@googlegroups.com
Hi all,

I agree that the stacking method ignores some variability by assuming that measurements from the same site are independent across years. However, it's probably not such a big deal. And: people got away (and still get away all the time) with treating pieces of individual capture histories as independent in the m-array formulation of fitting the Cormack-Jolly-Seber model: I have never heard anybody complain about pseudoreplication there (i.e., that you have to fit a random individual effect). Hence, I would try to argue that stacking is often a reasonable thing to do, because it allows you to very easily fit models with trends across years. Just treat the significance tests with a little caution, since they are likely to be a little too liberal.

Best regards  ---- Marc




From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Chris Merkord [chrism...@gmail.com]
Sent: 14 November 2018 19:15
To: unma...@googlegroups.com
Subject: Re: [unmarked] Re: effective open multi-season model by stacking years (so site/years = sites)

Aaron Grade

unread,
Mar 20, 2019, 7:31:29 AM3/20/19
to unmarked
Hello,

On this same topic (using single-season occupancy model for multiple years), I have a limited amount of visits per year (4 or less) for 3 years, would it be appropriate to treat each site as the same site (rather than stacking by site-year) and then adding year as a covariate to detection and occupancy, or is that entering into the territory of pseudoreplication/severe model assumption violations? If not, would adding year as a random effect in a mixed model framework reduce the pseudoreplication issues?

Thank you,

Aaron

Kery Marc

unread,
Mar 20, 2019, 8:20:02 AM3/20/19
to unmarked

Dear Aaron,

 

(Max 4 visits is not unusual at all.) And no, putting yearly data sideways will not give you the right answers, because you will then assume closure over the entire 3-year period. OK, this may not be disastrous, but you will then estimate some sort of probability of use rather than probability of permanent presence. Unless you have strong reason not to, I find stacking the better way to deal with multi-season data in a closed model.

 

Best regards  -- Marc

Jim Baldwin

unread,
Mar 20, 2019, 10:45:03 AM3/20/19
to unma...@googlegroups.com
For whatever it's worth here are my two cents:

1.  As others have said for many situations ignoring the correlation structure doesn't bias estimates.  (xbar is still a good estimate of the mean even if observations are serially correlated - assuming one has enough observations).
2.  However, ignoring correlation structure does have a big effect on estimates of precision.  (My soapbox speech is "A statistic without an appropriate measure of precision is at best of unknown value.")
3.  Maybe it was mentioned in the previous threads (and if so I apologize for missing it) but why not perform a bootstrap to estimate precision of any particular quantity where site is the basic unit of a bootstrap selection?  That would attempt to include the inherent correlation/dependence structure (i.e., all years/seasons for any site would always be included in a bootstrap sample) and result in justifiable estimates of precision.

Jim


Message has been deleted

Kery Marc

unread,
Mar 22, 2019, 4:52:42 AM3/22/19
to unma...@googlegroups.com
Hi Dan and Jim,

interesting discussion. I tend to feel like Dan that I wouldn't see underestimated precision as a death-blow for "stacking". After all, I would say that in almost ANY statistical analysis that we run, we always ignore at least half a dozen dependencies (in time, space etc), which always will cause our SEs to be too narrow.

On the other hand, I find Jim's suggested solution, nonparametric bootstrapping, excellent and it should be easy to enough to implement in R.

Best regards  --- Marc




From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Dan Linden [danl...@gmail.com]
Sent: 21 March 2019 03:32
To: unmarked

Subject: Re: [unmarked] Re: effective open multi-season model by stacking years (so site/years = sites)

Not sure I agree with Jim here.  Ignoring correlation can have an effect on precision, but it does not have to be "big".  Underestimating uncertainty is not necessarily a fatal flaw.

One of the points of stacking sites/years is that there is value in knowing that a site with particular attributes has 3 years of occurrence vs. 1 or 2.  You will miss out on the mechanism, since a site with persistent occurrence across 3 years is different than 3 separate sites being occupied in a given year.  But, having information across multiple years is still more valuable than a single year.  Everything is an approximation, right?

Dan Linden

unread,
Mar 22, 2019, 8:00:54 AM3/22/19
to unmarked
Not sure why my message got deleted.  Andy, what kind of unit are you running here? :)

Jeffrey Royle

unread,
Mar 22, 2019, 6:42:41 PM3/22/19
to unma...@googlegroups.com
hey Dan, not sure what happened man -- it's not you!

Aaron Grade

unread,
Mar 26, 2019, 1:40:23 PM3/26/19
to unmarked
This is very helpful, thank you!

Stacking appears to be the way to go, my colext models don't do a great job of converging, but this is a worthwhile dataset to explore.

-Aaron
Message has been deleted
Message has been deleted

Jeffrey Royle

unread,
Apr 3, 2019, 11:38:15 PM4/3/19
to unma...@googlegroups.com
hi Gretchen,
 to be honest I didn't exactly follow the structure of your data here. My understanding is that each site was sampled over 3 years. And within each year it was sampled 3 times.  So let's call y_{ijk} = site I, rep j, year k count. The data are like:
          ---- year 1 -----year 2-------year 3-----
  site1:   y11 y21 y31  y12 y22 y32  y13 y23 y33

 So from this you create a data set like this:

  "site1"   y11 y21 y31
  "site2"   y12 y22 y32
  "site3"   y13 y23 y33

correct?   

This is how I would think of describing the idea of "stacking" -- it breaks the temporal structure and, as noted by Jim Baldwin and Marc and others, it introduces some dependence that is not accounted for.  I would say that the method of stacking does NOT assume closure across years. Rather it avoids having to assume that... it allows N[site,year] to be different while using a closed model... but the stated precision of MLEs is a little too optimistic perhaps.  In the new book (AHM2) we added a short short illustration of Jim Baldwin's idea of using a nonparametric bootstrap to account for this. That's a brilliant idea which apparently Ian or Richard had thought of 10 years ago when they wrote unmarked because in fact we have a nonparboot() function which I had never thought to use before. So I learn something new everyday by reading the unmarked email list....

I hope this helps out a little bit but, if not, please ask for follow-up and will try to help out...

regards
andy



On Mon, Apr 1, 2019 at 4:38 PM Gretchen Nareff <marsh...@gmail.com> wrote:
I didn't see a response to Aaron's question about his specific setup. I did what he is describing—I have 474 three-visit samples from 187 point count stations (points), assuming closure among the three visits and using pcount. I'm stacking because I have different levels of years-post-harvest (YPH) among points (some only have data from one year post while others have data from two or three years post). I'm interested in the effect YPH has on abundance. If I combined points with YPH as you all are talking about, I would have 474 "sites" and my models don't converge. So I ran my models with my 187 points stacked (into 474 samples) and used a covariate for YPH. So the two are linked, but my "sites" repeat for however many years. 

For example, point count "KY_E1" would have three rows, tied to three different levels of SiteCov YPH, rather than having three different sites: "KY_E1_1", "KY_E1_2", "KY_E1_3". And then point count "SJ_E1" only has one YPH so it has one row.

Is this not appropriate? Thank you.
 
On Wed, Mar 20, 2019 at 6:31 AM Aaron Grade <agra...@gmail.com> wrote:
Hello,

On this same topic (using single-season occupancy model for multiple years), I have a limited amount of visits per year (4 or less) for 3 years, would it be appropriate to treat each site as the same site (rather than stacking by site-year) and then adding year as a covariate to detection and occupancy, or is that entering into the territory of pseudoreplication/severe model assumption violations? If not, would adding year as a random effect in a mixed model framework reduce the pseudoreplication issues?

Thank you,

Aaron


Gretchen Nareff

unread,
Apr 3, 2019, 11:53:23 PM4/3/19
to unma...@googlegroups.com
Thanks, Andy. I tried to delete my messages today because they were a result of lack of sleep and subsequent panicking about a manuscript I submitted. I figured out today that what I did was correct. After reading Aaron’s message, I thought I had to somehow tell unmarked that each row was a unique point, but that is inherent in the stacked structure.

I may not have described it clearly, but that structure you wrote out is what my data look like. I also understand that it doesn’t assume closure across years because there is no connection between site x in year one and site x in year 2. I’m using dynamic N-mix models for another dataset so my thoughts were muddled. 

I appreciate you taking the time to respond!

You received this message because you are subscribed to a topic in the Google Groups "unmarked" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/unmarked/OHkk98y09Zo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Gretchen E. Nareff

U.S. Fish and Wildlife Service
PhD Candidate
WV Cooperative Fish and Wildlife Research Unit
West Virginia University


John

unread,
Jun 7, 2019, 1:30:29 PM6/7/19
to unmarked
Hi everyone,

I have a follow up question for these stacked single-season models:

Should you include year in every model (as done so in the previously cited fisher and moose papers), or is it acceptable to simply include year in a subset of the models?

For my analysis (surveys from 2009-2018, points surveyed for 5 days, then resurveyed every 2-4 years depending on the point), I'm not interested in changes by year, just the associated habitat variables.

Thus, is this model set acceptable, or should year be included in each model?

m1  <- occu(data = CAT_UMF, ~ 1 ~ 1)
m2  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + YEAR)
m3  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BDE)
m4  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BPR)
m5  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + IMP)
m6  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + FPA)
m7  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + YEAR + BDE)
m8  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + YEAR + BPR)
m9  
<- occu(data = CAT_UMF, ~ 1 ~ 1 + YEAR + IMP)
m10
<- occu(data = CAT_UMF, ~ 1 ~ 1 + YEAR + FPA)
m11
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BDE + BPR)
m12
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BDE + IMP)
m13
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BDE + FPA)
m14
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BPR + IMP)
m15
<- occu(data = CAT_UMF, ~ 1 ~ 1 + BPR + FPA)
m16
<- occu(data = CAT_UMF, ~ 1 ~ 1 + IMP + FPA)


Thanks!
To unsubscribe from this group and stop receiving emails from it, send an email to unma...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "unmarked" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/unmarked/OHkk98y09Zo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to unma...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

L. Jay Roberts

unread,
Jul 15, 2019, 2:41:22 PM7/15/19
to unmarked
In general I would say you want to include year as a covariate in the occupancy calculation. IDEALLY this would be as a random effect and year would be coded as a factor, but since unmarked doesn't do random effects it has to be a fixed effect. The year factor covariate would account for some year to year variation that affects all your points equally. Where there could be issues is if there is a trend or some other year to year influence that is important to your conclusions or to study system in some way that "interferes" with the interpretation of another variable. You may want to check colinearity of all your variables and see if any have high colinearity with year (and in a more general sense this should always be done in any regression model). E.g. if one of your variables has a strong trend across years it may be difficult to interpret the model fit when both that variable and year are included in the same model. The "vif" statistic from package HH is one good way to do this. 

Alternatively, you can code year as a numerical value and then it accounts for linear trend in your data. If that makes sense for your study then I think it is appropriate. 

In conclusion, the answer (to a great many stats questions as it turns out) is: it depends. It is my opinion that year can be included in a subset of the models you are evaluating as long as you have good reason to do so. I have run occupancy models using stacked data without year as a covariate when I had other variables that happened to be colinear with year (e.g. climate data). 

eric.d...@nasa.gov

unread,
Aug 27, 2019, 2:01:52 PM8/27/19
to unmarked
Andy;
Would you please sketch out how to do the non-parametric bootstrap to properly estimate the variance when using stacked years? Is it just a case of re-sampling the points across all years with replacement and re-fitting the model?
Thanks,
Eric
To unsubscribe from this group and stop receiving emails from it, send an email to unma...@googlegroups.com.

Kery Marc

unread,
Aug 27, 2019, 2:15:04 PM8/27/19
to unma...@googlegroups.com
Dear Eric,

I would do this:
- from n sites surveyed over multiple years, sample with replacement n sites. Hence, some sites will appear in the bootstrap data multiple times and some sites not at all.
- then do the stacking and fit the model, save the MLEs (no need for SEs --- speeds things up)
- repeat this 1000 or so times and take the SD as the bootstrap SE and a 95% percentile as a bootstrapped 95% CI

That's what we do in the upcoming AHM2 book too.

Best regards  --- Marc



From: 'eric.d...@nasa.gov' via unmarked [unma...@googlegroups.com]
Sent: 27 August 2019 20:01
To: unmarked

Subject: Re: [unmarked] Re: effective open multi-season model by stacking years (so site/years = sites)
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/8c133355-6435-4a6e-bacc-c79cfec1bdad%40googlegroups.com.

eric.d...@nasa.gov

unread,
Aug 30, 2019, 11:19:44 AM8/30/19
to unmarked
Thank you!


On Tuesday, August 27, 2019 at 2:15:04 PM UTC-4, Kery Marc wrote:
Dear Eric,

I would do this:
- from n sites surveyed over multiple years, sample with replacement n sites. Hence, some sites will appear in the bootstrap data multiple times and some sites not at all.
- then do the stacking and fit the model, save the MLEs (no need for SEs --- speeds things up)
- repeat this 1000 or so times and take the SD as the bootstrap SE and a 95% percentile as a bootstrapped 95% CI

That's what we do in the upcoming AHM2 book too.

Best regards  --- Marc


To unsubscribe from this group and stop receiving emails from it, send an email to unma...@googlegroups.com.

Jasmine Cutter

unread,
Oct 11, 2019, 2:19:37 PM10/11/19
to unmarked
Hi all,
I'm new to unmarked & trying to wrap my brain around formatting my data for the ymatrix for gdistsamp.

I have 72 transects, that I've sampled 3x per year for 3 years. My initial inclination was to arrange my data like you suggested for Gretchen, but then it was recommended to me to have the first column be transect-year instead of transect and only y1 y2 y3. I'm trying to wrap my brain around what that means for the analysis/assumptions. 

It makes sense to me to do it as recommended for Gretchen, because these are sites that are being visited multiple times (basically 9 times). But if I expect some differences in availability each sampling period (within year visits), would it make more sense to have it transect-year, so that observations aren't summed for the whole transect? Either way, NumPrimary should be 3, because years is the primary time period regardless, correct?

Finally, when I run summary on my unmarkedFrameGDS, even when it's arranged by transect, it tells me there are 3 primary sampling periods, but only 1 secondary period... I assume that should be 3 also? What am I missing in my matrix arrangement and/or unmarkedFrameGDS call to tell it that there are 3 secondary periods nested in 3 primary periods?

Thanks for any advice! 
Jasmine

Option 1:
          ---- year 1 -----year 2-------year 3-----
  site1:   y11 y21 y31  y12 y22 y32  y13 y23 y33

 So from this you create a data set like this:

  "site1"   y11 y21 y31
  "site2"   y12 y22 y32
  "site3"   y13 y23 y33

Option 2:
                       SampPd1     SampPd2     SampPd3
transect1.yr1
transect1.yr2
transect1.yr3
transect2.yr1

To unsubscribe from this group and stop receiving emails from it, send an email to unma...@googlegroups.com.

Justin T Mann

unread,
May 20, 2020, 2:07:36 PM5/20/20
to unmarked
Hi everyone! 

Sorry to crash an older thread, but...

I just applied this bootstrap to a year-stratified N-mixture model. This was recommended to me by Andy, who graciously sent me the AHM2 chapter proof that described both the "stacking" method and the non-parametric bootstrap. My data come from a short, two-year experiment. I'm not interested in the effects of year, and when included as a fixed effect, year didn't explain much of any variation. I understand that the bootstrap simulation essentially "breaks" up the temporal structure of the dataset to compensate for the non-independence of reused survey sites. 

So my question is: since the temporal structure isn't of interest, is there anything wrong with me simply reporting the mean of the bootstrapped estimates with the bootstrapped standard errors and confidence intervals? There are no differences in the interpretation of my model, whether bootstrapped or not. Reporting the bootstrapped estimates would just provide a more conservative estimation of the variance.

Am I thinking about this correctly? I'm sorry if this is an embarrassingly naive question. I am no expert and just need a sanity check.

Many thanks,
Justin   

    

On Tuesday, August 27, 2019 at 2:15:04 PM UTC-4, Kery Marc wrote:
Dear Eric,

I would do this:
- from n sites surveyed over multiple years, sample with replacement n sites. Hence, some sites will appear in the bootstrap data multiple times and some sites not at all.
- then do the stacking and fit the model, save the MLEs (no need for SEs --- speeds things up)
- repeat this 1000 or so times and take the SD as the bootstrap SE and a 95% percentile as a bootstrapped 95% CI

That's what we do in the upcoming AHM2 book too.

Best regards  --- Marc


To unsubscribe from this group and stop receiving emails from it, send an email to unma...@googlegroups.com.

Edward Trout

unread,
Sep 8, 2020, 2:05:19 PM9/8/20
to unmarked
Hi everyone,

This thread has been a great help and I appreciate the responsiveness and help of this community in general!

I have a question about stacking for my data. I have camera data from the summers of two years that differ in length of survey (July - October 2018,  June - October 2019). I currently have weeks set as my observation period. Thus I have 17 observations for my first year and 22 observations for my second year.

In stacking as I understand it, I would then have each site*year as a separate "site". My detection histories (y data) would then be of different lengths as the first year has fewer observations than the second year (2018 [43 x 17], 2019 [48 x 22]) such as this:

"site 1"   y11 y21 y31
"site 2"   y12 y22 y32 y42 y52

What is the appropriate way of dealing with this?

I would imagine you would fill the spaces with NAs. At that point is it better to:
           1) align the detection histories by time so that each observation has a fixed point in season (e.g. July 1st 2018 and July 1st 2019 share the same observation column)
               "site 1"  NA   NA   y11  y21   y31
               "site 2"  y12  y22  y32  y42   y52
OR
          2) align the first dates of survey so that each observation has a relative point in each season (e.g. July 1st 2018 and June 1st 2019 share the same observation column)
             "site 1"  y11  y21  y31  NA  NA
             "site 2" y12  y22   y32  y42  y52


Thanks so much for taking a look at this and setting me right!

Best,

Edward

ken.k...@gmail.com

unread,
Sep 8, 2020, 4:31:13 PM9/8/20
to unmarked
Hi Edward,

It shouldn't matter how you pad with NAs, either approach is fine. Just want to make sure that your observation-level covariates (if you have them) line up in the same way.

Ken

Mike Allen

unread,
Nov 2, 2020, 5:46:33 AM11/2/20
to unmarked
Hi all,
Related to Justin's previous question on this thread: What would you do if sites were irregularly surveyed by year (i.e., some were surveyed multiple years and others not, in an inconsistent fashion) and some years have a low number of sites that are not spatially representative? The data below show number of detections by year (out of 3-6 annual surveys) for each site. The count of sites surveyed in 2015 and 2016 is ~200 while only about 40 sites each were surveyed in 2018 and 2019. Furthermore, the 2018 and 2019 sites were done in an ad hoc, spatially clustered fashion (and had VERY few detections), so I don't trust that the "occupancy" measurement for those sites in those years represents "truth" for the whole region (but all points from all years, together, are fairly representative). Therefore, a stacked approach with a "year" covariate does not produce satisfactory results (i.e., separate occupancy estimates by year are not trustworthy). Also, I'm not interested in changes in occupancy across years, just the overall effects of environmental covariates that may be driving occupancy.

Is a Bayesian approach with site (pointid) as a random factor the way to go to investigate factors driving occupancy?

Thanks,

Mike Allen
Rutgers Ecology & Evolution
(PS - This is the best thread.)

pointid 2015 2016 2018 2019
   site1    0    0   NA   NA
   site2    0    0   NA   NA
   site3    0    0   NA   NA
   site4    0    0   NA   NA
   site5    0   NA   NA   NA
   site6    0   NA   NA   NA
   site7    0   NA   NA   NA
   site8    0   NA   NA   NA
   site9   NA    0   NA   NA
  site10    0    0   NA   NA
  site11    0    0   NA   NA
  site12    0   NA   NA   NA
  site13    0   NA   NA   NA
  site14    0   NA   NA   NA
  site15    0   NA   NA   NA
  site16    0   NA   NA   NA
  site17    0   NA   NA   NA
  site18    0   NA   NA   NA
  site19    0   NA   NA   NA
  site20    0   NA   NA   NA
  site21    0   NA   NA   NA
  site22    0   NA   NA   NA
  site23    0   NA   NA   NA
  site24    0   NA   NA   NA
  site25    0   NA   NA   NA
  site26   NA    2   NA   NA
  site27   NA    2   NA   NA
  site28   NA    0   NA   NA
  site29   NA    0   NA   NA
  site30   NA    0   NA   NA
  ...
 site125    0    2   NA   NA
 site126    0    0   NA   NA
 site127    0    0   NA   NA
 site128    0    0   NA   NA
 site129    0   NA   NA   NA
 site130    0   NA   NA   NA
 site131    0   NA   NA   NA
 site132    0   NA   NA   NA
 site133    0   NA   NA   NA
 site134    0   NA   NA   NA
 site135    0   NA   NA   NA
 site136    0   NA   NA   NA
 site137   NA   NA    0   NA
 site138   NA   NA    0    0
 site139   NA   NA    0   NA
 site140   NA   NA   NA    0
 site141    1    0   NA    0
 site142    0    0   NA   NA
 site143    0    1    0    0
 site144    0   NA   NA   NA
 site145    0   NA   NA   NA
 site146    0    0   NA   NA
 site147    0    0   NA   NA
 site148    1    1    0    0
 site149   NA   NA    0   NA
 site150    0   NA   NA   NA
... ~ 200 more rows...

Juani Reppucci

unread,
Nov 3, 2020, 3:44:25 PM11/3/20
to unmarked
Hi Mike
I'm in a similar situation...
Take a look at the ubms package, it might be what you are looking for. 
Here are the vignettes for the package: https://kenkellner.com/blog/ubms-vignette.htmlhttps://kenkellner.com/blog/ubms-vignette.html
Cheers

Juan

 

Ken Kellner

unread,
Nov 6, 2020, 1:05:03 PM11/6/20
to unmarked
Juan,

Thanks for suggesting ubms, just wanted to add I have a proper website for it now here: https://kenkellner.com/ubms/

There are now a number of unmarked-equivalent models you can fit in ubms while also including random effects, including versions of occu(), colext(), pcount(), distsamp(), multinomPois() and occuTTD(). I am slowly grinding away at final issues and hope to submit to CRAN soon.

Ken

Richard Chandler

unread,
Nov 6, 2020, 1:10:30 PM11/6/20
to unmarked
Ken, congrats on the new package. It looks great!

Richard

From: unma...@googlegroups.com <unma...@googlegroups.com> on behalf of Ken Kellner <ken.k...@gmail.com>
Sent: Friday, November 6, 2020 1:05 PM
To: unmarked <unma...@googlegroups.com>

Subject: Re: [unmarked] Re: effective open multi-season model by stacking years (so site/years = sites)
 
[EXTERNAL SENDER - PROCEED CAUTIOUSLY]

To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/862dc59d-26d2-4bdd-b66f-7ce8a6d1ede1n%40googlegroups.com.

Kery Marc

unread,
Nov 6, 2020, 1:38:11 PM11/6/20
to unma...@googlegroups.com
I agree. This is super-cool work ! Thanks, Ken.

Best regards  --- Marc


From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Richard Chandler [rcha...@warnell.uga.edu]
Sent: 06 November 2020 19:10

Ye Htet Lwin

unread,
May 2, 2024, 8:29:26 AMMay 2
to unmarked

Dear all,

I hope you're doing well. I recently read the above helpful discussion that reminded me of a problem I'm facing. Like Mike, I have lots of camera trap data from different years (2016 to 2022), but the sampling periods are inconsistent. Some traps were deployed for a whole year, while others were only for a few months and survey seasons were also different. My primary research objective is to analyze multi-year data into a single-season N-mixture/Royle-Nicholas model since I am not interested in a dynamic model.

Currently, I am thinking between two methods: using a "stacked" approach OR treating the year as a random effect by using ‘ubms’ package in R. The stacked function seems great, but it might not work well because my data was collected at different times of the year. On the other hand, treating the year as a random effect might be better, but I'm not sure how to do it effectively.

I would really appreciate your advice on which method would be best for my situation and any suggestions or comments you provided are highly appreciated. Your help would mean a lot to me.

Thanks,

Lwin

Reply all
Reply to author
Forward
0 new messages