Abundance multinompois using predict, sample.int error

58 views
Skip to first unread message

Isabelle Ranson

unread,
Jul 10, 2022, 7:17:32 AMJul 10
to unmarked
Hi everyone,
I just tried using this code example for multinompois removal provided by Ken Kellner:

https://groups.google.com/g/unmarked/c/l9CCdAplmAo/m/H9r42ErIBAAJ

But I'm getting an error message, any idea how I can fix it?
I've already completed a run on detection probability, but now I want to estimate abundance in different regions of the same population.

#for context

#sept.pois has been simplified for sharing
sept.pois <- unmarkedFrameMPois(y=dp.sept,
                                          siteCovs = NULL,
                                          obsCovs = (Method, region, effort), type ="removal")
                                         
sept2 <- multinomPois(~Method +region +effort ~1,sept.pois) #best fit

r <- ranef(sept2)

groupfunc <- function(x){
out <- numeric(5)
out[1] <- mean(x[obsCovs$region=="Marsh"])
out[2] <- mean(x[obsCovs$region=="FloatingDock"])
out[3] <- mean(x[obsCovs$region=="MuddyHatchery"])
out[4] <- mean(x[obsCovs$region=="NorthernExtent"])
out[5] <- mean(x[obsCovs$region=="SouthernExtent"])
}

pr <- predict(r,func=groupfunc,nsims=100)
Error in sample.int(length(x), size, replace, prob) :
NA in probability vector


Any ideas would be appreciated! Thank you!

Marc Kery

unread,
Jul 10, 2022, 7:57:46 AMJul 10
to unmarked
Dear Isabelle,

I think that for this you need to add 'region' as a site covariate also in your unmarked data frame and then add it into the abundance model that you fit. Then, you should be able to use predict() to obtain an estimate of the mean lambda for each region.

An alternative (though perhaps less common and perhaps even to be discouraged) that should also work with a model where you don't have 'region' as a factor in the (sub-model) for lambda would be to use ranef() on your fitted model object now and then summarize the resulting estimates of the realized population sizes at each site by region. [Perhaps somebody has an opinion on that ? I think adding 'region' in the abundance model and then using predict() is probably better, since it's an inference to the regions as wholes, and not restricted in its scope to the exact set of sites studied in each region, right ?]

Then, I have a question. Thanks for your other email where you summarized some of the obstacles you had and how you overcame them. That may well be useful for others. However, I think there you say you had no site covariates .... but what about 'region' ? I don't see how this could not be a site covariate. If it was an observational covariate then that would mean that sites can change membership to a region over the duration of the sampling, and I don't see how this could happen.

Best regards  --- Marc


From: unma...@googlegroups.com <unma...@googlegroups.com> on behalf of Isabelle Ranson <isabell...@gmail.com>
Sent: Sunday, July 10, 2022 13:17
To: unmarked <unma...@googlegroups.com>
Subject: [unmarked] Abundance multinompois using predict, sample.int error
 
--
*** Three hierarchical modeling email lists ***
(1) unmarked (this list): for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology: for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/f8616ec2-20d1-4c20-b5a2-3c6695ea8281n%40googlegroups.com.

Isabelle Ranson

unread,
Jul 10, 2022, 9:43:58 AMJul 10
to unmarked

Hi Marc, 

It took me a long time to discover region (in this dataset) made more sense as an obs covariate. It may be hard to explain my reasoning, and of course I may be incorrect! It’s not my preferred option, but it was the only one that appeared to work with this dataset and produced results.

This dataset is regarding an invasive species in a 750 acre area, and the traps being deployed were moved around on a daily basis. The managers constantly adapted their strategies to maximize capture. This meant there were no traps which remained in a given spot (GPS) over the duration of sampling, and thus no consistently sampled sites (M). I defined five regions based on habitat differences within the 750 acres to see if the detection probability or abundance estimates were different in these different habitats. In order to guarantee I had zeros in my count data (Y=1 or Y=0), the data could not be condensed in a way that created permanent sites. The square matrix also made this a troubleshooting nightmare (so many pivot tables and reorganization experiments). These means that every site or M had to be defined as every individual trap opening, therefore the region changes with each M. This was the only method that appeared to work.

In the context of this data:

M = sites. Individual trap “checks”.

Y = count data or detection data (Y=1 or Y=0) at each of the sites (M). For this report, Y represents the catch of each trap, (0 - >100)

siteCovs: = columns for site-level covariates, things that remain constant for every site (M) per visit (J). NULL.

obsCovs: columns for observational-level covariates, things that differ per trap opening (M). Region, Method, and Effort.

 

Perhaps that explains it better? What do you think? 

Marc Kery

unread,
Jul 11, 2022, 3:41:37 AMJul 11
to unmarked

Dear Isabelle,

 

wow, not that is a fairly complex design. I must admit that I do have some concerns about whether either standard occupancy models (i.e., function occu() in unmared) or (even more) multinomial N-mixture models (e.g., function multinomPois()) are appropriate for the data produced by it. The two complications seem to be:

  • Traps are moved around, such that you do not seem to get the replicate observations that are required to fit all of these models in their traditional applications
  • And the captured animals are killed, right ? So, you have what is called a removal design: i.e., the population is not closed, but rather does change over the duration of study, due to the capture-removal

 

I think that there may be ways in which the data produced by this design can be subsumed into a design assumed by the model fitting functions in unmarked, but I am not sure.

 

Best regards  --- Marc

Isabelle Ranson

unread,
Jul 11, 2022, 11:56:09 AMJul 11
to unmarked

Hi Marc,

I agree, it's unusual and there has been a lot of compromise to see if this type of analysis is applicable to the situation. But at the end of the day, the goal of my report is to determine if models like those discussed in unmarked, could fit this data (I’m not planning on publishing the results). And while I have detection probability results, I believe they represent a simulation, I’m not confident the results are accurate. I'll have a lot to unwrap in my report discussion.  

You're correct, the individuals are removed and frozen, so this is a removal study. I reviewed the assumptions of a removal model using multinomPois()) and I believe this dataset fits the assumptions regarding migration, and reproduction/death. It is a funky enclosed aquatic system.  

Do you think there’s a way for me to attempt an abundance result, without having a site covariate? I'm going to experiment with this for another month.

.

I’ve included two sources which helped me understand the assumptions of the models. For anyone lost, these are informative places to start. There are loads more resources out there. 

Williams, B.K., Nichols, J.D. and Conroy, M.J., 2002. Analysis and management of animal populations. Academic press. 

Fiske, I. and Chandler, R., 2011. Unmarked: an R package for fitting hierarchical models of wildlife occurrence and abundance. Journal of statistical software, 43, pp.1-23. 

Marc Kery

unread,
Jul 12, 2022, 3:37:48 AMJul 12
to unmarked
Dear Isabelle,

thanks for these clarifications.

I am sorry to say that I don't believe that any of the models in unmarked are applicable for your data (at least in a standard application that excludes single-visit data; but see Lele, Solymos et al., around 2012). The reason for that is that you do not seem to have any replicate observations *at the same sites*. For instance, to adopt a removal sampling design for the model fit by multinomPois, you need at least two such removal counts per site. You don't need replicated data at all of the sites, but at least from some, because the sites with reps provide ~99% of the information about detection probability. In the removal design, this information comes from the decline (overall) of the counts with increasing depletion of a population over successive removal passes.

Now the situation would be different if you had at least some removal counts where the gear remained at the same location for at least two "harvests" and if the proportion of the data that have replicates is at least 5-10% or so (this percentage is just my wild guess here). Such "mixed" designs, where some sites do have replicates and most do not, are amenable to inferences using the models in unmarked.

In the absence of any sites with replicates I would adopt the usual fallback position and just analyse "relative occurrence" (in the case of occupancy) or "relative abundance" (in the case of using counts) in relation to the covariates of interest. This is what the vast majority of ecology etc still does. Good practice then requires (IMO) that you are upfront about what your modelled quantities mean (a combination of what you're really interested in, occupancy or abundance, and of what you're less interested in, detection probability) and that you discuss possible confounding of the patterns of interest by "nuisance patterns" in detection.

Best regards  -- Marc

PS: all: comments/objections welcome as always !
 


Sent: Monday, July 11, 2022 17:56
To: unmarked <unma...@googlegroups.com>
Subject: Re: [unmarked] Abundance multinompois using predict, sample.int error
 

Isabelle Ranson

unread,
Jul 12, 2022, 8:02:33 AMJul 12
to unmarked
Hi Marc,

I may have some sites which remained in the same places. There was one trap type that didn't move around like the others. I'll take a look at those and see if I have enough to salvage. Update soon!

Isabelle Ranson

unread,
Jul 20, 2022, 9:35:55 AMJul 20
to unmarked

Hi Marc!

I have created a new excel sheet using only the stationary traps. I wasn't even sure I had enough to work with but the set appears big enough to get some results, but not all I was hoping for. At this point I can now make region and method sitecovs! Effort remains obscovs. Making the switch was the right decision. My results would have been inaccurate otherwise.

Two of my region outputs appear fine but three are unable to provide SE values. I’m not sure why this is, especially when looking at the raw data. The three regions lacking SE are larger than those that output SE values. There are Nas in the dataset, as some traps have stationary data for less days than others. I’m going to try to find an answer to this, but if you have any hunches, I’d be grateful. 

At the end of the day, this dataset may just be incompatible for continued modelling. 

Here are two of the outputs: one with SE and one without.

> septpredMarsh

   Predicted          SE       lower      upper effort region Method

1   0.01421498   0.002745395   0.009726411   0.02073157      1  Marsh Shrimp

> septpredNorth

   Predicted  SE lower upper effort         region Method

1   0.01202877 NaN   NaN   NaN      1 NorthernExtent Shrimp

Marc Kery

unread,
Jul 21, 2022, 3:06:22 AMJul 21
to unma...@googlegroups.com

Dear Isabelle,

 

thanks for the update. I am pretty sure that using only the data from stationary traps, with at least some repeated measurements, is the right decision. Now these missing SEs are usually a sign that a data set is very small or that there are very few detections for a species. Or that a model is too complex given the amount of information in the data. Sometimes, it’s also a consequence of the optimization not finding the global maximum and that may then be mended by using better starting values.

 

Best regards  -- Marc

Reply all
Reply to author
Forward
0 new messages