occuMulti questions

1,727 views
Skip to first unread message

Katie Grabowski

unread,
Jul 29, 2020, 10:02:27 PM7/29/20
to unmarked
Hi!

First post here--I went through old questions and didn't see these, so I hope this is alright! I'm using the Rota et al (2016) model for my master's thesis, running it with unmarkedframeoccuMulti() and occuMulti().  It's going, but I've reached a few questions:

- How do you extract data from the S4 object that occuMulti produces? I want to take the estimates and standard errors and export them without copy/paste, but as.data.frame, as_tibble, tidy, as.matrix don't work as far as I can tell. as.data.frame(coef(fit)) [fit is what I called the resultant object from occuMulti()] lets me pull just the estimates, but not the SE. 

- Why does the stan code that Rota et al included in their dryad code take 12 hours to run for a single model, while the occuMulti function takes around a minute or so (for the pretty basic model I ran just to see that it was working)? I have a very limited stats background and I'm wondering if they're doing the same thing with that big a time difference? 

- When I run occuMulti(), I get the following warning: In .local(umf, ...) :
  Missing detections at sites: 11, 12, 21, 23, 25, 47, 49, 52, 56. These sites have some NAs for days when the cameras were down. It still gives all the values I'd expect from fitting the model (the estimates, SEs, AIC, etc.), so I'm thinking it's probably fine but I wanted to check. 

- I set the max order of interactions at 2 because I only want to consider up to pairwise interactions for my 4 species: [my code: fit <- occuMulti(detFormulas, stateformulas = occFormulas, data, maxOrder = 2]. With four species and all pairs, there are 10 formulas in my stateformulas argument. However, when I try to run predict() on the model [my code: lapply(predict(fit,'state'),head)], I get the following error:
Error in .local(umf, ...) : 
  15 formulas are required in stateformulas list. 
Any thoughts?

Thank you in advance for your help! I'm so glad this group exists :)

Cheers,
Katie

Ken Kellner

unread,
Jul 30, 2020, 9:20:58 AM7/30/20
to unmarked
Hi Katie,

1) Try using summary() on the output object. You should get a list with one element for the state model and one element for the detection model, and each should be a data frame containing the info you want. You can also extract the standard errors directly with SE().

2) Rota's paper used Bayesian approach and MCMC with Stan to estimate parameters. Unmarked uses maximum likelihood, which is usually much, much faster. The Bayesian approach has other advantages, for example, you can add random effects or customize the model structure (not possible in unmarked).

3) Missing values at some cameras is fine, as long as you are getting reasonable results don't worry about this warning.

4) This looks like a bug to me. I'll work on a fix. In the meantime, you should be able to make prediction work in this situation by fitting your initial model with the alternative approach to limiting interactions - set the formulas for parameters above order 2 to '0' or '~0'. So your stateformulas would be the 10 formulas you have now, plus 4 '0' s for the third-order interactions, plus 1 '0' for the fourth-order interaction (15). See the "fit3" example in the occuMulti help. Re-fit the model and then try predicting and it should work.

Ken

Katie Grabowski

unread,
Jul 30, 2020, 11:11:52 AM7/30/20
to unmarked
Thank you so much for your quick and helpful response! That worked to get the data from the model object, and I appreciate your explanations for the second two questions. 

For my last question:
I added 5 '0's to my stateformulas argument:
occFormulas <- c('~urema_dist','~tree_hansen','~termite.large.count.100m','~lion_camera','~1','~1','~1','~1','~1','~1', '0', '0', '0', '0', '0')

When I re-fit and tried to run predict() again, I get a new error:
Error in eval(predvars, data, env) : object 'detect.obscured' not found

I only have siteCovs in my unmarkedFrameOccuMulti object (no obsCovs), and the siteCovs include both occupancy (i.e. distance to lake) and detection (i.e. vegetation cover) covariates. The occupancy covariates go into my stateformulas argument (above), and I use the detection covariates to create my detformulas object:
detFormulas <- c('~detect.obscured+cover.ground', '~detect.obscured+cover.ground', '~detect.obscured+cover.ground', '~detect.obscured+cover.ground')
As you can see, detect.obscured is one of my detection covariates. Any thoughts on what's throwing this new error?

Ken Kellner

unread,
Jul 30, 2020, 11:55:04 AM7/30/20
to unmarked
Looks like another bug for me to fix, sorry!

You should be able to work around this by putting the detection covs into obsCovs instead. Basically you just need to repeat each site-level value a number of times equal to your number of sampling occasions. So if your siteCovs are 

data.frame(cov1=c(1,2,3,4))

and you have 3 sampling occasions, the equivalent obsCovs would be

data.frame(cov1_detect=c(1,1,1,2,2,2,3,3,3,4,4,4))

This is what occuMulti is doing behind the scenes, but it's obviously not working properly for predict at the moment.

Ken

Katie Grabowski

unread,
Jul 30, 2020, 11:59:56 AM7/30/20
to unmarked
Got it, thank you again for your quick response! I will move my detection covariates and try again. Thanks!

Katie Grabowski

unread,
Jul 30, 2020, 2:19:33 PM7/30/20
to unmarked
Hi Ken,

Just wanted to let you know that I decided to reinstall the unmarked package (I understood your recommendation to get around the error, but was having trouble implementing it in R) and the error about not finding detect.obscured went away! So it seems like the issue was on my end, not the function's. 

Thanks again!
Katie

Emma Evers

unread,
Sep 14, 2020, 12:34:18 PM9/14/20
to unmarked
Hi,

I am having a very similar problem when including detection formulas. 

I called:
occFormulas <- c('~1', '~1', '~1', '~1', '~1', '~1', '0')
detFormulas <- c('~baited', '~baited', '~baited')
fit <- occuMulti(detFormulas, occFormulas, UMFMulti)
fit
The fit file looks good but as soon as I proceed with plot(fit) I get the following:
Error in eval(predvars, data, env) : object 'baited' not found

I tried uninstalling and re-installing unmarked but that has not worked. When I try run the model with no detection covs I get 
Error in pmat[j, not_na] <- ps : 
  number of items to replace is not a multiple of replacement length

Not too sure how to proceed from here.

Thanks in advance,
Emma

Ken Kellner

unread,
Sep 14, 2020, 12:41:59 PM9/14/20
to unmarked
Hi Emma,

I think the first error is a known recent bug, when you use a site-level covariate as a detection covariate. Fitting the model works fine but predicting from it (which is happening at some point when you plot) causes problems.

This should be fixed in the dev version in the next couple days and on CRAN hopefully soon. In the meantime you can try to using the attached script. Start a completely new R session, load unmarked, source in the script, and then run your model and plot again, and it should work.

I'm not sure about that second error, if it keeps happening I'll need more info to look into it.

Ken
om_predict_fix.R

Emma Evers

unread,
Sep 14, 2020, 1:18:27 PM9/14/20
to unmarked
Hi Ken,

I'm still getting the same error message. 
Just to make sure I did it correctly (sorry very new to R) my first two lines are:

library(unmarked)
source('~/Downloads/om_predict_fix.R')
 
and then I ran my script as I had it? It does allow me to do a traceback when I get to the plot(fit) error message.

If I run it without any detFormulas and no maxOrder I am still getting the following error: 
Error in pmat[j, not_na] <- ps : 
  number of items to replace is not a multiple of replacement length

Emma


Ken Kellner

unread,
Sep 14, 2020, 1:26:26 PM9/14/20
to unmarked
Hi Emma,

I'll probably have to run this myself to solve it unfortunately. I'll email you off-list to see if that's possible.

If I end up finding a solution I'll also post it here.

Ken

Emma Evers

unread,
Sep 14, 2020, 2:10:11 PM9/14/20
to unmarked
Hi Ken,

Ok that would be great, happy to provide anything you might need.

Thanks so much for your help,
Emma

ken.k...@gmail.com

unread,
Sep 14, 2020, 3:21:39 PM9/14/20
to unmarked
Emma,

The reason for this error:

Error in pmat[j, not_na] <- ps : 
  number of items to replace is not a multiple of replacement length

turned out to be that the location of missing values did not match for all species across sites. There was at least one site where there was an observation for say, species 1 but the corresponding observation was missing for another species. Unfortunately the model can't handle that scenario properly, either all species have to be non-missing or all have to be missing for a given observation. The solution is to just fill in NAs for all species for any observation where there is any NA for any species.

For example if your observation y-matrices are leopardmat, spottedmat, and lionmat:

any_na <- is.na(leopardmat) | is.na(spottedmat) | is.na(lionmat) #find observations where at least one observation across species is NA
leopardmat[any_na] <- NA #overwrite observations that don't match with NA for each species
spottedmat[any_na] <- NA
lionmat[any_na] <- NA

Then build your y-list and your unmarked frame and everything should work. I'll add a check for this to unmarkedFrameOccuMulti() with an informative warning so that it is handled automatically in the future.

Ken


Henry Hart

unread,
Sep 16, 2020, 6:51:07 PM9/16/20
to unmarked
Hi Ken,

I have been working away at getting the parameters ready for my occuMulti() function. Through trial and error I am finally starting to get some results. However, there are a few parameters (based on r doc examples & the above discussion), that I have been including but do not fully understand. Now that I am beginning to get ready for adding covariates, it would be great to get some clarification to ensure I am using them correctly moving forward.

In my case, I have 4 species, 59 sites (camera trap locations), and 157 observations(sampling period in days). 

occuMulti (detformulas,  stateformulas,  datastarts

1) detformulas: Character vector of formulas for the detection models, one per species. 
          For this parameter, I've noticed that you have added a '~1' for each species type. In my case, will this always be detFormulas <- c('~1', '~1', '~1', '~1')? Do I need to know anymore than this moving forward? I'm confused how ~1 represents a                  formula for the detection model? (Maybe I am overthinking this).

2) stateformulas: Character vector of formulas for the natural parameters. To fix a natural parameter at 0, specify the corresponding formula as "0" or "~0".
           Length should match number/order of columns in fDesign              e.g)    occFormulas <- c('~occ_cov1','~occ_cov2','~occ_cov3','~1','~1','~1','~1')

           I understand including the covariates here, however I am still confused about the meaning of ~1 or ~0. When would be a case in which I would want to fix a natural parameter at '~0'? Again, I'm sorry for my lack of understanding. I have tried                 to explore the meaning of these parameters, but simply can't seem to wrap my head around them (again, maybe overthinking them).

3) Data: I have my data already made. (No issues here!)  

4) Starts: Vector of parameter starting values. 
            Are there specific cases in which starting values would be required?

Again, any insight here would be greatly appreciated. Essentially, I am just confused as what the '~1' and '~0' values represent when included in both the detformula/stateformula  parameters.

Thank you in advance for you help!

Cheers,

Henry

ken.k...@gmail.com

unread,
Sep 17, 2020, 9:37:55 AM9/17/20
to unmarked
Hi Henry,

1) The "~1" formula represents an intercept-only model (the 1 represents the intercept in R formulas) for the parameter in question, with no covariates. So when you set detFormulas to c('~1', '~1', '~1', '~1') you are specifying that you want to model unique, but constant, detection probabilities for each species. If you aren't interested in covariate effects on detection probability in your models, then yes, this will always remain the same for you.

2) Again, for the formulas on natural parameters, ~1 represents an intercept-only model with no covariates. The "0" or "~0" structure for some natural parameters is a unique feature of occuMulti. To fully understand what each natural parameter is doing in the model I would suggest reading Chris Rota's papers on the model. The short version is that the multispecies occupancy model estimates interaction effects among species. If you have two species in your model it will estimate the pairwise interaction. If you have three species it will measure each of the three pairwise interactions (species 1&2, 1&3, 2&3) and the three-way interaction (species 1&2&3), and so on if you have >3 species. In many cases you may not be interested in these higher-order interactions among species. They require a lot of data to get good estimates and (in my opinion) are often hard to interpret. So, you can tell occuMulti to just not estimate these interactions by specifying "0" or "~0" in the corresponding formula.

Say you had 3 species, just to keep things simpler. Your state formulas should contain 7 formulas: 3 formulas for the single-species occupancy parameters, 3 formulas for the 3 pairwise interactions (1&2, 1&3, 2&3), and one formula for the three-way interaction (1&2&3) in that order. If they were all intercept only with no covariates this would be:

c("~1", "~1", "~1", "~1", "~1", "~1", "~1")

If you decide you didn't want to estimate the three-way interaction term, you would replace the corresponding formula with "~0":

c("~1", "~1", "~1", "~1", "~1", "~1", "~0")

And it would no longer be estimated or appear in the model output.

3) Supplying starting values it not usually necessary, and generally it's better to avoid it. Sometimes if your model does not converge, it can be helpful to supply the starting values, but picking good values can be quite a challenge and is not worth worrying about until you decide it's necessary.

Ken

Henry Hart

unread,
Sep 17, 2020, 1:30:21 PM9/17/20
to unmarked
Ken--

Thank you so much for the quick response. This is so much clearer now!

I have a quick side question, regarding the co-variates, that I hope you could help me out with.

Let's say, for example, I want to include the elevation at each site as a covariate. This will be in the form of a (M:# of sites)x(# of cov's) matrix: for this example, a Mx1 matrix.

MxCov Matrix
1   500m
2   650m
3   150m
...
59  230m

Does the unmarkedFrameOccuMulti function accept any value for the covariates column? Or should I normalize these values? If so, do you recommend the -1 to 1 normalization, or just the 0 to 1 normalization?

Also, one last question regarding the covariates. I have been considering including the landcover classification of each site area as a covariate. Would the function accept a 'landcover' column, with shorten character values, such as:

open coniferous = ocon
closed coniferous = ccon
... etc.

i.e.
MxCov Matrix
1   ocon
2   ccon
3   ccon
...
59  ocon



Thanks again, I really appreciate all of your help!

Cheers,

Henry

ken.k...@gmail.com

unread,
Sep 17, 2020, 1:55:46 PM9/17/20
to unmarked
(1) Yes,  technically occuMulti (or other unmarked functions) will accept any numeric values, although in your example you would need to remove the "m" from each value. It's generally a good idea to normalize covariates to facilitate model convergence, especially if they are otherwise very large. I would suggest normalizing your covariates to have a mean of 0 and a standard deviation of 1, e.g. with the scale() function in R.

(2) Yes this is possible. unmarked will automatically convert your character land class data into an R factor variable. This land class factor will then be automatically converted into dummy variables so they can be used in the model.

Ken

Elicia Bell

unread,
Nov 25, 2020, 7:26:49 AM11/25/20
to unmarked

Hello Ken,

I'm working on a multi-species occupancy analysis that forms part of my master’s thesis. I’m familiar with the Rota et al. 2016 paper and am looking at 4 focal species. I have a couple of questions that I hope you might be able to help with. The first relates to issues with back-transforming occupancy estimates on outputs with site covariates.

I've been unsuccessful with the backtransform function even when I am running with the model that excludes covariates (see error below). I have also come to understand that the backtransform function is only possible when no covariates are present. Could you advise me on what steps can be taken to back transform occupancy estimate outputs, in particular when covariates are in play?

Example code:

data <- unmarkedFrameOccuMulti(y=y, siteCovs=site.covs, obsCovs=NULL)

occFormulas <- c('~land_cover','~land_cover','~land_cover','~land_cover','~0','~0','~0','~0','~0','~0', '~0','~0','~0', '~0', '~0')

detFormulas <- c('~1','~1','~1','~1')

fit <- occuMulti(detFormulas,occFormulas,data)

Occupancy:

                                                                Estimate                     SE                   z          P(>|z|) 

[Coyote] (Intercept)                                        -1.456             8.21e-01  -1.772965  0.0762

[Coyote] land_coverGrassland                      -0.715             1.10e+00 -0.649255  0.5162

[Coyote] land_coverOpen Coniferous            -1.031             1.32e+00 -0.780580  0.4350

[Coyote] land_coverShrub Dominated             0.643             1.49e+00  0.432732  0.6652

[Coyote] land_coverWetlands                        -18.220           2.87e+04 -0.000634  0.9995

etc.

backTransform(fit, type="state")

e.g. error: Cannot directly backTransform an unmarkedEstimate with length > 1.

My second question pertains to conditional occupancy. I’d like to be able to apply site covariates to conditional occupancy probabilities. For example, what is the probability that a coyote will occupy a site given wolf presence/absence for different landcover classifications. I believe I understand how to look at pairwise interactions  with covariates using occuFormulas…

e.g. occFormulas <- c('~1','~1','~1','~1','~land_cover','~land_cover','~land_cover','~land_cover','~land_cover','~land_cover','~land_cover','~0','~0', '~0', '~0')

[Coyote:`Red Fox`] (Intercept)                                  8.959   38.049    0.23545   0.813860

[Coyote:`Red Fox`] land_coverGrassland                  0.860    1.185    0.72568   0.468038

[Coyote:`Red Fox`] land_coverOpen Coniferous      -18.145   5931.796  -0.00306   0.997559

[Coyote:`Red Fox`] land_coverShrub Dominated     -11.767   655.041   -0.01796   0.985667

[Coyote:`Red Fox`] land_coverWetlands                  -9.422   265.672   -0.03547   0.971708

etc.

...and how to predict conditional occupancies…

e.g. cond_occ <- predict(fit,'state',species='Coyote',cond='Wolf')

            cond_occ2 <- predict(fit,'state',species='Coyote',cond='-Wolf')

…but not how to generate conditional occupancy probabilities with site level covariate effects?  

Thanks in advance for any insight.

Elicia

Ken Kellner

unread,
Nov 25, 2020, 8:14:21 AM11/25/20
to unmarked
Elicia,

For your first question, backTransform() can't handle occuMulti output since there are more than 2 possible occupancy states. If you just call predict() on your model, e.g.

predict(fit, type='state')

You will get occupancy probabilities for each possible state (0000, 1000, 1100, etc.) at each possible site (taking into account each site's covariate values) along with standard errors. I think that's what you're after? This will still work if you don't have site-level covariates, you'll still get occupancy probabilities for each state and site, each site will just have the same predicted occupancy values as the other sites.

I'm not sure I  understand your second question. The code you shared should generate conditional occupancy probabilities for Coyote based on wolf status at each of your sites, given the specific set of site covariate values at each site. Maybe you are asking about calculating conditional occupancy probability for a specific set of site covariate values that aren't necessarily associated with any one site? If so you can use the `newdata` argument to predict(), supplying to newdata a data frame containing one row per probability you want to calculate and one column for each site covariate included in your model corresponding to those probabilities.

Ken
Reply all
Reply to author
Forward
0 new messages