NMixture model with missing data in detectability covariate.....sort of.

91 views
Skip to first unread message

Brian Battaile - NOAA Affiliate

unread,
Aug 1, 2022, 9:02:31 PM8/1/22
to hmecology: Hierarchical Modeling in Ecology
Hi All,

I'm doing an NMixture model ala. Marc Key et al. Bayesian Population Analysis Using WinBUGS: A Hierarchical Perspective.  I'm basically following the chapter 12.3.3 method.  I have data at 69 locations but between 2 and 7 replicates at each area, so it's pretty ragged.  Alone this does not seem to be an issue, however I would like to add a detectability covariate, X1.  I have a value for X1 for ALL of my count replicates at ALL areas, no missing data in that sense, but there are still NA's in the X1 input matrix [69,7] where replicates for an area are less than 7.  Just to be clear, I have NA's in the exact same places in my X1 matrix and my count data input matrix, no more, no less.  So I end up with an error message saying my covariate "is partially observed and partly missing".  I think it's kinda funny JAGS can handle missing values in the count data, but not in the covariate.  Can anyone point me towards a potential solution?

Thanks,
Bian

Alejandro de la Fuente Piñero

unread,
Aug 1, 2022, 9:37:15 PM8/1/22
to hmecology: Hierarchical Modeling in Ecology
Hi Brian,

If I understood correctly, when you are constructing a symmetrical matrix to store your data, you need to include missing surveys as NA in those sites with less than seven replicates. Following some examples in the AHM books, I noticed that Kery and Royle inputted the missing values in detection as zeroes after standardising the covariate (centred at 0). They called it "Innocuous mean imputation". That way you would be assuming that the detection covariates during the missing surveys just follow the average conditions. Would that make sense for your case?

Check this example from the AHM vol.2 (it can be run in R, you just need to install the AHMbook package): 

I hope that helps.

All the best,

Alejandro


Marc Kery

unread,
Aug 2, 2022, 6:02:32 AM8/2/22
to Alejandro de la Fuente Piñero, hmecology: Hierarchical Modeling in Ecology
Hi there,

I agree that the simplest solution is to just fill the entire covariate array with some non-NA content, e.g., zeroes. This is innocuous since it will not affect the estimates (there will never be any corresponding non-NA elements in the response so this will not contribute to the likelihood).

The only cost of this approach is that the algorithm will then estimate the missing responses associated with every value of the covariate that is originally missing. If your data set is not large, then this cost will be minor. If your data set is large or very large then you should avoid this wasteful updating, e.g., by putting all data in vectors and getting rid of any NAs; see Section 4.10.2 in the AHM2 book (and you don't absolutely have to buy it, but you can look up the code on the website).

Best regards  ---- Marc


From: hmec...@googlegroups.com <hmec...@googlegroups.com> on behalf of Alejandro de la Fuente Piñero <alejandrof...@gmail.com>
Sent: Tuesday, August 2, 2022 03:37
To: hmecology: Hierarchical Modeling in Ecology <hmec...@googlegroups.com>
Subject: Re: NMixture model with missing data in detectability covariate.....sort of.
 
--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hmecology/fe199eb2-e786-4ccb-a13f-d43be4a4ce5dn%40googlegroups.com.

Brian Battaile - NOAA Affiliate

unread,
Aug 2, 2022, 1:46:33 PM8/2/22
to hmecology: Hierarchical Modeling in Ecology
Hi all,

So it was suggested to me by a couple members to simply loop through the ragged array, and avoid the NA's all together, and it took 5min to set up so easy peasy, right?  

The following example code to execute this is care of Thomas Riecke
visits <- c(7, 3, 2, 7, ... , 7)
for (i in 1:n.sites){
     N[i] ~ dpois(mu)
  for (j in 1:visits[i]){
     logit(p[i,j]) = beta0 + beta1 * x1[i,j]
     y[i,j] ~ dbin(p[i,j], N[i])
}}

Funny thing though,  JAGS still crashed and admonished me for being "unable to resolve the following parameters" (and listed all the p[i,j] where the NA's are in the data matrix) that I was no longer looping over.  LOL!  

I did do some testing with code that previously worked ( basically replace logit(p[i,j]) = beta0 + beta1 * x1[i,j] from above with logit(p[i,j])~dnorm(beta, tau.p) )and got a similar response looping over the ragged array, which leads me to believe that internally JAGS keeps track of things in non-ragged arrays so this solution is not possible.  Does that make sense to people with more experience?

This ragged array loop is conceptually the same solution as Marcs vectorization suggestion (to remove/ignore the NA's), but a different implementation that JAGS can handle?

   

Marc and Alejandro,
Going to work on your suggestion to replace NA's with 0's now.  My data set is small so this may be the path of least resistance.


Thank you!
Brian

Marc Kery

unread,
Aug 2, 2022, 1:51:39 PM8/2/22
to hmecology: Hierarchical Modeling in Ecology
Dear Brian,

I agree, this is another solution to this problem. And it should work of course ... are you sure there are not any missing covariate values at some position of the ragged array that IS evaluated for the likelihood ? And, well, while you try to debug this code,  you could always try to run the "vectorized" version and see whether that goes.

Best regards --- Marc






From: 'Brian Battaile - NOAA Affiliate' via hmecology: Hierarchical Modeling in Ecology <hmec...@googlegroups.com>
Sent: Tuesday, August 2, 2022 19:46

Brian Battaile - NOAA Affiliate

unread,
Aug 3, 2022, 6:52:33 PM8/3/22
to Marc Kery, hmecology: Hierarchical Modeling in Ecology
Dear All,

The ragged array solution has worked, but I needed to add some code
and change some code with respect to the summary fit statistics
described on page 406 of Bayesian Population Analysis Using WinBUGS.
I needed to be specific when summarizing over the ragged array
otherwise JAGS would assume the array was not ragged and wonder where
the missing values were.

For instance, within the j loop, instead of
ik.p[i] <- mean(p[i,])
I would have
ik.p[i]<-mean(p[i,1:visits[i])

in the j loop I add
fiti[i]<-sum(E[i,1:visits[i])
fit.newi<-sum(E.new[i,1:visits[i])

and then outside the loops replace
fit<-sum(E[,])
with
fit<-sum(fiti)

and
fit.new<-sum(E.new[,])
with
fit.new<-sum(fit.newi)

Brian
> https://groups.google.com/d/msgid/hmecology/fe199eb2-e786-4ccb-a13f-d43be4a4ce5dn%40googlegroups.com<https://groups.google.com/d/msgid/hmecology/fe199eb2-e786-4ccb-a13f-d43be4a4ce5dn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> *** Three hierarchical modeling email lists ***
> (1) unmarked: for questions specific to the R package unmarked
> (2) SCR: for design and Bayesian or non-bayesian analysis of spatial
> capture-recapture
> (3) HMecology (this list): for everything else, especially material covered
> in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle
> (2016, 2021) and Schaub & Kéry (2022)
> ---
> You received this message because you are subscribed to the Google Groups
> "hmecology: Hierarchical Modeling in Ecology" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> hmecology+...@googlegroups.com<mailto:hmecology+...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/hmecology/c78e58ea-12dc-4b16-b6ad-778af329dd0dn%40googlegroups.com<https://groups.google.com/d/msgid/hmecology/c78e58ea-12dc-4b16-b6ad-778af329dd0dn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> *** Three hierarchical modeling email lists ***
> (1) unmarked: for questions specific to the R package unmarked
> (2) SCR: for design and Bayesian or non-bayesian analysis of spatial
> capture-recapture
> (3) HMecology (this list): for everything else, especially material covered
> in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle
> (2016, 2021) and Schaub & Kéry (2022)
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/hmecology/3fVt2xvLcmw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> hmecology+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/hmecology/ZR0P278MB0869C8C8BC7A294B4BAEC70CEB9D9%40ZR0P278MB0869.CHEP278.PROD.OUTLOOK.COM.
>

Thomas Riecke

unread,
Aug 5, 2022, 11:18:27 AM8/5/22
to Marc Kery, hmecology: Hierarchical Modeling in Ecology
Hi Brian,

I attached a working example showing some nested indexing 'tricks.' I'm not sure why you're still seeing errors, but following this as a guide may be helpful.

The first example deals with 'left-justified' ragged arrays (so the data look like: 5 7 3 NA NA NA NA if a site was only visited 3 times). 

The second (starting line 125) deals with missing data. i.e., the same data can look like NA 5 NA 7 NA 3 NA or 5 7 3 NA NA NA NA.

Best wishes,

Thomas



--
Thomas Riecke (he/him)
postdoc
Swiss Ornithological Institute

example_script_N_mix_ragged.R
Reply all
Reply to author
Forward
0 new messages