Calculating WAIC for hierarchical models

1,097 views
Skip to first unread message

Natasha Karenyi

unread,
May 9, 2017, 3:45:17 AM5/9/17
to hmecology: Hierarchical Modeling in Ecology
Good day all,

I have a question regarding how I would calculate the WAIC for a hierarchical model.

I estimated species richness at each of my sites using a capture-recapture population model. and then fed the means and SE of that analysis into a number of hierarchical linear models. For model selection I used DIC since I ran my analyses in JAGS. I would now like to calculate the WAIC as well since this model selection criterion is supposed to be better for Bayesian analyses. Can anyone advise me on how to do this? The only information I could find was that I would have to re-run my models in STAN then use 'waic' in the 'loo' package. Is this the only way to calculate WAIC for hierarchical models?

Looking forward to hearing from you.
Natasha

John Clare

unread,
May 9, 2017, 12:20:18 PM5/9/17
to hmecology: Hierarchical Modeling in Ecology
Hi Natasha,

There is some discussion on this a bit further down the board, but there are a couple of ways to do it. Among others, Gelman has a paper  (http://www.stat.columbia.edu/~gelman/research/published/waic_understand3.pdf) that describes the derivation of WAIC: formulas 5 (the lpd of the model) and 12 (the penalty, based upon variance in the lpd) are most germane. Although not terribly well documented, JAGS has a couple functions to calculate the log probability of specific observations (i.e., loglike[i, j, k]<-logdensity.bern(p[i, j, k]).  An alternative is to do so in r after the fact (loglike[i, j, k, sim]<-dbinom(1, size, prob=p[i, j, k, sim]). 

Cheers,

John

Andrea Goijman

unread,
May 12, 2017, 11:27:18 AM5/12/17
to John Clare, hmecology: Hierarchical Modeling in Ecology
Hi Natasha, 

I'm dealing with the same problem here for Multi-species Occupancy models.

John, in the alternative you mention for calculating WAIC in R after running the model... How would you combine the likelihood calculation if it comes from two probabilities, say p and psi?
I understand in this one, you can only calculate it for p, right?

(loglike[i, j, k, sim]<-dbinom(1, size, prob=p[i, j, k, sim]). 

Thanks!

Andrea

--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2015)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+unsubscribe@googlegroups.com.
To post to this group, send email to hmec...@googlegroups.com.
Visit this group at https://groups.google.com/group/hmecology.
To view this discussion on the web visit https://groups.google.com/d/msgid/hmecology/786f9aa7-0c4f-46c5-a65b-1462f0552c88%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
-----
Andrea Paula Goijman, Ph.D
Grupo Ecología, Biodiversidad y Gestión Ambiental en Agroecosistemas
Instituto de Recursos Biológicos
CIRN - INTA Castelar, Argentina

De Los Reseros y Las Cabañas S/N
HB1712WAA Hurlingham 
54 11 4621 0840/ 4621 1819 (int. 4)
Cel. 54 11 6293 6214
US Cell +706.389.4356
Skype: andrea.goijman

John Clare

unread,
May 12, 2017, 1:01:03 PM5/12/17
to hmecology: Hierarchical Modeling in Ecology, john....@maine.edu, goijman...@inta.gob.ar
Hi Andrea (and Natasha),

Sorry, that was a terribly lazy description: don't calculate it as (loglike[i, j, k, iteration]<-dbinom(y[i, j, k], size, prob=p[i, j, k, iteration]).  

Instead, a better way might be loglike[i, j, k, iteration]<-dbinom(y[i, j, k], size, prob=p[i, j, k, iteration]*psi[i, j, iteration], log=T), which corresponds to the probability of detection given presence; for sites in which there are no detections at all, another part of the likelihood corresponding to the probability of not being present should also be incorporated...so maybe the overall likelihood is something like loglike[i, j, k, iteration]<-dbinom(y[i, j, k], size, prob=p[i, j, k, iteration]*psi[i, j, iteration]+indicator_all_yij_are_zero[i,j]*(1-psi[i, j, k]), log=T)...where the indicator is 1 if all y[i, j,] are 0, and 0 if otherwise.  Kristin Broms et al.'s recent paper regarding MSOM-selection (Appendix 1) lays out the likelihood; I believe Royle and Dorazio 2008 has some code (in the occupancy chapter).

Cheers,

John




On Friday, May 12, 2017 at 10:27:18 AM UTC-5, Andrea Goijman wrote:
Hi Natasha, 

I'm dealing with the same problem here for Multi-species Occupancy models.

John, in the alternative you mention for calculating WAIC in R after running the model... How would you combine the likelihood calculation if it comes from two probabilities, say p and psi?
I understand in this one, you can only calculate it for p, right?

(loglike[i, j, k, sim]<-dbinom(1, size, prob=p[i, j, k, sim]). 

Thanks!

Andrea
On Tue, May 9, 2017 at 1:20 PM, John Clare <john....@maine.edu> wrote:
Hi Natasha,

There is some discussion on this a bit further down the board, but there are a couple of ways to do it. Among others, Gelman has a paper  (http://www.stat.columbia.edu/~gelman/research/published/waic_understand3.pdf) that describes the derivation of WAIC: formulas 5 (the lpd of the model) and 12 (the penalty, based upon variance in the lpd) are most germane. Although not terribly well documented, JAGS has a couple functions to calculate the log probability of specific observations (i.e., loglike[i, j, k]<-logdensity.bern(p[i, j, k]).  An alternative is to do so in r after the fact (loglike[i, j, k, sim]<-dbinom(1, size, prob=p[i, j, k, sim]). 

Cheers,

John

--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2015)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.

To post to this group, send email to hmec...@googlegroups.com.
Visit this group at https://groups.google.com/group/hmecology.

John Clare

unread,
May 12, 2017, 2:00:00 PM5/12/17
to hmecology: Hierarchical Modeling in Ecology, john....@maine.edu, goijman...@inta.gob.ar
Or rather, something like loglike[i, j, k, iteration]<-log(dbinom(y[i, j, k], size = 1, prob=p[i, j, k, iteration])*psi[i, j, iteration]+indicator_all_yij_are_zero[i,j]*(1-psi[i, j, iteration])). Sorry, Friday.

Michelle E. Thompson

unread,
Jun 10, 2017, 11:34:21 PM6/10/17
to hmecology: Hierarchical Modeling in Ecology, john....@maine.edu, goijman...@inta.gob.ar
Hi,

I am also trying to calculate WAIC for (single species) occupancy models. Once I have the likelihood calculated, how to I use the output to calculate WAIC?

Thanks!
Michelle

John Clare

unread,
Jun 11, 2017, 10:50:39 PM6/11/17
to hmecology: Hierarchical Modeling in Ecology
See the Gelman paper in the first reply (eq's. 5 and 12).  This paper (http://www.stat.columbia.edu/~gelman/research/unpublished/waic_stan.pdf) also goes over some additional detail (in Stan, but similar things could be done in JAGS, etc.).

I guess it's worth noting as a general point that despite a lot of the initial optimism regarding WAIC (e.g., the Hooten and Hobbes monograph), follow up work (such as the paper linked to above, Kristin Broms paper on model selection for community models, etc.) should temper expectations.  Just my opinion, but barring taking a thorough out of sample (like, really out of sample) validation approach, it's unclear whether WAIC has better predictive performance than approaches more theoretically similar to BIC/Bayesian Model averaging (variable selection processes, etc.), and the latter don't necessarily require repeatedly fitting models. 
Reply all
Reply to author
Forward
0 new messages