'if' statement conditional on indexed value of a constant

244 views
Skip to first unread message

Ciar Noble

unread,
Jul 18, 2023, 4:53:54 PM7/18/23
to nimble-users
Hello,

I'm fitting a zero-inflated beta regression, and as such need to specify a different likelihood structure for zero and non-zero observations. Currently, I have structured the likelihood section like so:

#Phi (only varies with year not observation)
  for(y in 1:nYrs){ log(phi[y]) <- a[3] + u[y, 3] }
 
  for(n in 1:N){
   
    #Zi
    yz[n] ~ dbern(zi[n])
    logit(zi[n]) <- a[1] + u[yr_n[n], 1] + inprod(bzi[1:Kzi], X[n, 1:Kzi])
   
    if(yz[n] == 0){
     
      #Observed value (non-zero)
      y[n] ~ dbeta(shape[1, n], shape[2, n])
      shape[1, n] <- mu[n] * phi[yr_n[n]]
      shape[2, n] <- (1 - mu[n]) * phi[yr_n[n]]
     
      #Mu
      logit(mu[n]) <- a[2] + u[yr_n[n], 2] + inprod(bmu[1:Kmu], Z[n, 1:Kmu]) 
     
    }
  }

Here, yz is a dummy binary variable indicating whether (1) or not (0) each observation (y) is equal to zero, and is provided as a vector in the model constants. However, when I try to compile the model, I get the error I'd expect when tring to apply an if statement to an estimated parameter:

Condition must be able to be evaluated based on values in 'constants'

I presume this is because part of the condition is not included in the constants (i.e., the index, n). Is there a way to make this work, I feel like there should be?

One option I've thought of is to create another vector including the index of all non-zero observations, to which I can compare n, which in R would be:

if(n %in% non_zeros){...}

But obviously there is no %in% operator in NIMBLE - is there a NIMBLE alternative to %in%?

Of course, I think the recommended solution is probably to create a nimbleFunction. I've had a go at this:

zibr_likelihood <- nimbleFunction(
  run = function(y = double(0), yz = double(0), u = double(1), phi = double(0),
                 s = double(0), X = double(1), Z = double(1)) {
   
    if(!yz){
     
      #Zi
      yz ~ dbern(zi)
      logit(zi) <- a[1] + u[1] + inprod(bzi, X)
     
      #Observed value (non-zero)
      y ~ dbeta(shape[1], shape[2])
      shape[1] <- mu * phi
      shape[2] <- (1 - mu) * phi
     
      #Mu
      logit(mu) <- a[2] + u[2] + inprod(bmu, Z) + s
     
    } else {
     
      #Zi
      yz ~ dbern(zi)
      logit(zi) <- a[1] + u[1] + inprod(bzi, X)
     
    }
   
  })

However, I think due to the fact some of the parameters of my model are only included in the likelihood (i.e., the section in the function) I get an error when compiling:

Defining model
  [Note] 'yz' is provided in 'constants' but not used in the model code and is being ignored.
Error in processBUGScode(recurseCode, nextContextID, lineNumber = lineNumber,  :
  Error: zibr_likelihood not allowed in BUGS code in zibr_likelihood(y = y[n], yz = y[n], u = u[yr_n[n], ], phi = phi[yr_n[n]], s = s[yr_n[n], loc_n[n]], X = X[n, 1:Kzi], Z = Z[n, 1:Kmu])

Ideally I'd like to write the likelihood directly within the nimbleModel (i.e., not use a nimbleFunction), although this is purely for aesthetic reasons so I'm perfectly happy for someone to point out the errors in my function.

Thanks very much for any help,

Ciar

Perry de Valpine

unread,
Jul 18, 2023, 5:13:36 PM7/18/23
to Ciar Noble, nimble-users
Hi Ciar,

Thanks for posting the question. I am not sure I completely follow what you want, but let me see if I can say something helpful anyway.

It is not allowed (does not make sense) to have constants follow a distribution.  Also the definition-time if-then-else can't change for each value in the for loop.

Are the yz[i] elements supposed to be latent states sampled by MCMC (but then they can't be constants, but maybe you were just trying things)? 

If you need different model declarations for different elements of y (known in advance), nested indexing would be a way to do that:

for(i in 1:n_zeros) {
  y[i_zero[i] ] ~ dbeta(...)
}
for(i in 1:n_ones) {
  y[i_one[i] ] ~ dbeta(...)
}

where you would set up the index vector i_zero and i_one and provide them as constants.

If you don't know which is which in advance because it depends on a latent state such as yz[i], you could write a user-defined distribution (but maybe you can clarify more what you need if that is the case).  In simple cases such as zero-inflated Bernoulli, you can multiply the Bernoulli probability by the indicator variable, which sets it to zero if the indicator is zero, but that is a special case of zero-inflation.

HTH
Perry


--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/44c152bc-1e5b-4fcd-abaa-aa8f9ad35b57n%40googlegroups.com.

Ciar Noble

unread,
Jul 18, 2023, 6:03:42 PM7/18/23
to nimble-users
Hi Perry,

Thanks for the quick response. Sorry I should have been clearer: yz is not a latent state, it is simply the result of ifelse(y == 0, 1, 0). So, it's a constant, known parameter, which simply indicates whether each of my observations were 0s. I'm modelling a continuous proportion ( y ∈[0,1) )  with a reasonably high number of 0 observations, and I'm particularly interested in the effects of covariates on the probability of obtaining a 0, hence the seperate zi component. Terming this 'zero-inflated' did seem slightly counter-intuitive to me, as I'm not necessarily trying to account for the disproportionate density at 0, rather I'm treating zero observations as arising from a distinct process to non-zero observations. However, 'zero-inflated beta regression' seems to be the established name for this in the literature. 

Anyway, your index vector idea is exactly what I'm looking for, thanks very much. 

Also, I now realise representing the response variable as y, and using y as an index for year in the previous 'for' loop could lead to some confusion  - I'll change that.

Best wishes,

Ciar

Ciar Noble

unread,
Jul 18, 2023, 6:24:27 PM7/18/23
to nimble-users
It's also just clicked as to what you were saying about modelling yz despite specifying it as a constant, I've shifted it to the data.

Perry de Valpine

unread,
Jul 18, 2023, 6:38:06 PM7/18/23
to Ciar Noble, nimble-users
That all makes sense.  Good luck with it!

Reply all
Reply to author
Forward
0 new messages