Covariate-dependent DPM using NIMBLE

260 views
Skip to first unread message

Meng Qiu

unread,
May 28, 2023, 3:13:00 PM5/28/23
to nimble-users
Dear all,

Good afternoon!

I'm currently working on applying a Dirichlet process mixture (DPM) model to a real dataset that includes several informative covariates. I am thinking about incorporating covariates into the DPM, i.e., allowing the weights in the stick-breaking construction to depend on covariates. 

Do Nimble's samplers support such functionality? If Nimble does offer this capability, I would appreciate it if you could provide me with a simple demonstration. 

Thank you very much for your time and assistance!


Best,
Chris


PierGianLuca

unread,
May 30, 2023, 3:34:50 AM5/30/23
to Meng Qiu, nimble-users
Hi Chris,

This is not a proper answer, as I do not know whether the stick-breaking weights can be made to depend on covariates.

But I just wanted to point out, as a workaround if necessary, that a Dirichlet-process mixture with concentration parameter α can be approximated arbitrarily closely by a finite Dirichlet mixture with a large number N of components and concentration parameter α/N. According to my experience from the past two years this approach gives some extra flexibility and – at least in my specific applications – mixes better. In my case N≈64 has worked perfectly so far; you can monitor how many of the N components the data actually use and increase N if necessary.

All this can be seamlessly implemented in Nimble!

See
http://www3.stat.sinica.edu.tw/statistica/J12n3/j12n316/j12n316.htm

and § 4 here:
https://doi.org/10.2307/3315951

Cheers,
Luca

Meng Qiu

unread,
May 30, 2023, 12:11:44 PM5/30/23
to PierGianLuca, nimble-users
Hi Luca,

Thank you very much for introducing me to this approach and the references. I'm wondering how one can incorporate covariates into this approach. Should we treat it as a finite mixture model? 

I've been doing some digging, and it looks like the covariate-dependent DPM is often referred to as the dependent DPM. I came across a paper by Chung and Dunson (2009) that proposed a probit stick-breaking process, whose input can be a function of the covariates. This appears to be feasible in Nimble, but I am not entirely sure. 

Sorry for bombarding you with these questions, but I'm still trying to wrap my head around all of this. 


Thanks again,
Chris
--

Meng (Chris) Qiu

Graduate Student, Quantitative Psychology

Statistical Methods for Real Data Lab|University of Notre Dame

Chris Paciorek

unread,
Jun 5, 2023, 9:32:46 PM6/5/23
to Meng Qiu, nimble-users
Hi Chris,

You should be able to use the stick-breaking approach to do this (see here in our manual), where you'll need to specify how to construct the elements of 'v' (using the notation in the manual section) based on covariates.
I don't have an example of this, but I think it would be straightforward to translate equation 5 of Chung and Dunson into NIMBLE model code. We do have a Gaussian CDF function you can use in model code.

-Chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/CAJeAPp9fjuDvn0edLm_BAq75CfsZOpHXVtkvrdTR1oyMxR_WWg%40mail.gmail.com.

Meng Qiu

unread,
Jun 5, 2023, 9:48:15 PM6/5/23
to paci...@stat.berkeley.edu, nimble-users
Hi Chris,

Thank you so much for the information! I totally agree that implementing the probit or logistic stick-breaking process should be straightforward in NIMBLE. I'll share the code with the community once it is completed. 


Best,
Meng

Meng Qiu

unread,
Jan 26, 2024, 1:13:25 AM1/26/24
to nimble-users
Hi Chris,

I implemented one type of dependent Dirichlet process mixture model, namely the logistic stick-breaking process mixture, in Nimble. The model works, but it reports warnings about the initialization of membership variables, e.g., warning: problem initializing stochastic node z[9]: logProb is -Inf. I am confused since this is how I usually initialize membership variables. 

I would appreciate your suggestions. Thank you very much. 

# Nimble model
ddpm <- nimbleCode({
  # likelihood
  for(i in 1:N) {
    for(j in 1:J) {
      y[i,j] ~ dcat(ip[z[i], j, 1:C])
    }
    z[i] ~ dcat(w[i,1:H])
  }
  # prior on item parameters
  for(h in 1:H) {
    for(j in 1:J) {
      ip[h, j, 1:C] ~ ddirch(beta[1:C])
    }
  }
  # logistic stick-breaking process
  for(h in 1:(H-1)) {
    b0[h] ~ dnorm(0, tau=1.0E-03)
    b1[h] ~ dnorm(0, tau=1.0E-03)
  }
  for(i in 1:N) {
    for(h in 1:(H-1)) {
      logit(v[i,h]) <- b0[h] + b1[h]*x[i]
    }
    w[i,1:H] <- stick_breaking(v[i,1:(H-1)])
  }
})
# Prepare data
library(poLCA)
data(election)
dat <- election[,-13] # remove outcome variable VOTE
dat <- dat[complete.cases(dat), ]
N <- 200
data_N <- dat[sample(nrow(dat),N),]
data_N <- lapply(data_N, function(x) as.numeric(x))
data_N <- as.data.frame(data_N)
# Run the MCMC
Y <- data_N[,1:12]
X <- data_N[,13]
C <- 4 # number of categories for each item
H <- 10 # truncation
N <- nrow(Y) # sample size
J <- ncol(Y) # number of items
nMCMC <- 10000 # number of MCMCs
consts <- list(N=N, J=J, H=H, C=C, beta=rep(1, C))
set.seed(123)
ip <- array(0, dim=c(N, J, C)) # initial values for ip
for(i in 1:N) {
  for(j in 1:J) {
    ip[i,j,] <- MCMCpack::rdirichlet(1, alpha=rep(1,C))
  }
}
inits <- list(ip=ip, b0=rep(0.5,H), b1=rep(0.5,H),
              z=sample(1:5, size=consts$N, replace=TRUE))
dat <- list(y=Y, x=X)
model <- nimbleModel(code=ddpm, data=dat, inits=inits, constants=consts)
cmodel <- compileNimble(model)
conf <- configureMCMC(model, monitors=c('y'), print=TRUE)
mcmc <- buildMCMC(conf)
cmcmc <- compileNimble(mcmc, project=model)
samples <- runMCMC(cmcmc, niter=nMCMC, nburnin=1000, thin=1, setSeed=TRUE)

Chris Paciorek

unread,
Jan 28, 2024, 1:51:05 PM1/28/24
to Meng Qiu, nimble-users
Hi Meng,

We have some suggestions on initialization here in the manual.

If you are getting an invalid logProb for `z[9]`, I would look at the values of `w[9,]` and `v[9,]` as that is likely where the problem comes from. It's possible something funny is happening with initialization because of the `logit` on the left-hand side of the declaration for `v[i,h]`.

chris

Meng Qiu

unread,
Jan 28, 2024, 2:26:47 PM1/28/24
to paci...@stat.berkeley.edu, nimble-users
Hi Chris,

Thank you so much for the suggestions! I made several attempts and found that:

1) This initialization issue does not seem to impact the sampling process. I still get the posterior samples of the model parameters. However, it might introduce certain problems of which I am not aware. 
2) After standardizing the covariate X, the warning messages disappeared, and everything returned to normal. 

One more question: Does adding covariates to the data list make any difference compared to adding them to the constants? In my case, both ways work well. 


Best,
Meng
--

Meng (Chris) Qiu (he/him/his)

Ph.D. Candidate, Quantitative Psychology

Statistical Methods for Real Data Lab 
E414 Corbett Family Hall | University of Notre Dame
Notre Dame, IN 46556

Chris Paciorek

unread,
Jan 30, 2024, 12:45:24 PM1/30/24
to Meng Qiu, nimble-users
Hi Meng,

In many cases, NIMBLE's MCMC can recover from a set of inconsistent initial values, but it's generally safest to make sure that the initial values are self-consistent and don't lead to infinities in `model$calculate()`.

Regarding covariates as data vs. constants, it doesn't really matter in most cases, and in fact you can set them via `inits` as well. If you're never going to change them, providing them as constants can make
sense. We have a bit of info about this here in the manual.

Regarding the initial values in your particular case, the results below are not exactly what you saw because the random number seed is not the same, but they illustrate what can happen here - a row of the `w` matrix is inconsistent with the value for the corresponding `z`. I believe that is the cause of what you saw.

> model$calculate('z[45]')
[1] -Inf
> model$z[45]
[1] 3
> model$w[45,]
 [1] 1 0 0 0 0 0 0 0 0 0

Meng Qiu

unread,
Jan 31, 2024, 12:43:44 PM1/31/24
to paci...@stat.berkeley.edu, nimble-users
Hi Chris,

Thank you so much for showing me how to diagnose the initialization issue. This is extremely helpful!

I apologize, but I may need to bother you with another question. I am applying this model to a real dataset in which the predictor X also has missing values (NAs). In my model, the X variable is not included in the likelihood and, if I understand correctly, is treated as a hyperparameter of the stick-breaking. Is it possible to impute the NAs of X, perhaps by assigning a sampler to X?


Best,
Meng

Chris Paciorek

unread,
Feb 2, 2024, 4:27:41 PM2/2/24
to Meng Qiu, nimble-users
In general if X has missing values, in order for it to be sampled, it
must have a prior. In that case, the log posterior density of the
model includes the log prior density for X.

I haven't looked back at your model so I can't say anything regarding
the role of X in your model.

Meng Qiu

unread,
Feb 2, 2024, 6:29:12 PM2/2/24
to paci...@stat.berkeley.edu, nimble-users
Hi Chris,

I get your point. Thank you so much!


Best,
Meng
Reply all
Reply to author
Forward
0 new messages