MCEM for Mixture Cure Rate Model

Alt, Ethan Mackenzie

unread,

Feb 23, 2024, 3:40:29 PMFeb 23

to nimble...@googlegroups.com

Hello,

I am having difficulty obtaining reasonable estimates for the MLE of a mixture cure rate model (although the MCMC results look good).

The model is

cure ~ Bernoulli(pcure)

t | cure == 1 = Inf

log t | cure == 0 ~ N(mu, sigma^2)

log c ~ censoring distribution

y = min(t, c)

nu = I(log t <= log c)

If someone experienced the event (i.e., if nu == 1), they cannot have been cured (so the cure indicator is known for these people). If they did not experience the event, then either they were censored or cured, and the cure indicator is latent for these people.

Below is a simple reproducible example of the MCMC, MCEM with box constraints, and unconstrained MCEM. Both MCEM algorithms produce bogus results, while the MCMC is reasonably close to the truth in terms of the posterior mean.

library(nimble)

library(posterior)

library(tidyverse)

#' Complete data likelihood for censored individuals

#'

#' @param x censoring time

#' @param cure whether individidual was cured (`cure == 1`) or uncured (`cure==0`)

#' @param mu mean of normal distribution for uncured portion

#' @param sigma SD of normal distribution for uncured portion

#' @param log whether to return log density

dnormcurecen <- nimbleFunction(

run = function(x = double(0), cure = integer(0), mu = double(0), sigma = double(0), log = integer(0, default = 0)) {

returnType(double(0))

res <- 0

if ( !cure )

res <- pnorm(x, mu, sd = sigma, lower.tail = 0, log.p = 1)

if ( log )

return(res)

return(exp(res))

}

)

## Nimble code for mixture cure rate model

code <- nimbleCode({

sigma ~ dhalfflat()

mu ~ dflat()

pcure ~ dbeta(1, 1)

for (i in 1:nobs){

cure_obs[i] ~ dbern(pcure)

yobs[i] ~ dnorm(mu, sd = sigma)

}

for ( i in 1:ncen ) {

cure_cen[i] ~ dbern(pcure)

ycen[i] ~ dnormcurecen(cure_cen[i], mu, sigma)

}

})

## Generate data from mixture cure model:
## log t ~ normal(2, 1.5^2)

## cure ~ Bernoulli(0.25)

n <- 50

pcure <- 0.25

mu <- 2

sigma <- 1.5

cure <- rbinom(n, 1, pcure)

logt <- rnorm(n, mu, sigma)

logt[cure==1] <- Inf

logc <- rnorm(n, mu + sigma, sigma)

y <- pmin(logt, logc)

event <- logt <= logc

yobs <- y[event]

ycen <- y[!event]

nobs <- length(yobs)

ncen <- length(ycen)

const <- list('nobs' = nobs, 'ncen' = ncen)

data <- list('yobs' = yobs, 'ycen' = ycen, 'cure_obs' = rep(0, nobs))

inits <- list('mu' = 0, 'sigma' = 1, 'pcure' = 0.5, 'cure_cen' = rbinom(ncen, 1, 0.5))

model <- nimbleModel(code, const, data, inits)

cmodel <- compileNimble(model)

## Run MCMC

mcmcConf <- configureMCMC(cmodel)

mcmc <- buildMCMC(mcmcConf)

cmcmc <- compileNimble(mcmc)

sample <- runMCMC(cmcmc, 12000, 2000)

sample %>% summarize_draws()

## MCEM with bounded parameters

latentName <- paste0('cure_cen[1:', ncen, ']')

box <- list( list('sigma', c(0.01, Inf)), list('pcure', c(0.01, 0.99)) ) ## Constraints for the parameters

mcem <- buildMCEM(model = cmodel, latentNodes = latentName, boxConstraints = box)

mle <- mcem$run(initM = 5000)

## MCEM with unbounded parameters

code2 <- nimbleCode({

logsigma ~ dflat()

mu ~ dflat()

logitpcure ~ dflat()

sigma <- exp(logsigma)

pcure <- expit(logitpcure)

for (i in 1:nobs){

cure_obs[i] ~ dbern(pcure)

yobs[i] ~ dnorm(mu, sd = sigma)

}

for ( i in 1:ncen ) {

cure_cen[i] ~ dbern(pcure)

ycen[i] ~ dnormcurecen(cure_cen[i], mu, sigma)

}

})

inits2 <- inits

inits2$logsigma <- log(inits$sigma)

inits2$logitpcure <- logit(inits2$pcure)

model2 <- nimbleModel(code2, constants = const, data = data, inits = inits2)

cmodel2 <- compileNimble(model2)

latentName <- paste0('cure_cen[1:', ncen, ']')

mcem2 <- buildMCEM(model = cmodel2, latentNodes = latentName)

mle2 <- mcem2$run(initM = 5000)

Perry de Valpine

unread,

Feb 27, 2024, 4:40:14 PMFeb 27

to Alt, Ethan Mackenzie, nimble...@googlegroups.com

Dear Ethan,

Thanks for posting this and also the compact reproducible example. That is very helpful.

Unfortunately I can confirm you've revealed a bug in the MCEM algorithm. The problem arises due to a part of the model not involving latent states, which is incorrectly omitted from the maximization step.

We (nimble development team) are not under the impression that nimble's MCEM has been used very much. The last query on nimble-users relating to MCEM was in 2019, and the vast majority of nimble-using publications we see involve solely MCMC. This is a somewhat glaring bug and low usage may explain why it has not surfaced previously. It is disconcerting, and we will fix it soon.

It turns out MCEM is primed for other improvements as well, including better efficiency and use of AD for (potentially much) faster optimization.

Thanks again for the useful report of this.

Perry

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/BN7PR03MB374687029AC9FB2483D282AEEF552%40BN7PR03MB3746.namprd03.prod.outlook.com.

Perry de Valpine

unread,

Jun 17, 2024, 7:38:01 PM (12 days ago) Jun 17

to Alt, Ethan Mackenzie, nimble...@googlegroups.com

Hi Ethan and all,

I just wanted to circle back to this message and note that the newly released nimble 1.2.0 (see here: https://r-nimble.org/version-1-2-0-of-nimble-released) has a completely revamped MCEM that should be faster (it can use nimble's automatic derivative system if the model is amenable to it) and also not have the problem you ran into.