posterior distribution with no variation (same value sampled repeatedly)

220 views
Skip to first unread message

Enrico Ryunosuke Crema

unread,
Oct 25, 2021, 9:19:47 AM10/25/21
to nimble-users

Dear All,

I am trying to fit a quantile regression model with measurement error and I occasionally end up with no variation in the posteriors.  I provide a reproducible example below, but there are a couple of things to bear in mind that make everything slightly more complicated.

  • The data consists of radiocarbon ages and this requires a calibration process. In practical terms observed data are in C14ages while the model is in calendar time. This conversion is done by using nimbleCarbon’s interpLin(), which is basically the equivalent of JAGS’ interp.lin() function.

  • The calibration however is bound to calendar dates between 1 and 55000. This means that if the sampled values of theta are outside this range the model crashes (it generates a  “*** caught segfault *** address 0x0, cause 'invalid permissions'” error that tracebacks at  “.Call("CALL_MCMC_run", niter, reset, resetMV, time, progressBar,     nburnin, thin, thin2, .basePtr)”. This makes sense as any value of theta outside that range would produce an NA.

  • To address the problem above I truncated the Asymmetric Laplace Distribution to be between 1 and 55000. I’ve run some simulations and the censoring seems to work fine, and the model runs without crashing. However, as shown below for results1 sometimes the posterior of beta (the slope parameter) does not seem to change at all.

  • The problem above does not happen all the time. In the example below using a seed equal to 123 causes the problem but everything is fine when it is set to 456 (results2). I also noticed that changing the truncation from T(...,1,55000) to T(...,500,55000) seem to solve the problem (results3).

  • The solution above works, but I really don't know why. And whilst in this particular dataset, I can justify the truncation, there are other situations where that might not be the case...

Thanks in advance!

All Best,
Enrico

#### Code to replicate issue ####

library(nimble)

library(nimbleCarbon) #using version 0.2.1 from Github


#generate observed data

dat <- list()

dat$cra <- c(2022,2085,2029,1610,2518,1260,2103,2562,1118,2010,1682,2340,1950,2185,2530,1124,2320,1806,1989,2216,2390,1990,2250,2120,2494,1845,1860,2020,2145,2077,2140,2060,2014,2490,2060,2165,1311,2315,1850,1917,1138,1490,2220,2465,2010,1226,1251,2710,1540,1316,1350,1167,1824,2125,2165,1605,2093,1750,1275,2327,1780,1193,2140,2190,2540,2127,1348,1812,1180,1879,1920,1212,2210,2098,1728,1311,2450,1160,2200,1090,2274,2515,1255,1586,1726,2087,1940,1130,2450,1893,1827,1962,1036,1530,1270,1174,1150,1820,2260,1526,2000,1293,2235,1737,1149,2680,1530,2091,1093,2107,2540,1915,1334,1890,1335,1240,2145,1144,2660,2150,1450,1985,1876,1895,1595)

dat$cra_error <- c(41,20,21,50,18,23,23,30,34,22,33,50,40,25,50,20,40,28,20,22,40,40,30,40,29,25,25,30,35,20,40,40,19,50,40,20,19,35,115,20,20,40,40,25,20,32,27,24,15,21,18,25,18,18,20,25,20,40,37,30,40,22,17,40,30,20,29,20,23,23,35,40,30,36,19,20,20,40,40,30,20,50,22,19,19,23,17,30,90,18,21,27,26,40,40,17,30,40,30,17,40,25,20,38,19,80,120,25,37,20,30,30,26,30,41,24,35,18,40,40,15,35,19,20,40)


#define constants

data(intcal20)

constants <- list()

constants$N.dates <- 125

constants$dist_org <- c(429.320076767307,45.4500564582831,322.671537712194,905.579074066578,428.955334297091,818.130733835863,57.7551514074897,553.348924893019,914.355776343483,736.956879213358,1004.51058778946,95.876702405384,732.493279501873,157.289326517452,871.971052575082,1237.45900132812,872.503064161858,535.775426836309,915.108590947556,95.8211622306369,102.086453653302,197.682951730513,677.922286125642,914.056093886234,818.975691751595,838.390437714983,689.016859729808,55.3048547352654,729.186022635382,61.2568016655655,41.5130043675428,205.205115147396,553.625036725479,553.54017699813,202.199807423367,1233.99108085542,51.1902351867664,363.779720148363,952.481106113845,910.131152495908,1228.91046941717,101.169157783038,678.627746631143,499.914492936387,1020.90873495549,913.185627614995,946.22900586691,6.2894926602453,204.059528496027,915.082230490806,1265.78072750436,1107.45046168738,617.85042958057,870.28488531245,202.406778923027,60.1156852481389,60.232546685893,1079.93419454084,677.99634554784,463.843280633457,198.430697242824,907.942698419647,828.953172155422,214.662038425399,428.127805544043,48.9395235742186,760.285999030207,746.231669093426,1210.0884377714,753.597599067191,924.716574543807,101.51356196022,86.4279270847293,63.4579343200094,796.406791834446,242.18764532981,37.410013462252,1252.69672506524,663.900117837543,1185.26101155236,47.647204698256,47.9006974290247,223.263611281168,52.4278202987916,191.601773473245,836.561942440207,835.840409069826,1118.238027713,56.6483553793198,794.047571185558,925.420851551298,893.640899763591,1224.85495981368,1131.24029478372,1090.34417141112,117.115887593255,882.472759424415,202.528497833281,666.824642408286,80.7364858881542,883.054010765992,1178.1979567042,160.491815196279,351.88173005541,904.807019281615,0.400044992340342,1258.7330890533,838.237650541803,1002.86553484861,216.932967388995,543.263273124691,908.81524695607,690.767375579083,854.354771266472,986.591883294761,1043.1818154292,728.731912733023,1240.16146703068,48.0689767529771,437.230783706002,88.0740363141035,405.756535991697,827.838111734014,648.911159036423,72.5765429000875)

constants$calBP <- intcal20$CalBP

constants$C14BP <- intcal20$C14Age

constants$C14err <- intcal20$C14Age.sigma

constants$tau <- 0.99

#define sensible initialisation for cra

theta.init <- c(1960,2043,1965,1478,2585,1223,2064,2718,1016,1950,1573,2365,1875,2236,2595,1014,2339,1703,1924,2227,2426,1922,2227,2086,2585,1749,1770,1958,2124,2035,2117,2018,1958,2574,2018,2161,1229,2336,1763,1832,1020,1365,2227,2579,1952,1145,1213,2807,1400,1248,1288,1083,1723,2088,2161,1473,2052,1638,1218,2345,1658,1111,2119,2215,2623,2092,1281,1712,1103,1783,1835,1132,2229,2061,1613,1232,2513,1068,2223,993,2314,2587,1221,1467,1611,2046,1863,1020,2532,1792,1726,1890,942,1404,1214,1101,1047,1721,2231,1390,1934,1226,2222,1626,1028,2804,1432,2051,998,2068,2623,1827,1271,1793,1256,1162,2124,1022,2772,2134,1329,1917,1781,1794,1468)



model <- nimbleCode({

  for (i in 1:N.dates){

    # Model

    mu[i] <- alpha - beta*dist_org[i]

    theta[i] ~ T(dAsymLaplace(mu=mu[i],sigma=sigma,tau=tau),m,M) 

    c14age[i] <- interpLin(z=theta[i], x=calBP[], y=C14BP[]);

    sigmaCurve[i] <- interpLin(z=theta[i], x=calBP[], y=C14err[]);

    sigmaDate[i] <- (cra_error[i]^2+sigmaCurve[i]^2)^(1/2);

    cra[i] ~ dnorm(mean=c14age[i],sd=sigmaDate[i]);

  }

  #priors

  alpha ~ dnorm(3000,sd=200);

  beta ~ dexp(1)

  sigma ~ dexp(0.001)

}) 


# MCMC setup

niter  <- 10000

nburnin  <- 5000

thin  <- 1

# Censor to 1-55000

constants$m <- 1

constants$M <- 55000


# This does not work

seed = 123

set.seed(seed)

inits  <- list(alpha=rnorm(1,3000,200),beta=rexp(1,1),sigma=rexp(1,0.001),theta=theta.init)

model.asymlap <- nimbleModel(model,constants = constants,data=dat,inits=inits)

cModel.asymlap <- compileNimble(model.asymlap)

conf.asymlap <- configureMCMC(model.asymlap)

conf.asymlap$addMonitors('theta')

MCMC.asymlap <- buildMCMC(conf.asymlap)

cMCMC.asymlap <- compileNimble(MCMC.asymlap)

results1 <- runMCMC(cMCMC.asymlap, niter = niter, thin=thin,nburnin = nburnin,samplesAsCodaMCMC = T, setSeed = seed)

range(results1[,'beta']) #no variation in beta!!! all 2.65812. Notice this is different from inits$beta which is 1.329


# This works

seed = 456

set.seed(seed)

inits  <- list(alpha=rnorm(1,3000,200),beta=rexp(1,1),sigma=rexp(1,0.001),theta=theta.init)

model.asymlap <- nimbleModel(model,constants = constants,data=dat,inits=inits)

cModel.asymlap <- compileNimble(model.asymlap)

conf.asymlap <- configureMCMC(model.asymlap)

conf.asymlap$addMonitors('theta')

MCMC.asymlap <- buildMCMC(conf.asymlap)

cMCMC.asymlap <- compileNimble(MCMC.asymlap)

results2 <- runMCMC(cMCMC.asymlap, niter = niter, thin=thin,nburnin = nburnin,samplesAsCodaMCMC = T, setSeed = seed)

range(results2[,'beta']) #beta ranges between 0.27 and 0.70



#This works

seed = 123

constants$m <- 500 #Changing the censoring to 500-55000

set.seed(seed)

inits  <- list(alpha=rnorm(1,3000,200),beta=rexp(1,1),sigma=rexp(1,0.001),theta=theta.init)

model.asymlap <- nimbleModel(model,constants = constants,data=dat,inits=inits)

cModel.asymlap <- compileNimble(model.asymlap)

conf.asymlap <- configureMCMC(model.asymlap)

conf.asymlap$addMonitors('theta')

MCMC.asymlap <- buildMCMC(conf.asymlap)

cMCMC.asymlap <- compileNimble(MCMC.asymlap)

results3 <- runMCMC(cMCMC.asymlap, niter = niter, thin=thin,nburnin = nburnin,samplesAsCodaMCMC = T, setSeed = seed)

range(results3[,'beta']) #beta ranges between 0.18 and 0.75


Enrico Ryunosuke Crema

unread,
Oct 25, 2021, 12:37:46 PM10/25/21
to nimble-users
Just a brief follow-up on this. I now know why the sampler is stuck to a single value. It seems that the log Prob of some theta are positive infinite (see the code below which takes the first output of the MCMC from results1 and feeds this back into the model). What I don't understand is that these theta values are not outside the truncated range, and they when using a non-truncated version of dAsymLaplace() they return non-infinite log-probabilities... Indeed given that mu.post.i[1] is negative, the log-probabilities of positive values of theta should be very small and not large (see the last block with the plot function). I also noticed that if I remove the truncation (or if I set constants$m to -Inf and constants$M to +Inf) the problems goes away and the non-infinite log-probs are returned. Am I missing something obvious?

Enrico

###

alpha.post = results1[1,1]
beta.post = results1[1,2]
sigma.post = results1[1,3]
theta.post = results1[1,4:128]
init.post = list(alpha=alpha.post,beta=beta.post,sigma=sigma.post,theta=theta.post)
model.post <- nimbleModel(model,constants = constants,data=dat,inits=init.post)
model.post$logProb_theta #Some logProb are +Infinite, so the model cannot improve. 
i = which(model.post$logProb_theta==Inf)
theta.post[i] #all values are between 1 and 50000. 
dAsymLaplace(theta.post[i[1]],mu.post.i[1],sigma.post,tau=constants$tau,log=T) #logProb on non-truncated version is not infinite

x = -1000:2000
d = numeric(length(x))
for (j in 1:length(x))
{
  d[j] = dAsymLaplace(x[j],mu.post.i[1],sigma.post,tau=constants$tau,log=T)
}
plot(x,d,type='l',xlab='theta',ylab='log-prob')
abline(v=1,lty=2)






Chris Paciorek

unread,
Oct 27, 2021, 9:10:13 PM10/27/21
to Enrico Ryunosuke Crema, nimble-users
Hi Enrico,

I haven't gone through this completely thoroughly, but here is what I am seeing. I hope that this information will allow you to work out a solution, and I'm happy to discuss this further here.

If you use x ~ T(dsome_density(a,b),m,M), nimble calculates the density as the original density dsome_density(a,b) divided by the probability of a value less than M minus the probability of a value less than m. I.e., we normalize the original density by the probability of being in the interval (m,M).  This is done on the log scale as the log of the original density minus the log of the difference in probabilities, where those probabilities are calculated on the original (non-log) scale. 

In some cases in your problem, as the MCMC is starting to take samples (probably in the first or second iteration) it looks like a value mu[i] is a negative number that is fairly large in magnitude (e.g., -400). In that case, because the density is concentrated at large magnitude negative values, the two probabilities we are subtracting are both numerically equal to 1 and so the difference is 0.  The result is that you have some density divided by 0, which is Infinity, for theta[i]. This then causes all sorts of weird things to happen with whether proposals are accepted.

I don't know that it will fix things in this case, but in general, for numerical accuracy, it's best to calculate the various quantities in your d,p,q functions on the log scale and then only if log=FALSE exponentiate the result, rather than calculating on the original scale and then taking the log if log=TRUE. In this case, given how far 1 and 55000 can be into the right tail, this might still end up giving you probabilities that are exactly 1. 

One other thought is that it seems like a value of mu[i] that is, say, -400 is not a reasonable parameter value (particularly given the sigma and tau values) given that the thetas are positive. So perhaps different initial values for alpha and beta would make sense, or constraints or different priors for alpha and beta. 

On the nimble side, we may want to think harder about making sure that when this happens with T() that it is handled more robustly, perhaps such that the density evaluates to 0 instead of Inf and therefore that proposals that lead to this would be rejected.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/073ac809-2cf6-4a78-84bd-e26365b49e20n%40googlegroups.com.

Enrico R Crema

unread,
Oct 28, 2021, 4:10:44 AM10/28/21
to Chris Paciorek, nimble-users

Hi Chris,

Many thanks for this - it all makes sense. And indeed it happens only with either bad priors or very long chains (I initially discovered this with longer chains of 2 million iterations). I was also playing around and noticed this happens with other distributions as well (see examples below, my concerns are the two examples in the middle). I ended using a work-around by expanding artificially the interpolation range and then adding a dconstraint which fixed my specific problem, but will definitely update my d,p,q functions. Thanks!

All Best,

Enrico

###

library(nimble)
library(truncnorm)
obs.theta <- 14
sim  <-  nimbleCode({theta ~ T(dnorm(mean=0,sd=1),m,M)})
nimbleModel(sim,constant=list(m=-Inf,M=Inf),data=list(theta=obs.theta))$logProb_theta #returns # -98.91894
log(dtruncnorm(obs.theta,mean=0,sd=1)) #returns  -98.91894

#m=10
nimbleModel(sim,constant=list(m=10,M=Inf),data=list(theta=obs.theta))$logProb_theta #returns +Inf
log(dtruncnorm(obs.theta,mean=0,sd=1,a=10)) #returns -Inf

#m=10, M=15
nimbleModel(sim,constant=list(m=10,M=15),data=list(theta=obs.theta))$logProb_theta #returns +Inf
log(dtruncnorm(obs.theta,mean=0,sd=1,a=10,b=15)) #returns -1.609438

#m=15, M=30
nimbleModel(sim,constant=list(m=15,M=30),data=list(theta=obs.theta))$logProb_theta #returns -Inf
log(dtruncnorm(obs.theta,mean=0,sd=1,a=15,b=30)) #returns -Inf

Reply all
Reply to author
Forward
0 new messages