No conjugacy in augmented probit latent space network model (and likely other issues)

201 views
Skip to first unread message

Jennifer Kampe

unread,
Oct 18, 2023, 4:35:59 PM10/18/23
to nimble-users
Hi all, 

I'm very new to NIMBLE and I suspect there may be many errors in my approach, so apologies in advance. 

I'm attempting to fit a simple inner product network model in the style of Hoff and Raftery (2002). My standard approach would be to use a Gibbs sampler for Polya-gamma data augmented logistic regression, but I found that the Polya-gamma is not in NIMBLE's distribution library so conjugacy is not detected. Instead I'm attempting probit data augmentation instead, but I'm still seeing that conjugacy is not detected and I'm getting an "Infinite values were detected in model variable: logProb_Y" error for all edges (Y_ij ==1). Alternatively, using nimbleHMC (and dropping the data augmentation), I get reasonable recovery, so it seems I'm not specifying the data augmentation scheme correctly in my Nimble model, and I would like to get the conjugate sampler working if possible. 

My code is below. 

Thanks, 

Jennifer

#---------------------------------------------- SIMULATE DATA -----------------------------------------#

# Simulate data from an inner product network model
set.seed(1234)

# Data (and latent space) structure
V <- 100 # Number of vertices
H <- 10 # Dimension of latent space

# Hyperparameters
a1 <- 2.5 # Gamma shrinkage parameter for factor 1
a2 <- 2.5 # Gamma shrinkage parameters for factors 2:H
sd.e <- 0.1 # Gaussian noise added to linear predictor
meanP <- 0.25 # Moderately sparse network
mu0 <- probit(meanP) # Prior mean for intercept
sd.mu <- 0.1 # Prior sd for intercept: consider probit scale

# Simulate multiplicative gamma shrinkage process
U <- rep(NA,H)
U[1] <- rgamma(1, a1, 1)
U[2:H] <- rgamma(H-1, a2, 1)

Tau <- rep(NA, H)
for(h in 1:H){
  Tau[h] <- prod(U[1:h])
}

# Simulate latent factors
X <- matrix(NA, nrow = V, ncol = H)
for(h in 1:H){
  X[,h] <- rmvnorm(n = 1, mean = rep(0, V), sigma = diag(Tau[h]^(-1), nrow = V))
}

# Simulate intercept
mu <- rnorm(1,mean = mu0, sd = sd.mu) # Normal prior for baseline connection score

# Compute the linear predictor and Z latent connection score
M.Z <- mu + X %*% t(X)
vecE <- rnorm(V*(V-1)/2, 0, sd.e)
E <- matrix(NA, V,V)
E[upper.tri(E)] <- vecE
E[lower.tri(E)] <- t(E)[lower.tri(E)]
diag(E) <- 0
Z <- M.Z + E # Data augmentation approach

# Simulate edges and non-edges
Y <- ifelse(Z > 0, 1, 0)
diag(Y) <- diag(Z) <- 0 # No self-edges

#----------------------------------------- DEFINE NIMBLE MODEL---------------------------------------------#
fmCode <- nimbleCode({
 
  # Intercept
  mu ~ dnorm(mu0, sd = sd.mu)
 
  # Shrinkage process
  U[1] ~ dgamma(a1, 1)
  for(h in 2:H){
    U[h] ~ dgamma(a2, 1)
  }

  for(h in 1:H){
    Tau[h] <- prod(U[1:h])
  }

  # Latent factors
  for(h in 1:H){
    for(v in 1:V){
      X[v,h] ~ dnorm(M[v] , sd = sqrt(Tau[h]^(-1)))
    }
  }
 
  # Compute linear predictor
  M.Z[1:V,1:V] <- mu + X[,] %*% t(X[,]) # Recall multivariate nodes must be used with []

  # Model
  for (i in 2:V){
    for (j in 1:(i-1)){ # Self-edges not allowed
      E[i,j] ~ dnorm(mean = 0, sd = sd.e)
      Z[i,j] <- M.Z[i,j] + E[i,j]
      indZ[i,j] <- getIndicator(Z[i,j]) # Function required to pass indices to if statement
      Y[i,j] ~ dbin(size = 1, prob = indZ[i,j]) 
    }
  }

})


# Function to get indicator
getIndicator <- nimbleFunction(
  run = function(Zij = double(0)) {  
    if(Zij > 0){
      Yij <- 1
    }else{
      Yij <- 0
    }
    return(Yij)
    returnType(double(0))
  })


# Define the constants
mu0 = probit(mean(Y, na.rm = TRUE)) # Reasonable guess for intercept prior mean

fmConsts <- list(V = V,
                 H = H,
                  a1 = a1, a2 = a2,
                  mu0 = mu0, sd.mu = sd.mu,
                  M = rep(0, V), sd.e = sd.e)

# Define the data
fmData <- list(Y = Y)

# Set initialization parameters
fmInits <- list(X = matrix(0, V, H),
                U = 1:H,
                mu = mu0,
                E = matrix(0,V,V),
                Z = ifelse(Y==1, 1, -1),
                indZ = Y)

fmDims <- list(Tau = H, X = c(V, H), mvCov = c(V,V,H),
              E = c(V,V))

# Create NIMBLE model
fm <- nimbleModel(code = fmCode, name = "fm", constants = fmConsts, data = fmData,
                   dimensions = fmDims, inits = fmInits)

# Check conjugacy
configureMCMC(fm, print = TRUE) # no conjugacy

# ------------------------------------ FIT MODEL AND EVALUATE----------------------------------------#
niter <- 2000
nchains <- 2
mcmc.out <- nimbleMCMC(code = fmCode, constants = fmConsts,
                       data = fmData, inits = fmInits,
                       nchains = nchains, niter = niter,
                       summary = TRUE, WAIC = TRUE,
                       monitors = c('mu', 'Z', 'Tau')) 

# Check recovery of latent probabilities
fm.samples <- do.call(rbind, mcmc.out$samples)
p.samples <- fm.samples[, grepl("P", colnames(fm.samples))]

p.post <- matrix(data = colMeans(p.samples), byrow = FALSE, nrow = V, ncol = V)
diag(p.post) <- 0
plot(c(p.post), c(P))




Chris Paciorek

unread,
Oct 23, 2023, 9:22:38 PM10/23/23
to Jennifer Kampe, nimble-users
Hi Jennifer, a few comments here.

1) For a basic probit data augmentation, you could write something like this:

```
y[i,j] ~ dinterval(z[i,j], 0)
z[i,j] ~ dnorm(beta0 + beta1*x[i,j], 1)
beta0 ~ dnorm(0,1)
beta1 ~ dnorm(0,1)
```
and conjugacy should be detected in nimble for sampling the beta parameters.  `dinterval` is basically a trick to enforce that z[i,j] > 0 if y[i,j] = 1 and z[i,j] <= 0 if y[i,j] = 0. (One could also use `dconstraint`.) See Section 5.2.7 of the manual for more details on `dinterval` and `dconstraint`.

2) Without delving into Hoff and Raftery and thinking more carefully about the model, I don't have specific things to suggest in terms of writing your model, but hopefully the above code will help get you started.
I'm happy to iterate here if you have follow-up questions or can say more about which parameters you're expecting to be sampled based on conjugacy. (I'm actually not seeing what you're expecting
to be conjugate.)

3) We may do some work in coming weeks to make a Polya-gamma sampler available (initially in our development branch). I don't want to promise anything, but it's something we've wanted
to set up for a while and have some other projects going on that could make use of it. I can try to remember to let you know if/when we make progress on that. 

4) As far as the logProb_y infinity situation, I'm not immediately seeing what might cause that. One common issue in situations like this is setting incompatible data/initial values for `Y[i,j]` and `Z[i,j]` pairs such that the log density is `-Inf`, but you indicated the message is about getting `Inf`. Despite that, it may relate to model initialization - we have some tips on that here in the manual.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/155c5173-b46d-4bce-afe2-c46a9a586c34n%40googlegroups.com.

Jennifer Kampe

unread,
Oct 24, 2023, 11:14:05 AM10/24/23
to nimble-users
Hi Chris, 

Thanks so much. To simplify things, I'm starting with a more basic network regression. I attempted the probit data augmentation approach using your suggested construction of Y,Z with a simple network probit regression, and Nimble does now detect conjugacy (although not for z, which should have truncated normal full conditional), and giving marginal improvement on the RW sampler . 

However, when I add random effects, I can fit it using the RW sampler, but using the conjugate sampler I'm getting the following error:  warning: logProb of data node y[26, 14]: logProb is -Inf.

Do you have any suggestions what could be causing the issue? My code is below and I'm happy to share the data via email if there's no obvious error. 


## define the model
glmmCode2 <- nimbleCode({
 
  #beta0 ~ dnorm(0, sd = 1) # Drop intercept
  beta1 ~ dnorm(0, sd = 1)
  sigma_RE ~ dunif(0, 10)
 
  # Model
  for(i in 1:N){ # Random effects for each vertex
    beta2[i] ~ dnorm(0, sd = sigma_RE)
  }
  for (i in 2:N) { # Likelihood
    for (j in 1:(i-1)){
      y[i,j] ~ dinterval(z[i,j], 0)
      z[i,j] ~ dnorm(beta1*x[i,j] + beta2[i] + beta2[j], 1)
    }
  }
})


## constants, data, and initial values
glmmConsts2 <- list(N = nC)

glmmData2 <- list(
  y = A,
  x = outer(reg,reg, FUN = "==")*1 # x_ij = 1(region_i == region_j)
)

glmmInits2 <- list(beta1 = 0, beta2 = rep(0,nC))

glmmModel2 <- nimbleModel(code = glmmCode2, constants = glmmConsts2, data = glmmData2,
                         inits = glmmInits2)

configureMCMC(glmmModel2, print = TRUE)


niter <- 2000
nchains <- 2
mcmc.glmm.out2 <- nimbleMCMC(code = glmmCode2, constants = glmmConsts2,
                       data = glmmData2, inits = glmmInits2,

                       nchains = nchains, niter = niter,
                       summary = TRUE, WAIC = TRUE,
                       monitors = c('beta1', 'beta2'))

Best, 

Jennifer

Chris Paciorek

unread,
Oct 24, 2023, 8:33:27 PM10/24/23
to Jennifer Kampe, nimble-users
Hi Jennifer,

Regarding the truncated normal, we don't have that conjugacy set up. However, if you wanted to put in a bit of time, you could code up a user-defined sampler to do the job, presuming using the inverse CDF method.
That said, I suspect that in many cases the lack of a conjugate sampler for those parameters would not be the main contributor to mixing issues.

As far as the -Inf, I suspect it is because you are not explicitly initializing the `z` values. You should initialize them such that they are consistent with the `y` values. I suspect that the initial value of `z[26,14]` is not consistent with its corresponding 'y[26,14]`.

-chris 

Jennifer Kampe

unread,
Oct 25, 2023, 7:49:19 PM10/25/23
to nimble-users
Hi Chris, 

Thanks, it was indeed the initialization for Z - I had it in my simple sampler than left it out in the random effects version somehow. The model now runs, but the data augmentation scheme seems to be causing implausible values in the WAIC (WAIC \approx 0), what is to work around this?

Thanks, 

Jennifer

Chris Paciorek

unread,
Oct 25, 2023, 7:59:35 PM10/25/23
to Jennifer Kampe, nimble-users
Hi Jennifer, this issue just came up in this user post.  That was a different model but the same structure in that the data are simply imposing a constraint on some latent variables. Our WAIC calculates the likelihood of the data, which in this case are enforced to always have probability of 1 (log-prob of 0, hence what you are seeing for WAIC). So I don't know of a way to get WAIC for the data augmentation formulation of the model.

I don't know if anyone has written anything in the literature about WAIC in data augmentation contexts like this. 

You mentioned fitting with nimbleHMC (i.e., without data augmentation) so you might consider using that model formulation either with nimbleHMC or some other sampling scheme. You should be able to get WAIC from that.

-chris

Paul vdb

unread,
Oct 27, 2023, 6:39:40 PM10/27/23
to nimble-users
Hey Jennifer,
I've written a basic polya-gamma sampler for this type of problem. I wrote it today and you'll be the first to use it so please let me know if it falls apart. It was on my to do list so this thread was a good excuse. The LogisticRegression.R file is the basic example and samplePolyaGamma.R has all the Nimble code to run it. I take no responsibility for it yet but if you use it please give us some feedback. It will only currently work for dnorm fixed and random effects, not multivariate normal yet. You'll need to pass the sampler the design matrix, the size of ~dbin and the name of the response variable that is ~dbin. If it's all bernoulli you can just put size = 1.
Thanks,
Paul
LogisticRegression.r
samplePolyaGamma.r

Jennifer Kampe

unread,
Oct 29, 2023, 1:24:42 PM10/29/23
to nimble-users
Incredible! I will try it out in the next week and will let you know how it goes.
Reply all
Reply to author
Forward
0 new messages