How to choose the shape for a LKJ prior?

1,374 views
Skip to first unread message

David Wang

unread,
Jan 15, 2015, 2:35:03 PM1/15/15
to stan-...@googlegroups.com
Hello Stan users,

I intend to model the correlation structure among M count variables. My strategy is to model each count variable by a Poisson distribution and to draw the parameters (lambdas) of the K Poisson distributions from a multivariate log-normal distribution. I implemented a bivariate (M=2) version of this model in Stan following the example in the latest Stan reference on page 54. The only tunable parameter in the model is the shape of the LKJ prior, \nu, as in Omega ~ lkj_prior(nu). Now the manual indicates that "as \nu increases, the prior increasingly concentrates around the unit correlation matrix". However, I still feel insufficiently guided to pick an appropriate \nu for my particular model. I get the impression that \nu is often set to be equal to the size of the correlation matrix, namely, M. But I didn't find any rationale for such a choice. I ran the model with \nu = 1, 2, and 4 and found the posterior \rho is sensitive to the choice of \nu. I can probably rule out \nu = 1 because I know a priori that there could be a relatively weak negative correlation between the two count variables, so \nu should be larger than 1. But to use \nu = 2, 4, or any other values is a choice I'm not able to make with confidence. Any advice? I copy and paste my Stan code below.

Thanks,
David

data {
  int<lower=0> N; // number of obs, here 43
  int<lower=0> M; // number of response variables, here 2
  int<lower=0> y[N,M]; // counts
  real<lower=1> nu; // the shape of the LKJ prior
}

parameters {
  vector[M] mu;
  vector[M] lambda[N];
  vector<lower=0>[M] tau; // prior scale
  corr_matrix[M] Omega; // prior correlation matrix
}

model {
  matrix[M,M] Sigma; // covariance matrix
  Sigma <- quad_form_diag(Omega, tau);
  tau ~ cauchy(0, 2.5); // see Stan reference 2.5.0, page 54
  Omega ~ lkj_corr(nu);
  mu ~ normal(0, 100);
  for (i in 1:N) 
    lambda[i] ~ multi_normal(mu, Sigma);
  for (i in 1:N)
    y[i] ~ poisson_log(lambda[i]); // likelihood

Ben Goodrich

unread,
Jan 15, 2015, 3:07:48 PM1/15/15
to stan-...@googlegroups.com
On Thursday, January 15, 2015 at 2:35:03 PM UTC-5, David Wang wrote:
Now the manual indicates that "as \nu increases, the prior increasingly concentrates around the unit correlation matrix". However, I still feel insufficiently guided to pick an appropriate \nu for my particular model. I get the impression that \nu is often set to be equal to the size of the correlation matrix, namely, M. But I didn't find any rationale for such a choice. I ran the model with \nu = 1, 2, and 4 and found the posterior \rho is sensitive to the choice of \nu. I can probably rule out \nu = 1 because I know a priori that there could be a relatively weak negative correlation between the two count variables, so \nu should be larger than 1. But to use \nu = 2, 4, or any other values is a choice

Note that the next sentence of the manual is wrong. It should say

At $\nu = 1$, the LKJ correlation distribution reduces to the uniform distribution over correlation matrices of order K.

I don't think anyone sets $\nu = K$ because the density already depends on a given K. People mostly do $\nu = 1$, so that the joint density is constant, which is probably fine in your case. The LKJ prior is symmetric around 0, so weak negative correlations are as likely as weak positive correlations. It is difficult to restrict the sign of a correlation.

The correct description of the LKJ prior is in section 48.1, including the suggestion not to use it. Instead, use the lkj_corr_cholesky() density for the Cholesky factor of a correlation matrix. In that case, you would have

parameters {
 vector
[M] mu;
 vector
<lower=0>[M] tau; // scale
 cholesky_factorr_corr[M] L; // Cholesky factor of correlation matrix
 vector
[M] log_lambda_z[N];
}
transformed parameters
{
  vector
[M] log_lambda[N];
 
{
    matrix
[M,M] new_L;
    new_L
<- diag_pre_multiply(tau, L);
   
for (n in 1:N) log_lambda[n] <- mu + new_L * log_lambda_z[n];
 
}
}

model {
  tau ~ cauchy(0, 2.5); // see Stan reference 2.5.0, page 54
  L ~ lkj_corr_cholesky(nu);
  mu ~ normal(0, 100);
  for (i in 1:N) {
   log_lambda_z[i] ~ normal(0,1);
   y[i] ~ poisson_log(log_lambda[i]); // likelihood
 
}
}

Ben
Reply all
Reply to author
Forward
0 new messages