visualizing Wishart and lkj_corr priors

550 views
Skip to first unread message

Shravan Vasishth

unread,
Jan 1, 2014, 1:47:03 AM1/1/14
to stan-...@googlegroups.com
I am aware of the paper by Andrew and colleagues on visualizing variance covariance matrix priors, but I wanted to make up a simpler example for visualizing what it means to have a prior on a 2x2 variance covariance matrix or a correlaton matrix. Would people agree that this is a reasonable way to do this for a Wishart prior:

## R code:
library(MCMCpack)
draw <- rwish(3, matrix(c(1,.3,.3,1),2,2))

nsim<-1000
store22<-store21<-store12<-store11<-rep(NA,nsim)

for(i in 1:nsim){
draw <- rwish(3, matrix(c(1,0.9,0.9,1),2,2))
store11[i]<-draw[1,1]
store12[i]<-draw[1,2]
store21[i]<-draw[2,1]
store22[i]<-draw[2,2]
}

library(MASS)

variances<-as.matrix(data.frame(store11,store22))
bivn.kde<-kde2d(variances[,1],variances[,2],n=nsim)
persp(bivn.kde,phi=10,theta=0,shade=0.2,border=NA,
      main="Simulated variances' joint distribution")

My Stan related question is: I want to have something simple like this for lkj_corr(2.0), lkj_corr(0.5), and lkj_corr(2.0). Does this make sense:

## Stan and R code:
omegamodel<- '
parameters {
corr_matrix[2] Omega;
}
model {
// vary the parameter: set it to 0.5, 1.0, 2.0
// to see the shape change
Omega ~ lkj_corr(1.0);
}
'

library(rstan)
set_cppo("fast")  

fit <- stan(model_code = omegamodel, 
            iter = 500, chains = 2)

print(fit)

fit2<-as.matrix(fit)

offdiags<-fit2[,c(2,3)]

bivn.kde<-kde2d(offdiags[,1],offdiags[,2],
                n=500)
persp(bivn.kde,phi=10,theta=0,shade=0.2,border=NA,
      main="Simulated lkj_corr joint distribution")


--
Shravan Vasishth
Professor for Psycholinguistics und Neurolinguistics
Department of Linguistics
University of Potsdam, Germany
http://www.ling.uni-potsdam.de/~vasishth

Bob Carpenter

unread,
Jan 1, 2014, 8:03:50 AM1/1/14
to stan-...@googlegroups.com
The (2 x 2) case for correlation only has one free
parameter --- the correlation rho_{1,2}.

The LKJ priors are only on correlation matrices.
We then build up covariance priors by scaling
with independent priors on the scale of each dimension.

No problem drawing from lkj_corr as you do, but you can
skip the MCMC approach and just code an independent-draw MC
approach in Stan by using lkj_corr_rng in the generated quantities
block.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Ben Goodrich

unread,
Jan 1, 2014, 5:33:44 PM1/1/14
to stan-...@googlegroups.com, vasishth...@gmail.com
On Wednesday, January 1, 2014 1:47:03 AM UTC-5, Shravan Vasishth wrote:
My Stan related question is: I want to have something simple like this for lkj_corr(2.0), lkj_corr(0.5), and lkj_corr(2.0). Does this make sense:

It is possible but a bit inefficient to use Stan for this. Attached is a .R file that randomly draws from the LKJ distribution (quickly).

When the shape parameter is not equal to 1, then the correlations do not all have the same marginal distribution, unless you do a symmetric permutation of the rows and columns of the correlation matrix. But a marginal correlation is a red herring anyway.

In my opinion, the thing to visualize is "effective independence" --- which is the determinant of the correlation matrix raised to the power of 1/K or equivalently the geometric mean of the eigenvalues --- as K and the shape parameter varies. So, something like

source("lkj.R")
stopifnot
(require(parallel))

K
<- 2:22
eta
<- seq(from = 0, to = 5, length.out = 21)
eg
<- expand.grid(K = K, eta = eta)

ei_lkj
<- function(K, eta, SIMS = 1000) {
  mean
(replicate(SIMS,
                 exp
(2 * mean(log(diag(rcorvine(K, eta, cholesky = TRUE)))))) )
}

theta_lkj
<- mcmapply(ei_lkj, K = eg$K, eta = eg$eta, mc.cores = detectCores())

par
(mar = c(5,4,1,1) + .1)
image
(K, eta, matrix(theta_lkj, length(K), length(eta)), las = 1)
contour
(K, eta, matrix(theta_lkj, length(K), length(eta)), las = 1)

yields



If you do something similar for the Wishart or inverse Wishart distribution with identity scale matrix and use the cov2cor() function in R to obtain the implied covariance matrix, you can see how effective independence varies with K and the degrees of freedom. A lot of people use the inverse Wishart distribution with degrees of freedom equal to K + 1 in order to get marginally uniform correlations and end up with low effective independence values.

Ben

lkj.R
Reply all
Reply to author
Forward
0 new messages