I'm experimenting with modeling correlation / covariance matrices with LKJ priors and it seems like a cool approach. But I'm encountering trouble when I try to scale up to higher dimensionality. In particular, even if I could get the model to run, the K-choose-2 factors will be increasingly hard to estimate as K gets large.
constant <- function(K, eta = 1) {
if(K == 1) return(1)
k <- 1:(K-1)
log_constant <- -sum(k) / 2 * log(pi) + (K - 1) * lgamma(eta + (K-1)/2) -
sum(lgamma(eta + (K-k-1)/2))
return(exp(log_constant))
}
plot(sapply(1:43, constant), type = "l", log = "y", las = 1,
xlab = "K", ylab = "Constant")
One approach I was thinking of is using a low-rank approximation, perhaps a truncated Cholesky decomposition. Has anyone tried this? What might be a good way to implement it in Stan?
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
id = 0 (Default)
data
file = /tmp/onion.data.R
init = 2 (Default)
random
seed = 4294967295 (Default)
output
file = output.csv (Default)
diagnostic_file = (Default)
refresh = 100 (Default)
Gradient evaluation took 1.63379 seconds
1000 transitions using 10 leapfrog steps per transition would take 16337.9 seconds.
Adjust your expectations accordingly!
--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sorry for the delayed reply: this worked great! I was getting some nice results for kernel learning on top of this, but admittedly I was getting ahead of myself; a real presentation of results will have to wait until I get some other parts ironed out.
The other advantage is that you can scale the Cholesky factor once rather
than the correlation matrix and then use multi_normal_cholesky(), which
is more efficient than the multi_normal() applied to covariance matrices.