Hi all,
I've been looking at past threads regarding Conditional Auto Regressive (CAR) models.
(In order of relevance)
https://groups.google.com/forum/#!searchin/stan-users/spatial/stan-users/S8dhFpn0Lg8/F91hZXBJmnMJhttps://groups.google.com/forum/#!searchin/stan-users/spatial/stan-users/63m5A5ATzs8/LII8RuX2jI8Jhttps://groups.google.com/forum/#!searchin/stan-users/spatial/stan-users/he7BxxHfBfk/V6s1K14nRfgJhttps://groups.google.com/forum/#!searchin/stan-users/spatial/stan-users/iddShgJVwTQ/Ihes0SOy5yQJI've tried to adapt the examples to my fake toy example with little success.
(by adapting, i mean a naive specification without resorting to working with the lp_ directly)
I've finally hit my wall, and would appreciate some help.
Is working with lp_ the only way to practically implement this model?
Problems:- The practical problem I face is the sampling is extremely slow compared to WinBUGS. 
 
I already have results from WinBUGS, where my attempt in Stan doesn't even return 1000 iterations when I allow 8+ hours for the running clock time.
Further, i'd like the restriction of avoiding the use of lp_ directly
- Possibly related, I get multiple stan warnings about non-symmetric covariance matrices, when I believe they are intentionally symmetric.
 
covariance matrix[0,2] is 105097511.634409:0, but 
covariance matrix[2,0] is 1.05098e+008:0
Background and Details:The model is a two stage model, where the data model is Poisson, and the log Poisson rate is linear with fixed effects and random effects
Past stan examples used a "small" number of spatial neighbors, say 50.
My toy example has 480 neighbors.
Past examples used a single random effect, the spatial conditional autoregressive (CAR) effect
My example uses two random effects (as in the Besag-York-Mollie model) one spatial CAR and the other normal 
Past examples estimated the spatial dependence parameter and also giving it a uniform prior
My example fixes (doesn't estimate) the spatial dependence parameter using the reciprocal of the largest eigenvalue of some matrix.
Model:In short, i've tried to follow Cressie and Wikle's (statistics for spatiao-temporal data) more general model specification 
A stylized Model Specification:
O ~ poisson(mu)
log_mu <- log(E) + alpha0 + alpha1 * x1 + ( beta1-mean(beta1) ) + beta2
beta1 ~ MVN(0,Cov_spatial)
beta2 ~ MVN(0,Cov_normal)
Cov_spatial <- tau^2 * inverse(ONES - phi * H ) * DELTA
Cov_normal <- sigma^2 * ONES
Where:
O is the response count
mu is the poisson rate
E is an offset
alpha0 is the uninteresting intercept
alpha1 is a fixed effect
beta1 is the spatial random effect using a MVN with a spatially structured covariance (Cov_spatial)
beta2 is the nonspatial random effect using a MVN with  covariance (Cov_normal)
tau is the standard deviation for the spatial term
sigma is the standard deviation for the non spatial term
ONES is the identity matrix
DELTA is a diagonal matrix, whose entries delta_ii = 1 /  the # of neighbors of row i
H is a binary then row standardized adjacency matrix (eg each row sums to 1) 
where h_ij = ( b_ij / # of neighbors of row i) and b_ij = 1 if i is a 
neighbor of j; 0 otherwise.
phi = 0.3 = 1 / max{ eigenvalue(-DELTA^(-1/2) * H * DELTA ^ (1/2) }
I believe my model specification is correct and my data manipulations (via [R]) should be correct, but I question if my stan implementation is correct.
AttachmentsFor your consideration, I've attached 4 files:
All of the necessary data to feed into stan
fake_spatial_bym_stan.RData   
The two files that manipulated the data
setup_stan_maf_data.R  (and calls stan)
setup_stan_maf_init.R
The stan model file
maf_convolution_model_sfstd.stan
Specific Stan Model Filedata {
    int<lower=0> N; // number of areas
    matrix[N,N] H;  // area adjacency matrix row standardized (ea row sums to 1)
    matrix[N,N] Delta;  // diagonal matrix in BYM 1991, sfstd model B1
    matrix[N,N] Ones; // diag matrix of ones for identity
    real phi; // phi = 1/maxeigval(Delta*H*Delta)
    int<lower=0> O[N];  // observed cases
    vector[N] E;  // expected cases
    vector[N] x1;  // covariate
}
transformed data {
  vector[N] zeros;  // zeros for mean of MVN specification
  for (i in 1:N)
        zeros[i] <- 0.0;
}
parameters {
    real alpha0;  // intercept
    real alpha1;  // coefficient on covariates
    vector[N] beta1;  // random effect (CAR)
    vector[N] beta2;  // random effect normal
    real<lower=1e-5> prec_sre;  // precision of CAR
    real<lower=1e-5> prec_nre;  // precision of normal    
}
transformed parameters {
    // turn precisions to variances
    real sigma_sq;
    real tau_sq;
    sigma_sq <- 1.0 / (prec_nre * prec_nre);
    tau_sq <- 1.0 / (prec_sre * prec_sre);
}
model {
    matrix[N,N] Sigma_sre;  // covariance matrix for CAR
    matrix[N,N] Sigma_nre;  // covariance matrix for normal RE
    real beta1_mean;
    vector[N] beta1_stz;  // centered spatial re
    vector[N] log_mu;
    
    prec_sre ~ gamma(1,1);    // precision of spatial re
    prec_nre ~ gamma(0.5, 0.0005);    // precision of normal re
    // spatial re
    Sigma_sre <- (tau_sq * inverse(Ones - phi * H)) * Delta;
    beta1 ~ multi_normal(zeros, Sigma_sre);
    beta1_mean <- mean(beta1);
    beta1_stz <- beta1 - beta1_mean;
    // nonspatial re
    Sigma_nre <- sigma_sq*Ones;
    beta2 ~ multi_normal(zeros, Sigma_nre);
    // fixed effects
    // no explicit prior on alpha0 (implicit improper flat prior)
    alpha1 ~ normal(0.0, sqrt(1e5));
    // process model
    log_mu <- log(E) + alpha0 + alpha1 * x1 + beta1_stz + beta2;
    // data model
    O ~ poisson_log(log_mu);
}