lmer vs Stan comparison

958 views
Skip to first unread message

Shravan Vasishth

unread,
Dec 17, 2013, 3:14:02 PM12/17/13
to stan-...@googlegroups.com, Reinhold Kliegl
Here is a comparison of lmer vs Stan output on a mildly complicated dataset from a psychology expt. (Kliegl et al 2011). The data are attached.

Data and paper, along with lots of challenging datasets for Stan, are available from: 


I should say that datasets from psychology and psycholinguistic can be much more complicated than this example. So this was only a modest test of Stan.

The basic result is that I was able to recover in Stan the parameter estimates (fixed effects) that were primarily of interest, compared to the lmer output. The sds of the variance components all come out pretty much the same in Stan vs lmer. The correlations estimated in Stan are much smaller than lmer, but this seems to be normal: bayesian models seem to be more conservative when it comes to estimating correlations between random effects (I know Andrew does not like this term, and also not fixed effects, but they are convenient). 

Traceplots are attached. They look generally fine to me.

One very important fact about lmer vs Stan is that lmer took 23 seconds to return an answer, but Stan took 18,814 seconds (about 5 hours), running 500 iterations and 2 chains. 

One caveat is that I do have to try to figure out how to speed up Stan so that we get the best performance out of it that is possible. But could Stan developers provide some fully worked examples of a mildly challenging dataset like this What's the absolute best that Stan can do with professional intervention, compared to lmer?

If the present analysis can be sped up a lot more, I have another dataset in the Stan-queue with 150 subjects and three contrasts, plus another random factor (items) that I will test Stan vs lmer on. 

Details:

key:
beta[1] = Intercept
beta[2] = c1
beta[3] = c2
beta[4] = c3

1. Fixed effects, Stan vs lmer:
Stan:
Predictors:
mean se_mean  sd  2.5% 25% 50% 75% 97.5% n_eff Rhat
beta[1] 381.1 3.5 7.0 367.5 376.2 381.1 385.9 394.1 4 1.4
beta[2] 31.6  0.8 3.5 23.9 29.7 32.0 34.1 37.9 17 1.2
beta[3] 14.0  0.1 2.2 9.9 12.5 13.8 15.4 18.3 292 1.0
beta[4] 2.9   0.1 2.0 -1.3 1.5 3.0 4.2 7.0 251 1.0

lmer:
          Estimate Std. Error t value
beta[1]   389.73   7.15    54.5
beta[2]   33.78    3.31    10.2
beta[3]   13.98    2.32     6.0
beta[4]   2.75     2.22     1.2

2. Random effects, Stan vs lmer:

## I created this by hand to match lmer output:
Stan:                     Var    sd
# id       (Int)     3152.9    56.2                         
#          c1           531.6    23.1  0.51              
#          c2            89.3       9.5 -0.02   0.11        
#          c3            77.5       8.8 -0.1 -0.5  0.03 
# Residual             4886    69.9

lmer:                    Var     sd
# id       (Int)     3098.3   55.66                         
#          c1          550.8   23.47     0.6               
#          c2          121.0   11.00    -0.13 -0.01        
#          c3            93.2    9.65     -0.3    -0.9  0.4
# Residual       4877.1   69.84


## calculations of correlations from Stan output:
r12: 654.3 / (56.151 * 23.056)   =    0.5054 = r12 
r13:  -8.4 / (56.151 * 9.4499)   =   -0.01583 = r13
r14:  -46.5 / (56.151 * 8.8034)  =   -0.094069 = r14
r23:   24.7 / (23.056 * 9.4499)  =    0.11337 = r23
r24:   -95.7 / (23.056 * 8.8034) =   -0.4715 = r24
r34:   2.4 / (9.4499 * 8.8034)   =    0.028849 = r34

## original stan output:
Sigma[1,1] 3152.9 32.4 589.5 2197.9 2729.1 3090.3 3522.1 4432.8 331 1.0
Sigma[1,2] 654.3 8.5 180.7 367.8 522.5 628.3 767.5 1022.8 455 1.0
Sigma[1,3] -8.4 6.4 91.8 -203.6 -58.8 -4.5 49.7 161.7 207 1.0
Sigma[1,4] -46.5 7.4 91.1 -235.8 -101.9 -41.4 11.0 118.9 151 1.0
Sigma[2,1] 654.3 8.5 180.7 367.8 522.5 628.3 767.5 1022.8 455 1.0
Sigma[2,2] 531.6 6.4 109.0 357.4 455.4 517.5 599.2 780.8 289 1.0
Sigma[2,3] 24.7 2.9 37.3 -50.4 0.6 26.0 49.8 94.7 162 1.0
Sigma[2,4] -95.7 4.2 43.6 -186.6 -121.0 -91.4 -68.0 -16.6 109 1.0
Sigma[3,1] -8.4 6.4 91.8 -203.6 -58.8 -4.5 49.7 161.7 207 1.0
Sigma[3,2] 24.7 2.9 37.3 -50.4 0.6 26.0 49.8 94.7 162 1.0
Sigma[3,3] 89.3 11.6 53.1 11.5 50.0 85.7 120.6 207.4 21 1.1
Sigma[3,4] 2.4 2.8 21.2 -31.6 -11.0 -0.6 13.4 49.1 56 1.0
Sigma[4,1] -46.5 7.4 91.1 -235.8 -101.9 -41.4 11.0 118.9 151 1.0
Sigma[4,2] -95.7 4.2 43.6 -186.6 -121.0 -91.4 -68.0 -16.6 109 1.0
Sigma[4,3] 2.4 2.8 21.2 -31.6 -11.0 -0.6 13.4 49.1 56 1.0
Sigma[4,4] 77.5 9.7 44.5 11.2 45.5 68.6 102.4 175.3 21 1.1

#LMER fit: time taken: 23 seconds
# Groups   Name        Variance Std.Dev. Corr                 
# id       (Intercept) 3098.3   55.66                         
#          c1           550.8   23.47     0.603               
#          c2           121.0   11.00    -0.129 -0.014        
#          c3            93.2    9.65    -0.247 -0.846  0.376 
# Residual             4877.1   69.84                         
#Number of obs: 28710, groups: id, 61
#
#Fixed effects:
#            Estimate Std. Error t value
#(Intercept)   389.73       7.15    54.5
#c1             33.78       3.31    10.2
#c2             13.98       2.32     6.0
#c3              2.75       2.22     1.2

##Stan: time taken: 18,814 seconds (5.2261 hours)
> fit2 <- stan(model_code = klieglvaryingintslopes_codefast, 
+            data = dat2, 
+            iter = 500, chains = 2)

 (CHAIN 1).
Iteration: 500 / 500 [100%]  (Sampling)
Elapsed Time: 8255.26 seconds (Warm-up)
              1650.07 seconds (Sampling)
              9905.33 seconds (Total)

(CHAIN 2).
Iteration: 500 / 500 [100%]  (Sampling)
Elapsed Time: 8062.87 seconds (Warm-up)
              845.716 seconds (Sampling)
              8908.59 seconds (Total)

> print(fit2)
Inference for Stan model: klieglvaryingintslopes_codefast.
2 chains, each with iter=500; warmup=250; thin=1;
post-warmup draws per chain=250, total post-warmup draws=500.

The model code: 

klieglvaryingintslopes_codefast <- 'data {
int<lower=1> N;
real rt[N];                         //outcome
real c1[N];                         //predictor
real c2[N];                         //predictor
real c3[N];                         //predictor
int<lower=1> I;                     //number of subjects
int<lower=1, upper=I> id[N];        //subject id
vector[4] mu_prior;                 //vector of zeros passed in from R
}
parameters {
vector[4] beta;                     // intercept and slope
vector[4] u[I];                     // random intercept and slopes
real<lower=0> sigma_e;              // residual sd 
vector<lower=0>[4] sigma_u;         // subj sd
corr_matrix[4] Omega;               // correlation matrix for random intercepts and slopes
}
transformed parameters {
    matrix[4,4] D;
    D <- diag_matrix(sigma_u);
}
model {
matrix[4,4] L;
matrix[4,4] DL;
real mu[N];                         // mu for likelihood

//priors:
beta ~ normal(0,50);
sigma_e ~ cauchy(0,2);
sigma_u ~ cauchy(0,2);
Omega ~ lkj_corr(4.0);

L <- cholesky_decompose(Omega);
DL <- D * L;
for (i in 1:I)                             // loop for subj random effects
    u[i] ~ multi_normal_cholesky(mu_prior, DL);

for (n in 1:N) {
mu[n] <- beta[1] + beta[2]*c1[n] + beta[3]*c2[n] + beta[4]*c3[n] +  u[id[n], 1] + u[id[n], 2]*c1[n] + u[id[n], 3]*c2[n] + u[id[n], 4]*c3[n];
}
rt ~ normal(mu,sigma_e);           // likelihood
}
generated quantities {
    cov_matrix[4] Sigma;
    Sigma <- D * Omega * D;
}
'
## data loading and lmer call:
load("KWDYZ_test.rda")

summary(m2 <- lmer(rt ~ 1 + c1 + c2 + c3 + (1 + c1 + c2 + c3 | id),data=dat)) 

--
Shravan Vasishth
Professor für Psycholinguistik und Neurolinguistik
Department Linguistik
Universität Potsdam, D-14476 Potsdam
http://www.ling.uni-potsdam.de/~vasishth
traceplotkliegl2011.pdf
KWDYZ_test.rda

Andrew Gelman

unread,
Dec 17, 2013, 3:23:03 PM12/17/13
to stan-...@googlegroups.com, Reinhold Kliegl
Hi, one of our items in the queue (as Bob would say) is an implementation of the marginal mode approach of blmer/bglmer using conditional optimization and Laplace approximations in Stan.  This is waiting on our implementation of 2nd derivatives.  Until that, I expect that for moderate sized problems with small well-behaved models, blmer etc are likely to outperform Stan in speed.  In the current environment, I'd expect Stan to outperform blmer etc in the following scenarios:
- Models such as ordered logit which simply do not exist in lmer etc
- Models with many nonnested variance components and varying intercepts and slopes (thus, multiple hierarchical cov matrices to estimate); these can stump lmer etc
- Very sparse data where the Laplace approximation of glmer doesn't work well
- Models with extra structure (for example, Yair used Stan for a model with time-series correlations)
- Huge data sets (Bob has said that he thinks that Stan will scale better than lmer etc as data sets become huge).
But in your basic ARM-type hierarchical models, it might well be that blmer/bglmer is the best bet for now.  A year or so ago, Vince and I had some thoughts about improving blmer in various ways, but at this point I think we're best off waiting for the 2nd derivatives to come in Stan and then we can try programming the optimization and approximations directly.
A

--
You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
<traceplotkliegl2011.pdf><KWDYZ_test.rda>

Bob Carpenter

unread,
Dec 17, 2013, 4:20:32 PM12/17/13
to stan-...@googlegroups.com
We're not (yet at least) trying to compete with (bg)lmer on speed!
bglmer is Vince Dorie's project (with Andrew) to extend lmer with
priors:

http://cran.r-project.org/web/packages/blme/blme.pdf )

But before talking about speed, let's be clear that

* your models in Stan and lmer are not doing the same thing; your
hierarchical priors can't be expressed in lmer or in bglmer.

* the fitting performed is different; lmer uses maximum marginal
likelihood, which finds the mode after integrating out the hierarchical
variance parameters; this mode is not the same as the Bayesian estimate
you get from Stan based on the posterior mean (though they're often very
close if there's enough data and the posterior's roughly symmetric).

* lmer approximates posterior covariance using a Laplace approximation
at the mode, whereas Stan uses MCMC for full Bayesian inference.

We are working in Stan on three approaches to optimization-based
inference:

* (stochastic) variational inference

* maximum marginal likelihood (like lmer)

* (MC) expectation propagation

Hopefully these approximate methods will be as fast as lmer, especially
on larger-scale problems (lmer uses matrix solving, which is ideal for
small to medium size problems, but doesn't scale to large problems --- it's
why you see so much stochastic-gradient descent in the "machine learning" world).

Now onto specifics of your models and fits.

When you ran, what were the effective sample sizes and R-hat
values? You don't need many effective samples
to get reasonable posterior mean estimates (error drops with sqrt(N_eff)).
You need even less precision when doing initial model exploration.

I'll append some comments on the model below. The bottom line is
that I don't think it can be sped up significantly at this point.
Two improvements we have scheduled will probably give you around
a 10 times speedup for this model:

* a Cholesky factor for correlation matrix data type and
matching LKJ density, and

* vectorizing the multivariate normal.

Even after that, which might give you a factor of 10 speedup, MCMC is
still not going to be competitive with optimization-based approaches.

Every field has the same problem --- the models we want to fit
go beyond the software we want to fit them with. I don't see any indication
this trend will slow down.

Andrew's been using lmer and bglmer for years because he couldn't fit the
models he wanted to in BUGS/JAGS. We tried to build Stan so it would at
least be possible to do the full Bayesian inference for these models and more
general multilevel ones that can't be expressed in lmer's modeling
language (many examples of which can be found in Andrew's research papers and
in his and Jennifer's book).

- Bob


On 12/17/13, 3:14 PM, Shravan Vasishth wrote:
...

> The model code:

I'll grin and bear the lack of indentation, but
I would very very strongly recommend pulling the model
out of the R code and into its own file. I really don't like
this tendency to bundle the model and code together that
our examples in R are leading people to. There's an emacs
mode for Stan, for one thing!

> klieglvaryingintslopes_codefast <- 'data {
> int<lower=1> N;
> real rt[N]; //outcome
> real c1[N]; //predictor
> real c2[N]; //predictor
> real c3[N]; //predictor
> int<lower=1> I; //number of subjects
> int<lower=1, upper=I> id[N]; //subject id
> vector[4] mu_prior; //vector of zeros passed in from R
> }
> parameters {
> vector[4] beta; // intercept and slope
> vector[4] u[I]; // random intercept and slopes
> real<lower=0> sigma_e; // residual sd
> vector<lower=0>[4] sigma_u; // subj sd
> corr_matrix[4] Omega; // correlation matrix for random intercepts and slopes
> }
> transformed parameters {
> matrix[4,4] D;
> D <- diag_matrix(sigma_u);
> }

This is very inefficient. Multiplying out the scaling
directly rather than filling a matrix is much better.
You get all those multiplications by zero when you multiply
a diagonal matrix by a matrix. Get rid of this and replace
the DL <- D * L below with a loop. Fill in all the zero
values with the same temporary 0 for even more speed.

> model {
> matrix[4,4] L;
> matrix[4,4] DL;
> real mu[N]; // mu for likelihood
>
> //priors:
> beta ~ normal(0,50);
> sigma_e ~ cauchy(0,2);
> sigma_u ~ cauchy(0,2);
> Omega ~ lkj_corr(4.0);
>
> L <- cholesky_decompose(Omega);
> DL <- D * L;

Replace this with a loop.

for (m in 1:4)
for (n in 1:m)
DL[m,n] <- L[m,n] * sigma_u[m];

and fill in zeroes with a single value, which is
a hack that helps with auto-diff by reducing expression
stack size.

{
real zero;
zero <- 0;
for (m in 1:4)
for (n in (m+1):4))
DL[m,n] <- zero;
}

This will get more efficient and cleaner to write
when we introduce a Cholesky factor for correlation matrix data type (mostly
coded -- just needs testing and should show up in 2.2).

Given the small size of your correlation matrix, I doubt this will make
a measurable difference in overall model speed.


> for (i in 1:I) // loop for subj random effects
> u[i] ~ multi_normal_cholesky(mu_prior, DL);

This should get a lot faster after we vectorize
multi_normal so you can write this as:

u ~ multi_normal_cholesky(mu_prior, DL);

> for (n in 1:N) {
> mu[n] <- beta[1] + beta[2]*c1[n] + beta[3]*c2[n] + beta[4]*c3[n] + u[id[n], 1] + u[id[n], 2]*c1[n] + u[id[n], 3]*c2[n]
> + u[id[n], 4]*c3[n];
> }
> rt ~ normal(mu,sigma_e); // likelihood
> }
> generated quantities {
> cov_matrix[4] Sigma;
> Sigma <- D * Omega * D;

Sigma <- L * L';

There's a built-in function for multiplying a lower-triangular matrix
by its own transpose, but with generated quantities, it'll be about
as fast to write it this way.

- Bob

Andrew Gelman

unread,
Dec 17, 2013, 4:29:03 PM12/17/13
to stan-...@googlegroups.com
I've been doing this, but I've been having workflow issues. We can talk about this when we're in the same room and we are mapping out a reconfiguration of rstan. Maybe some sort of sweave thing would work for us, I'm not sure. But I've actually been having problems with putting each Stan model in its own file. The trouble is that when I play around with the model, I end up with lots of different files and I have to give each one a different name, then I'll have lots of Stan objects in R and each of these has to have a name, then I save the parameters and these all have to have different names to distinguish them from each other, etc. I need a way to manage my modeling workflow so I can work with many different models without having to roll my own naming conventions. Maybe we could construct some default system where the stan objects automatically get a name that is derived from the file name, or something like that.

Sean O'Riordain

unread,
Dec 17, 2013, 5:05:07 PM12/17/13
to stan-...@googlegroups.com
I've been mulling this same problem over as well and initially I was tending to put everything into one file, but soon that became large and unwieldy when a couple of models were there.  I am working on models which are related and evolving as my understanding improves.  The dataset is the same, but needs to be manipulated a little for some models - i.e. rescaled, or groups progressively added together.  I came to the conclusion a week or two ago that by breaking things down into the data preparation stage, a model and the post-processing stages that I got more clarity.  The pre-processing stage can be done in a couple of different ways and wrapped in a few different functions. The post -processing stage of summary output and graphics etc is pretty much the same for each - this can also be wrapped in a function or two for clarity - parameterised.

So this leaves me with the model code (say I call it H2) which changes slightly, so I put that in a file named like H2.Stan, and I have the driver code which is quite short in a file called H2.R - which means that when I print out things, it is quite short and I can see everything that I need to see on two or three pages - easier to show others.  I put the output into a folder named H2, since I sometimes I create a lot of files - but that is not confusing as I have parameterised the pre/post processing functions to prepend the model name "H2" in this case to all the files - e.g. H2_pairs.pdf, H2_absd_plot.pdf, H2_fit.txt etc...

Of course currently the code I am using is still in the transition from ugly and is not quite at the above stage of clarity but that is where I am aiming for ! :-)

Kind regards,
Seán



--
You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




--
Kind regards,

Sean O'Riordain


Bob Carpenter

unread,
Dec 17, 2013, 5:34:36 PM12/17/13
to stan-...@googlegroups.com
> On Dec 17, 2013, at 10:20 PM, Bob Carpenter wrote:
>> I would very very strongly recommend pulling the model
>> out of the R code and into its own file.

On 12/17/13, 4:29 PM, Andrew Gelman wrote:
> I've been doing this, but I've been having workflow issues. We can talk about this when we're in the same room

Or before.



> and we are mapping out a reconfiguration of rstan. Maybe some sort of sweave thing would work for us, I'm not sure.

OK. I do want to say that Sweave is going in exactly the wrong
direction in my opinion. It ties your R code to a presentation!
I have the same problem with IPython Notebook. If someone puts
their Stan code in an IPython Notebook, will you (Andrew) be able
to use it?

I know a lot of people disagree with me on this one. But I'm
still going to champion favoring modularity (i.e., lack of dependencies)
over almost all other concerns.

> But I've actually been having problems with putting each Stan model in its own file.
> The trouble is that when I play around with the model, I end up with lots of different files and
> I have to give each one a different name, then I'll have lots of Stan objects in R and each of these
> has to have a name, then I save the parameters and these all have to have different names to distinguish
> them from each other, etc. I need a way to manage my modeling workflow so I can work with many different
> models without having to roll my own naming conventions. Maybe we could construct some default system
> where the stan objects automatically get a name that is derived from the file name, or something like that.

How did you deal with this for BUGS?

Maybe some users have some advice?

Daniel and I have come up with some conventions for projects we've been
working on together. If you modify files rather than copying them, you lose
the ability to rerun the older models. (Unless you've checked them into
some kind of version control system that'll let you restore them.) At the
same time, you want to reuse some of the R infrastructure, data readers, etc.

name/ // some descriptive name for model
name.stan // Stan model
sim.R // simulator for fake data
init-gen.R // generate initialization, perhaps based on data
fit.sh // shell script to fit data in parallel
plot-fir.R // R script with functions to generate full set of plots
data/ // directory for all data
data-gen-<params>/ // file name records parameter settings (not scalable!)
data.R // actual data file in Stan dump format
sampling-<params>/ // file name records seeds, etc.
samples-1.csv // sample file for a single chain (extend
samples-2.csv

- Bob

sp_r...@yahoo.it

unread,
Dec 17, 2013, 5:50:38 PM12/17/13
to stan-...@googlegroups.com, gel...@stat.columbia.edu
On Tuesday, December 17, 2013 10:29:03 PM UTC+1, Andrew Gelman wrote:
Maybe we could construct some default system where the stan objects automatically get a name that is derived from the file name, or something like that.

My 2 cents.
I'm using the following naming convention:

- "foo": model_name
- "foo.stan": model source
- "foo.data.R": data file
- "foo.sm": compiled model
- "foo.sm.RData": saved compiled model
- "foo.sf": final stanfit object
- "foo.sf.RData": eventual saved stanfit object

and a function:

mystan <- function(model, data=model, iter = 2000, seed = 12345, new=FALSE, ...) {
    stopifnot
(require(parallel)) # should never happen!
    model
.data <- paste(data, ".data.R", sep='')
    model
.file <- paste(model, ".stan", sep='')
    model
.sm <- paste(model, ".sm", sep='')
    model
.saved <- paste(model.sm, ".RData", sep='')
    model
.sf <- paste(model, ".sf", sep='')
   
if (!file.exists(model.file))
        stop
(paste(model.stan, "not found"))
   
if (!file.exists(model.data))
        stop
(paste(model.data, "not found"))
   
if (new) { # you have a previous compiled model, but you've changed the source
       
if (exists(model.sm))
            rm
(list=model.sm, envir = .GlobalEnv)
       
if (file.exists(model.saved))
            unlink
(model.saved)
   
}
   
if (!exists(model.sm)) {
       
if (file.exists(model.saved)) {
           
# load the saved model and put its name in .GlobalEnv
            load
(model.saved, envir = .GlobalEnv, verbose = FALSE)
       
} else {
            rt
<- stanc(model.file, model_name = model)
            sm
<- stan_model(stanc_ret = rt)
           
# create the compiled model variable name
            assign
(model.sm, sm, envir = .GlobalEnv)
            save
(list=model.sm, file = model.saved)
       
}
   
}
    sm
<- get(model.sm) # sm <- the model to whom its variable name refers
    sflist
<- mclapply(1:4, mc.cores = detectCores(),
                       
function(i) sampling(sm, data = read_rdump(model.data),
                           chains
= 1, chain_id = i,
                           iter
= iter, seed = seed, ...))
    sf
<- sflist2stanfit(sflist)
   
# create the stanfit object variable name
    assign
(model.sf, sf, .GlobalEnv)
}

An example session:

> library(rstan)
...
> source("mystan.R")
> N <- 100
> x <- rnorm(N,0,4)
> y <- 2 + 0.5 * x - 1.5 * x^2 + rnorm(N)
> stan_rdump(c("N", "x", "y"), file = "linreg.data.R")
> ls()
[1] "mystan" "N"      "x"      "y"    
> mystan("linreg1", "linreg")
...
> ls()
[1] "linreg1.sf" "linreg1.sm" "mystan"     "N"          "x"        
[6] "y"        
> print(linreg1.sf)
...
> mystan("linreg2", "linreg")
...
> ls()
[1] "linreg1.sf" "linreg1.sm" "linreg2.sf" "linreg2.sm" "mystan"    
[6] "N"          "x"          "y"        
> print(linreg2.sf)
...
>

Just my 2 cents...

Sergio

PS: @Bob: I've disabled hyperthreading ;-)

Andrew Gelman

unread,
Dec 18, 2013, 10:18:17 AM12/18/13
to stan-...@googlegroups.com
[Subject line renamed for clarity, as Bob would say]

What about allowing a single file to include several different Stan models, for example using some sort of # tag?  This would allow me to modify a model via cut and paste instead of needing to keep track of multiple files.

Sean O'Riordain

unread,
Dec 18, 2013, 10:22:57 AM12/18/13
to stan-...@googlegroups.com
Personally, I'd prefer to keep the models in separate files for clarity and simplicity.  Then it comes down to creating a file naming convention which works.

Andrew Gelman

unread,
Dec 18, 2013, 10:27:57 AM12/18/13
to stan-...@googlegroups.com
I think it should certainly be _allowed_ to keep models in separate files, indeed that might well be the default.  But it might also be helpful to have some sort of # option to allow multiple models in one file as well.

Sean O'Riordain

unread,
Dec 18, 2013, 10:37:05 AM12/18/13
to stan-...@googlegroups.com

Currently you can do that from R, by saying something like.

schools_code1 <- '
  data {
    int<lower=0> J; // number of schools 
etc...
  }
  parameters {
    real mu; 
etc...
  }
  model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
  }
'

schools_code2 <- '
  data {
    int<lower=0> J; // number of schools 
etc...
  }
  parameters {
    real mu; 
etc.
  }
  }
  model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
  }
'

fit1 <- stan(model_code = schools_code1, data = schools_dat, 
            iter = 1000, chains = 4)

fit2 <- stan(model_code = schools_code2, data = schools_dat, iter = 1000, chains = 4)

Or alternatively you can have just the Stan models as giant strings in an "R" files and just source() the file to read the models in.

Bob Carpenter

unread,
Dec 18, 2013, 11:25:43 AM12/18/13
to stan-...@googlegroups.com
I think Sean's message nicely outlines the problems we're
all having with our models and data munging evolving as we work
on a problem. Daniel and I are feeling this now in some applied
work we're doing together. Andrew --- what did you do with
BUGS/JAGS for this? Do they have any recommendations, given that
they've been playing this game for roughly two decades?

I have a very strong preference for:

* data in its own dump-format file

* models in their own files

These formats are cross-interface (CmdStan, RStan, PyStan),
so anyone on the dev team could use them, not just the R
users. This is the same reason I don't like muddling R and
LaTeX in Sweave or Python and web presentations in IPython
notebooks.

I'd also strongly prefer

* one model per file

It's just fewer moving pieces to code, test, document,
and maintain inside of Stan. I realize this goes against
R's one-function-to-rule-them-all mentality and is more
unix-like in giving you small pieces to cobble together.

I'd then like to see in separate files in the project:

* scripts to fit the models

* scripts to test the fit of models

* scripts to plot

This is just good for reproducibility and sharing with others.
As much as possible, common functions should be factored out of
these pieces for ongoing work.

These are going to be platform-specific, though. I don't want
to write a scripting language for Stan like the one for JAGS (and
maybe BUGS).

I realize tastes vary on this. I don't personally like IDEs
for the same reason --- they tangle your otherwise portable code
with a tool for compilation and browsing. It's also a chef-like
mentality of fewer gadgets and better knife skills. I work
on all sorts of coding environments, and it's easier for me to
just do everything in the shell and in emacs.

- Bob
>>> Se�n
>>>
>>>
>>>
>>> On 17 December 2013 21:29, Andrew Gelman <gel...@stat.columbia.edu <mailto:gel...@stat.columbia.edu>> wrote:
>>>
>>> I've been doing this, but I've been having workflow issues. We can talk about this when we're in the
>>> same room and we are mapping out a reconfiguration of rstan. Maybe some sort of sweave thing would work
>>> for us, I'm not sure. But I've actually been having problems with putting each Stan model in its own
>>> file. The trouble is that when I play around with the model, I end up with lots of different files and I
>>> have to give each one a different name, then I'll have lots of Stan objects in R and each of these has to
>>> have a name, then I save the parameters and these all have to have different names to distinguish them
>>> from each other, etc. I need a way to manage my modeling workflow so I can work with many different
>>> models without having to roll my own naming conventions. Maybe we could construct some default system
>>> where the stan objects automatically get a name that is derived from the file name, or something like that.
>>>
>>> On Dec 17, 2013, at 10:20 PM, Bob Carpenter wrote:
>>> > I would very very strongly recommend pulling the model
>>> > out of the R code and into its own file.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>> stan-users+...@googlegroups.com <mailto:stan-users%2Bunsu...@googlegroups.com>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>>
>>> --
>>> Kind regards,
>>>
>>> Sean O'Riordain
>>> seor...@tcd.ie <mailto:seor...@tcd.ie>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to
>>> stan-users+...@googlegroups.com <mailto:stan-users+...@googlegroups.com>.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to
>> stan-users+...@googlegroups.com <mailto:stan-users%2Bunsu...@googlegroups.com>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>>
>> --
>> Kind regards,
>>
>> Sean O'Riordain
>> seor...@tcd.ie <mailto:seor...@tcd.ie>
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to
>> stan-users+...@googlegroups.com <mailto:stan-users+...@googlegroups.com>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> stan-users+...@googlegroups.com <mailto:stan-users%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
>
> --
> Kind regards,
>
> Sean O'Riordain
> seor...@tcd.ie <mailto:seor...@tcd.ie>

Andrew Gelman

unread,
Dec 18, 2013, 11:26:01 AM12/18/13
to stan-...@googlegroups.com
Yes, I realize this, but I also agree with Bob that it's a bit ugly to have the Stan code written as strings and embedded in the R code.  So, yes, I like Bob's method of keeping R script and Stan code separate, I just also think it would help to allow multiple Stan models in a single file.

Andrew Gelman

unread,
Dec 18, 2013, 11:27:22 AM12/18/13
to stan-...@googlegroups.com
Bob
I think we might be able to come up with some solutions in January, when we're working side by side.
A
>>>> Seán
Reply all
Reply to author
Forward
0 new messages