Frequently Asked Questions / Frequently Encountered Problems

15,101 views
Skip to first unread message

Ben Goodrich

unread,
Oct 22, 2012, 5:48:30 PM10/22/12
to
This thread will be updated with further posts that illustrate common problems. Please read through it to see if the problem you are encountering is similar. Note that these posts often reflect the collective experience of the Stan developers and are not necessarily a completely original insight by the author of the post. @devs: When posting a solution to this thread, please edit the subject line to describe the problem and paste a link below.

My model does not parse correctly
My model does not compile correctly
My model compiles but does not run correctly
My model runs but I have a problem with rstan
Can Stan utilize this probability distribution?

Ben Goodrich

unread,
Oct 4, 2012, 5:02:58 PM10/4/12
to stan-...@googlegroups.com
If you have a parser error, attach the text file that fails to parse and quote the error message in your post.

Ben Goodrich

unread,
Oct 4, 2012, 8:12:05 PM10/4/12
to
If you have a compiler error, attach the text file that parses but fails to compile to a post, state the version number of the C++ compiler, and quote the compiler error(s). The version number of the g++ compiler can be found in R by executing system("g++ --version") or by executing g++ --version on the command-line. Substitute clang++ for g++ if you are using the clang compiler. If using rstan, specify verbose = TRUE in the call to stan().

Ben Goodrich

unread,
Oct 17, 2012, 11:57:15 AM10/17/12
to
If you have a runtime error, first recompile your model without optimization, which will often yield a more informative error message. In rstan, you can execute set_cppo(mode = "debug") before calling stan() . Note that whenever the problem gets resolved, execute set_cppo(mode = "fast") before running your model. If using Stan from the command line, specify O=0 in the call to make and edit the makefile in the stan/ directory to append -g to the line that starts with CFLAGS. Runtime errors are often simple typos, so doing this will often isolate the problem.

If you still do not understand the runtime error or there is no runtime error but the output seems flawed, try to make a model that is as "simple" as possible but still causes an error, where "simplicity" is associated with fewer parameters and fewer lines of code in the text file. For example, instead of a multilevel model, do a flat model. Or fix some unknown but unproblematic parameters to reasonable constants.

Then attach the simplified text file (and if using rstan the .R file that calls stan() ) to a post and quote the runtime error. Also, if at all possible, attach the data needed to run the model; simulated data is fine. If you cannot post the data to a public list, you can email it to d...@mc-stan.org and it will only be circulated to Stan developers. If you cannot get the data to us somehow, it may be impossible to diagnose a runtime error. Finally, be sure to fix the pseudo-random number generator seed that is passed to Stan, so that we see the same error that you do. If using Stan from the command-line, simply specify the otherwise optional --seed argument. Similarly, if using rstan, the stan() function can be passed a seed argument.

Ben Goodrich

unread,
Oct 4, 2012, 5:07:03 PM10/4/12
to stan-...@googlegroups.com
If your model executes but you have some other problem with rstan, follow the instructions for a runtime error at

https://groups.google.com/d/msg/stan-users/4gv3fNCqSNk/FXHj6fn0SC0J

and be sure the attach the .R file that calls stan() , and quote the error message that R produces.

Ben Goodrich

unread,
Jan 6, 2013, 5:19:27 PM1/6/13
to
Stan supports many probability distributions and more are always being added. If the probability distribution you would like to use is not among the distributions Stan supports, please request it on the stan-users mailing list, so that we know what probability distributions are in demand. Also, adding a new probability distribution to Stan is relatively easy for anyone who can program in C++, but we will not cover this possibility here.

Even if you do not know any C++, you can utilize essentially any probability distribution whose density can be written in closed form by making appropriate additions to your .stan file. The variable that holds the log-posterior is called lp__ and can be modified by the .stan file. Thus, simply take the logarithm of the density and increment lp__ by the result. For more details, see the chapter entitled "Custom Probability Distributions" in the Stan reference manual.

We illustrate this process for the Kumaraswamy distribution, which is not implemented in Stan as of version 1.1.0 :

https://en.wikipedia.org/wiki/Kumaraswamy_distribution

where 

f(x; a,b) = a * b * x^(- 1) * (1 - x^a)^(- 1) for a,> 0 and 0 < x < 1

so

log(f(x; a,b)) = log(a) + log(b) + (- 1) * log(x) + (- 1) * log(1 - x^a) for a,> 0 and 0 < x < 1

The attached Kumaraswamy.stan file estimates the parameters of this distribution in a computationally efficient fashion:

data {
  int<lower=1> N;
  real<lower=0,upper=1> x[N];
}
transformed data {
  real sum_log_x; // calculate this constant only once
  sum_log_x <- 0.0;
  for (i in 1:N)
    sum_log_x <- sum_log_x + log(x[i]);
}
parameters {
  real<lower=0> a;
  real<lower=0> b;
}
model {
  real summands[N];
  // put priors on a and b here if you want

  // log-likelihood
  lp__ <- lp__ + N * (log(a) + log(b)) + (a - 1) * sum_log_x;
  for (i in 1:N) {
    summands[i] <- (b - 1) * log1m(pow(x[i],a)); // log1m(y) := log(1 - y)
  }
  lp__ <- lp__ + sum(summands); // faster than doing inside loop
}


To verify that it is working, we use the attached Kumaraswamy.R file to draw from a Kumaraswamy distribution with shape parameters a = 3 and b = 2 and then use the rstan package to estimate the shape parameters.

stopifnot(require(rstan))

N <- 1000
a <- 3
b <- 2

x <- rbeta(N, 1, b)^(1/a)
Kumaraswamy <- stan(file = "Kumaraswamy.stan", data = list(N = N, x = x), verbose = FALSE, refresh = -1)
print(Kumaraswamy)


which yields something like (i.e. it will randomly deviate from) the following:

> print(Kumaraswamy)
Inference for Stan model: Kumaraswamy.
4 chains: each with iter=2000; warmup=1000; thin=1; 2000 iterations saved.

      mean se_mean  sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
a      2.9       0 0.1   2.7   2.9   2.9   3.0   3.1   766    1
b      1.9       0 0.1   1.8   1.9   1.9   2.0   2.1   787    1
lp__ 282.4       0 1.0 279.7 282.1 282.7 283.1 283.4   683    1

Samples were drawn using NUTS2 at Sat Oct  6 16:00:30 2012.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).


If you need help with this process, please start a new thread on stan-users and include a link to the density you are interested in. Also, if you successfully utilize some density this way, please start a new thread on stan-users, include a link to the density, and attach your .stan file. We might have some suggestions as to how you can compute the density more efficiently in Stan, and other Stan users will be able to use your example.
Kumaraswamy.stan
Kumaraswamy.R

Ben Goodrich

unread,
Nov 28, 2013, 1:41:37 PM11/28/13
to
While maximizing speed is often a model-specific question, here are a few general points. This post will be updated in the future with more tricks, so be sure to check back regularly.
  • The relevant metric is the ratio of effective sample size to time, not the ratio of iterations to time.
Samples from the joint posterior distribution of the parameters are not independent. Hence, the effective sample size is always less than the total number of samples from the posterior distribution. One of the goals of Hamiltonian Monte Carlo (HMC), as implemented in Stan, is to return a reasonable number of effective samples from the posterior distribution of interest with fewer total iterations than would be necessary for a Metropolis-Hastings or Gibbs sampler. Doing so often entails that Stan take more time to complete each iteration than a Metropolis-Hastings or Gibbs sampler would because (among other reasons) Stan is using a greedy leapfrog algorithm and is calculating a gradient via auto-differentiation.
  • Utilize full compiler optimization
This is an essentially free way to speed up your code, although it does come with a cost that it takes somewhat longer to compile the code. If you are using Stan from the command line, simply specify O=3 (that is an "Oh" not a "zero") in the call to make as in

make O=3 /path/to/my/dotstanfile_without_extension

If you are using rstan, execute set_cppo(mode = "fast") or manually set up a Makevars file (see http://cran.r-project.org/doc/manuals/R-admin.html#Customizing-package-compilation or https://code.google.com/p/stan/wiki/RStanGettingStarted ).
  • There is no need to speculate as to what part of your code is slow (as long as you are not using Windows); measure it.
If using rstan, see

http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-compiled-code

To utilize these tools, it is necessary to know the location of the "shared object". This lives in a temporary directory and can be found by executing dir(tempdir()) in R. Look for the file that ends in .so. If there is more than one such file (because you have estimated  more than one model), you can find the temporary file name by executing get_stanmodel(posterior)@dso@dso_filename in R where posterior is the name of the object produced by stan().
  • Do not loop over sampling statements when a vectorized sampling statement is possible

Many (but not all in any given release) density functions accept a vector on the left-hand side and either a vector or a scalar arguments. One such distribution is the normal. In other words, this BUGS-like code fragment  is unnecessarily slow in Stan

real mu[N];
for (i in 1:N) {
  mu
[i] <- x[i] * beta;
  y
[i] ~ normal(mu[i], sigma);
}

while this is faster

vector[N] mu;
mu
<- X * beta; // presuming X is a NxK matrix
y
~ normal(mu, sigma);

The previous is faster for two reasons. The first is that matrix algebra operations --- such as the product of a matrix and a vector --- are potentially faster than the equivalent scalar algebra operations because the former lend themselves to more optimization of the auto-differentiation mechanism. The second reason is the vectorization of the sampling statement. Vectorization is so much faster that it is worth pulling the normal() line out of the loop even if you must use a loop to construct a mu vector, as in a multilevel model such as

real mu[N];
for (i in 1:N) {
  mu
[i] <- x[i] * beta + RandomIntercept;
}
y
~ normal(mu, sigma);
  • Consider reparameterizations of your model, such as the "Matt trick"
Many multilevel models involve random coefficients that are functions of hyperpriors. However, these models can lead to posterior distributions that are highly correlated. An alternative is to reparameterize in terms of a standardized variable and create a transformed parameter or local parameter to use in the log-likelihood. For example, if a theoretical model can be operationalized as

parameters {  
  vector
[K] beta[J];
  vector
[K] mu;              // hier prior loc
  real
<lower=0> tau[K];      // hier prior scale
}
model
{
  for(j in 1:J) beta[j] ~ normal(mu, tau);
  // likelihood as a function of beta
}

it can be reoperationalized as a shift-and-scale of standard normals

parameters {  
  vector
[K] e_beta[J];         // errors in beta
  vector
[K] mu;                // hier prior loc
  real
<lower=0> tau[K];        // hier prior scale
}
transformed parameters
{
  vector
[K] beta[J];         // intercept + slopes
 
for (k in 1:K)
   
for (j in 1:J)
      beta
[j,k] <- mu[k] + e_beta[j,k]*tau[k];
}
model
{
  for(j in 1:J) e_beta[j] ~ normal(0,1); // standard normal prior implies beta ~ normal(mu,tau)
  // proper priors on mu and tau
 
// likelihood as a function of beta, NOT e_beta
}

Doing so makes the posterior distribution of e_beta less correlated with mu and tau and thereby should increase the effective sample size. Note that it may be necessary to place at least weakly informative priors on mu and tau, but there are many ways of doing so.

The same can be done with a multivariate normal parameter. Instead of parameterizing it as a multivariate normal with mean vector mu and covariance matrix Sigma like

parameters {
  vector
[K] beta[J];
  vector
[K] mu;
  cov_matrix
[K] Sigma;
}
model
{
 
for (j in 1:J) {
    beta
[j] ~ multi_normal(mu, Sigma);
 
}
 
Sigma ~ wishart(some_df, some_Scale);
 
// prior on mu
 
// likelihood as a function of beta
}

reparameterize it in terms of a multivariate normal with mean vector zero and correlation matrix Tau and a transformation thereof like

transformed data {
  vector
[K] zero_vector;
 
for (k in 1:K) {
    zero_vector
[k] <- 0;
 
}
}
parameters
{
  vector
[K] e_beta[J];
  vector
[K] mu;
  corr_matrix
Tau;
  real
<lower=0> tau[K]; // standard deviations
}
transformed parameters
{
  vector
[K] beta[J];
 
for (k in 1:K) {
   
for (j in 1:J) {
      beta
[j,k] <- mu[k] + e_beta[j,k] * tau[k]; // thanks to Mike Lawrence for catching typo
   
}
 
}
}
model
{
 
for (j in 1:J) {
    e_beta
[j] ~ multi_normal(zero_vector, Tau);
 
}
 
// Note: 1.0 implies a uniform prior on Tau so omit it, but uncomment for shape != 1.0
  //  Tau ~ corr_matrix(1.0);
  // proper priors on mu
and tau
 
// likelihood as a function of beta not e_beta
}

The Matt trick is an option for any distribution in the location-scale family

https://en.wikipedia.org/wiki/Location-scale_family

not just the normal distribution. There is more discussion of the Matt trick in the chapter of the manual entitled "Optimizing Stan Code".

If these tricks are insufficient, please start a new thread on stan-users and be sure to attach your .stan file.
Message has been deleted

Bob Carpenter

unread,
Oct 12, 2012, 9:03:17 AM10/12/12
to stan-...@googlegroups.com
I think we need to make that warning message even clearer.
It's not fatal and the chains still satisfy detailed balance
so the sample is still drawn properly from the posterior.

Because we're using a discrete simulation of the Hamiltonian
dynamics that follows the gradient for a fixed time rather
than the actual log probability function's curvature, we can
get simulation errors. When these result in domain errors,
like trying to provide an illegal argument to a probability
function, then we issue a warning and reject using a Metropolis
adjustment (running out of bounds is equivalent to having
zero probability).

If we start in a place that has domain errors (like if
we're given bad inits), a different warning is printed.

- Bob

On 10/12/12 12:33 AM, Mike Lawrence wrote:
> The line:
>
> beta[j,k] <- mu[k] + e_beta[j,k] * sigma[k];
>
> should be
>
> beta[j,k] <- mu[k] + e_beta[j,k] * tau[k];
>
> yes?
>
>
> Also, when I attempted to implement the efficient multivariate example, I received the following warnings occasionally
> during chain computation:
>
> Warning (non-fatal rejection): Error in function stan::prob::multi_normal_log(i): Covariance matrix is not positive
> definite (0)
>
>
> On Friday, October 12, 2012 12:15:52 AM UTC-3, Ben Goodrich wrote:
>
> While maximizing speed is often a model-specific question, here are a few general points. This post will be updated
> in the future with more tricks, so be sure to check back regularly.
>
> * The relevant metric is the ratio of effective sample size to time, not the ratio of iterations to time.
>
> One of the goals of Hamiltonian Monte Carlo (HMC), as implemented in Stan, is to return a reasonable number of
> effective samples from the posterior distribution of interest with fewer iterations than would be necessary for a
> Metropolis-Hastings or Gibbs sampler. Doing so often entails that Stan take more time to complete an iteration than
> a Metropolis-Hastings or Gibbs sampler because (among other reasons) Stan is using a greedy leapfrog algorithm and
> is calculating a gradient via auto-differentiation.
>
> * Utilize full compiler optimization
>
> This is an essentially free way to speed up your code, although it does come with a cost that it takes somewhat
> longer to compile the code. If you are using Stan from the command line, simply specify O=3 (that is an "Oh" not a
> "zero") in the call to make as in
>
> |
> make O=3 /path/to/my/dotstanfile_without_extension
> |
>
> If you are using rstan, execute _set_cppo(mode = "fast")_ or manually set up a Makevars file (see
> https://code.google.com/p/stan/wiki/RStanGettingStarted <https://code.google.com/p/stan/wiki/RStanGettingStarted> ).
>
> * There is no need to speculate as to what part of your code is slow (as long as you are not using Windows);
> measure it.
>
> If using rstan, see
>
> http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-compiled-code
> <http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-compiled-code>
>
> To utilize these tools, it is necessary to know the location of the "shared object". This lives in a temporary
> directory and can be found by executing _dir(tempdir())_ in R. Look for the file that ends in .so. If there is more
> than one such file (because you have estimated more than one model), you can find the temporary file name by
> executing _get_stanmodel(posterior)@dso@dso_filename_ in R where posterior is the name of the object produced by
> _stan()_.
>
> * Do not loop over sampling statements when a vectorized sampling statement is possible
>
> Many (but not all in any given release) density functions accept a vector on the left-hand side and either a vector
> or a scalar arguments. One such distribution is the normal. In other words, this BUGS-like code fragment is
> unnecessarily slow in Stan
>
> |
> real mu[N];
> for (i in 1:N) {
> mu[i] <- x[i] * beta;
> y[i] ~ normal(mu[i], sigma);
> }
> |
>
> while this is faster
>
> |
> vector[N] mu;
> mu <- X * beta; // presuming X is a NxK matrix
> y ~ normal(mu, sigma);
> |
>
> Vectorization is so much faster that it is worth pulling the normal() line out of the loop even if you must use a
> loop to construct a mu vector, as in a multilevel model like
>
> |
> real mu[N];
> for (i in 1:N) {
> mu[i] <- x[i] * beta + RandomIntercept;
> }
> y ~ normal(mu, sigma);
> |
>
> Also, keep in mind that if you have a matrix of normally distributed quantities with varying parameters, you can
> treat the rows or columns as vectors (depending on what dimension as constant parameters. In other words, this is slow
>
> |
> for (k in 1:K) {
> for (i in 1:N) {
> mu[i,k] ~ normal(mu[k], sigma[k]);
> }
> }
> |
>
> and this is much faster
>
> |
> for (k in 1:K) {
> col(mu,k) ~ normal(mu[k], sigma[k]);
> }
> |
>
> That works if mu is defined as a matrix. If mu were defined as
>
> |
> real mu[N,K];
> // or
> vector[N] mu[K];
> |
>
> then it is not easy to extract the kth "column".
>
> * Consider reparameterizations of your model, such as the "Matt trick"
>
> Many multilevel models involve random coefficients that are functions of hyperpriors. However, these models can lead
> to posterior distributions that are highly correlated. An alternative is to reparameterize in terms of a
> standardized variable and create a transformed parameter or local parameter to use in the log-likelihood. For
> example, if a theoretical model can be operationalized as
>
> |
> parameters {
> vector[K] beta[J];
> vector[K] mu; // hier prior loc
> real<lower=0> tau[K]; // hier prior scale
> }
> model {
> for (j in 1:J) {
> beta[j] ~ normal(mu[j], tau[j]);
> }
> // likelihood as a function of beta[j]
> }
> |
>
> it can be reoperationalized as a shift-and-scale of standard normals
>
> |
> parameters {
> vector[K] e_beta[J]; // errors in beta
> vector[K] mu; // hier prior loc
> real<lower=0> tau[K]; // hier prior scale
> }
> transformed parameters {
> vector[K] beta[J]; // intercept + slopes
> for (k in 1:K)
> for (j in 1:J)
> beta[j,k] <- mu[k] + e_beta[j,k]*tau[k];
> }
> model {
> for (j in 1:J) {
> e_beta[j] ~ normal(0, 1); // standard normal prior
> }
> // prior on mu and tau
> // likelihood as a function of beta, not e_beta
> }
> |
>
> Doing so makes the posterior distribution of e_beta[j] less correlated with mu[k] and tau[k] and thereby should
> increase the effective sample size. The same can be done with a multivariate normal parameter. Instead of
> parameterizing it as a multivariate normal with mean vector mu and covariance matrix Sigma like
>
> |
> parameters {
> vector[K] beta[J];
> vector[K] mu;
> cov_matrix[K] Sigma;
> }
> model {
> for (j in 1:J) {
> beta[j] ~ multi_normal(mu, Sigma);
> }
> Sigma ~ wishart(some_df, some_Scale);
> // prior on mu
> // likelihood as a function of beta
> }
> |
>
> reparameterize it in terms of a multivariate normal with mean vector *zero* and *correlation* matrix Tau and a
> transformation thereof like
>
> |
> transformed data {
> vector[K] zero_vector;
> for (k in 1:K) {
> zero_vector[k] <- 0;
> }
> }
> parameters {
> vector[K] e_beta[J];
> vector[K] mu;
> corr_matrix Tau;
> real<lower=0> tau[K]; // standard deviations
> }
> transformed parameters {
> vector[K] beta[J];
> for (k in 1:K) {
> for (j in 1:J) {
> beta[j,k] <- mu[k] + e_beta[j,k] * sigma[k];
> }
> }
> }
> model {
> for (j in 1:J) {
> e_beta[j] ~ multi_normal(zero_vector, Tau);
> }
> |// Note: 1.0 implies a uniform prior on Tau so omit it, but uncomment for shape != 1.0
> |// Tau ~ corr_matrix(1.0);
> // prior on mu and tau
> // likelihood as a function of beta not e_beta
> }
> |
>
> If these tricks are insufficient, please start a new thread on stan-users and be sure to attach your .stan file.
>
> --
>
>

Ben Goodrich

unread,
Oct 23, 2012, 8:09:45 PM10/23/12
to
One topic that is seen a lot on stan-users is when and how the support of a parameter should be restricted and what that implies for how the prior on that parameter should be specified. This necessarily long post will try to make this process clearer.

In the beginning (which you can think of as some time before Stan existed), the parameter space is all real numbers. When declaring a parameter in the parameter {} block of a .stan file, you can (but are not required to) restrict the support of a parameter to some subset of the real numbers. We will use the running example of a bivariate normal model. For example,

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real
<lower=-1,upper=1> rho; // one correlation between -1 and 1
}

These declarations ensure that Stan will never generate a proposal that is outside the support of a parameter (although you may get numerically unstable behavior at the endpoint). In other words, by restricting the support of a parameter you have already partially specified a prior on that parameter: namely that there is no prior mass outside its support. You have not yet said anything explicitly about the distribution of prior mass over its support, and the absence of an explicit statement implies a uniform prior over the support.

A common mistake in .stan files is the failure to restrict the support of a parameter in the parameters {} block and to then attempt to restrict the support of a parameter with a prior in the model {} block. For example, writing the following is not good

parameters {
  real mu
[2]; // two unrestricted means
  real sigma
[2]; // two "standard deviations", not restricted to be nonnegative
  real rho
; // one "correlation", not restricted to be between -1 and 1
}
model
{
  sigma
[1] ~ SomePositivePrior();
  sigma
[2] ~ SomePositivePrior();
  rho ~ uniform(-1.0, 1.0);
  // more stuff
}

Such a model will probably spew a lot of warnings at runtime and may fail entirely. Even if it happens to run to completion, you should not trust the results. Even if you trust the results, it would be much better to write the parameters {} and model {} blocks properly.

The reason is that, when a prior puts zero mass on some subset of the support, Stan may generate parameter proposals where there is no posterior mass. Such proposals must be rejected, so the sampling is at best computationally inefficient. Worse, suppose there is no posterior mass at the initial value for some parameter, which is likely because initial values by default are drawn uniformally from the [-2,2] interval before mapping them to the support of the parameter. In that case, Stan may never generate an admissible proposal for that parameter. Remember that in a Hamiltonian Monte Carlo (HMC) algorithm like Stan, the gradient of the log posterior is a very important ingredient for how the Markov Chain evolves. If there is no posterior mass at a point in the parameter space, the partial derivative of the log posterior with respect to that parameter is either zero or undefined, depending on how you look at it. In other words, proposals where there is no posterior mass have no information about which direction the parameters should move.

Thus, we have a general rule in Stan: An explicit prior should place positive (but perhaps arbitrarily small) mass over the entire support of a parameter.

The best way to specify the parameters{} and model {} blocks would be

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real
<lower=-1,upper=1> rho; // one correlation between -1 and 1
}
model {
  mu[1] ~ SomeUnrestrictedPrior();
  mu[2] ~
SomeUnrestrictedPrior();
  sigma[1] ~
SomePositivePrior()
  sigma[2] ~ SomePositivePrior();
  // implicit: rho ~ uniform(-1.0, 1.0);
  // more stuff
}


In the above, I have specified restrictions on the support of the parameters where appropriate, added explicit priors on mu[1] and mu[2], and commented out the uniform prior on rho. There are no restrictions on the support of mu[1] and mu[2], and explicit priors on mu[1] and mu[2] are a modeling choice. Stan permits improper priors --- such as implicit or explicit priors that are unbounded on one or both sides --- but the resulting posterior distribution may be improper and thus unsuitable for inference. It is, in general, difficult to determine analytically when an improper prior will result in an improper posterior, but Stan may throw an error message to that effect. By putting a proper prior on mu[1] and mu[2], even a diffuse one such as Cauchy, the posterior distribution is guaranteed to be proper.

Why not state the uniform prior on rho explicitly? First, leaving it implicit emphasizes the role that restricting the support of a parameter plays. Second, explicitly calling the uniform(-1.0, 1.0) function wastes (a small amount of) computational time. Some might say that the model is clearer when all priors are stated explicitly, but a commented-out prior is just as clear. While the time wasted evaluated a uniform density is small for a univariate parameter, try doing this (but save all your open files beforehand because your computer might crash):

transformed data {
  int<lower=1> K;
  real<lower=0> eta;
  K <- 10000;
  eta <- 1.0;
}
parameters
{
  corr_matrix[K] Sigma
; // one correlation matrix of order K
}
model
{
  Sigma ~ lkj_corr(eta); // jointly uniform prior over valid correlation matrices of order K
  // more stuff
}

In this case, by declaring Sigma to be a correlation matrix, you restrict its support to the space of admissible --- symmetric, positive semi-definite, unit diagonal --- correlation matrices of order K. Thus, Stan will generate no proposals for Sigma that are asymmetric, indefinite, or have values along the diagonal other than unity. If nothing else were stated, the prior over Sigma would be uniform over the space of admissible correlation matrices of order K, but in the example above I foolishly went ahead and made this prior explicit using the lkj_corr density with eta = 1.0. Doing so will probably exhaust the available RAM on your computer. Although the explicit prior implies that all admissible correlation matrices are equally likely, the lkj_corr function has to check whether Sigma is, in fact, an admissible correlation matrix, even though in this case it is admissible by construction. To check whether a matrix is positive semi-definite, Stan has to do a matrix decomposition such as an eigenvalue decomposition or in practice, a three-factor Cholesky decomposition, which are both O(K^3) operations that have absolutely no effect on the log posterior in this case. If eta were not 1.0, then the prior is not uniform, and it would be necessary to write it explicitly. But in this case, the lkj_corr(1.0) density evaluates to a constant.

Thus, we have a general rule in Stan: Do not waste computational time calculating constants.

Another source of confusion is restrictions on the support of a transformation of a parameter. For example,

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real
<lower=-1,upper=1> rho; // one correlation between -1 and 1
}
transformed parameter {
  real<lower=0> variance1;
  variance1 <- pow(sigma[1], 2.0);
}
model {
  real precision2;
  precision2 <- 1.0 / pow(sigma[2], 2.0);
  mu[1] ~ SomeUnrestrictedPrior();
  mu[2] ~
SomeUnrestrictedPrior();
  sigma[1] ~
SomePositivePrior()
  sigma[2] ~ SomePositivePrior();
  // implicit: rho ~ uniform(-1.0, 1.0);
  // more stuff that depends on sigma2 and precision
}


In this example, I explicitly wrote that variance1 is required to be non-negative but did not explicitly write that precision2 is required to be non-negative, which seems inconsistent. However, the restrictions on the support of a transformed parameter in the transformed parameters {} block do not affect the sampling because Stan samples from the space of the parameters (actually it doesn't, but don't worry about that detail). Thus, it is not absolutely required to write logical restrictions on the support of a transformation of a parameter, but it is good practice because Stan will check that the transformation respects the support. In other words, when Stan squares sigma[1] to get variance1 and the restrictions on the support of variance1 are explicitly stated, it checks that variance1 is nonnegative which wastes a small amount of computational time but will yield a more informative error message if you made a mistake in the .stan code that caused variance1 to be negative. However, these checks are not performed on a transformation of a parameter in the model{} block, so it would be a parser error to attempt to restrict the support of precision2 in the example above. To ensure checking on precision2, you would need to move its declaration and definition to the transformed parameters {} block.

[Note: Original post had some errors below that have now been corrected thanks to Jiqiang]

So, among the most common defects in .stan files are failing to restrict the support of a parameter and explicitly writing priors that are uniform over the support of a parameter. A third mistake is less common but is very bad: Writing an explicit prior that places non-constant mass outside the support of a parameter. For example,

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real
<lower=L,upper=U> rho; // one correlation between -1 and 1
  real<lower=-1,upper=1> mu_rho; // unknown mean for rho
}
model {
  mu[1] ~ SomeUnrestrictedPrior();
  mu[2] ~
SomeUnrestrictedPrior();
  sigma[1] ~
SomePositivePrior()
  sigma[2] ~ SomePositivePrior();
  rho ~ normal(mu_rho, 0.5);
  // implicit: mu_rho ~ uniform(-1.0,1.0);
  // more stuff
}


In this case, rho is restricted to the (-1,1) interval but has an explicit normal prior. Since the normal distribution is unbounded on both sides, it puts some mass below -1 and above 1, even though Stan cannot make such a proposal for rho. But the exact amount of mass outside the admissible interval depends on the unknown parameter mu_rho. Thus, this example requires a truncated normal(mu_rho,0.5) prior so that there is no mass outside the support:

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real<lower=-1,upper=1> rho; // one correlation between -1 and 1
  real<lower=-1,upper=1> mu_rho; // unknown mean for rho
}
model {
  mu[1] ~ SomeUnrestrictedPrior();
  mu[2] ~
SomeUnrestrictedPrior();
  sigma[1] ~
SomePositivePrior()
  sigma[2] ~ SomePositivePrior();
  rho ~ normal(mu_rho, 0.5) T[-1.0,1.0];
  // implicit: mu_rho ~ uniform(-1.0,1.0);
  // more stuff
}


where I have used the T[] notation to enforce the truncation. Not all distributions in Stan currently allow truncation but the normal does.

There is one common but special case that appears to be (but actually isn't) an exception to the general rule that truncation is necessary when the prior places non-zero mass outside the support of the parameter. If the amount of truncated mass is a constant, then using a truncated density is legal but should be avoided because truncation only affects the log-posterior by a constant amount. In the above example, if mu_rho were known, then explicit truncation would not be necessary. A more common example is

parameters {
  real mu
[2]; // two unrestricted means
  real
<lower=0> sigma[2]; // two nonnegative standard deviations
  real
<lower=-1,upper=1> rho; // one correlation between -1 and 1
}
model {
  mu[1] ~ cauchy(0.0, 5.0); // a diffuse prior over the whole real line
  mu[2] ~
cauchy(0.0, 5.0); // ditto
  sigma[1] ~
cauchy(0.0, 5.0); // explicit truncation is not necessary 
  sigma[2] ~ cauchy(0.0, 5.0); // ditto 
  // implicit: rho ~ uniform(-1.0, 1.0);
  // more stuff
}


Here the support of sigma[1] and sigma[2] is restricted to nonnegative numbers but I have placed cauchy(0.0, 5.0) priors on them which put half the prior mass on negative numbers. This appears to be an error, but the saving grace is that truncation in this case would simply double the remaining mass. Stan operates on the log posterior, so explicitly truncating would only add log(2.0) and hence is covered by the general rule of not wasting time calculating constants. In other words, if the support of a parameter is the nonnegative numbers, then

cauchy(0.0,5.0) T[0.0,] \propto cauchy(0.0,5.0)

However, if the parameters to the cauchy distribution were unknown or if the truncation occurred at unknown points, then explicit truncation would be necessary.

To summarize,
  • restrict the support of parameters when appropriate; if no prior is specified, it is implicitly uniform over the parameter's support
  • do not explicitly write uniform priors; comment them out
  • be aware that an implicit or explicit improper prior may result in an improper posterior
  • if you want to impose an informative prior, it should put non-zero (but perhaps arbitrarily small) mass over the entire support of that parameter
  • if an informative prior puts non-zero mass outside the support of a parameter, you have to truncate the prior in theory
  • it is not necessary to explicitly write the truncation when the amount of truncated mass is a constant

If you have additional questions on this topic, please start a new thread on stan-users and paste in the relevant parts of your .stan file.

Hang Zeng

unread,
Jun 11, 2014, 11:00:31 AM6/11/14
to stan-...@googlegroups.com
Hi Stan development team members,
My model compiles and it runs, but it runs so slow and stays on "Iteration: 1/10000 [  0%]  (warmup)". Then I stop it and check the results. My initial values are reasonable but then it changes much after several iterations. I think this is the reason for running soooo slow. How can I improve it?
Thank you so much.

Bob Carpenter

unread,
Jun 11, 2014, 11:07:33 AM6/11/14
to stan-...@googlegroups.com
Not much we can say if you don't share the model itself.
There's a chapter in the manual on efficiency. Another usual
problem is missing constraints---the support in the model
needs to match the declared constraints.

Usually you won't need 10,000 iterations --- Stan converges
quickly and then mixes well for many problems. We sometimes
run as few as 100 iterations.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Ben Goodrich

unread,
Jun 11, 2014, 11:50:39 AM6/11/14
to stan-...@googlegroups.com
Please start a new thread on stan-users and include your model.

Ben

Hang Zeng

unread,
Jun 11, 2014, 12:53:30 PM6/11/14
to stan-...@googlegroups.com
Thank you so much. And I am sorry I don't know how to post thread before. 
Now I know and I will post my model on a new thread. Thank you…

Ben Lambert

unread,
Apr 23, 2015, 4:36:44 PM4/23/15
to stan-...@googlegroups.com
Hi,

I am a new user to Stan (very excited to try it out!), and so forgive me if this is a simple question.

I have a parser error in rstan. I get the following, which is in itself quite self-explanatory, but I would like to keep the vectorised code I have developed so far (rather than find a slower workaround):

"arguments to ^ must be primitive (real or int); cannot exponentiate real by vector"

The problem line in the code is: 

for (i in 1:K) {
    segment(Y,Pos[i],S[i]) ~ poisson(R[i]*psi[i]*(1-mu[i])^segment(t,Pos[i],S[i]));
  }

So it seems that in this case the operator  ^ is not able to be vectorised if the base is a real, and the exponent a vector. Is that correct?

I attach my code, the data, and a OpenBUGS schematic of the problem. I have tried a few workarounds, like doing a loop over all observations in the transformed parameters block:

real lambda[N];
for (i in 1:K)
{
    for (j in 1:S[i])
    {
        lambda[Pos[i] + j - 1] <- R[i]*psi[i]*(1-mu[i])^t[Pos[i] + j - 1];
    }
}

But this runs into other issues, where I am told that the variable "real" does not exist. Also, I would prefer a vectorised version, rather than the loop right above.

Does anyone have any ideas as to how I can remedy this, in an efficient (ie vectorised) manner?

Best,

Ben
singleRelease.stan
singleReleaseData.R
s_singleReleaseDoodle.odc

Bob Carpenter

unread,
Apr 23, 2015, 10:15:37 PM4/23/15
to stan-...@googlegroups.com

> On Apr 24, 2015, at 6:36 AM, Ben Lambert <ben.c....@googlemail.com> wrote:
>
> Hi,
>
> I am a new user to Stan (very excited to try it out!), and so forgive me if this is a simple question.
>
> I have a parser error in rstan. I get the following, which is in itself quite self-explanatory, but I would like to keep the vectorised code I have developed so far (rather than find a slower workaround):

Vectorization for a pointwise operation like ^ isn't going to speed
things up.

> "arguments to ^ must be primitive (real or int); cannot exponentiate real by vector"
>
> The problem line in the code is:
>
> for (i in 1:K) {
> segment(Y,Pos[i],S[i]) ~ poisson(R[i]*psi[i]*(1-mu[i])^segment(t,Pos[i],S[i]));
> }
>
> So it seems that in this case the operator ^ is not able to be vectorised if the base is a real, and the exponent a vector. Is that correct?

Correct --- operator^ isn't vectorized.

But defining a local variable and writing the values into it wouldn't
be any slower --- just less convenient to write.

>
> I attach my code, the data, and a OpenBUGS schematic of the problem. I have tried a few workarounds, like doing a loop over all observations in the transformed parameters block:
>
> real lambda[N];
> for (i in 1:K)
> {
> for (j in 1:S[i])
> {
> lambda[Pos[i] + j - 1] <- R[i]*psi[i]*(1-mu[i])^t[Pos[i] + j - 1];
> }
> }
>
> But this runs into other issues, where I am told that the variable "real" does not exist. Also, I would prefer a vectorised version, rather than the loop right above.

You can only define local variables at the top of a block.

> Does anyone have any ideas as to how I can remedy this, in an efficient (ie vectorised) manner?

Although not vectorized, the following is just as efficient:

{
real lambda[S[i]];
for (j in Pos[i]):(Pos[i] + S[i]))
lambda[j] <- R[i] * psi[i] * (1 - mu[i])^t[j];
segment(Y,Pos[i],S[i]) ~ poisson(lambda);
}

You'll need those braces, which allow you to define lambda as a local
variable. I'd worry about getting things working before worrying about
efficiency, so you could always start with this:

for (j in Pos[i]):(Pos[i] + S[i]))
Y[j] ~ poisson(R[i] * psi[i] * (1 - mu[i])^t[j]);

It shouldn't be too much slower.

You'll want to double-check my bounds calculations, of course.

And if there's an outer loop, then caching R[i] * psi[i] in a local
variable like R_times_psi would help

for (i in ...) {
Real R_times_psi;
R_times_psi <- R[i] * psi[i];
for (j in ...)
}

Anything that cuts down on repeated operations will save time.

- Bob

Ben Lambert

unread,
Apr 25, 2015, 9:04:54 AM4/25/15
to stan-...@googlegroups.com
Hi Bob,

That's great - that did the trick. Many thanks for your help here!

Best,

Ben


--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/4gv3fNCqSNk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

Ben Lambert

unread,
Apr 30, 2015, 8:29:53 PM4/30/15
to stan-...@googlegroups.com, ben.l...@some.ox.ac.uk
Hello again,

Sorry to ask a question, but I am having an issue with a program that yields the following error:

"Initialization between (-2, 2) failed after 100 attempts. 
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model."

The variable I am having an issue with is the following, for which I do not assign a prior (implicitly using STAN's implicit improper uniform prior)

real<lower=1,upper=10> beta[K];

When I reduce the range of constraints on this variable to, for example:

real<lower=1,upper=3> beta[K];

I no longer have the issue, however this is prohibitive to the model I am building (the fitted MLEs are often up until about 10).

A further issue I have is that, I want a hyperprior to sit above all the beta parameters, so something of the form:

beta ~ gamma(a,b);
a ~ cauchy(0,2.5);
b ~ cauchy(0,2.5);

However, when I try to implement this, running stan using rstan, I run into the same issue as above. I tried to attach my code, but it isn't working, so have put it underneath (sorry).

I have tried the following: transforming the parameter beta, to re-specify the model in terms of: beta[i] <- log(zeta[i]); Then specifying a relatively weak normal prior on zeta. I have tried this formulation with, and without placing explicit limits on beta.

Best,

Ben

data {

  ## Single release parameters
  int<lower=0> N; ## Total number of observations
  int<lower=0> K; ## Total number of groups
  int Y[N]; ## Vector of observations for the release
  int S[K]; ## Array of ints for the group sizes
  vector[N] t; ## Vector of time observations (days) 
  int Pos[K]; ## Array of the starting position of each dataset
  
  ## Multiple release parameters
  int RelFreq[K]; ## Array of the frequency of releases in each study
  int PosRel[K]; ## Array of the starting position of releases in each study in RelNumber vector
  int<lower=0> NReleases; ## Total number of releases across all studies
  vector[NReleases] RelNumber; ## Vector of the numbers released across each study
  vector[NReleases] RelTime; ## Release time of each release in each study. For the majority of these studies, this will be simply be '0' corresponding to single release


parameters {
  real mPsi;
  real<lower=0> sigmaPsi;
  real<lower=0,upper=1> psi[K];
  real mMu;
  real<lower=0> sigmaMu;
  real<lower=0,upper=1> mu[K]; 
  
  ## This is the variable with which I am having an issue
  real<lower=0.3,upper=2> beta[K];

  
transformed parameters {
  real phi[K];
  real eta[K];
  for (i in 1:K)
  {
phi[i] <- logit(psi[i]);
eta[i] <- logit(mu[i]);
  }
}
 
model {
  for (i in 1:K) {
real lambda[S[i]];
if (RelFreq[i]< 2) ## Single release
{
real R_times_psi;
R_times_psi <- RelNumber[PosRel[i]]*psi[i];
for (j in 1:S[i])
{
lambda[j]<- R_times_psi*(1 - mu[i])^((t[Pos[i] + j - 1])^beta[i]);
Y[Pos[i] + j - 1] ~ poisson(lambda[j]);
}
}
else ## Multiple release
{
for (j in 1:S[i])
{
real lambdaTemp;
lambdaTemp <- 0;
for (kk in 1:RelFreq[i])
{
if (t[Pos[i] + j - 1]> RelTime[PosRel[i]+kk-1])
{
lambdaTemp <- lambdaTemp + psi[i]*RelNumber[PosRel[i] + kk - 1]*(1-mu[i])^((t[Pos[i] + j - 1] - RelTime[PosRel[i]+kk-1])^beta[i]);
}
}
lambda[j] <- lambdaTemp;
Y[Pos[i] + j - 1] ~ poisson(lambda[j]);
}
}
  }
  eta ~ normal(mMu,sigmaMu);
  phi ~ normal(mPsi,sigmaPsi);
  mMu ~ normal(0,1);
  mPsi ~ normal(0,1);
  sigmaMu ~ inv_gamma(1.5, 1); 
  sigmaPsi ~ inv_gamma(1.5, 1);
}

Bob Carpenter

unread,
May 3, 2015, 3:06:41 AM5/3/15
to stan-...@googlegroups.com

> On Apr 30, 2015, at 8:29 PM, Ben Lambert <ben.c....@googlemail.com> wrote:
>
> Hello again,
>
> Sorry to ask a question, but I am having an issue with a program that yields the following error:
>
> "Initialization between (-2, 2) failed after 100 attempts.
> Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model."
>
> The variable I am having an issue with is the following, for which I do not assign a prior (implicitly using STAN's implicit improper uniform prior)
>
> real<lower=1,upper=10> beta[K];

This will get a proper uniform prior on (1,10). The default
prior is uniform over values satisfying the constraint.

The uniform(-2,2) inits on the unconstrained scale are inverse logit
transformed to (0,1) and then scaled and shifted to (1,10). The inits
will be centered around the midpoint --- 5.5.

> When I reduce the range of constraints on this variable to, for example:
>
> real<lower=1,upper=3> beta[K];
>
> I no longer have the issue,

This says that if you generate random values in (1,10), you go
outside support somewhere that (1,3) doesn't.

Was there any more information from the output?

It does look like this:

> lambda[j]<- R_times_psi*(1 - mu[i])^((t[Pos[i] + j - 1])^beta[i]);

may be prone to overflow or underflow with more extreme values of beta.

> however this is prohibitive to the model I am building (the fitted MLEs are often up until about 10).
>
> A further issue I have is that, I want a hyperprior to sit above all the beta parameters, so something of the form:
>
> beta ~ gamma(a,b);
> a ~ cauchy(0,2.5);
> b ~ cauchy(0,2.5);
>
> However, when I try to implement this, running stan using rstan, I run into the same issue as above. I tried to attach my code, but it isn't working, so have put it underneath (sorry).

Were a and b declared to be positive?

- Bob

P.S. Thanks for the careful writeup!
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Ben.c.lambert

unread,
May 3, 2015, 11:30:57 AM5/3/15
to stan-...@googlegroups.com
Hi Bob,

Many thanks for your answers. I am not near my computer at the moment, but I suspect that the most logical thing to do would be to generate initial random values within the support of [1,3] in untransformed space for this particular variable (I know I also need to generate random initial values for the other variables as well, if I do this). I will try this, and see if this does the trick.

Also, I saw on the STAN site that you are looking for developers to help. I would very much like to pitch in, if I can? Is there a protocol for getting involved? Excited to read that Riemannian manifold HMC is on the horizon!

Best,

Ben

David Manheim

unread,
Jul 29, 2015, 5:38:36 PM7/29/15
to Stan users mailing list, goodri...@gmail.com
It seemed like Rtools isn't working, but it is installed.

(I figured out how to fix this, and wanted to post it to the list anyways to ensure others searching for the problem can find it.)

I have a model which validates and generates C code using stanc, ("stanc(model_code = stanmodelcode, model_name = "modelname")") but I could not compile it on my machine.

After drafting a message, then figuring it out, the issue is that the path to Rtools was not actually added by the Rtools installer, despite asking it to do so. This may have been a permissions issue when it was installed.

To test this, I can try to run the example form the R package:
library(rstan)
scode <- "
parameters {
  real y[2]; 
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE) 

I then get the error:

Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! Warning message:
running command 'make -f "C:/PROGRA~1/R/R-32~1.1/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-32~1.1/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="file27747df31792.dll" WIN=64 TCLBIN=64 OBJECTS="file27747df31792.o"' had status 127 
In addition: Warning messages:
1: running command '"C:/PROGRA~1/R/R-32~1.1/bin/x64/R" CMD config CXX' had status 1 
2: running command 'C:/PROGRA~1/R/R-32~1.1/bin/x64/R CMD SHLIB file27747df31792.cpp 2> file27747df31792.cpp.err.txt' had status 1 


Compiler version:
> system("g++ --version")
g++.exe (GCC) 4.7.0 20111220 (experimental)
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

FIX:
On Windows, adding directories to the path can be done as follows; http://www.computerhope.com/issues/ch000549.htm
The directories that needed to be added are both the Rtools\gcc-4.6.3\bin directory, and the Rtools\bin directory.

Cheers,
David Manheim

Bob Carpenter

unread,
Jul 29, 2015, 7:00:52 PM7/29/15
to stan-...@googlegroups.com, goodri...@gmail.com
Thanks for sharing. (And for those of you reading this in the future, the
path to gcc from Rtools may change with later Rtools releases, though
it looks like they're still struggling with later gcc versions and
C++11).

- Bob

Avraham Adler

unread,
Jul 30, 2015, 2:05:34 AM7/30/15
to Stan users mailing list, ca...@alias-i.com, goodri...@gmail.com, ca...@alias-i.com
That's interesting; every time I've installed Rtools using the .exe hosted at <https://cran.r-project.org/bin/windows/Rtools/> it has always asked me if I wanted to edit the PATH. The fact that your version was 4.7 was a giveaway, since Rtools is stuck at 4.6.3 prerelease. How did you install Rtools?

As for updates, last I spoke with some of the developers, there is still some slow work on building a 4.9.3 toolchain, but I wouldn't expect anything soon.

Avi

Vijay Desai

unread,
Aug 10, 2015, 4:15:16 PM8/10/15
to Stan users mailing list

Hi all,
I am using pystan package. I am getting parse error for the attached code. The error message is in error.txt. Can somebody please tell me what I am doing wrong?

Thanks,
Vijay
code.py
error.txt

Daniel Lee

unread,
Aug 10, 2015, 4:58:41 PM8/10/15
to stan-...@googlegroups.com
Assignment in the Stan language is "<-", not "=". It's borrowed from BUGS / R notation.



--

David Manheim

unread,
Nov 30, 2015, 3:44:25 PM11/30/15
to Stan users mailing list
It has occurred to me (after the Nth time it happened to me,) that a typical failure mode for slow-running models is model mis-specification, which might be good to note in the documentation / a revision to this post.

Andrew Gelman

unread,
Nov 30, 2015, 4:30:51 PM11/30/15
to stan-...@googlegroups.com
That’s the Folk Theorem of Statistical Computing:

David Manheim

unread,
Nov 30, 2015, 4:35:50 PM11/30/15
to Stan users mailing list, gel...@stat.columbia.edu
It might be good to include this (prominently) in the STAN manual, as, say, a preface to the section on optimizing your code.

Bob Carpenter

unread,
Nov 30, 2015, 6:14:19 PM11/30/15
to stan-...@googlegroups.com
If I had a nickel for every time someone's asked me to include
something prominently in the Stan (not an acronym) manual, I'd
be a rich man. :-)

I can put something at the beginning of the optimization section:

https://github.com/stan-dev/stan/issues/1617#issuecomment-160792547

If it doesn't get into the next manual, it'll go in the one after.
We're coming down to the crunch for 2.9.0.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.

Krzysztof Sakrejda

unread,
Mar 3, 2016, 9:06:55 AM3/3/16
to Stan users mailing list
Could you add "My model runs but warmup is very slow/stuck."

Could use the answers from me/Michael if you don't feel like writing something:

Arun Kaushik

unread,
May 14, 2016, 1:34:37 AM5/14/16
to Stan users mailing list
Hello sir, 
When a run your script in R then I got the following error.
Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! Warning message:
running command 'make -f "C:/PROGRA~1/R/R-32~1.5/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-32~1.5/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="filede904a2d2e43.dll" WIN=64 TCLBIN=64 OBJECTS="filede904a2d2e43.o"' had status 127 
In addition: Warning message:
running command 'C:/PROGRA~1/R/R-32~1.5/bin/x64/R CMD SHLIB filede904a2d2e43.cpp 2> filede904a2d2e43.cpp.err.txt' had status 1 

Ben Goodrich

unread,
May 14, 2016, 10:59:27 AM5/14/16
to Stan users mailing list
On Saturday, May 14, 2016 at 1:34:37 AM UTC-4, Arun Kaushik wrote:
had status 127 
 
Rtools wasn't found, possibly because it is not installed or if you installed it then the box for putting in on the PATH wasn't checked.

Ben

P.S. Please start a new thread for problems like this rather than posting on the FAQ thread

Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages