Using (r)Stan with knitr—suppressing messages

934 views
Skip to first unread message

Avraham Adler

unread,
Feb 19, 2014, 1:31:06 PM2/19/14
to stan-...@googlegroups.com
Firstly, as a medium-term lurker but first-time poster, I would like to thank the entire Stan development team for the creation and maintenance of Stan.

I am using Stan as part of a paper I am trying to write, and I am struggling with trying to suppress diagnostic messages. For example, when I have used JAGS, R2jags, and knitr, once I set echo, messages, and warnings to FALSE, I can have the model run every time I knit the paper, and only have the results I want displayed through \Sexpr{}. Using Stan, I am having much more of a problem. I have searched the group and have found the following threads:
  1. Suppressing Warning Messages
  2. silent stan?
  3. RStan print() - Trivial request

I have verbose set to FALSE and refresh set to -1. I have used suppressMessages/Warnings as suggested in the thread (1), and that does not seem to do anything more than setting messages and warnings settings in the knitr chunk. I have wrapped my stan() call in sink() calls, as discussed in thread (2) as such:

<<'Bayes_Stan', eval=TRUE, echo=FALSE, message=FALSE, warning=FALSE, cache=0>>=
sink(file = "a", type = c("output", "message"))
library(rstan)
LN_S <- suppressMessages(suppressWarnings(stan(file = 'Bayes.stan', data = BayesData, pars = c('mu', 'sigma', 'LR_post', 'ASL_post', 'dev'), iter = 300000, warmup = 50000, thin = 25, chains = 5, seed = 12, refresh = -1, verbose = FALSE)))

[snip extractions and assignments]

LNPlus_S <- suppressMessages(suppressWarnings(stan(file = 'BayesPlus.stan', data = BayesDataPlus, pars = c('Pars', 'LR_post', 'ASL_post', 'dev'), iter = 300000, warmup = 50000, thin = 25, chains = 5, seed = 12, refresh = -1, verbose = FALSE)))

[snip extractions and assignments]
sink()
@

Regardless, my knitted output looks like:

…Therefore, the model was rebuilt in Stan (Stan Development Team, 2014a)—a Bayesian modeling language whose samplers are non-Gibbs.
##
## TRANSLATING MODEL 'Bayes' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'Bayes' NOW.
## cygwin warning:
## MS-DOS style path detected: C:/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## Preferred POSIX equivalent is: /cygdrive/c/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## CYGWIN environment variable option "nodosfilewarning" turns off this warning.
## Consult the user's guide for more details about POSIX paths:
## http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/var_stack.hpp:49:17: warning: 'void stan::agrad::free_memory()' defined but not used [-Wunused-function]
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'void stan::agrad::set_zero_all_adjoints()' defined but not used [-Wunused-function]
##
## TRANSLATING MODEL 'BayesPlus' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'BayesPlus' NOW.
## cygwin warning:
## MS-DOS style path detected: C:/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## Preferred POSIX equivalent is: /cygdrive/c/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## CYGWIN environment variable option "nodosfilewarning" turns off this warning.
## Consult the user's guide for more details about POSIX paths:
## http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/var_stack.hpp:49:17: warning: 'void stan::agrad::free_memory()' defined but not used [-Wunused-function]
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'void stan::agrad::set_zero_all_adjoints()' defined but not used [-Wunused-function]
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.34910650273000776:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.6172265406146098:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.2449912107872052:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.0263658781850085:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.7036664255500364:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.7912999475616009:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.67853001987262718:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.56723437122169695:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.9622578028376079:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.9058489407892381:0, but must be > 0!Rejecting proposed initial value with zero density.
The No-U-Turn-Sampler behind Stan is much more robust to highly autocorrelated parameters

I am not savvy enough in either R or Stan to write wrapper codes like the one in thread (3) which may address this issue.

While I can always knit the Rnw, edit the resulting tex file to remove these lines, and re-texify that file, I would appreciate knowing if there is any way to prevent the need for such post-processing.

While I cannot get an exact sessionInfo without knitting it into the paper, loading all the packages and calls used results in the following:

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rstan_2.2.0          inline_0.3.13        rjags_3-12           coda_0.16-1          lattice_0.20-24      runjags_1.2.0-7     
 [7] scatterplot3d_0.3-35 mvtnorm_0.9-9997     nloptr_1.0.0         Rcpp_0.11.0          reshape2_1.2.2       scales_0.2.3        
[13] ggplot2_0.9.3.1     

loaded via a namespace (and not attached):
 [1] colorspace_1.2-4   dichromat_2.0-0    digest_0.6.4       grid_3.0.2         gtable_0.1.2       labeling_0.2       MASS_7.3-29       
 [8] munsell_0.4.2      parallel_3.0.2     plyr_1.8           proto_0.3-10       RColorBrewer_1.0-5 stats4_3.0.2       stringr_0.6.2     
[15] tools_3.0.2 

If there is no way at current to completely suppress messaged, would it be possible to request a future feature, such as when refresh is set to a [possibly large] negative value (instead of 0) that ALL output is suppressed and only the model and results are created?

Once again, thank you for all your collective work on such a great and powerful new tool for Bayesian analysis.

Bob Carpenter

unread,
Feb 19, 2014, 1:41:43 PM2/19/14
to stan-...@googlegroups.com
I’m pretty sure there’s no way to do that in RStan yet.

It’s not possible in CmdStan, either, but there’s a feature request to make
it possible.

We are kicking off a major re-org of the code to unify the back-end and front-end
for CmdStan, RStan, and PyStan, which have drifted apart. We’ll include this
functionality in the re-org everywhere, as people seem to want it and it shouldn’t
be too difficult.

In the meantime, this might be easy for Jiqiang or Ben to solve, so I opened
and RStan-specific issue:

https://github.com/stan-dev/rstan/issues/49

- Bob
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Ben Goodrich

unread,
Feb 19, 2014, 3:43:19 PM2/19/14
to stan-...@googlegroups.com
On Wednesday, February 19, 2014 1:31:06 PM UTC-5, Avraham Adler wrote:
Regardless, my knitted output looks like:

…Therefore, the model was rebuilt in Stan (Stan Development Team, 2014a)—a Bayesian modeling language whose samplers are non-Gibbs.
##
## TRANSLATING MODEL 'Bayes' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'Bayes' NOW.

I think you can get rid of the above by redefining the cat() function in your setup chunk

cat <- function(...) return(invisible(NULL))
 

## cygwin warning:
## MS-DOS style path detected: C:/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## Preferred POSIX equivalent is: /cygdrive/c/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## CYGWIN environment variable option "nodosfilewarning" turns off this warning.
## Consult the user's guide for more details about POSIX paths:
## http://cygwin.com/cygwin-ug-net/using.html#using-pathnames

You might be able to get rid of the above by defining that environmental variable in your setup chunk

Sys.setenv(nodosfilewarning = "1")

 

## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/var_stack.hpp:49:17: warning: 'void stan::agrad::free_memory()' defined but not used [-Wunused-function]
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'void stan::agrad::set_zero_all_adjoints()' defined but not used [-Wunused-function]

Short of recompiling rstan and / or doing something to suppress that warning, you could silently sed your sinked file

system("sed -i '/-Wunused-function/d' a")
 

##
## TRANSLATING MODEL 'BayesPlus' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'BayesPlus' NOW.
## cygwin warning:
## MS-DOS style path detected: C:/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## Preferred POSIX equivalent is: /cygdrive/c/R/RCurrent/R-30~1.2/etc/x64/Makeconf
## CYGWIN environment variable option "nodosfilewarning" turns off this warning.
## Consult the user's guide for more details about POSIX paths:
## http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/var_stack.hpp:49:17: warning: 'void stan::agrad::free_memory()' defined but not used [-Wunused-function]
## C:/R/RCurrent/R-3.0.2/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'void stan::agrad::set_zero_all_adjoints()' defined but not used [-Wunused-function]

See above
 

## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.34910650273000776:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.6172265406146098:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.2449912107872052:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.0263658781850085:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.7036664255500364:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.7912999475616009:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.67853001987262718:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -0.56723437122169695:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.9622578028376079:0, but must be > 0!Rejecting proposed initial value with zero density.
## Error in function stan::prob::lognormal_log(N4stan5agrad3varE): Scale parameter is -1.9058489407892381:0, but must be > 0!Rejecting proposed initial value with zero density.

The bigger problem here is that you seem to have not constrained the scale parameter to be positive. You need something like this in the parameters block

real<lower=0> sigma;

but if there are any stray messages, you could get rid of them with another sed sweep

system("sed -i '/Error in function/d' a")

Ben

Avraham Adler

unread,
Feb 19, 2014, 3:50:15 PM2/19/14
to stan-...@googlegroups.com


On Wednesday, February 19, 2014 3:43:19 PM UTC-5, Ben Goodrich wrote:
The bigger problem here is that you seem to have not constrained the scale parameter to be positive. You need something like this in the parameters block

real<lower=0> sigma;

but if there are any stray messages, you could get rid of them with another sed sweep

system("sed -i '/Error in function/d' a")

Ben


Thank you very much for your advice, Ben. I'll play with some of your suggestions. Regarding your last, the both the hyperprior (V) and the Omega prior are defined as type "cov_matrix" which should account for that. I am using a pretty uninformative prior, and the mean values are all clustered around 1 (for various reasons) so there are times when it generates those complaints. With only a dozen or so such errors when generating 1.5 million draws (5 chains of 300,000 each) I believe that's acceptable. Even the warning message says "If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine"

Thanks again!

Avraham

Ben Goodrich

unread,
Feb 19, 2014, 4:32:31 PM2/19/14
to stan-...@googlegroups.com
 
Yes, the "Informational messages" about the covariance matrices not being numerically positive definite are usually not a big deal and can be zapped from the sinked file by sed, but I was referring to these, which all come from a lognormal(mu,sigma) statement where sigma has not been constrained to be positive.
Ben

Avraham Adler

unread,
Feb 19, 2014, 5:03:25 PM2/19/14
to


On Wednesday, February 19, 2014 4:32:31 PM UTC-5, Ben Goodrich wrote:
 
Yes, the "Informational messages" about the covariance matrices not being numerically positive definite are usually not a big deal and can be zapped from the sinked file by sed, but I was referring to these, which all come from a lognormal(mu,sigma) statement where sigma has not been constrained to be positive.


I may have built the model incorrectly and would be very grateful for any advice.

The data is:
N <- 10

LR <- exp(c(-.5376, -.4388, -.3953, -.6415, -.5376, -.4388, -.244, -.3488, -.4786, -.4388))

V is a hyperprior precision matrix set up so that 3 * solve(V) is a covariance matrix where the variances matches the variance of the hyperprior on the means (for mu[1] it's normal 0, 100 so var = 1e4 and for mu[2] its uniform[0, 100] so var = 100^2 / 12), and there is no correlation in the hyperprior so V becomes matrix(c(3e-4, 0, 0, 0.0036), ncol = 2

The Stan model is:

data {
 
int<lower = 0> N;
  real
<lower = 0> LR[N];
  cov_matrix
[2] V;
}
parameters
{
  vector
[2] mu;
  vector
[2] Pars;
  cov_matrix
[2] Omega;
}
model
{
  mu
[1] ~ normal(0.0, 100.0);
  mu
[2] ~ uniform(0.0, 100.0);
 
Omega ~ inv_wishart(3, V);
 
Pars ~ multi_normal_prec(mu, Omega);
 
for (year in 1:N) {
    LR
[year] ~ lognormal(Pars[1], Pars[2]);
     
}
}
generated quantities
{
  real LR_post
;
  real ASL_post
;
  real dev
;
  LR_post
<- lognormal_rng(Pars[1], Pars[2]);
  ASL_post
<- fmin(0.025, fmax(LR_post - 0.725, 0.0));
  dev
<- 0;
 
for (i in 1:N) {
    dev
<- dev - (2 * lognormal_log (LR[i], Pars[1], Pars[2]));
 
}
}

Yes, I know that Andrew et al are not big fans of DIC (and I'm using pV instead of pD as I'm not sure how to easily calculate the latter) but I need some way to compare the models :)

I can't just throw a lower=0 on mu, as the first parameter can (and should) be less than 0.

Thank you very much!

Avraham

Ben Goodrich

unread,
Feb 19, 2014, 5:20:37 PM2/19/14
to stan-...@googlegroups.com
On Wednesday, February 19, 2014 5:02:08 PM UTC-5, Avraham Adler wrote:
The Stan model is:

data {
 
int<lower = 0> N;
  real
<lower = 0> LR[N];
  cov_matrix
[2] V;
}
parameters
{
  vector
[2] mu;
  vector
[2] Pars;
  cov_matrix
[2] Omega;
}
model
{
  mu
[1] ~ normal(0.0, 100.0);
  mu
[2] ~ uniform(0.0, 100.0);
 
Omega ~ inv_wishart(3, V);
 
Pars ~ multi_normal_prec(mu, Omega);

This is what I was saying. The sample space for Pars is the entire real plane, so there is positive probability that Pars[2] is negative. One solution would be to work with exp(Pars[2]) in the model block, as in

LR ~ lognormal(Pars[1], exp(Pars[2]));

and later in
 
generated quantities {
  real LR_post
;
  real ASL_post
;
  real dev
;

  LR_post
<- lognormal_rng(Pars[1], exp(Pars[2]));

  ASL_post
<- fmin(0.025, fmax(LR_post - 0.725, 0.0));
  dev
<- 0;
 
for (i in 1:N) {

    dev
<- dev - (2 * lognormal_log (LR[i], Pars[1], exp(Pars[2])));
 
}
}

Yes, I know that Andrew et al are not big fans of DIC (and I'm using pV instead of pD as I'm not sure how to easily calculate the latter) but I need some way to compare the models :)

I can't just throw a lower=0 on mu, as the first parameter can (and should) be less than 0.

You can separate them as

real mu_1;
real
<lower=0,upper=1> mu_2;

Ben

Sergio Polini

unread,
Feb 19, 2014, 6:03:44 PM2/19/14
to stan-...@googlegroups.com
My 2 cents (i.e. what I do, and why).
1. Stan models are translated to C++ and compiled.
2. A knitr (.Rnw) file is going to be modified several times.
3. When using a Stan model several times, compiling it only once is enough.
4. So I compile the model in a separate R session, then save it:
mod.sm <- stan_model(...)
save(mod.sm, file = "mod.sm.RData")
and write in my .Rnw file:
<<eval=FALSE>>=
mod.sm <- stan_model(...)
@
<<echo=FALSE>>=
if (!exists(mod.sm))
load("mod.sm.RData")
@
5. If sampling is fast, I add:
<<message = FALSE>>=
mod.sf <- sampling(mod.sm, ...)
print(mod.sf)
@
else I can:
a) save mod.sf as before
b) add "cache = TRUE" to <<>>=

Well, I actually use a wrapper code for steps 4 and 5, but what I've
described is just what I started from.

HTH

Sergio


Il 19/02/2014 19:31, Avraham Adler ha scritto:
> Firstly, as a medium-term lurker but first-time poster, I would like to
> thank the entire Stan development team for the creation and maintenance
> of Stan.
>
> I am using Stan as part of a paper I am trying to write, and I am
> struggling with trying to suppress diagnostic messages. For example,
> when I have used JAGS, R2jags, and knitr, once I set echo, messages, and
> warnings to FALSE, I can have the model run every time I knit the paper,
> and only have the results I want displayed through \Sexpr{}. Using Stan,
> I am having much more of a problem. I have searched the group and have
> found the following threads:
>
> 1. Suppressing Warning Messages
> <https://groups.google.com/forum/?fromgroups#!searchin/stan-users/suppress$20messages/stan-users/pjqZwsoMVAM/BGXOU1LIgWMJ>
> 2. silent stan?
> <https://groups.google.com/forum/?fromgroups#!searchin/stan-users/suppress$20messages/stan-users/mfdMEgS1Yt0/AGongVLQ1XQJ>
> 3. RStan print() - Trivial request
> <https://groups.google.com/forum/?fromgroups#!searchin/stan-users/knitr/stan-users/GSDaSkhsE7A/fZRih8LpzNQJ>

Bob Carpenter

unread,
Feb 20, 2014, 10:36:46 AM2/20/14
to stan-...@googlegroups.com

On Feb 19, 2014, at 11:02 PM, Avraham Adler <avraha...@gmail.com> wrote:

> …

> Omega ~ inv_wishart(3, V);
> Pars ~ multi_normal_prec(mu, Omega);

Isn’t it more usual to use a Wishart for a precision parameter
and the inverse Wishart with a covariance parameter? (I could be
reversing my conjugacy relations.)

Stan doesn’t care about conjugacy, so it’s not an efficiency
issue or expressibility issue as in BUGS.

There’s a new section in the manual chapter on regression
on multivariate covariance priors where we recommend a scaled correlation
matrix approach (due to Ben).

- Bob

P.S. For efficiency, we could always implement a marginalized-out compound
Inverse-Wishart-Multinormal distribution following the formula for p(X|Psi,nu)
here:

http://en.wikipedia.org/wiki/Inverse-Wishart_distribution#Conjugate_distribution

But we probably won’t because we’re not really recommending the Wishart priors.


Bob Carpenter

unread,
Feb 21, 2014, 3:31:15 PM2/21/14
to stan-...@googlegroups.com

Avraham Adler

unread,
Feb 21, 2014, 4:20:29 PM2/21/14
to stan-...@googlegroups.com
Sorry, I forgot to respond. As you can tell, I'm somewhat new to this. I've corrected the call, and as you expected, the posterior didn't change much as Stan can quickly move away from the poor prior.

Avraham
Reply all
Reply to author
Forward
0 new messages