Trying to work on rstan

42 views
Skip to first unread message

Daniel Lee

unread,
Aug 11, 2016, 10:36:21 PM8/11/16
to stan-dev mailing list
Hi Ben (or Jiqiang or anyone else that knows),

I'm trying to get rstan working with the services branch and I want to do a little more rapid checking. Is there any way to build a model with a new stan_fit.hpp? Or the source without installing everything?

Here's what I've done so far:
Setup:
1. update the submodule:
    cd StanHeaders/inst/include/upstream
    git checkout feature/issue-1751-service-methods
    cd ../../../../
2. build new StanHeaders
    R CMD build StanHeaders
3. install new StanHeaders
    R CMD install StanHeaders_2.11.0.tar.gz

What I'm doing to test rstan:
1. change directories:
    cd rstan
2. build and install:
    make build
    make install ## this takes a while
3. start R and try building:
    R
    library(rstan)
    options(mc.cores = parallel::detectCores()) 
    fit <- stan(model_code = "parameters { real<lower = 0, upper = 1> theta; } model { }")


When that last line fails, I change code, then go back to step 2 and type make build again. Is there a way to speed up this development?


Daniel


Krzysztof Sakrejda

unread,
Aug 11, 2016, 10:44:58 PM8/11/16
to stan development mailing list
Have you tried passing
--no-build-vignettes and --no-manual to R CMD build (maybe in the make install step) and then for R CMD INSTALL passing --no-docs --no-html --no-help --no-test-load

That might help since I remember the docs/vignettes taking up a fair bit of time. Other than that not really any way of speeding it up that I know, I've always hated testing package builds in R, this process takes forever. K

Krzysztof Sakrejda

unread,
Aug 11, 2016, 10:47:46 PM8/11/16
to stan development mailing list
On Thursday, August 11, 2016 at 10:36:21 PM UTC-4, Daniel Lee wrote:
Just looked at the rstna makefile. Also might help to remove --as-cran from the R CMD INSTALL. To avoid making these changes temporarily while developing it might be nice to have separate make targets like "make install-fast" for doing development pre-testing.

Jiqiang Guo

unread,
Aug 11, 2016, 11:31:06 PM8/11/16
to stan...@googlegroups.com
I would add ccache to a Makevars file. 

$ R CMD config CXX
ccache clang++

I also just change files within the installed rstan package if there is no change of StanHeaders and copy the correct files to git repo later. 

For example, 

$ vim `R RHOME`/library/rstan/include/rstan/stan_fit.hpp

Jiqiang 


--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jiqiang Guo

unread,
Aug 11, 2016, 11:46:50 PM8/11/16
to stan...@googlegroups.com

Sebastian Weber

unread,
Aug 12, 2016, 2:41:23 AM8/12/16
to stan development mailing list
Hi!

Last time I looked at the rstan code it used a lot of those Rcpp List constructs which are convenient to use, but from what I found not very fast to handle the massive numeric arrays.

When refactoring rstan anyways, it would make sense to switch to containers for the chains which are specifically tailored for numerics only data. Would it make sense to do this in the course of this refactoring.

As far as I recall the list handling had severe slow downs as a consequence whenever larger arrays had to cross the Stan -> R space. Doing that with the matrices would expedite this communication substantially.

Best,
Sebastian

On Friday, August 12, 2016 at 5:46:50 AM UTC+2, Jiqiang Guo wrote:
> forgot to mention what ccache is. 
>
>
> https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/ccache
>
>
>
> On Thu, Aug 11, 2016 at 11:31 PM, Jiqiang Guo
>
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,
Aug 12, 2016, 6:09:08 AM8/12/16
to stan...@googlegroups.com

> On Aug 12, 2016, at 8:41 AM, Sebastian Weber <sdw....@gmail.com> wrote:
>
> Hi!
>
> Last time I looked at the rstan code it used a lot of those Rcpp List constructs which are convenient to use, but from what I found not very fast to handle the massive numeric arrays.
>
> When refactoring rstan anyways, it would make sense to switch to containers for the chains which are specifically tailored for numerics only data.

That sounds like a great idea.

> Would it make sense to do this in the course of this refactoring.

We like to keep pull requests modular, so I'd suggest creating
an issue and a separate pull request if it's possible.

> As far as I recall the list handling had severe slow downs as a consequence whenever larger arrays had to cross the Stan -> R space. Doing that with the matrices would expedite this communication substantially.

I don't know what you mean by Stan -> R. Doesn't Rcpp just
wrap the R data structures and expose them to Stan (and vice-versa)?

I found all the doc and descriptions of Rcpp very cryptic when
I first looked at them, but part of that is my ignorance of
R internals and part just that the Rcpp doc is minimal (so is
stan-dev/math's for that matter). Documenting these things is hard,
but I'd think given the number of users for Rcpp it'd be worth it
for someone to write more comprehensive doc.

- Bob

Sebastian Weber

unread,
Aug 12, 2016, 9:35:16 AM8/12/16
to stan development mailing list
Hi!

Yeah, I had that thought for a while, but I never went for it as it means a huge rewrite of much of rstan.

So my thought was that if anyways everything is turned upside down, then we could do this. But going in small steps here hurts. I don't want to say to change that strategy.

With Stan -> R I mean how the data from Stan is brought into R. This works by initializing Rcpp data structures which can then be made accessible in the R workspace. Last time I looked at the code this was done using Rccp List objects which translate to data.frames in R. However, data.frames are horribly inefficent when it comes to dealing with chunks of matrices which is what the output from a chain is. Unless there is a reason which I haven't understood yet, then it would speed up all manipulations by a lot if we were to use matrices and only use apply methods on rstan.

However, this is a major rewrite of rstan.

Sebastian

On Friday, August 12, 2016 at 12:09:08 PM UTC+2, Bob Carpenter wrote:
> > On Aug 12, 2016, at 8:41 AM, Sebastian Weber
> >

Krzysztof Sakrejda

unread,
Aug 12, 2016, 9:57:00 AM8/12/16
to stan development mailing list

There is more doc in Dirk's book that's sitting on my shelf :P

Krzysztof Sakrejda

unread,
Aug 12, 2016, 10:01:47 AM8/12/16
to stan development mailing list
On Friday, August 12, 2016 at 9:35:16 AM UTC-4, Sebastian Weber wrote:
> Hi!
>
> Yeah, I had that thought for a while, but I never went for it as it means a huge rewrite of much of rstan.
>
> So my thought was that if anyways everything is turned upside down, then we could do this. But going in small steps here hurts. I don't want to say to change that strategy.

I'd wait for the current services refactor to make its way through rstan and then we should be able to work on this by changing the sample writer callback. Should be easier to do modular pull requests for it then. K

Bob Carpenter

unread,
Aug 12, 2016, 1:00:07 PM8/12/16
to stan...@googlegroups.com

> On Aug 12, 2016, at 3:56 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:

...

>> I found all the doc and descriptions of Rcpp very cryptic when
>> I first looked at them, ...
>
> There is more doc in Dirk's book that's sitting on my shelf :P

:-(

I don't like closed-source doc for open-source code.
And I'd much rather a PDF than anything else, though I
might be able to get this one through Columbia's library.

That's why we negotiated with the publisher to allow the
Stan book to be distributed free as a PDF. Now all we need
to do is write it.

- Bob
Reply all
Reply to author
Forward
0 new messages