More generally, any assessment of how suitable Stan would be for this
task would be great.
OK, so I'm thinking of trying stan, and thinking of packaging it for
Debian. Any tips?
In particular, has the external code in libs got any local
modifications?
Are there any notable version dependicies?
Debian
packages generally rely on existing Debian packages for external libraries.
Also, what's up with the recursive rstan directories
https://github.com/stan-dev/rstan/tree/develop/rstan/rstan?
I think the binary packages that would result include one for the stan
library, one for the stan executable, and one for the r package.
But before going down the whole installation route, I'd suggest just trying
to use it as is to see if it'll work for you. Then if you don't use it,
all the integration with Debian won't be wasted effort.
forgot to CC the list
On Sat, Aug 3, 2013 at 3:56 PM, Ross Boylan wrote:
Maybe I'm overthinking it. Maybe you just meant that "stan" is not the name
of the executable. My intent was "the executable for the stan system".
Right, I just meant that the binary is called stancgoodrich@CYBERPOWERPC:/opt/stan$ bin/stanc --help
stanc version 1.3.0
USAGE: stanc [options] <model_file>
and there is no binary called stan. Although, as Bob points out, there is a print binary (that would need to be renamed in the Debian package) that prints a summary of the posterior distribution from the .csv files. Those are the only two executables.
Anyway, packaging Stan wouldn't be very hard. Making r-noncran-rstan is the harder part, but probably more useful. It is true, as Bob mentioned on another post, that the R package doesn't use all of Boost, but neither does Stan. It is just that Stan embeds all of Boost because no one has gotten around to pruning the parts we don't need with bcp. The easiest thing to do would be to depend on libboost1.54-all, but Debian frowns on that sort of thing.
And then reconfiguring the R package to work with a system-wide Stan is a bit more work. Nothing too hard, it is just that Stan has followed a very Windows-centric philosophy of distributing itself and that is often at odds with the Debian / UNIX way.
Ben
Ben
That means there is a single official version of Boost which Debian
requires? Do other linuxes require the same one?
On 8/3/13 4:20 PM, Ben Goodrich wrote:
> If Ross / Dan / whomever are up for it, packaging for Debian would be really beneficial. It would mean Stan gets a lot
> of testing on non-Intel platforms, facilitate the python integration, etc.
That would be great.
How does it facilitate Python integration? (I'm pretty much
completely ignorant about Python and Linux packaging.)
> It just means doing a lot of things the
> Debian way instead of the wrong way.
Could you be more specific? I know we've talked before about
lib and compiler dependencies.
I figure if I go as far as rearranging things for myself I might as well
try to make Debian package(s) as a public service.
EIGEN ?= lib/eigen_3.2.0
BOOST ?= lib/boost_1.54.0
GTEST ?= lib/gtest_1.6.0
EIGEN ?= /usr/include/eigen3/
BOOST ?= /usr/include
GTEST ?= /usr/include
I'm not quite sure what to do with the unit tests under src/test but they should go somewhere. The tricky part is that some of the unit tests depend on stanc while the libstan package shouldn't necessarily have a hard dependency on stanc. And that's basically it. The rest is just compliance with Debian policy.
With the R package I guess there's a tension between wanting to provide
a nearly "one-click" install experience and providing a regular R
library. Definitely for Debian the r-noncran-stan package would be
basically rstan without the included stan subproject.
Apparently stan has a static library, libstan.a. I think Debian policy
favors using dynamic libraries; is there any particular reason for the
static form?
Now for the bugs.
cd rstan/rstan
make
produced the error
[stuff omitted]
* looking to see if a ‘data/datalist’ file should be added
Error in if (any(update)) { : missing value where TRUE/FALSE needed
Execution halted
This seems to be an R bug that has been corrected since the version I'm using (2.15.1).
I added --resave-data="best" as a work-around from the bug indicated in the comments
below; I think moving the data file into the data directory from the R directory might
work too.
I changed to --no-resave-data, which allowed the build. This doesn't seem like a great
solution, however.
I'm working off the latest rstan git, c14f746ec7b85320b2c113468331132bda15b4d8.
I created a local debian branch; I could clone the project on github and push my changes
there if needed.
When the tarball is expanded over 97% of the space is the boost headers.
Another issue with sysdata.rda is that it is a binary file. It would
be desirable to have whatever the "source" is (presumably an R script)
for Debian packaging (I'm not sure if policy requires it).
The code generating sysdata.rda is https://github.com/stan-dev/rstan/blob/develop/rstan/example/examplemodel.RI would like to discontinue having sysdata.rda. We could put more tests toIt is tested on Jenkins daily now after a successful build of rstan.
On Monday, August 5, 2013 3:19:48 PM UTC-4, Jiqiang Guo wrote:The code generating sysdata.rda is https://github.com/stan-dev/rstan/blob/develop/rstan/example/examplemodel.RI would like to discontinue having sysdata.rda. We could put more tests toIt is tested on Jenkins daily now after a successful build of rstan.
I never thought of sysdata.rda as being primarily for unit-testing but more as a way to avoid a bunch of \dontrun in the examples.
Would you rather distribute a bunch of .csv files from a model and have the examples generate a stanfit object from those .csv files and then call whatever function is being illustrated in the example?
On Mon, Aug 5, 2013 at 3:26 PM, Ben Goodrich <goodri...@gmail.com> wrote:On Monday, August 5, 2013 3:19:48 PM UTC-4, Jiqiang Guo wrote:The code generating sysdata.rda is https://github.com/stan-dev/rstan/blob/develop/rstan/example/examplemodel.RI would like to discontinue having sysdata.rda. We could put more tests toIt is tested on Jenkins daily now after a successful build of rstan.
I never thought of sysdata.rda as being primarily for unit-testing but more as a way to avoid a bunch of \dontrun in the examples.I thought being part of the test was one of your goal. I do not have problem with dontrun as it does not mean the code cannot be run though after the doc is written we do not run it everytime we release rstan.
Would you rather distribute a bunch of .csv files from a model and have the examples generate a stanfit object from those .csv files and then call whatever function is being illustrated in the example?I do not like it either as they need to be regenerated. But on a second thought, how the code in the example section gets used typically? If users really try the code and look at the result, I can see the merits of using csv files. But if users just take a quick look of the code most of the times, putting them in \dontrun is enough.