Implementations of Markov chain Monte Carlo methods?

Douglas Bates

unread,

Oct 10, 2012, 10:58:11 AM10/10/12

to juli...@googlegroups.com

There are many McMC stand-alone implementations with links to R; Bugs/OpenBugs (http://mathstat.helsinki.fi/openbugs/) with the BRugs interface, JAGS (mcmc-jags.sourceforge.net) with rjags, and now Stan (mc-stan,org) with the rstan interface.

It seems to me that all these implementations are based on a parser that takes a model specification in a Bugs-like language and translates it into another language or calls in another language to create the actual sampler. The actual sampler is implemented in a compiled language for performance because McMC is a compute-intensive operation. This translation is most evident in Stan where the model specification in the stan language is translated into C++ code that is compiled and accessed from R through the Rcpp interface.

This all means that the painful two-language with interfaces approach to getting performance in dynamic languages like R and Matlab becomes a three-language with interfaces approach where the model specification language is translated to a static compiled language that is linked to a dynamic language.

A Julia implementation of such a system would, I believe, be much less painful. I still don't know enough about the mechanics of languages like Bugs to concoct samplers (John Myles White undoubtably knows more) but I do know a bit about evaluating pdf's, pmf's, cdf's, etc. for distributions and would be happy to work on making the Distributions module more useful for these purposes.

What I would like to see is an informal working group, or perhaps an organization on github, devoted to creating Julia tools for McMC implementation. In other words, I think that it would be better to create a collection of tools that could be easily assembled to create a sampler rather than having a closed system like Bugs, which to me seems like old-style "statistical packages" such as SAS and SPSS.

I think McMC could be a "killer app" for Julia in the statistical computing world.

Chris DuBois

unread,

Oct 10, 2012, 12:36:25 PM10/10/12

to juli...@googlegroups.com

Hi Douglas,

I agree that Julia is well suited for MCMC. A few months ago I translated a few samplers from Radford Neal's GRIMS R package. They aren't heavily optimized or tested (yet), but nonetheless I've found them useful. My motivation for this project sounds much like what you describe: a collection of tools that could be easily assembled to create a sampler. It is, however, very distant from BUGS/JAGS/Stan approaches.

I would be very interested in a working group (and an associated github organization) on this topic, and I know several others that would be as well.

Chris

--

John Myles White

unread,

Oct 10, 2012, 1:41:48 PM10/10/12

to juli...@googlegroups.com

I'll chime in more on this later today, but wanted to say a few things.

First off, I've used Chris's code before and think it could easily be the basis for the samplers we'd use in an MCMC system.

That said, most of the substance of a system like BUGS/JAGS/Stan lies in building an expert system that has domain knowledge about how to handle specific models. I'll track down links, but I've been told that Stan actually lags behind JAGS in some standard models for which this domain knowledge is missing, even though the NUTS sampler at the core of Stan can be substantially more efficient than the non-gradient samplers used in BUGS/JAGS when they're applied appropriately. It would be good to talk with Martyn Plummer about JAGS before embarking on a project to build something similar in Julia.

I'm happy to participate in a working group on this topic, although I'm badly overcommitted with existing projects and will not be able to much coding.

Maybe as a start, we should agree on a language w'ed like to implement and get some simple examples working. For instance, the simplest JAGS code I start with tends to be inference for the mean of a Gaussian with known variance of 1. In JAGS, that looks like

model

{

for (i in 1:N)

{

x[i] ~ dnorm(mu, 1)

}

mu ~ dnorm(0, 0.001)

}

A JAGS system could handle this model specification (given some data for x of length N) in many ways. It could attempt to run Metropolis-Hasting and tune a proposal distribution. It could use a slice sampler. It could exploit knowledge about conjugacy to directly sample from the posterior. Building a system that's fast relies upon knowing which of these strategies to use.

-- John

--

John Myles White

unread,

Oct 11, 2012, 2:32:43 PM10/11/12

to juli...@googlegroups.com

More thorough follow-up. Hopefully this isn't too cryptic. Please let me know what I leave unclear.

(1) I really like the idea of pushing on Julia as a tool for MCMC, but would like to encourage us to start small. A full MCMC system like JAGS or Stan requires a team whose job is exclusively to maintain the system. For example, Stan (whose sole purpose is to be faster than JAGS) occasionally discovers that JAGS has a clever trick for special classes of models that lets JAGS beat out Stan. See https://groups.google.com/forum/#!topic/stan-users/QDumPPbRa4Q/discussion for one example. Maintaining a system that's competitive with JAGS/Stan will be a major undertaking on the same scale as building Stan, which had three full time developers who were all postdocs in Gelman's group working on the project for more than one year.

(2) One way we could start small is to focus on translating a JAGS-like syntax into Julia code that uses the samplers in Distributions. (Or, as the StrangeLoop crowd would say, we need a JAGS-to-Julia transpiler.) I think this is promising because it seems like we could very slowly build up a mini-language that can be translated into very specific Julia code. For example, we could just implement the standard conjugate pairs and provide a brief language for working with those that is translated into Julia. I would really like something like that, because it would make the system much more transparent than JAGS, which is very hard to debug. If a modeling language were translated into Julia, we would not have to worry as much about performance because the end-user could customize the produced code when they realize that the translator isn't exploiting domain knowledge. The problem with JAGS/Stan is that the efficiency they achieve is largely a black-box, which makes it very hard for outsiders to contribute to the project. In other words, we could try to do better than JAGS/Stan by following Julia's strategy of doing everything in Julia, rather than dropping to a C-language for speed.

(3) Before we could even start doing (2), we'll need two things: a nice parsing library and a nice automatic differentiation library. Perhaps those should be our original focus? I know there's work on automatic differentiation already. I'm less aware of parsing, although it's come up in the context of Formula syntax for GLM's in JuliaData discussions. Maybe thinking more about that is the best place to start, especially because I have so little knowledge of how to write a decent parser.

(4) I think BUGS is much less like SAS/SPSS than it might seem. BUGS is a full programming language (it might actually be Turing complete) that allows one to implement essentially any model imaginable. It's just horribly slow and clunky when you start to write models that it wasn't designed to handle. Mixture models, for example, can be very hard to express in BUGS. Stan is, linguistically, as expressive as BUGS, except that the Hamiltonian MCMC tricks that power it only apply to models with continuous variables, because the system increases statistical efficiency by using the gradient of the log posterior of a model. In short, it's basically the same language, but translates continuous models into better algorithms for sampling the model parameters.

MCMC-style computations are what drew me to Julia originally, so I hope we can find useful places for abstraction here.

-- John

> --
>
>
>

Andrei de Araújo Formiga

unread,

Oct 11, 2012, 3:04:12 PM10/11/12

to juli...@googlegroups.com

I've been thinking about this in similar lines recently. I actually
thought about the possibility of exposing different levels of
abstraction, a higher level that could be similar to JAGS/BUGS, a
"middle" that could be defined with operators/macros in Julia itself
and a lower level composed of the samplers. Maybe it could be
interesting to use the same higher-level models with different
"backends" to run them, maybe even calling Stan from Julia, for
example.

In any way, I agree with this, and it should definitely start small.

2012/10/11 John Myles White <johnmyl...@gmail.com>:

> --
>
>
>

John Myles White

unread,

Oct 11, 2012, 5:06:30 PM10/11/12

to juli...@googlegroups.com

The other small start is to wrap JAGS or Stan in Julia. From what I saw, that's not so small.

-- John

> --
>
>
>

Andrei de Araújo Formiga

unread,

Oct 11, 2012, 5:16:32 PM10/11/12

to juli...@googlegroups.com

Indeed, both are in C++ and at least in Stan case it doesn't seem to
have a simple C interface. The R bindings use Rcpp as far as I know.

> --
>
>
>

Ian Fiske

unread,

Oct 11, 2012, 10:27:03 PM10/11/12

to juli...@googlegroups.com

+1 for automatic differentiation. That would also come in handy for loads of other optimization problems. As far as I can tell there has been some existing work for forward mode but not reverse mode. Of course reverse mode is the killer approach for optimization although harder to implement.

sth4nth

unread,

Oct 12, 2012, 1:07:02 AM10/12/12

to juli...@googlegroups.com

I've thinking about this for a while. Since julia is a powerful FP style language, we might should leverage the DSL ability of Julia to define the interface for Bayesian inference. For example, the Church language (https://projects.csail.mit.edu/church/wiki/Computation_in_Church), which is a variant of lisp, incorporate the probabilistic modelling ability into its syntax. Also there is infer.net fun project, which is based F# to do similar things.

Besides MC based methods, there are also other optimization based inference technique such as mean filed and belief propagation algorithm also can do the job. We should separate the inference engineer and modelling interface. The modelling interface is to define the problem structure (a.k.a the Graphical model), the engine is to do the inference job. Which engine to use is up to the user.

In my opinion, we should not adopt the R still modelling interface, since Julia is far more powerful language, we should leverage macro to define a better DSL for probabilistic inference like Church and Fun.

Here are some material for probabilistic DSL programming:

http://probabilistic-programming.org/wiki/Home

Best,

Mo

Andrei Formiga於 2012年10月12日星期五UTC+8上午3時04分19秒寫道：

Adrian Cuthbertson

unread,

Oct 12, 2012, 1:57:26 AM10/12/12

to juli...@googlegroups.com

It seems the threads spread far, wide and deep :). I'm just embarking
on a learning curve with MCMC methods as a focus, so this initiative
will be very interesting to follow and participate in where I can.

Regards, - Adrian.

> --
>
>
>

John Myles White

unread,

Oct 13, 2012, 5:52:12 PM10/13/12

to juli...@googlegroups.com

I have to confess that I don't know what reverse mode means here.

-- John

On Oct 11, 2012, at 10:27 PM, Ian Fiske wrote:

> +1 for automatic differentiation. That would also come in handy for loads of other optimization problems. As far as I can tell there has been some existing work for forward mode but not reverse mode. Of course reverse mode is the killer approach for optimization although harder to implement.
>

> --
>
>
>

Stefan Karpinski

unread,

Oct 13, 2012, 5:59:50 PM10/13/12

to juli...@googlegroups.com

The separation is definitely the way to go. Was going to write that myself earlier but, in any case, yes!

--

Andrei de Araújo Formiga

unread,

Oct 13, 2012, 7:19:07 PM10/13/12

to juli...@googlegroups.com

It's basically the order the differentials are evaluated in the chain
rule. Reverse-mode tends to be better for objective functions in
optimization.

2012/10/13 John Myles White <johnmyl...@gmail.com>:

> --
>
>
>

Ian Fiske

unread,

Oct 13, 2012, 7:32:23 PM10/13/12

to juli...@googlegroups.com

Yes, an AD system for statistical optimization work needs reverse mode to handle models with many parameters. See for example ADMB (http://admb-project.org/) and the cited paper (http://tandfonline.com/doi/abs/10.1080/10556788.2011.597854).

Of course ADMB implements one of the many algorithms for reverse mode and I don't know what approach would be best for Julia. There are a couple of functional approaches that might be suitable (scheme: http://www.bcl.hamilton.ie/~qobi/stalingrad/ and haskell: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ad), but I'm not sure if the pure functional approach will have the performance found in the C++ or Fortran versions. There has also been work on this for Matlab which might be relevant to Julia: https://dspace.lib.cranfield.ac.uk/bitstream/1826/4356/1/ForthICCS2010.pdf.

Stefan Karpinski

unread,

Oct 13, 2012, 8:20:55 PM10/13/12

to juli...@googlegroups.com

These are great resources. Thanks!

--

Andrei de Araújo Formiga

unread,

Oct 13, 2012, 9:19:08 PM10/13/12

to juli...@googlegroups.com

Is it possible to access the Julia code/AST for a given function? This
would come in handy to implement automatic differentiation for any
given function. Failing that, maybe a macro that wraps a given code
block in a context where the operators are redefined to calculate the
differentiation, but this means the writer of the function to be
differentiated must define it using the macro instead of the usual
function definition in Julia.

2012/10/13 Stefan Karpinski <stefan.k...@gmail.com>:

> --
>
>
>

Patrick O'Leary

unread,

Oct 14, 2012, 1:08:54 AM10/14/12

to juli...@googlegroups.com

On Saturday, October 13, 2012 8:19:09 PM UTC-5, Andrei Formiga wrote:

Is it possible to access the Julia code/AST for a given function? This
would come in handy to implement automatic differentiation for any
given function. Failing that, maybe a macro that wraps a given code
block in a context where the operators are redefined to calculate the
differentiation, but this means the writer of the function to be
differentiated must define it using the macro instead of the usual
function definition in Julia.

I've figured it out for anonymous functions (f.code.ast) but haven't figured it out in the case of generic functions.

Kevin Squire

unread,

Oct 14, 2012, 1:03:10 PM10/14/12

to juli...@googlegroups.com

Perhaps reverse AD could be implemented by wrapping a block of code in a manner similar to how the profiler works, e.g., something like

@ad begin

include("math_fns.jl")

def some_func(theta)

sin(theta)^2 + cos(theta)^2

end

end # @ad begin

That way, even functions which are not originally defined using the AD macro could be differentiated. Some cleverness would probably be needed to deal with functions which shouldn't be differentiated or those which call C code.

Kevin

On Saturday, October 13, 2012 8:19:09 PM UTC-5, Andrei Formiga wrote:

Simon Byrne

unread,

Oct 15, 2012, 3:59:12 PM10/15/12

to juli...@googlegroups.com

I think this is a great idea: using the one language for everything is bound to make things easier.

I've had a bash at implementing a bugs-type interpreter using a macro here:

https://gist.github.com/3894849

It's pretty rudimentary, and doesn't do anything clever (it just does slice sampling, using Chris DuBois's code, on each variable in turn). But it works, at least as a proof of concept.

Simon

John Myles White

unread,

Oct 16, 2012, 10:57:23 AM10/16/12

to juli...@googlegroups.com

Very cool!

I'd like to push for us to keep inference and model definition separate. The virtue of BUGS-systems is that you don't need to know what slice-sampling is to be able to use it. I'd like to see something like:

model = quote

for i in 1:N
x[i] ~ Normal(mu, sigma)
end

mu ~ Normal(0, 1000.0)
sigma = 1.0

end

data = {:x => x}

compile(model, data)

-- John

> --
>
>
>

Simon Byrne

unread,

Oct 16, 2012, 11:35:37 AM10/16/12

to juli...@googlegroups.com

On Tuesday, 16 October 2012 15:57:26 UTC+1, John Myles White wrote:

I'd like to push for us to keep inference and model definition separate. The virtue of BUGS-systems is that you don't need to know what slice-sampling is to be able to use it.

I agree entirely. One thing I would like would be an additional interface somewhere in the middle, that allows the user to specify the samplers to use on each variable (or set of variables for block updating), but uses the model definition to figure out all the exact details (so they don't need to derive all the equations).

Simon

Andrei de Araújo Formiga

unread,

Oct 16, 2012, 11:59:29 AM10/16/12

to juli...@googlegroups.com

That's what I've been thinking about too, as I said somewhere back in
this thread. I thought about the three levels as:

- Higher level is the model specification in a language similar to JAGS/BUGS
- Middle level is an embedded DSL (domain-specific language) in Julia,
possibly using macros, where the user can specify the computation in
greater detail (samplers, etc)
- Lower level would be the samplers/inference engines themselves,
basically the raw MCMC machinery (plus maybe other kinds of inference
engine, I don't know if something using variational inference would be
feasible)

This way advances in the inference engines could be decoupled from
high-level use, but users willing to have more control would get it in
the middle layer. Julia is powerful enough to define the embedded DSL
in the middle so that it works well and is nice to use.

2012/10/16 Simon Byrne <simon...@gmail.com>:

> --
>
>
>

John Myles White

unread,

Oct 16, 2012, 12:41:11 PM10/16/12

to juli...@googlegroups.com

I'm in total agreement here. Let's start to refine those levels more.

To help us, I've set some infrastructure up.

(1) I've just set up a Google Group for us to continue the discussion. The address is julia...@googlegroups.com

(2) I've set up a GitHub repo for us to start to refine a plan and some code examples. The URL is https://github.com/johnmyleswhite/bugs.jl

(3) I've used the GitHub repo's wiki to write some very simple examples that are almost already Julia expressions. If we get an enough examples in place, I think it will be much easier to actually plan out the rest of our work: https://github.com/johnmyleswhite/bugs.jl/wiki/Proposed-DSL

-- John

> --
>
>
>

John Myles White

unread,

Oct 16, 2012, 1:16:11 PM10/16/12

to juli...@googlegroups.com

In case it's not clear where the page for the mailing list is, this is the URL:

https://groups.google.com/forum/#!forum/julia-stats

-- John

On Oct 16, 2012, at 11:59 AM, Andrei de Araújo Formiga wrote:

--

Viral Shah

unread,

Apr 10, 2013, 11:44:18 PM4/10/13

to juli...@googlegroups.com, julia...@googlegroups.com

I just stumbled upon some analysis of Stan that Stefan had done a while back. Given that it is C++ templated code, it is not convenient to call from julia. However, Stefan's analysis may help convince someone to either translate or re-implement these methods in Julia:

Looking through the Stan code, the project includes:

IO libraries
memory manager
math libraries
matrix implementation
optimization algorithms
C++ template metaprogramming utilities
definitions of standard probability distributions
a parser for the custom model language
MCMC code

The actual MCMC code seems to be approximately 4000 lines out of a total of about 37000 lines of code.

-viral

Rahul Dave

unread,

Apr 11, 2013, 10:27:30 AM4/11/13

to julia...@googlegroups.com, juli...@googlegroups.com

Being about neck deep in HMC right now, I suspect that the hardest parts to port from Stan are the optimization and tuning parts...In other words the 4000 lines are complex ones, supporting both standard HMC and NUTS..

But you are entirely right...it would be a killer app. And the constructs of Julia seem very suitable.. If someone hasnt done it already (or is doing it) would be quite happy to work on parts of this after the academic year ends (1st week of may!).

Chris has a HMC sampler based on Radford Neils paper's R code, perhaps he could comment on a direction forward?

Rahul

--

Rahul Dave

Sent with Sparrow

--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-stats...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Viral Shah

unread,

Apr 11, 2013, 10:56:47 AM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

Undoubtedly, I did not mean to say that it would be simple. I just meant to point out that the effort may be lesser than what it may appear at first glance. One of the things that I personally hope Julia will enable is that such projects in the future can focus on the problem at hand, rather than on all the tooling and plumbing. I know we have a lot to do.

In my opinion, the MathProg implementation by Miles and Iain is an encouraging example of this in linear programming.

-viral

Chris DuBois

unread,

Apr 11, 2013, 11:30:40 AM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

Glancing at Dahua's slides, it looks like this is an effort to allow for general inference routines on graphical models (e.g., belief propagation or any other message-passing-based approach). (Think Koller's Probabilistic Graphical Models book.)

I would check out SimpleMCMC.jl as a starting point for writing your own NUTS, or look at Matt Hoffman's Matlab code.

Looking back at this thread, I see a lot of talk about automatic differentiation. Does anybody know what the current state is?

Theodore Papamarkou

unread,

Apr 11, 2013, 11:49:56 AM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

Hi Chris, hi all,

This is the first time I came across this post re MCMC, and it is very pleasing that several "Julia people" have expressed interest in MCMC coding. With the danger of leaving out a package (if I do so, please correct me), here is the list of simulation or MCMC related Julia packages at the moment:

* MCMC,

* simpleMCMC,

* SimJulia,

* GeometricMCMC.

As for automatic differentiation, I am currently doing some work to extend the algorithms of the GeometricMCMC package. A very useful introduction to automatic differentiation is the following master thesis by Lium Torbjorn, which was mentioned to me by a colleague of mine:

https://bora.uib.no/handle/1956/3401

A more comprehensive (and rather authoritative) reference to automatic differentiation is this book:

http://www.amazon.com/Evaluating-Derivatives-Principles-Algorithmic-Differentiation/dp/0898716594/ref=sr_1_1?ie=UTF8&qid=1365695041&sr=8-1&keywords=Evaluating+Derivatives%3A+Principles+and+Techniques+of+Algorithmic+Differentiation

If you are interested in following up with more recent advances in AD (and finding more references), this is perhaps the most popular website:

http://www.autodiff.org/

One more thing, MCMC research advances in such fast pace that there are already many new methodologies that have not been coded yet in one of the above packages. For example, there have been publications on HMC on Hilbert spaces, geodesic Monte Carlo, and quantum Monte Carlo such as variational and diffusion Monte Carlo methods.

Best,

Theo

Chris DuBois

unread,

Apr 11, 2013, 11:54:12 AM4/11/13

to julia...@googlegroups.com, juli...@googlegroups.com

Thanks Theo! These are great resources!

Could you expand a little on the role of automatic differentiation for your GeometricMCMC package?

Theodore Papamarkou

unread,

Apr 11, 2013, 12:14:13 PM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

Yes, of course Chris, happy to share thoughts. So basically my motivation is simply the fact that the gradient of the target (as well as the metric tensor in the case of geometric MCMC) are not known in closed form in many cases. Muti-dimensional problems for example. Another example is that of systems of differential equations, where the involved sensitivities are not soluble analytically. Even when it is possible to derive exact formulae for the needed derivatives, Jacobians and Hessians, this is a rather cumbersome task, so even then it is prefereable from the suer end to rely on AD. In a few words, AD is the preferable way to employ within MCMC, unless we are talking about simple models (such as the logit and probit models I added in GeometricMCMC).

That's the motivation. As for the implementation, I am currently cheating (in the well-meaning way :)). I downloaded John's Calculus package and the simpleMCMC Julia code and read the basics from the above master thesis in order to implement forward AD in the case of the logit and probit models (to make sure that I get identical results to the ones I get via the closed forms). I am also pondering whether I should buy the book I mentioned, to check what it says about Jacobians and Hessians, although the same author published a more recent book (2012) on AD, which may be more geared towards optimization rather than simple simulation using AD...

I think we should all try to extend our little MCMC packages to use AD, if you agree?

Chris DuBois

unread,

Apr 11, 2013, 12:26:03 PM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

I am on board with the motivation! I would like to incorporate it into MCMC.jl soon.

It would be nice if all the MCMC packages could take advantage of a single AD framework. Is this a realistic scenario?

Theodore Papamarkou

unread,

Apr 11, 2013, 12:32:07 PM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

My view is that a Symbolic package and an AD (automatic differentiation) package may be of use in many scenarios, so it would be practical to have a central implementation. I don't want to be territorial, so I think that Jeff, Stefan, Viral et al are better suited answering this question (and more experienced than me too), I merely expressed my thoughts :) So if it is decided to have one package for AD, I am more than happy to work together. Let's see who else is interested. If we don't hear from many other people, we could look into AD together.

P.S. 1: AD does not involve truncation errors, that's one more motivation in comparison with numerical differentiation.

P.S. 2: I forgot to mention tempering/population MCMC in the Monte Carlo methods that need to be implemented.

Kevin Squire

unread,

Apr 11, 2013, 1:13:27 PM4/11/13

to julia...@googlegroups.com, juli...@googlegroups.com

Over a year ago, Jonas Rauch created a nice forward AD implementation using dual numbers:

https://groups.google.com/d/msg/julia-dev/tXBR04t31vI/Q30VCKAq8o0J

I've played around with it a little, and updated it a few months ago to work with newer versions of Julia. I was contemplating contacting Jonas and turning it into a package, but as it goes with these things, I got busy with other things. I'll see if I can dig it out and get it working again.

AD was also discussed previously on julia-dev (https://groups.google.com/d/topic/julia-dev/GId3CTNoJnw/discussion), and Miles Lubin, at least, expressed some interest.

For reverse AD, the lisp by Jeffrey Siskind, in particular (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.3715), might be a good target to translate in to julia, although the paper is a bit dense to me and my lisp isn't stellar. But I also really like the references Theodore mentioned as well, especially the Griewank book. For those at universities with SIAM subscriptions, the full version is available as pdf chapters: http://epubs.siam.org/doi/book/10.1137/1.9780898717761.

Anyway, I'm interested and inclined to help, although I'm not sure how much time I can commit. Please keep me in the loop.

Cheers!

Kevin

Theodore Papamarkou

unread,

Apr 11, 2013, 1:34:24 PM4/11/13

to juli...@googlegroups.com, julia...@googlegroups.com

Thanks Kevin! It would be very helpful and time saving if you manage to find your update of the ad.jl file! The one you sent already seems easy to follow through, which is good.

Oh, I should had checked the archive, I missed the previous AD thread.

I guess if it's several of us interested in AD, we could split the workload and each of us won't have to spend an immense amount of time on a potential package. Let's see if John for instance has done some coding too, and if Miles is also still interested... then we could start from the available code and each of us could undertake small tasks, without any pressure...

Chris, you and I, it's already three of us interested in doing a bit of work on AD.

Theo

John Myles White

unread,

Apr 11, 2013, 2:06:22 PM4/11/13

to juli...@googlegroups.com

I haven't done any work on AD so far. I'd be happy to contribute over the summer, but won't have time to work on anything till then. I'm already beyond my capacity with the packages I'm currently managing.

-- John

Theodore Papamarkou

unread,

Apr 11, 2013, 2:42:04 PM4/11/13

to juli...@googlegroups.com

Thanks for letting us know John. We could get this going for now by extending a bit some of the available code so that we have some elementary AD for the needs of MCMC, if Chris and Kevin agree. We could then expand the AD package over the summer, when you will be available.

Rahul Dave

unread,

Apr 11, 2013, 3:40:26 PM4/11/13

to juli...@googlegroups.com

Just catching up on the thread..wow a lot of stuff on it between my last post and now. I too, would be available mid-may onwards to contribute, once the (ironically, stochastic optimization ) course I am TF'ing is done. I hadnt realized the existence of SimpleMCMC (damn we need a "This Week in Julia"), and as someone in its intro thread put it; its got some complex guts in simple code in it right of the bat.

It could be used as a starting point, though I like the layered approach someone proposed earlier: one ought not need to write a model in a DSL to just sample from a pdf of your choice..at most a macro to propagate the AD through ought to be a required. Then the model parser etc could be layered on top.

Rahul

Tom Short

unread,

Apr 11, 2013, 3:52:24 PM4/11/13

to julia-dev

Count me in as interested in helping with the automatic differentiation. I've got this need as well. An approach that Miles Lubin suggested for using placeholder variables in a Julia expression might be useful here. This came up in a discussion of symbolic differentiation here:

https://github.com/johnmyleswhite/Calculus.jl/pull/7

See the sexpr macro in that pull request (Miles wrote it). Some of that code might be better in a central Symbolic.jl package.

Theodore Papamarkou

unread,

Apr 12, 2013, 4:14:20 AM4/12/13

to juli...@googlegroups.com

Great, it's already 5 of us (Chris, Kevin, Rahul, Tom and I), plus John perhaps over the summer, so let's wait a couple more days and we will could then split the coding work.

Andrei de Araújo Formiga

unread,

Apr 12, 2013, 10:54:29 AM4/12/13

to juli...@googlegroups.com

I'm also interested in this, but can't do much right now. Mid-may would be a good time for me too.

2013/4/12 Theodore Papamarkou <theodore....@gmail.com>

Tom Short

unread,

Apr 12, 2013, 7:32:29 PM4/12/13

to julia-dev

Theodore, as a start, maybe you could open a placeholder project called AutomaticDifferentiation.jl (or AD.jl or AutoDiff.jl or whatever). Then others have a place to start filling in with pull requests, or you can grant commit access.

John, maybe you could just rename Calculus.jl to Symbolic.jl, splitting stuff out later if it makes sense.

Theodore Papamarkou

unread,

Apr 13, 2013, 6:44:35 AM4/13/13

to juli...@googlegroups.com

I opened a placeholder project Tom as you suggested. It is called AutoDiff.jl. I wrote a README.md where I present a bit of the scope and possible features for this potential Julia package. I also added the initial code of Jonas Rauch that Kevin forwarded to us. Here is the link to the project:

https://github.com/scidom/AutoDiff.jl

Feel free to contribute Andrei whenever it suits you better. Same holds for anyone who wants to contribute.

P.S. I will be on holidays the coming week, so I may be somewhat slower replying.

Tom Short

unread,

Apr 13, 2013, 7:36:33 AM4/13/13

to julia-dev

Great, Theodore. That project needs a LICENSE file (hopefully MIT). Also, before we really use Jonas' code, I think we should get permission. I tried to find his email to ask permission, but I couldn't find it. I pinged him using the @JonasRauch method, but a direct email might be better.

Theodore Papamarkou

unread,

Apr 13, 2013, 7:48:11 AM4/13/13

to juli...@googlegroups.com

I completely agree with you Tom, we need Johas' permission. It is my omission and thank you very much for reminding me. I will try to find Jonas' email so as to contact him directly. If anyone else knows his email, please let us know.

Philipp Neubauer

unread,

Apr 12, 2013, 11:01:16 PM4/12/13

to juli...@googlegroups.com

Hi all -

been following this discussion, and couldn't agree more that a generic MCMC functionality with autodiff will be fantastic, especially if Julia continues to advance at the current rate in other areas such as plotting and other statistical methods. Just thought I'd express my interest to contribute (despite my limited programming chops - I'm learning loads from this list...), and give my two cents on the little I know about AD and MCMC.

Lots of people in my field of fisheries modelling use AD model builder (http://admb-project.org/) which uses AUTODIF, a library in C++. Perhaps it can be of use (along with Stan) as an example of how MCMC with AD is implemented, and clues as to what a package in Julia needs to do better in order to find wider acceptance (than ADMB) as a convenient and powerful modelling tool.

ADMB was mostly written to do ML inference in (the esoteric world of) complex fish stock assessment models (often 100s to 1000s of parameters), but MCMC capability was added in the form of MH and Hamiltonian MC (very recent I believe). It has a lot of good features, yet I don't think ADMB ever took off very much outside of fisheries research. Perhaps this is mostly due to its own way of specifying models that is not close to R syntax, which I think is the big plus of BUGS and it's offshoots (I personally choose Jags over ADMB just for convenience). It can be called from R but obviously suffers from the same 'three languages' drawbacks as the BUGS parsers in R, with the added language barrier for the ADMB specific model description. Plus ADMB is somewhat hard to debug - cryptic error messages a la WinBugs are common.

I believe that Julia will be a great end-to-end MCMC tool iff the post processing - posterior convergence diagnostics and stats a la coda and great plotting a la ggplot - are functional as well, such that the analysis can really happen end to end in Julia. Else the convenience of just calling jags (or stan) from R may keep most folks from making the effort to learn Julia.

My two cents...please keep me in the loop about developments, I'll aim to contribute where I can (the coda part, for instance could be up my alley in terms of programming chops required)....

Phil

--
Philipp Neubauer

Postdoctoral Research Fellow
Rutgers University
Institute of Marine & Coastal Sciences
71 Dudley Road
New Brunswick, New Jersey 08901

e-mail: n eubaue...@gmail.com

Kevin Squire

unread,

Apr 13, 2013, 11:16:56 AM4/13/13

to juli...@googlegroups.com

Thanks for setting up the repo and taking the lead on this, Theo. I just sent a message to Jonas using "reply to author" on the original google groups post, pointing him to AutoDiff.jl and this email thread.

I probably should have asked if you had already contacted him first, but either way, hopefully he'll get the message and respond here.

I'll look for my ad.jl updates (although it would be nice to have Jonas's permission before going much further). Can you add his name to the code on AutoDiff.jl (or make a contributors file)?

Cheers,

Kevin

Jonas Rauch

unread,

Apr 13, 2013, 11:46:05 AM4/13/13

to juli...@googlegroups.com

Hey everyone,

like I just posted on github, you are welcome to use my code and publish it under the MIT license. I would love to contribute myself, but I don't have any time at the moment.

I started this experiment as a way to learn julia (my first lines of julia code...), because the whole concept of Julia seems to be so suited to AD. I still think it should be possible to build an AD package that works together with almost all other julia code without changes, while still maintaining performance near the theoretical optimum. I'll come back in a few months and admire the progress you guys will have made by then. It's nice to see that so many people are interested in AD :-)

Cheers,
Jonas

Theodore Papamarkou

unread,

Apr 13, 2013, 12:34:21 PM4/13/13

to juli...@googlegroups.com

Hi all,

Jonas, thank you very much for willing to share your code under the MIT license. I have added you as a contributor by mentioning your name in the first lines of LICENSE.md and in README.md.

Kevin, thank you for contacting Jonas, this was very helpful and speeded things up :) I made the changes you suggested, i.e. added Jonas' name in the LICENSE.md and temporarily renamed ad.jl to ad_jonas_rauch.jl (untile we organize the code and file names once files come in).

Tom, thanks for your suggestion, the license has been added to the package's repo.

Phil, thanks for the helpful feedback, please keep following up whenever you have the time.

Best,

Theo

ro...@biostat.ucsf.edu

unread,

Aug 21, 2014, 8:49:16 AM8/21/14

to julia...@googlegroups.com, juli...@googlegroups.com

On Wednesday, April 10, 2013 8:44:18 PM UTC-7, Viral B. Shah wrote:

I just stumbled upon some analysis of Stan that Stefan had done a while back. Given that it is C++ templated code, it is not convenient to call from julia. However, Stefan's analysis may help convince someone to either translate or re-implement these methods in Julia:

For the record (I just found this thread after noticing that MCMC was a focus area for Julia stats), stan is callable from R and python, and so it would surely be possible to do so from Julia. stan has been somewhat refactored over the last year or so, which should make it easier to slice and dice. Note that "easier" != "easy."

I've also noticed the stan developers (who are statisticians) keep adding new algorithms and tweaking old ones, which seems to me an argument to use their work rather than reimplement it.

I've used stan extensively from R and looked into packaging it for Debian, and so have some sense of its internal structure. Stan is pretty heavy weight; it uses a bunch of other libraries and even stan users need to have a compiler, since the typical work flow is that stan translates a user model to C++ code, which is then compiled (I believe, though am not sure, the the "symbolic" differentiation happens at compile time) and the resulting executable does the actual model estimation.

Ross Boylan

Amit Alon

unread,

Aug 24, 2014, 7:06:11 PM8/24/14

to juli...@googlegroups.com, julia...@googlegroups.com

Hey Everyone,

I'm hoping it's still possible to join this effort/project. Please let me know if you have room for one more in MCMC :-).

I could be (and probably am) way off base here, but was it considered to translate quantlib in c++ to MCMC?

Amit

John Myles White

unread,

Aug 25, 2014, 6:17:16 PM8/25/14

to juli...@googlegroups.com

Hi Amit,

There's always more work to be done. The question is what you'd like to work on and how much supervision you'll need.

One project I've thought about lately is the idea of providing a unified interface that would let you use JAGS or Stan (or alternatives) as a backend in the same way that JuMP lets you use many different models.

This is a very complex project because of the different natures of those tools, but it might get you thinking carefully about how these systems work.

Porting quantlib sounds interesting, although I'd argue that a port should just be a standalone thing rather than a part of MCMC.jl.

-- John

Reply all

Reply to author

Forward