Message-passing methods

Josh Wasserstein

unread,

Mar 19, 2014, 11:03:07 AM3/19/14

to py...@googlegroups.com

Hi,

I am a big fan of PyMC and wanted to say congrats and thank you to everyone involved in the development of this wonderful package.

I have been using PyMC, statsmodels and scikit-learn for a few years now, and (correct me if I am wrong) I have noticed what seems to be an important gap in the landscape of machine learning libraries available/accessible from Python.

An important set of methods that seem to be missing in PyMC and other packages is message-passing methods (belief propagation, LBP, TRW and its variants). For some problems (e.g. acyclic MRFs and Bayes Networks) these methods can provide exact inference (MAP or marginals) and be faster than MCMC.

This takes me to the following question: Are there any plans to extend PyMC to make it a general inference engine and include some of these methods? If not, does anyone have any pointers to libraries that facilitate this type of inference?

Thanks a lot,

Josh

Chris Fonnesbeck

unread,

Mar 19, 2014, 12:09:20 PM3/19/14

to py...@googlegroups.com

Thanks Josh,

There are all sorts of possibilities with respect to probabilistic graphical models, etc. in PyMC, particularly now that it uses Theano. However, at any given time there are only 3-4 of us that are working on the project, so its all we can do to keep up with implementing the MCMC side of things, documentation, and so on. So, while we'd like to have PyMC do more than it currently does, its not likely that it will take on another large branch of Bayesian computing soon, unless other contributors come on board to make that happen.

cf

Kai Londenberg

unread,

Mar 19, 2014, 3:40:53 PM3/19/14

to py...@googlegroups.com

Hi,

actually this is something I'm currently actively working on. See https://github.com/kadeng/pypgmc/ - it's unfinished work in heavy development, but a lot of stuff already passes unit tests. I'm currently developing this as an add-on package for PyMC 2 and 3 respectively. I plan to support both PyMC2 and 3.

For a start I'm implementing discrete message passing algorithms, i.e. Clique Tree and Loopy Belief Propagation. The former one is designed to play nicely in the context of PyMC3, i.e. it should be possible to create derivatives of hyperparameters for example. LoopyBP, since it's iterative, is a bit more complicated when it comes to that.

I want to integrate these as Submodels into PyMC2 and 3, which, to PyMC, act as a multivariate probability distribution.

best,

Kai Londenberg

--
You received this message because you are subscribed to the Google Groups "PyMC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pymc+uns...@googlegroups.com.
To post to this group, send email to py...@googlegroups.com.
Visit this group at http://groups.google.com/group/pymc.
For more options, visit https://groups.google.com/d/optout.

Chris Fonnesbeck

unread,

Mar 19, 2014, 4:10:12 PM3/19/14

to py...@googlegroups.com, kai.lon...@googlemail.com

That's pretty great. I look forward to having a peek at it.

Are you going to the SciPy meeting this year, by chance?

Kai Londenberg

unread,

Mar 19, 2014, 6:20:17 PM3/19/14

to Chris Fonnesbeck, py...@googlegroups.com

Nice to see others are actually interested in it. I don't think I'll be able to attend that SciPy meeting. I'd like to, but Texas is a long way from Germany, and I don't think I can convince my employer that it's all to their best ;)

Just in case you are interested, I have a lot of ideas how message passing could be integrated with PyMC (and, actually, the other way around) - so many that it's getting hard to priorize. So I'll toss in a few things here. If you consider anything especially interesting, please tell me.

The most important next steps on my Roadmap are:

* Exact discrete inference submodels in PyMC3.

I have a finished (but probably not bug-free yet) Clique Tree implementation using Theano which should be easily integrated with PyMC3. Being able to calculate exact derivatives through the model might allow for some pretty interesting applications. These clique trees would be appear as a discrete multivariate distribution to PyMC.

* Fast and scalable approximate inference submodels using Loopy BP for PyMC 2

While loopy BP isn't unbiased, it's an extremely fast and scalable general purpose inference algorithm. This is harder to integrate with PyMC3 due to the iterative nature of the algorithm (run it until convergence) it's harder to calculate derivatives etc. through the model, so I would primarily target this one at PyMC 2 to allow it to scale up to very high dimensional problems.

This Loopy BP implementation is ~ 80% done.

* Loader for at least one common Bayesian Network format

I have XDSL in mind, which has the advantage of being supported by the free BN GUI Tool GeNie, and the free (but closed source) SMILE Library which I already wrote a Python wrapper for (see http://genie.sis.pitt.edu/ ). I will probably use that wrapper library at first, then later add own code to parse the format.

* Support for (possibly incomplete) evidence for a discrete Bayes-Net in the form of a Pandas DataFrame.

The BN would calculate the probability of the dataset given the hyperparameters. The hyperparameters themselves would be sampled by PyMC (for example from Dirichlet Distributions).

This would obviously be very convenient. I'm not yet sure how to combine discrete and continuous evidence. Maybe some kind of Kabuki integration might help.

* Examples, Examples !

I would like to show off some of the possibilities all of the above offers, probably in a few IPython Notebooks, tackling some old problems with new tools.

All of the above would be pretty neat, and I think that will keep me occupied for a while. But then I have a few more ideas (which could probably keep me busy for years, and no I don't actually believe I will actually finish these, but hey ..)

* More than just Dirichlet Priors ...

Once we have the above machinery in place, maybe we can explicity support some interesting priors over discrete distributions, such as a truncated Dirichlet Process Prior , or a smoothed or rank-ordered Dirichlet prior or things like that. This is probably easiest in PyMC3 for the exact inference engine.

* Particle Message Passing

Message Passing can be extended to arbitrary (non-discrete) distributions by using particle-list based messages (see http://www.dauwels.com/files/Particle.pdf for a nice overview). What's interesting here: These particles could be easily created using MCMC submodels and combined with exact discrete messages. This would probably allow for inference using pretty much arbitrary mixed type models. Efficiency gains over pure MCMC are only to be expected if a substantial part of the model is discrete and can use exact messages, though.

I have my doubt whether this can be implemented easily enough in Theano, so it's probably something I would attempt for PyMC 2 first. I think it's going to be real hard, though.

The following aren't neccessarily related to message passing, but I'll mention them anyway:

* Expectation Maximization./ Inverse Mixture Model learning

It might very well be that for some variables we are not interested in unbiased samples, rather we would like to find a discrete set of local optima. An application for this might be Expectation Maximization (EM) learning over latent variables. It's probably easy to find local optima around any MCMC sample via PyMC3 for continuous variables via gradient ascent. Since these optima effectively span a discrete sampling space, this might be a way to map a continuous distribution to a discrete one and continue with discrete inference from there. I don't know if there's a word for this. Maybe inverse mixture model learning ? :)

* Importance Sampling and Simulated Annealing

PyMC usually draws unweighted samples. I imagine it should be easy to add the ability to draw weighted samples from a modified (more or less peaked) distribution. In applications where risk is being modeled, it might be important to explore low probability density regions thoroughly. In estimation or learning, we might be more interested in the peaks. Allowing PyMC to have a variable "temperature" (as in Simulated Annealing) as well as record sample weights, the sampling efficiency for different purposes might be increased a lot, and PyMC might also be used for optimization purposes, more or less, out of the box.

* Network Structure Learning

If we have a BN engine, it would probably make sense to include support for PEBL ( https://code.google.com/p/pebl-project/ ) which allows bayesian network structure learning, and maybe more explicit support for structure sampling like the one mentioned in this thread: https://groups.google.com/forum/#!topic/pymc/acTuyT4cp1Q - What might also be interesting is to implement causal structure learning algorithms like PC, IC and IC* (see Judea Pearls Book: Causality). Or we could leave this to existing tools like GeNie / SMILE etc..