Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] ANN: Probabilistic programming in OCaml

148 views
Skip to first unread message

ol...@okmij.org

unread,
Jun 1, 2009, 11:08:20 PM6/1/09
to caml...@inria.fr

Chung-chieh Shan and I would like to announce the OCaml library
HANSEI, to express probabilistic models and perform probabilistic
inference. OCaml thus becomes a probabilistic programming
language.

The canonical example of Bayesian net, the grass model, looks as
follows

open ProbM;;
let grass_model () = (* unit -> bool *)
let rain = flip 0.3 and sprinkler = flip 0.5 in
let grass_is_wet =
(flip 0.9 && rain) || (flip 0.8 && sprinkler) || flip 0.1 in
if not grass_is_wet then fail ();
rain;;

The model first defines the prior distributions of two events: of
raining and of the sprinkler being on. We then specify the Bayesian:
grass may be wet because it rained or because the sprinkler was on, or
-- with the probability 10% -- for some other reason. We also consider
that there is 10% chance rain did not wet the grass. We observe that
the grass is wet. What are the chances it rained? To find out, we
execute
exact_reify grass_model;;
- : bool pV = [(0.322, V false); (0.2838, V true)]
which after normalization tells the posterior probability of raining,
about 7/15.

The probabilistic model is the regular OCaml function; the independent
random variables rain and sprinkler and the dependent random variable
grass_is_wet are regular OCaml boolean variables. We can pass the
values of these random variables (which are just booleans) to regular
OCaml functions such as 'not' and use the result in the regular if
statement.

HANSEI can handle models that are far more complex than the grass
model, supporting variable (or bucket) elimination, on-demand
evaluation of probabilistic expressions, memoization of stochastic
functions, and importance sampling.

Here is an example of on-demand evaluation:
let lazy_pair () =
let x = letlazy (fun () -> flip 0.5) in
(x (), x ());;
exact_reify lazy_pair;;
- : (bool * bool) pV =
[(0.5, V (true, true)); (0.5, V (false, false))]

We do not observe the pair (true, false). Evaluating the expression
x () several times gives the same result -- in the same possible
world. That result may be different in another possible world. For
that reason, we cannot use OCaml's own 'lazy' evaluation: OCaml's lazy
is not thread-safe.

A particular feature of HANSEI is that it permits calls to inference
procedures (e.g., exact_reify) appear in models. After all, both are
OCaml expressions. Distributions thus can be parameterized over
distributions and inference procedures can reason about their own
accuracy.

The HANSEI code is available at
http://okmij.org/ftp/kakuritu/
The web page also presents HANSEI code for sample probabilistic models
and standard benchmarks (HMM, noisy-or, population estimation, belief
networks).

The current documentation includes two complementary papers
http://okmij.org/ftp/kakuritu/dsl-paper.pdf
to be presented at the IFIP working conference on domain-specific
languages and
http://okmij.org/ftp/kakuritu/embedpp.pdf
to be presented at `Uncertainty in AI'. The papers are written for
different audiences. The first paper explains the implementation of
probabilistic primitives whereas the second describes
the applications of the library.

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

ol...@okmij.org

unread,
Jun 3, 2009, 1:45:15 AM6/3/09
to caml...@inria.fr

Eliot Handelman wrote:
> This looks really interesting,. but can it be compiled with 3.10.1?
> First bash at compiling the delimcc library failed.

Yes, the library can be used with 3.10.1 and even 3.09. I think I know
what problem you have encountered. Ideally, to compile delimcc you
need the configured OCaml sources, because delimcc needs a few header
files that are not normally installed. Since one could install OCaml
from a binary distribution, the sources are not necessarily
available. Therefore, delimcc includes the needed header files as part
of its distribution, for the most common platform: ia32 and
Linux/BSD. Alas, these files differ slightly among OCaml versions
(e.g., because of renaming). Therefore, delimcc distribution includes
two sets of files, in the directories ocaml-byterun-3.09 and
ocaml-byterun-3.10. With OCaml 3.10.1, one has to use
ocaml-byterun-3.09 files rather than ocaml-byterun-3.10. Sorry, this
is indeed confusing.

Thus, you may want to edit the Makefile in delimcc, comment out the
line
OCAMLINCLUDES=./ocaml-byterun-3.10
and uncomment the line above that contains 3.09. I think it would
build delimcc library (please `make testd0' to make sure).

0 new messages