beginner question about mixture models

311 views
Skip to first unread message

mike dewar

unread,
Dec 29, 2010, 2:39:38 PM12/29/10
to PyMC
Hi,

FIrst off, apologies if I'm missing something obvious. I'm trying to
build a mixture model, and I'm finding it a bit frustrating. Here's as
far as I've got:

import pymc
import numpy as np

pi = [0.3,0.4,0.2,0.1]

X = pymc.Multinomial(
'X',
n=1,
p=pi
)

mu = np.array([1,2,3,4])
tau = np.array([1,2,1,2])

@pymc.stochastic
def W(value=0, X=X, mu=mu, tau=tau):

def logp(value,X,mu,tau):
g = [pymc.Normal('W_%s'%k, mu, tau) for k,(mu,tau) in
enumerate(zip(mu,tau))]
i = X.nonzero()[0]
return g[i].logp

def random(X,mu,tau):
g = [pymc.Normal('W_%s'%k, mu, tau) for k,(mu,tau) in
enumerate(zip(mu,tau))]
i = X.nonzero()[0]
return g[i].random()

This seems to work, in that I can happily sample from W and it behaves
correctly given the value of X. However, I feel like I'm doing
something wrong, due to the fact that I have to repeat the
construction of the list of Normals over and over again, the
specification of value=0 worries me, and that i = X.nonzero()[0] line
worries me too.

I thought maybe I'd define the list of Normals (g) outside W, but then
all I can ever seem to get into W is the values of the normals in g,
as they're converted on their way in (something to do with this
LazyFunction thing right?). I'm new to pymc (and have never used BUGS
etc) and my experience with MCMC is pretty limited too.

Being pointed in the right direction would be most appreciated!

Cheers,

Mike Dewar

Chris Fonnesbeck

unread,
Dec 29, 2010, 7:00:04 PM12/29/10
to PyMC
On Dec 29, 11:39 am, mike dewar <mikede...@gmail.com> wrote:
> Hi,
>
> FIrst off, apologies if I'm missing something obvious. I'm trying to
> build a mixture model, and I'm finding it a bit frustrating. Here's as
> far as I've got:

Hey Mike,

These are good, non-obvious questions. Your code is not quite right,
due to the fact you are using distribution objects, which are meant to
be used on their own, within decorator-instantiated stochastic
objects. So, instead of Normal, you should just use the normal_like
function. See the user's guide for details on this.

More generally, there is a more efficient way of building mixture
models in PyMC. What I would do is first use a Categorical
distribution to model the index to the appropriate normal model, then
use a deterministic node to index out the appropriate values of mu and
tau. Then, a simple Normal object can be used in place of the more
complicated expression that you have created. A final thing is that I
do not see a data likelihood anywhere in your model, though I assume
the normal model would have some data attached to it. In this case,
the observed flag must be set to avoid the values changing at every
iteration. Here is my version:

pi = [0.3,0.4,0.2,0.1]
X = pymc.Categorical('X', p=pi)

mu = np.array([1,2,3,4])
tau = np.array([1,2,1,2])

mu_i = pymc.Lambda('mu_i', lambda X: mu[X])
tau_i = pymc.Lambda('tau_i', lambda X: tay[X])

W = pymc.Normal('W', mu=mu_i, tau=tau_i, value=mydata, observed=True)

Note that the Categorical model will return either a 0,1,2 or 3, which
can be used directly for indexing the appropriate mean and precision
for the mixture.

Let me know if you have any questions.

cf

Mike Dewar

unread,
Dec 30, 2010, 4:09:47 PM12/30/10
to py...@googlegroups.com
Hi Chris,

Thanks so much for your response! I have only one problem with it - when I try and run this

> pi = [0.3,0.4,0.2,0.1]
> X = pymc.Categorical('X', p=pi)
>
> mu = np.array([1,2,3,4])
> tau = np.array([1,2,1,2])
>
> mu_i = pymc.Lambda('mu_i', lambda X: mu[X])


I get an error: ValueError: mu_i: All arguments to lam_fun must have default values.

Running inspect.argspec on the lambda function confirms the lack of 'defaults':

inspect.getargspec(lam_fun)
Out[32]: ArgSpec(args=['X'], varargs=None, keywords=None, defaults=None)

After a bit of mucking about I can make it work using

mu_i = pymc.Lambda('mu_i', lambda X=0: mu[X])

(must admit to being embarrassed how long it took me to realise you can specify default values in lambda functions, despite it being rather clear in your documentation).

However, I'm concerned that I /have/ to specify default values - when are they used? How should I chose these?

M

> --
> You received this message because you are subscribed to the Google Groups "PyMC" group.
> To post to this group, send email to py...@googlegroups.com.
> To unsubscribe from this group, send email to pymc+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/pymc?hl=en.
>

Chris Fonnesbeck

unread,
Dec 31, 2010, 7:01:58 PM12/31/10
to py...@googlegroups.com
Sorry, my error -- I replied to you from the airport, so I did not have time to test. Just do the following:

mu_i = pymc.Lambda('mu_i', lambda X=X: mu[X])
cf
-- 
Chris Fonnesbeck
Nashville, TN
Reply all
Reply to author
Forward
0 new messages