question on multilevel model -- Gelman's radon dataset

876 views
Skip to first unread message

Whit Armstrong

unread,
Sep 9, 2009, 4:50:39 PM9/9/09
to py...@googlegroups.com
I'm just a little confused about how to go about fitting a multilevel
model in pymc.

The example I'm attempting to replicate is the varying intercept model
taken from the ARM book (http://www.stat.columbia.edu/~gelman/arm/).

The example and data can be found here:
http://www.stat.columbia.edu/~gelman/arm/examples/radon/

I'm attempting to replicate the bug file "radon.1.bug" in which
intercepts for each county take their prior from the mean of the
global dataset (mu.a).

The concept that I'm struggling with in pymc is how to define the
linkage between the a[i]'s and mu.a.

Any help is appreciated.

-Whit


warmstrong@research:~/scripts/gelman.examples/radon$ cat radon.1.bug
# Bugs code for multilevel model for radon
# with bsmt as an individual predictor

# varying-intercept model

model {
for (i in 1:n){
y[i] ~ dnorm (y.hat[i], tau.y)
y.hat[i] <- a[county[i]] + b*x[i]
}
b ~ dnorm (0, .0001)
tau.y <- pow(sigma.y, -2)
sigma.y ~ dunif (0, 100)

for (j in 1:J){
a[j] ~ dnorm (mu.a, tau.a)
}
mu.a ~ dnorm (0, .0001)
tau.a <- pow(sigma.a, -2)
sigma.a ~ dunif (0, 100)

A. Flaxman

unread,
Sep 12, 2009, 3:26:46 PM9/12/09
to py...@googlegroups.com
Hi Whit,

Have you made progress with your project? I don't have time to try it out myself right now, but I think the PyMC model will define variables in the reverse order of bugs code, i.e.:

sigma_a = Uniform('sigma_a', 0, 100)
@deterministic
def tau_a(s=sigma_a):
return s**-2
mu_a = Normal('mu_a', 0, .0001)

a = []
for j in range(J):
a.append(Normal('a_%d'%j, mu_a, tau_a))

When you get things working, please share the code. I think that having side-by-side examples of PyMC and Bugs models will be helpful for people deciding what MCMC system to use.

--Abie

________________________________________
From: py...@googlegroups.com [py...@googlegroups.com] On Behalf Of Whit Armstrong [armstro...@gmail.com]
Sent: Wednesday, September 09, 2009 1:50 PM
To: py...@googlegroups.com
Subject: [pymc] question on multilevel model -- Gelman's radon dataset

M. Aaron MacNeil

unread,
Sep 12, 2009, 4:38:58 PM9/12/09
to py...@googlegroups.com
Further to Abie's email, Whit,
A (potentially big) time-saver in PyMC is the use of indexing on
arrays. You can pass the index into the linear model bit before it is
sent to the data without the need for a loop.

For Gelman's radon.1.bug:

# Bugs code for multilevel model for radon
# with bsmt as an individual predictor

# varying-intercept model

model {
for (i in 1:n){
y[i] ~ dnorm (y.hat[i], tau.y)
y.hat[i] <- a[county[i]] + b*x[i]
}
b ~ dnorm (0, .0001)
tau.y <- pow(sigma.y, -2)
sigma.y ~ dunif (0, 100)

for (j in 1:J){
a[j] ~ dnorm (mu.a, tau.a)
}
mu.a ~ dnorm (0, .0001)
tau.a <- pow(sigma.a, -2)
sigma.a ~ dunif (0, 100)
}

So if array indx_c is an vector of length n with index numbers
corresponding to the J counties, and y and x are data vectors, an
equivalent in PyMC would be:

# Priors
mu_a = Normal('mu_a', mu=0., tau=0.0001)
sigma_a = Uniform('sigma_a', lower=0, upper=100)
tau_a = Lambda('tau_a', lambda s=sigma_a: s**-2)

a = Normal('a', mu=mu_a, tau=tau_a, value=zeros(J))

b = Normal('b', mu=0., tau=0.0001)
sigma_y = Uniform('sigma_y', lower=0, upper=100)
tau_y = Lambda('tau_y', lambda s=sigma_a: s**-2)

# Model
@determinisitc
def y_hat(a=a,b=b):
return a[indx_c] + b*x

# Likelihood
@stochastic(observed=True)
def y_i(value=y, mu=y_hat, tau=tau_y):
return normal_like(value,mu,tau)


There are other slightly tidier ways to code the model and likelihood
bits but I tend to like to have them in the decorator form. At any
rate, it should end up running a bit faster than the loops. I'm
frequently stealing WinBUGS code and translating into PyMC these days,
mostly for the flexibility it gives me to cut and paste into more
complex models and to play with data / posteriors.

Have fun,
A.
+

M. Aaron MacNeil, PhD
Research Scientist
Australian Institute of Marine Science

PMB 3 Townsville MC
Townsville 4810
Queensland Australia

Tel: +61 (0)7 4753 4191
Fax: +61 (0)7 4772 5852

+

Whit Armstrong

unread,
Sep 13, 2009, 1:45:29 PM9/13/09
to py...@googlegroups.com
Abie,

Thanks for the post. I'm happy to post back my example once it's
working. I do have the basic setup now which is running correctly. I
just need to tie out the numbers to the bugs results.

-Whit

Whit Armstrong

unread,
Sep 13, 2009, 1:55:37 PM9/13/09
to py...@googlegroups.com
Aaron,

I'm an R user, so it's very natural to think in vectors. Crossing over
to python, I'm a bit clumsy at the moment.

Basing my code on the "disasters" example, I have a two step process
for the intercept term. One in which I initialize the intercept term
to the proper values, and then fit it:

@pymc.deterministic(plot=False)
def intercept(intercept_ary=intercept_ary, county_map=county_map):
""" Concatenate intercepts """
ans = np.empty(len(county_map[1]))
for i in range(0,len(intercept_ary)):
ans[county_map[i]] = intercept_ary[i]
return ans


which is analogous to the disasters example:
@deterministic(plot=False)
def r(s=s, e=e, l=l):
""" Concatenate Poisson means """
out = np.empty(len(disasters_array))
out[:s] = e
out[s:] = l
return out

But from your post, I see that what I want can actually be done in a
much better way, by indexing directly into the array of intercepts,
rather than duplicating them into a full vector at each step. That
should indeed save some time.

Thanks for the post!

-Whit

Whit Armstrong

unread,
Sep 14, 2009, 12:12:22 PM9/14/09
to py...@googlegroups.com
After some help from Abie and Aaron, I have the radon model up and
running. The numbers for the varying intercept model tie out to the
numbers from the bugs model (at least by visual inspection).

The examples are posted here:
http://github.com/armstrtw/pymc_radon

Aside from the data import, the model is 100% based on Aaron's code
posted in this thread (thanks, Aaron).

I'm pulling the dataset from our in house database, so that will have
to be changed to create a portable example.

Feedback welcome.

-Whit
Reply all
Reply to author
Forward
0 new messages