more gelman radon questions -- scaled inverse wishart

Whit Armstrong

no leída,

14 sept 2009, 15:19:0714/9/09

a py...@googlegroups.com

Moving on to the next example in the arm book, I'm baffled about how
to define a multivariate normal distribution.

In bugs it's: B.raw[j,1:K] ~ dmnorm(mu.raw[], Tau.B.raw[,])

but I can't seem to find an equivalent construct in pymc.

Here is the bugs file:

model {
for (i in 1:n){
y[i] ~ dnorm(y.hat[i], tau.y)
y.hat[i] <- inprod(B[county[i],],X[i,])
e.y[i] <- y[i] - y.hat[i]
}
tau.y <- pow(sigma.y, -2)
sigma.y ~ dunif(0, 100)

for (j in 1:J) {
for(k in 1:K) {
B[j,k] <- xi[k] * B.raw[j,k]
}
B.raw[j,1:K] ~ dmnorm(mu.raw[], Tau.B.raw[,])
}
for(k in 1:K) {
mu[k] <- xi[k] * mu.raw[k]
mu.raw[k] ~ dnorm(0,.0001)
xi[k] ~ dunif(0,100)
}
Tau.B.raw[1:K,1:K] ~ dwish(W[,],df)
df <- K + 1
Sigma.B.raw[1:K,1:K] <- inverse(Tau.B.raw[,])
for(k in 1:K) {
for(k.prime in 1:K) {
rho.B[k,k.prime] <- Sigma.B.raw[k,k.prime] /
sqrt(Sigma.B.raw[k,k] * Sigma.B.raw[k.prime,k.prime])
}
sigma.B[k] <- abs(xi[k]) * sqrt(Sigma.B.raw[k,k])
}
}

and what I have so far:

# Priors
sigma_y = pymc.Uniform('sigma_y', lower=0, upper=100)
tau_y = pymc.Lambda('tau_y', lambda s=sigma_y: s**-2)

xi = pymc.Uniform('xi', lower=0, upper=100, value=np.zeros(k))
mu_raw = pymc.Normal('mu_raw', mu=0., tau=0.0001, value=np.zeros(k))
Tau_B_raw = pymc.Wishart('Tau_B_raw', df, Tau=np.diag(np.ones(k)))

## FIXME:
B_raw = pymc.Normal('B_raw',mu_raw,Tau_B_raw)

@pymc.deterministic(plot=False)
def Sigma_B_raw(Tau_B_raw=Tau_B_raw):
inverse(Tau_B_raw)

@pymc.deterministic(plot=False)
def rho_B(Sigma_B_raw=Sigma_B_raw):
Sigma_B_raw / sqrt(diag(Sigma_B_raw) * Sigma_B_raw)

@pymc.deterministic(plot=False)
def Sigma_B(xi=xi,Sigma_B_raw=Sigma_B_raw):
abs(xi) * sqrt(diag(Sigma_B_raw))

@pymc.deterministic(plot=False)
def B(xi=xi, B_raw=B_raw):
dot(xi, B_raw)

@pymc.deterministic(plot=False)
def mu(xi=xi, mu_raw=mu_raw):
xi * mu_raw

Thanks,
Whit

Whit Armstrong

no leída,

15 sept 2009, 10:40:2415/9/09

a py...@googlegroups.com

I found the MvNormal function, but I still can't make this example work in pymc.

I'm struggling with the definition of B_raw:

B_raw = []
B_raw_m = np.zeros( (J,K) )
for i in range(J):
B_raw.append(pymc.MvNormal('B_raw_%i' % i, mu_raw, Tau_B_raw,
value=B_raw_m[i]))

I need to use the coefficients as a contiguous array. Hence, I
initialize B_raw_m as a numpy array and then pass each row of the
array as the value argument to the MvNormal function.

Does MvNormal make a deep copy of the numpy array? Or can it share
memory with the B_raw_m object? I need the values of B_raw_m to be
updated whenever the B_raw_i's are updated.

Does this issue have anything to do with cache_depth?

Thanks,
Whit

Anand Patil

no leída,

15 sept 2009, 10:59:1315/9/09

a py...@googlegroups.com

Hi Whit,

On Tue, Sep 15, 2009 at 3:40 PM, Whit Armstrong <armstro...@gmail.com> wrote:

Does MvNormal make a deep copy of the numpy array? Or can it share
memory with the B_raw_m object? I need the values of B_raw_m to be
updated whenever the B_raw_i's are updated.

The right tool for this job is a deterministic:

@pm.deterministic

def B_raw_m(B_raw = B_raw):

return np.vstack(B_raw)

There's no facility for sharing memory like you want. PyMC needs to count on variables' values being immutable, meaning they won't be changed behind its back. If it goes to reject a jump, it needs to be sure the previous value is still intact.

Does this issue have anything to do with cache_depth?

Caching does make it even more important for values to be immutable, but if it weren't for the jump-rejecting requirement we might have found a workaround.

HTH

Anand

Whit Armstrong

no leída,

15 sept 2009, 11:09:4315/9/09

a py...@googlegroups.com

Thanks, Anand.

My first version of the model used deterministic to do the rbind at
each iteration. but the model seems to run extremely slowly (i.e.
much slower than then bugs model).

I will try to produce some real numbers to look at.

Thanks,
Whit

Anand Patil

no leída,

15 sept 2009, 11:15:4015/9/09

a py...@googlegroups.com

On Tue, Sep 15, 2009 at 4:09 PM, Whit Armstrong <armstro...@gmail.com> wrote:

Thanks, Anand.

My first version of the model used deterministic to do the rbind at
each iteration. but the model seems to run extremely slowly (i.e.
much slower than then bugs model).

If you haven't already discovered it, IPython's profiler is handy for investigating these things:

In [1]: run -p mcmc.py

will give you a list of the time spent in all your functions, including the internal function of B_raw_m.

Anand

Whit Armstrong

no leída,

15 sept 2009, 11:55:3815/9/09

a py...@googlegroups.com

That's fantastic. I'll try the profiler.

here are a few timings:

with M.sample(iter=1e3, burn=10, thin=2)
warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
real 0m55.371s
user 0m55.200s
sys 0m0.160s

with M.sample(iter=20e3, burn=5e3, thin=10)
warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
real 18m18.272s
user 18m15.780s
sys 0m0.400s

This is on an 8 core 64bit amd at 2.6ghz w/ 64GB ram and no other
activity at the moment.

-Whit

Whit Armstrong

no leída,

15 sept 2009, 12:22:2615/9/09

a py...@googlegroups.com

Here are the first 20 or so lines of the profiler output. Nothing really jumps out at me, but I'm not a python guru.

Any feedback appreciated. I'll post this version of the radon model with the other ones on github as well as the data. That way anyone who wants to can tinker w/ this example.

-Whit

77164946 function calls (75081733 primitive calls) in 136.673 CPU seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)
14531855 38.346 0.000 43.273 0.000 shape_base.py:289(atleast_2d)
172243 20.349 0.000 20.349 0.000 radon_inv_wishart.py:77(y_hat)
174797 11.574 0.000 11.574 0.000 distributions.py:1880(normal_like)
330675 7.604 0.000 7.604 0.000 {numpy.core.multiarray.concatenate}
17121012 7.337 0.000 7.337 0.000 PyMCObjects.py:576(get_value)
171080 6.607 0.000 49.890 0.000 {map}
170963 6.049 0.000 12.122 0.000 {method 'run' of 'pymc.Container_values.LCValue' objects}
176000 3.533 0.000 6.270 0.000 StepMethods.py:1104(propose)
2073750/1383355 3.506 0.000 119.696 0.000 {method 'get' of 'pymc.LazyFunction.LazyFunction' objects}
508725 3.107 0.000 3.425 0.000 distributions.py:1682(mv_normal_like)
14905132 2.847 0.000 2.847 0.000 {method 'append' of 'list' objects}
1381725 2.556 0.000 122.130 0.000 PyMCObjects.py:625(get_logp)
176000 2.310 0.000 134.757 0.001 StepMethods.py:1125(step)
1206606/690340 2.299 0.000 100.972 0.000 {method 'run' of 'pymc.Container_values.DCValue' objects}
15070015/15069829 2.266 0.000 2.266 0.000 {len}
172243 2.104 0.000 2.104 0.000 radon_inv_wishart.py:68(B)
692025/347218 1.909 0.000 98.705 0.000 PyMCObjects.py:361(get_value)
356000 1.495 0.000 123.624 0.000 Node.py:14(logp_of_set)
170963 1.378 0.000 58.542 0.000 shape_base.py:402(vstack)
1206606/690340 1.055 0.000 101.601 0.000 Container.py:485(get_value)
178087 0.887 0.000 1.411 0.000 PyMCObjects.py:583(set_value)
516729 0.746 0.000 4.362 0.000 distributions.py:2488(wrapper)
159710 0.640 0.000 1.868 0.000 StepMethods.py:1206(internal_tally)
946962 0.433 0.000 0.433 0.000 {isinstance}
178616 0.418 0.000 0.418 0.000 numeric.py:232(asanyarray)
162999 0.416 0.000 0.416 0.000 numeric.py:180(asarray)
356000 0.401 0.000 124.025 0.000 StepMethods.py:270(_get_logp_plus_loglike)
176867 0.372 0.000 0.397 0.000 {numpy.core._dotblas.dot}
176000 0.348 0.000 0.348 0.000 fromnumeric.py:101(reshape)
685002 0.314 0.000 0.314 0.000 fromnumeric.py:1006(shape)
1 0.268 0.268 135.582 135.582 MCMC.py:187(_loop)
159798 0.240 0.000 0.749 0.000 fromnumeric.py:886(ravel)
172796 0.203 0.000 11.758 0.000 radon_inv_wishart.py:82(y_i)
170963 0.199 0.000 58.741 0.000 radon_inv_wishart.py:52(B_raw_m)
170963 0.178 0.000 12.300 0.000 Container.py:408(get_value)
161295 0.172 0.000 0.172 0.000 {method 'ravel' of 'numpy.ndarray' objects}
176177 0.151 0.000 0.298 0.000 function_base.py:189(iterable)
176181 0.147 0.000 0.147 0.000 {iter}
80701 0.130 0.000 0.250 0.000 StepMethods.py:1201(reject)
517546 0.119 0.000 0.119 0.000 {method 'pop' of 'dict' objects}
176778 0.119 0.000 0.119 0.000 {method 'random_sample' of 'mtrand.RandomState' objects}
5760/193 0.105 0.000 0.517 0.003 Container.py:145(file_items)
82694 0.094 0.000 0.123 0.000 PyMCObjects.py:616(revert)
2052/1112 0.075 0.000 0.367 0.000 Container.py:451(__init__)
1 0.061 0.061 0.069 0.069 backend_tkagg.py:3(<module>)
178877 0.059 0.000 0.059 0.000 PyMCObjects.py:692(_get_observed)
1847 0.056 0.000 0.067 0.000 Container.py:229(sort_list)
352 0.050 0.000 0.196 0.001 StepMethods.py:983(update_cov)
178087 0.049 0.000 0.049 0.000 {method 'click' of 'pymc.LazyFunction.Counter' objects}
19800 0.046 0.000 0.198 0.000 ram.py:81(tally)
1906 0.045 0.000 0.101 0.000 PyMCObjects.py:707(_get_moral_neighbors)
4002 0.037 0.000 0.037 0.000 distributions.py:2286(uniform_like)
4723/4474 0.033 0.000 0.034 0.000 defmatrix.py:239(__array_finalize__)
82694 0.029 0.000 0.029 0.000 {method 'unclick' of 'pymc.LazyFunction.Counter' objects}

Whit Armstrong

no leída,

15 sept 2009, 14:43:0815/9/09

a py...@googlegroups.com

updated timings and package:

pymc 10k iterations, 3k burnin, thin=5

real 9m12.073s

user 9m11.730s

sys 0m0.160s

R/jags 3 chains, 10k iterations, 3k burnin, thin=5

user system elapsed

3.32050 0.05250 3.37625

which is about 3x faster.

I'm not sure what the impact is of nchains=3 for R. If it is really running 10k iterations for each of the 3 chains, then in reality R is 9x faster (since pymc is only running one chain).

I would be very happy to find out that I'm wrong about this test, but for now it seems like pymc is drastically slower than R/bugs/jags for this type of model.

test scripts with data available here:

http://github.com/armstrtw/pymc_radon

to replicate this test:

time python run_radon_inv_wishart.py

R < radon.wishart.R

-Whit

---------- Forwarded message ----------
From: Whit Armstrong <armstro...@gmail.com>
Date: Tue, Sep 15, 2009 at 12:22 PM
Subject: Re: [pymc] Re: Fwd: more gelman radon questions -- scaled inverse wishart
To: py...@googlegroups.com

Chris Fonnesbeck

no leída,

16 sept 2009, 4:14:0416/9/09

a PyMC

Hi Whit,

I'm travelling at the moment, so I am not able to inspect your model,
but the performance of pymc can depend strongly on how you have
specified the model. The best structure in R may not be the best
structure in pymc. Just a hunch. If nobody has the answer before I
return, I will take a look at the first opportunity.

cf

> 176000 3.533 0.000 6.270 0.000 StepMethods.py:1104(propose)2073750/1383355 3.506 0.000 119.696 0.000 {method 'get' of

> 'pymc.LazyFunction.LazyFunction' objects}
> 508725 3.107 0.000 3.425 0.000
> distributions.py:1682(mv_normal_like)
> 14905132 2.847 0.000 2.847 0.000 {method 'append' of 'list'
> objects}
> 1381725 2.556 0.000 122.130 0.000 PyMCObjects.py:625(get_logp)

> 176000 2.310 0.000 134.757 0.001 StepMethods.py:1125(step)1206606/690340 2.299 0.000 100.972 0.000 {method 'run' of

> 'pymc.Container_values.DCValue' objects}
> 15070015/15069829 2.266 0.000 2.266 0.000 {len}

> 172243 2.104 0.000 2.104 0.000 radon_inv_wishart.py:68(B)692025/347218 1.909 0.000 98.705 0.000

> PyMCObjects.py:361(get_value)
> 356000 1.495 0.000 123.624 0.000 Node.py:14(logp_of_set)

> 170963 1.378 0.000 58.542 0.000 shape_base.py:402(vstack)1206606/690340 1.055 0.000 101.601 0.000

> Container.py:485(get_value)
> 178087 0.887 0.000 1.411 0.000 PyMCObjects.py:583(set_value)
> 516729 0.746 0.000 4.362 0.000 distributions.py:2488(wrapper)
> 159710 0.640 0.000 1.868 0.000
> StepMethods.py:1206(internal_tally)
> 946962 0.433 0.000 0.433 0.000 {isinstance}
> 178616 0.418 0.000 0.418 0.000 numeric.py:232(asanyarray)
> 162999 0.416 0.000 0.416 0.000 numeric.py:180(asarray)
> 356000 0.401 0.000 124.025 0.000
> StepMethods.py:270(_get_logp_plus_loglike)
> 176867 0.372 0.000 0.397 0.000 {numpy.core._dotblas.dot}
> 176000 0.348 0.000 0.348 0.000 fromnumeric.py:101(reshape)
> 685002 0.314 0.000 0.314 0.000 fromnumeric.py:1006(shape)

> 1 0.268 0.268 135.582 135.582MCMC.py:187(_loop)

> 159798 0.240 0.000 0.749 0.000 fromnumeric.py:886(ravel)
> 172796 0.203 0.000 11.758 0.000 radon_inv_wishart.py:82(y_i)
> 170963 0.199 0.000 58.741 0.000
> radon_inv_wishart.py:52(B_raw_m)
> 170963 0.178 0.000 12.300 0.000 Container.py:408(get_value)
> 161295 0.172 0.000 0.172 0.000 {method 'ravel' of
> 'numpy.ndarray' objects}
> 176177 0.151 0.000 0.298 0.000 function_base.py:189(iterable)
> 176181 0.147 0.000 0.147 0.000 {iter}
> 80701 0.130 0.000 0.250 0.000 StepMethods.py:1201(reject)
> 517546 0.119 0.000 0.119 0.000 {method 'pop' of 'dict'
> objects}
> 176778 0.119 0.000 0.119 0.000 {method 'random_sample' of
> 'mtrand.RandomState' objects}
> 5760/193 0.105 0.000 0.517 0.003 Container.py:145(file_items)

> 82694 0.094 0.000 0.123 0.000 PyMCObjects.py:616(revert)2052/1112 0.075 0.000 0.367 0.000 Container.py:451(__init__)

> 1 0.061 0.061 0.069 0.069 backend_tkagg.py:3(<module>)
> 178877 0.059 0.000 0.059 0.000
> PyMCObjects.py:692(_get_observed)
> 1847 0.056 0.000 0.067 0.000 Container.py:229(sort_list)
> 352 0.050 0.000 0.196 0.001 StepMethods.py:983(update_cov)
> 178087 0.049 0.000 0.049 0.000 {method 'click' of
> 'pymc.LazyFunction.Counter' objects}
> 19800 0.046 0.000 0.198 0.000 ram.py:81(tally)
> 1906 0.045 0.000 0.101 0.000
> PyMCObjects.py:707(_get_moral_neighbors)
> 4002 0.037 0.000 0.037 0.000

> distributions.py:2286(uniform_like)4723/4474 0.033 0.000 0.034 0.000

> defmatrix.py:239(__array_finalize__)
> 82694 0.029 0.000 0.029 0.000 {method 'unclick' of
> 'pymc.LazyFunction.Counter' objects}
>

> On Tue, Sep 15, 2009 at 11:55 AM, Whit Armstrong <armstrong.w...@gmail.com>

> wrote:
> > That's fantastic. I'll try the profiler.
>
> > here are a few timings:
>
> > with M.sample(iter=1e3, burn=10, thin=2)
> > warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
> > real 0m55.371s
> > user 0m55.200s
> > sys 0m0.160s
>
> > with M.sample(iter=20e3, burn=5e3, thin=10)
> > warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
> > real 18m18.272s
> > user 18m15.780s
> > sys 0m0.400s
>
> > This is on an 8 core 64bit amd at 2.6ghz w/ 64GB ram and no other
> > activity at the moment.
>
> > -Whit
>
> > On Tue, Sep 15, 2009 at 11:15 AM, Anand Patil

> > <anand.prabhakar.pa...@gmail.com> wrote:
> >> On Tue, Sep 15, 2009 at 4:09 PM, Whit Armstrong <armstrong.w...@gmail.com

Anand Patil

no leída,

16 sept 2009, 5:53:5216/9/09

a py...@googlegroups.com

Hi Whit,

PyMC inherits its performance from Python: it lets you do just about anything, but your code won't be particularly fast at first. But, once you have your model working, you can often optimize so that your code is spending the bulk of its time in compiled extensions. At that point it should be performing fine.

The first four lines of your profiler results look pretty juicy:

14531855 38.346 0.000 43.273 0.000 shape_base.py:289(atleast_2d)
172243 20.349 0.000 20.349 0.000 radon_inv_wishart.py:77(y_hat)
174797 11.574 0.000 11.574 0.000 distributions.py:1880(normal_like)
330675 7.604 0.000 7.604 0.000 {numpy.core.multiarray.concatenate}

The second line is a function you wrote, so you can isolate it and tweak it until it's substantially faster. The first and fourth lines might be related to the rbinding deterministic I suggested, as you hypothesized. If you find that that is the case, you could try a custom stochastic:

@pm.stochastic

def B_raw_m(value=..., mu, tau):

sum = 0

for row in value:

sum += pm.mv_normal_like(row, mu, tau)

return sum

or you might try updating all the parents of the existing B_raw_m jointly using AdaptiveMetropolis.

The third line is a compiled extension, like most of PyMC's log-probability functions, so the function itself is about as fast as it's going to get. However, if your model has multiple scalar-valued normals, you can probably get significant speedups by combining them into a single vector-valued variable and/or updating them jointly using AdaptiveMetropolis.

If all else fails, you can use f2py to write your own custom compiled extension. It's really not hard: you write a subroutine in Fortran, then do

f2py -c fortran_file.f -m fortran_module.

Then from Python you can do

import fortran_module

to see how to call the new function.

So you have lots of options open for optimizing the model & algorithm. Sometimes you can get parity with MCMC package X and sometimes you can't, but you can usually get decent performance. :)

HTH,

Anand

Whit Armstrong

no leída,

16 sept 2009, 10:09:2916/9/09

a py...@googlegroups.com

Anand,

Thanks for the feedback.

y_hat is simply the regression function. being new to python, I'm not sure I defined it in an ideal way:

# Model

@pymc.deterministic(plot=False)

def y_hat(B=B):

return (B[index_c,] *X).sum(axis=1)

since B is a J x K matrix (where J is the number of counties), we have to index it w/ index_c (the county index).

I haven't looked at AdaptiveMetropolis yet. I'll have to do some reading up on it.

A big question is whether "shape_base.py:289(atleast_2d)" is coming from "np.vstack(B_raw)." If it is, then that would indicate that rbinding the coefs at each step is a big slowdown.

I haven't looked at the internals of pymc yet, but I'm curious about the shared memory question. In the case of this model B is always "read only." It is only used in the matrix multiplication. If the MvNormal value parameter allowed the memory to be externally managed (i.e. existing in a numpy array). Then rbinding the coefs would be unnecessary. I'm sure you all have been down this road before, so apologies in advance for bringing up this topic.

Here was my original code (which you already correctly pointed out will not work) which assumed that the "value" parameter could be an external array:

B_raw = []
B_raw_m = np.zeros( (J,K) )
for i in range(J):
B_raw.append(pymc.MvNormal('B_raw_%i' % i, mu_raw, Tau_B_raw, value=B_raw_m[i]))

-Whit

Responder a todos

Responder al autor

Reenviar