more gelman radon questions -- scaled inverse wishart

Visto 134 veces
Saltar al primer mensaje no leído

Whit Armstrong

no leída,
14 sept 2009, 15:19:0714/9/09
a py...@googlegroups.com
Moving on to the next example in the arm book, I'm baffled about how
to define a multivariate normal distribution.

In bugs it's: B.raw[j,1:K] ~ dmnorm(mu.raw[], Tau.B.raw[,])

but I can't seem to find an equivalent construct in pymc.

Here is the bugs file:

model {
for (i in 1:n){
y[i] ~ dnorm(y.hat[i], tau.y)
y.hat[i] <- inprod(B[county[i],],X[i,])
e.y[i] <- y[i] - y.hat[i]
}
tau.y <- pow(sigma.y, -2)
sigma.y ~ dunif(0, 100)

for (j in 1:J) {
for(k in 1:K) {
B[j,k] <- xi[k] * B.raw[j,k]
}
B.raw[j,1:K] ~ dmnorm(mu.raw[], Tau.B.raw[,])
}
for(k in 1:K) {
mu[k] <- xi[k] * mu.raw[k]
mu.raw[k] ~ dnorm(0,.0001)
xi[k] ~ dunif(0,100)
}
Tau.B.raw[1:K,1:K] ~ dwish(W[,],df)
df <- K + 1
Sigma.B.raw[1:K,1:K] <- inverse(Tau.B.raw[,])
for(k in 1:K) {
for(k.prime in 1:K) {
rho.B[k,k.prime] <- Sigma.B.raw[k,k.prime] /
sqrt(Sigma.B.raw[k,k] * Sigma.B.raw[k.prime,k.prime])
}
sigma.B[k] <- abs(xi[k]) * sqrt(Sigma.B.raw[k,k])
}
}

and what I have so far:

# Priors
sigma_y = pymc.Uniform('sigma_y', lower=0, upper=100)
tau_y = pymc.Lambda('tau_y', lambda s=sigma_y: s**-2)

xi = pymc.Uniform('xi', lower=0, upper=100, value=np.zeros(k))
mu_raw = pymc.Normal('mu_raw', mu=0., tau=0.0001, value=np.zeros(k))
Tau_B_raw = pymc.Wishart('Tau_B_raw', df, Tau=np.diag(np.ones(k)))

## FIXME:
B_raw = pymc.Normal('B_raw',mu_raw,Tau_B_raw)

@pymc.deterministic(plot=False)
def Sigma_B_raw(Tau_B_raw=Tau_B_raw):
inverse(Tau_B_raw)

@pymc.deterministic(plot=False)
def rho_B(Sigma_B_raw=Sigma_B_raw):
Sigma_B_raw / sqrt(diag(Sigma_B_raw) * Sigma_B_raw)

@pymc.deterministic(plot=False)
def Sigma_B(xi=xi,Sigma_B_raw=Sigma_B_raw):
abs(xi) * sqrt(diag(Sigma_B_raw))

@pymc.deterministic(plot=False)
def B(xi=xi, B_raw=B_raw):
dot(xi, B_raw)

@pymc.deterministic(plot=False)
def mu(xi=xi, mu_raw=mu_raw):
xi * mu_raw


Thanks,
Whit

Whit Armstrong

no leída,
15 sept 2009, 10:40:2415/9/09
a py...@googlegroups.com
I found the MvNormal function, but I still can't make this example work in pymc.

I'm struggling with the definition of B_raw:

B_raw = []
B_raw_m = np.zeros( (J,K) )
for i in range(J):
B_raw.append(pymc.MvNormal('B_raw_%i' % i, mu_raw, Tau_B_raw,
value=B_raw_m[i]))

I need to use the coefficients as a contiguous array. Hence, I
initialize B_raw_m as a numpy array and then pass each row of the
array as the value argument to the MvNormal function.

Does MvNormal make a deep copy of the numpy array? Or can it share
memory with the B_raw_m object? I need the values of B_raw_m to be
updated whenever the B_raw_i's are updated.

Does this issue have anything to do with cache_depth?

Thanks,
Whit

Anand Patil

no leída,
15 sept 2009, 10:59:1315/9/09
a py...@googlegroups.com
Hi Whit,

On Tue, Sep 15, 2009 at 3:40 PM, Whit Armstrong <armstro...@gmail.com> wrote: 
Does MvNormal make a deep copy of the numpy array?  Or can it share
memory with the B_raw_m object?  I need the values of B_raw_m to be
updated whenever the B_raw_i's are updated.

The right tool for this job is a deterministic:

@pm.deterministic
def B_raw_m(B_raw = B_raw):
    return np.vstack(B_raw)

There's no facility for sharing memory like you want. PyMC needs to count on variables' values being immutable, meaning they won't be changed behind its back. If it goes to reject a jump, it needs to be sure the previous value is still intact.

Does this issue have anything to do with cache_depth?

Caching does make it even more important for values to be immutable, but if it weren't for the jump-rejecting requirement we might have found a workaround.

HTH 
Anand

Whit Armstrong

no leída,
15 sept 2009, 11:09:4315/9/09
a py...@googlegroups.com
Thanks, Anand.

My first version of the model used deterministic to do the rbind at
each iteration. but the model seems to run extremely slowly (i.e.
much slower than then bugs model).

I will try to produce some real numbers to look at.

Thanks,
Whit

Anand Patil

no leída,
15 sept 2009, 11:15:4015/9/09
a py...@googlegroups.com
On Tue, Sep 15, 2009 at 4:09 PM, Whit Armstrong <armstro...@gmail.com> wrote:

Thanks, Anand.

My first version of the model used deterministic to do the rbind at
each iteration.  but the model seems to run extremely slowly (i.e.
much slower than then bugs model).

If you haven't already discovered it, IPython's profiler is handy for investigating these things:

In  [1]: run -p mcmc.py

will give you a list of the time spent in all your functions, including the internal function of B_raw_m.

Anand

Whit Armstrong

no leída,
15 sept 2009, 11:55:3815/9/09
a py...@googlegroups.com
That's fantastic. I'll try the profiler.

here are a few timings:

with M.sample(iter=1e3, burn=10, thin=2)
warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
real 0m55.371s
user 0m55.200s
sys 0m0.160s

with M.sample(iter=20e3, burn=5e3, thin=10)
warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
real 18m18.272s
user 18m15.780s
sys 0m0.400s

This is on an 8 core 64bit amd at 2.6ghz w/ 64GB ram and no other
activity at the moment.

-Whit

Whit Armstrong

no leída,
15 sept 2009, 12:22:2615/9/09
a py...@googlegroups.com
Here are the first 20 or so lines of the profiler output.  Nothing really jumps out at me, but I'm not a python guru.

Any feedback appreciated.  I'll post this version of the radon model with the other ones on github as well as the data.  That way anyone who wants to can tinker w/ this example.

-Whit

        77164946 function calls (75081733 primitive calls) in 136.673 CPU seconds

  Ordered by: internal time

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
14531855   38.346    0.000   43.273    0.000 shape_base.py:289(atleast_2d)
  172243   20.349    0.000   20.349    0.000 radon_inv_wishart.py:77(y_hat)
  174797   11.574    0.000   11.574    0.000 distributions.py:1880(normal_like)
  330675    7.604    0.000    7.604    0.000 {numpy.core.multiarray.concatenate}
17121012    7.337    0.000    7.337    0.000 PyMCObjects.py:576(get_value)
  171080    6.607    0.000   49.890    0.000 {map}
  170963    6.049    0.000   12.122    0.000 {method 'run' of 'pymc.Container_values.LCValue' objects}
  176000    3.533    0.000    6.270    0.000 StepMethods.py:1104(propose)
2073750/1383355    3.506    0.000  119.696    0.000 {method 'get' of 'pymc.LazyFunction.LazyFunction' objects}
  508725    3.107    0.000    3.425    0.000 distributions.py:1682(mv_normal_like)
14905132    2.847    0.000    2.847    0.000 {method 'append' of 'list' objects}
 1381725    2.556    0.000  122.130    0.000 PyMCObjects.py:625(get_logp)
  176000    2.310    0.000  134.757    0.001 StepMethods.py:1125(step)
1206606/690340    2.299    0.000  100.972    0.000 {method 'run' of 'pymc.Container_values.DCValue' objects}
15070015/15069829    2.266    0.000    2.266    0.000 {len}
  172243    2.104    0.000    2.104    0.000 radon_inv_wishart.py:68(B)
692025/347218    1.909    0.000   98.705    0.000 PyMCObjects.py:361(get_value)
  356000    1.495    0.000  123.624    0.000 Node.py:14(logp_of_set)
  170963    1.378    0.000   58.542    0.000 shape_base.py:402(vstack)
1206606/690340    1.055    0.000  101.601    0.000 Container.py:485(get_value)
  178087    0.887    0.000    1.411    0.000 PyMCObjects.py:583(set_value)
  516729    0.746    0.000    4.362    0.000 distributions.py:2488(wrapper)
  159710    0.640    0.000    1.868    0.000 StepMethods.py:1206(internal_tally)
  946962    0.433    0.000    0.433    0.000 {isinstance}
  178616    0.418    0.000    0.418    0.000 numeric.py:232(asanyarray)
  162999    0.416    0.000    0.416    0.000 numeric.py:180(asarray)
  356000    0.401    0.000  124.025    0.000 StepMethods.py:270(_get_logp_plus_loglike)
  176867    0.372    0.000    0.397    0.000 {numpy.core._dotblas.dot}
  176000    0.348    0.000    0.348    0.000 fromnumeric.py:101(reshape)
  685002    0.314    0.000    0.314    0.000 fromnumeric.py:1006(shape)
       1    0.268    0.268  135.582  135.582 MCMC.py:187(_loop)
  159798    0.240    0.000    0.749    0.000 fromnumeric.py:886(ravel)
  172796    0.203    0.000   11.758    0.000 radon_inv_wishart.py:82(y_i)
  170963    0.199    0.000   58.741    0.000 radon_inv_wishart.py:52(B_raw_m)
  170963    0.178    0.000   12.300    0.000 Container.py:408(get_value)
  161295    0.172    0.000    0.172    0.000 {method 'ravel' of 'numpy.ndarray' objects}
  176177    0.151    0.000    0.298    0.000 function_base.py:189(iterable)
  176181    0.147    0.000    0.147    0.000 {iter}
   80701    0.130    0.000    0.250    0.000 StepMethods.py:1201(reject)
  517546    0.119    0.000    0.119    0.000 {method 'pop' of 'dict' objects}
  176778    0.119    0.000    0.119    0.000 {method 'random_sample' of 'mtrand.RandomState' objects}
5760/193    0.105    0.000    0.517    0.003 Container.py:145(file_items)
   82694    0.094    0.000    0.123    0.000 PyMCObjects.py:616(revert)
2052/1112    0.075    0.000    0.367    0.000 Container.py:451(__init__)
       1    0.061    0.061    0.069    0.069 backend_tkagg.py:3(<module>)
  178877    0.059    0.000    0.059    0.000 PyMCObjects.py:692(_get_observed)
    1847    0.056    0.000    0.067    0.000 Container.py:229(sort_list)
     352    0.050    0.000    0.196    0.001 StepMethods.py:983(update_cov)
  178087    0.049    0.000    0.049    0.000 {method 'click' of 'pymc.LazyFunction.Counter' objects}
   19800    0.046    0.000    0.198    0.000 ram.py:81(tally)
    1906    0.045    0.000    0.101    0.000 PyMCObjects.py:707(_get_moral_neighbors)
    4002    0.037    0.000    0.037    0.000 distributions.py:2286(uniform_like)
4723/4474    0.033    0.000    0.034    0.000 defmatrix.py:239(__array_finalize__)
   82694    0.029    0.000    0.029    0.000 {method 'unclick' of 'pymc.LazyFunction.Counter' objects}

Whit Armstrong

no leída,
15 sept 2009, 14:43:0815/9/09
a py...@googlegroups.com
updated timings and package:

pymc 10k iterations, 3k burnin, thin=5
real    9m12.073s
user    9m11.730s
sys     0m0.160s

R/jags 3 chains, 10k iterations, 3k burnin, thin=5
   user  system elapsed 
3.32050 0.05250 3.37625 

which is about 3x faster.

I'm not sure what the impact is of nchains=3 for R. If it is really running 10k iterations for each of the 3 chains, then in reality R is 9x faster (since pymc is only running one chain).

I would be very happy to find out that I'm wrong about this test, but for now it seems like pymc is drastically slower than R/bugs/jags for this type of model.

test scripts with data available here:

to replicate this test:
time python run_radon_inv_wishart.py
R < radon.wishart.R

-Whit


---------- Forwarded message ----------
From: Whit Armstrong <armstro...@gmail.com>
Date: Tue, Sep 15, 2009 at 12:22 PM
Subject: Re: [pymc] Re: Fwd: more gelman radon questions -- scaled inverse wishart
To: py...@googlegroups.com


Chris Fonnesbeck

no leída,
16 sept 2009, 4:14:0416/9/09
a PyMC
Hi Whit,

I'm travelling at the moment, so I am not able to inspect your model,
but the performance of pymc can depend strongly on how you have
specified the model. The best structure in R may not be the best
structure in pymc. Just a hunch. If nobody has the answer before I
return, I will take a look at the first opportunity.

cf
>   176000    3.533    0.000    6.270    0.000 StepMethods.py:1104(propose)2073750/1383355   3.506    0.000  119.696    0.000 {method 'get' of
> 'pymc.LazyFunction.LazyFunction' objects}
>   508725    3.107    0.000    3.425    0.000
> distributions.py:1682(mv_normal_like)
> 14905132    2.847    0.000    2.847    0.000 {method 'append' of 'list'
> objects}
>  1381725   2.556    0.000  122.130    0.000 PyMCObjects.py:625(get_logp)
>   176000    2.310    0.000  134.757    0.001 StepMethods.py:1125(step)1206606/690340   2.299    0.000  100.972    0.000 {method 'run' of
> 'pymc.Container_values.DCValue' objects}
> 15070015/15069829    2.266    0.000    2.266    0.000 {len}
>   172243    2.104    0.000    2.104    0.000 radon_inv_wishart.py:68(B)692025/347218   1.909    0.000   98.705    0.000
> PyMCObjects.py:361(get_value)
>   356000    1.495    0.000  123.624    0.000 Node.py:14(logp_of_set)
>   170963    1.378    0.000   58.542    0.000 shape_base.py:402(vstack)1206606/690340   1.055    0.000  101.601    0.000
> Container.py:485(get_value)
>   178087    0.887    0.000    1.411    0.000 PyMCObjects.py:583(set_value)
>   516729    0.746    0.000    4.362    0.000 distributions.py:2488(wrapper)
>   159710    0.640    0.000    1.868    0.000
> StepMethods.py:1206(internal_tally)
>   946962    0.433    0.000    0.433    0.000 {isinstance}
>   178616    0.418    0.000    0.418    0.000 numeric.py:232(asanyarray)
>   162999    0.416    0.000    0.416    0.000 numeric.py:180(asarray)
>   356000    0.401    0.000  124.025    0.000
> StepMethods.py:270(_get_logp_plus_loglike)
>   176867    0.372    0.000    0.397    0.000 {numpy.core._dotblas.dot}
>   176000    0.348    0.000    0.348    0.000 fromnumeric.py:101(reshape)
>   685002    0.314    0.000    0.314    0.000 fromnumeric.py:1006(shape)
>        1    0.268    0.268  135.582  135.582MCMC.py:187(_loop)
>   159798    0.240    0.000    0.749    0.000 fromnumeric.py:886(ravel)
>   172796    0.203    0.000   11.758    0.000 radon_inv_wishart.py:82(y_i)
>   170963    0.199    0.000   58.741    0.000
> radon_inv_wishart.py:52(B_raw_m)
>   170963    0.178    0.000   12.300    0.000 Container.py:408(get_value)
>   161295    0.172    0.000    0.172    0.000 {method 'ravel' of
> 'numpy.ndarray' objects}
>   176177    0.151    0.000    0.298    0.000 function_base.py:189(iterable)
>   176181    0.147    0.000    0.147    0.000 {iter}
>    80701    0.130    0.000    0.250    0.000 StepMethods.py:1201(reject)
>   517546    0.119    0.000    0.119    0.000 {method 'pop' of 'dict'
> objects}
>   176778    0.119    0.000    0.119    0.000 {method 'random_sample' of
> 'mtrand.RandomState' objects}
> 5760/193    0.105    0.000    0.517    0.003 Container.py:145(file_items)
>    82694    0.094    0.000    0.123    0.000 PyMCObjects.py:616(revert)2052/1112   0.075    0.000    0.367    0.000 Container.py:451(__init__)
>        1    0.061    0.061    0.069    0.069 backend_tkagg.py:3(<module>)
>   178877    0.059    0.000    0.059    0.000
> PyMCObjects.py:692(_get_observed)
>     1847    0.056    0.000    0.067    0.000 Container.py:229(sort_list)
>      352    0.050    0.000    0.196    0.001 StepMethods.py:983(update_cov)
>   178087    0.049    0.000    0.049    0.000 {method 'click' of
> 'pymc.LazyFunction.Counter' objects}
>    19800    0.046    0.000    0.198    0.000 ram.py:81(tally)
>     1906    0.045    0.000    0.101    0.000
> PyMCObjects.py:707(_get_moral_neighbors)
>     4002    0.037    0.000    0.037    0.000
> distributions.py:2286(uniform_like)4723/4474   0.033    0.000    0.034    0.000
> defmatrix.py:239(__array_finalize__)
>    82694    0.029    0.000    0.029    0.000 {method 'unclick' of
> 'pymc.LazyFunction.Counter' objects}
>
> On Tue, Sep 15, 2009 at 11:55 AM, Whit Armstrong <armstrong.w...@gmail.com>
> wrote:
> > That's fantastic.  I'll try the profiler.
>
> > here are a few timings:
>
> > with M.sample(iter=1e3, burn=10, thin=2)
> > warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
> > real    0m55.371s
> > user    0m55.200s
> > sys     0m0.160s
>
> > with M.sample(iter=20e3, burn=5e3, thin=10)
> > warmstrong@research:~/scripts/radon$ time python run_radon_inv_wishart.py
> > real    18m18.272s
> > user    18m15.780s
> > sys     0m0.400s
>
> > This is on an 8 core 64bit amd at 2.6ghz w/ 64GB ram and no other
> > activity at the moment.
>
> > -Whit
>
> > On Tue, Sep 15, 2009 at 11:15 AM, Anand Patil
> > <anand.prabhakar.pa...@gmail.com> wrote:
> >> On Tue, Sep 15, 2009 at 4:09 PM, Whit Armstrong <armstrong.w...@gmail.com

Anand Patil

no leída,
16 sept 2009, 5:53:5216/9/09
a py...@googlegroups.com
Hi Whit,


PyMC inherits its performance from Python: it lets you do just about anything, but your code won't be particularly fast at first. But, once you have your model working, you can often optimize so that your code is spending the bulk of its time in compiled extensions. At that point it should be performing fine.


The first four lines of your profiler results look pretty juicy:

14531855   38.346    0.000   43.273    0.000 shape_base.py:289(atleast_2d)
  172243   20.349    0.000   20.349    0.000 radon_inv_wishart.py:77(y_hat)
  174797   11.574    0.000   11.574    0.000 distributions.py:1880(normal_like)
  330675    7.604    0.000    7.604    0.000 {numpy.core.multiarray.concatenate}

The second line is a function you wrote, so you can isolate it and tweak it until it's substantially faster. The first and fourth lines might be related to the rbinding deterministic I suggested, as you hypothesized. If you find that that is the case, you could try a custom stochastic:

@pm.stochastic
def B_raw_m(value=..., mu, tau):
    sum = 0
    for row in value:
        sum += pm.mv_normal_like(row, mu, tau)
    return sum

or you might try updating all the parents of the existing B_raw_m jointly using AdaptiveMetropolis.

The third line is a compiled extension, like most of PyMC's log-probability functions, so the function itself is about as fast as it's going to get. However, if your model has multiple scalar-valued normals, you can probably get significant speedups by combining them into a single vector-valued variable and/or updating them jointly using AdaptiveMetropolis.


If all else fails, you can use f2py to write your own custom compiled extension. It's really not hard: you write a subroutine in Fortran, then do 

f2py -c fortran_file.f -m fortran_module.

Then from Python you can do 

import fortran_module 

to see how to call the new function.


So you have lots of options open for optimizing the model & algorithm. Sometimes you can get parity with MCMC package X and sometimes you can't, but you can usually get decent performance. :)


HTH,
Anand

Whit Armstrong

no leída,
16 sept 2009, 10:09:2916/9/09
a py...@googlegroups.com
Anand,

Thanks for the feedback.

y_hat is simply the regression function.  being new to python, I'm not sure I defined it in an ideal way:

# Model
@pymc.deterministic(plot=False)
def y_hat(B=B):
    return (B[index_c,] *X).sum(axis=1)

since B is a J x K matrix (where J is the number of counties), we have to index it w/ index_c (the county index).

I haven't looked at AdaptiveMetropolis yet. I'll have to do some reading up on it.

A  big question is whether "shape_base.py:289(atleast_2d)" is coming from "np.vstack(B_raw)."  If it is, then that would indicate that rbinding the coefs at each step is a big slowdown.

I haven't looked at the internals of pymc yet, but I'm curious about the shared memory question.  In the case of this model B  is always "read only."  It is only used in the matrix multiplication.  If the MvNormal value parameter allowed the memory to be externally managed (i.e. existing in a numpy array). Then rbinding the coefs would be unnecessary.  I'm sure you all have been down this road before, so apologies in advance for bringing up this topic.

Here was my original code (which you already correctly pointed out will not work) which assumed that the "value" parameter could be an external array:

B_raw = []
B_raw_m = np.zeros( (J,K) )
for i in range(J):
   B_raw.append(pymc.MvNormal('B_raw_%i' % i, mu_raw, Tau_B_raw, value=B_raw_m[i]))

-Whit
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos