Straight-line fitting

236 views
Skip to first unread message

Todd

unread,
Jul 3, 2009, 6:34:36 PM7/3/09
to PyMC
Hi,

I've been working on implementing the straight-line fitting model
described by Brandon Kelly (http://xxx.lanl.gov/abs/0705.2774) in
PyMC. The model is designed to fit a straight line in the case where
there are measurement errors in both the independent and dependent
variables and intrinsic scatter in the underlying relationship. The
underlying relationship is eta_i = alpha + beta*xi_i + epsilon, where
i indexes the data points and epsilon has a Normal distribution with
mean = 0 and precision = tau. The observed data are xdata_i = xi_i +
epsilon_xi and ydata_i = eta_i + epsilon_eta, where epsilon_xi and
epsilon_eta have Normal distributions with means = 0 and *known*
precisions tau_x and tau_y. (Kelly's model also includes correlation
between the measurement errors, but I'm going to ignore that for the
sake of simplicity.) The priors for alpha, beta, and tau are Normal,
Normal, and Gamma, respectively. The prior for the distribution of xi
is also Normal. (Interestingly, the naive flat prior for the
distribution of xi leads to results identical to ordinary least
squares, i.e., underestimate of the slope and overestimate of the
intercept.)

My problem is that often (but not absolutely always) the MCMC sampling
favors very high values of tau (i.e., no intrinsic scatter). I've run
the same model with JAGS, and I never see the same issue. Part of the
difference may be that JAGS uses a different sampler for tau
(ConjugateGamma for JAGS vs Metropolis for PyMC), but otherwise I'm
not sure what's going on or how to fix it. The only other clue I've
discovered is that the acceptance ratio for the posterior samples of
tau becomes quite large (~0.8). So, I was wondering if any of you
folks have some suggestions. I've appended a simple Python script to
help others reproduce my results. The chain often needs to be run for
several hundred thousand iterations to see tau wander off to very high
values.

Thanks,

Todd

++++++++

import pymc

# "True" values of the parameters
n_data = 100
intercept = 15.
slope = 1.5
sigma = 8.
sigma_x = 8.
sigma_y = 8.

# Generate simulated data
xi_true = 25. + np.random.randn(n_data)*15.
eta_true = intercept + slope*xi_true + np.random.randn(n_data)*sigma

x_data = xi_true + np.random.randn(n_data)*sigma_x
y_data = eta_true + np.random.randn(n_data)*sigma_y

# Initial values
beta0, alpha0 = np.polyfit(x_data, y_data, 1)
var0 = ((alpha0 + beta0*x_data - y_data)**2.).sum()/n_data
tau0 = 1./var0

tau_x = 1./sigma_x**2.
tau_y = 1./sigma_y**2.

# Intercept, slope, and intrinsic scatter
alpha = pymc.Normal('alpha', 0., 1.e-6, value=alpha0)
beta = pymc.Normal('beta', 0., 1.e-6, value=beta0)
tau = pymc.Gamma('tau', 0.001, 0.001, value=tau0)

# Model for xi
mu_xi = pymc.Normal('mu_xi', 0., 1.e-6, value=x_data.mean())
tau_xi = pymc.Gamma('tau_xi', 0.001, 0.001, value=1./x_data.var())
xi = pymc.Normal('xi', mu_xi, tau_xi, value=x_data)

@pymc.deterministic
def mu(xi=xi, alpha=alpha, beta=beta):
return alpha + beta*xi

eta = pymc.Normal('eta', mu, tau)
x = pymc.Normal('x', xi, tau_x, value=x_data, observed=True)
y = pymc.Normal('y', eta, tau_y, value=y_data, observed=True)

M = pymc.MCMC([alpha, beta, tau, mu, mu_xi, tau_xi, xi, eta, x, y])
M.sample(500000, 10000, 10)

Chris Fonnesbeck

unread,
Jul 3, 2009, 8:31:35 PM7/3/09
to PyMC
On Jul 4, 10:34 am, Todd <todd_sm...@mac.com> wrote:
>
> My problem is that often (but not absolutely always) the MCMC sampling
> favors very high values of tau (i.e., no intrinsic scatter). I've run
> the same model with JAGS, and I never see the same issue. Part of the
> difference may be that JAGS uses a different sampler for tau
> (ConjugateGamma for JAGS vs Metropolis for PyMC), but otherwise I'm
> not sure what's going on or how to fix it. The only other clue I've
> discovered is that the acceptance ratio for the posterior samples of
> tau becomes quite large (~0.8). So, I was wondering if any of you
> folks have some suggestions. I've appended a simple Python script to
> help others reproduce my results. The chain often needs to be run for
> several hundred thousand iterations to see tau wander off to very high
> values.
>

Hi Todd,

I've just run your model on the most recent build of PyMC from the SVN/
Git development tree, and do not see the behaviour that you describe.
Tau settles in around 0.02 after around 100K iterations (discarding
the first 90K as burn-in):

In [6]: M.tau.stats()
Out[6]:
{'95% HPD interval': array([ 0.00664827, 0.04215601]),
'mc error': 0.00097687919725893656,
'mean': 0.021006442661547119,
'n': 10000,
'quantiles': {2.5: 0.0082983481628378288,
25: 0.012933593533960295,
50: 0.018103168505776854,
75: 0.026848016463369595,
97.5: 0.047884155210411714},
'standard deviation': 0.010621063097699369}


What version of PyMC are you using -- the 2.0 release? And on what
platform?

cf

Todd

unread,
Jul 3, 2009, 8:43:41 PM7/3/09
to PyMC


On Jul 3, 5:31 pm, Chris Fonnesbeck <fonnesb...@maths.otago.ac.nz>
wrote:
Hi Chris,

I'm using PyMC 2.0rc2 on a Macintosh.

Have you tried several runs with > 500,000 iterations? As I mentioned
in my first message, I usually but not always see the behavior, and
when I do, it after 100,000 or more iterations. I'll give the SVN
version a try.

I forgot to say in my first message how impressed I am with the PyMC
package. You and your co-developers have done a great job! I also
appreciate the comprehensive documentation.

Todd

Chris Fonnesbeck

unread,
Jul 4, 2009, 7:58:42 PM7/4/09
to PyMC
On Jul 4, 12:43 pm, Todd <todd_sm...@mac.com> wrote:

> Have you tried several runs with > 500,000 iterations?  As I mentioned
> in my first message, I usually but not always see the behavior, and
> when I do, it after 100,000 or more iterations.  I'll give the SVN
> version a try.
>
> I forgot to say in my first message how impressed I am with the PyMC
> package.  You and your co-developers have done a great job!  I also
> appreciate the comprehensive documentation.
>

Right, I got the same thing when I ran it out past 400K iterations. My
guess is that some of the stochastics are correlated. So, one solution
is to throw the entire set of stochastics into an adaptive metropolis
sampler, which takes into account the covariance among them. For
example:

theta = [alpha, beta, tau, mu, mu_xi, tau_xi, xi, eta, x, y]
M = pymc.MCMC(theta)
# Force the use of AdaptiveMetropolis step method
M.use_step_method(pymc.AdaptiveMetropolis, theta)
M.isample(500000, 400000, verbose=2, thin=10)

This appears to have done the trick for tau:

http://qkpic.com/a0e07

It may be worth diagnosing exactly which parameters are correlated, of
course, rather than forcing everything into the same step method.

cf

cf

Todd

unread,
Jul 6, 2009, 2:45:04 PM7/6/09
to PyMC


On Jul 4, 4:58 pm, Chris Fonnesbeck <fonnesb...@maths.otago.ac.nz>
wrote:
Unfortunately, while using the AdaptiveMetropolis for all parameters
keeps tau from becoming unreasonably large, it also yields incorrect
values for the alpha and beta parameters. In fact, values for alpha
and beta that are inferred are very similar to the incorrect ordinary
least squares values. If I use AdaptiveMetropolis for just alpha,
beta, and tau, then the alpha and beta values are correct, but tau
will often wander off to very high values. Here's a plot summarizing
the results of running 50 simulations (250,000 iterations with 100,000
iteration burn-in and thinning by 10), 25 with AdaptiveMetropolis for
all variables and 25 with AdaptiveMetropolis for only alpha, beta, and
tau:

http://files.getdropbox.com/u/647515/PyMC/Adaptive.png

The gray and black circles show the OLS estimates of alpha and beta,
while the pink and red circles show the medians of the PyMC posterior
distributions for alpha and beta using AdaptiveMetropolis for all
variables and AdaptiveMetropolis for just the alpha, beta, and tau.
The blue lines mark the true values of the parameters in the
simulations.

I've also made a plot showing the estimates of tau, along with 95%
credible intervals, for the two AdaptiveMetropolis schemes:

http://files.getdropbox.com/u/647515/PyMC/AdaptiveTau.png

The values of tau when using AdaptiveMetropolis for all variables
(shown in light blue) are all below the true value, which is shown in
red. AdaptiveMetropolis for only alpha, beta, and tau yields
reasonable values of tau some of the time (dark blue), but more often
gives very large values.

Finally, I've run 200 simulations with the standard Metropolis sampler
(again 250,000 iterations with 100,000 iteration burn-in and thinning
by 10). The values of tau have a bimodal distribution:

http://files.getdropbox.com/u/647515/PyMC/TauDistribution.png

One of the modes is in the right spot (shown by the red vertical
line), but the majority of the simulations give values of tau that are
too large.

I think that my next step will be to verify that the correct value of
tau emerges if there are *no* measurement errors in the independent
variable.

Todd

Todd

unread,
Jul 6, 2009, 2:46:51 PM7/6/09
to PyMC
Oops. I forgot to mention that all the results I described in my last
message were obtained using the SVN version of PyMC.

Todd

David Huard

unread,
Jul 6, 2009, 4:20:58 PM7/6/09
to py...@googlegroups.com
Todd,

I suggest you start by assuming you know the parameters of the error distributions and estimate the slope, intercept and true xi. You'll have a better idea of what's going on.

Also, don't forget to add the prior for the true xi. This can make all the difference.

I'll try to look into it further tonight.

Cheers,

David

Todd

unread,
Jul 8, 2009, 2:41:20 PM7/8/09
to PyMC


On Jul 6, 1:20 pm, David Huard <david.hu...@gmail.com> wrote:
> Todd,
>
> I suggest you start by assuming you know the parameters of the error
> distributions and estimate the slope, intercept and true xi. You'll have a
> better idea of what's going on.
>
> Also, don't forget to add the prior for the true xi. This can make all the
> difference.
>
> I'll try to look into it further tonight.
>
> Cheers,
>
> David
>
>

I followed David's suggestion to explore the model with x, y, and
intrinsic errors known. As I mentioned in my first post, I use a
Gaussian prior for xi. I ran 25 simulations, and the results for the
slope ("beta") and intercept ("alpha") were just fine.

I've also looked into a more-or-less standard regression model with no
errors in x, known measurement errors in y, and unknown intrinsic
scatter (sigma = 8). I ran 100 simulations (500,000 iterations,
100,000 burn-in, thin=10), and I noticed that 11 of the 100
simulations yielded estimates of the intrinsic scatter that were too
small:

http://files.getdropbox.com/u/647515/PyMC/sigma1.png

I then ran 50 simulations using the x and y data from one of the
previous 100 simulations that gave an unreasonably low value for
sigma. 43 of the 50 simulations got the value of sigma correct, but 7
did not:

http://files.getdropbox.com/u/647515/PyMC/sigma2.png

Given that I'm running such long chains (500,000 iterations) with long
burn-ins (100,000 iterations), I'm surprised to see a few simulations
with sigma so far off. In case someone would like to try to reproduce
my results, here's the x and y data I used:

http://files.getdropbox.com/u/647515/PyMC/XY_data.txt

The model I used is appended below.

Thanks,

Todd

sigma_y = 8.

beta0, alpha0 = np.polyfit(x_data, y_data, 1)
var0 = ((alpha0 + beta0*x_data - y_data)**2.).sum()/n_data

alpha = pymc.Normal('alpha', alpha0, 1.e-4, value=alpha0) #
Intercept
beta = pymc.Normal('beta', beta0, 1.e-4, value=beta0) #
Slope
sigma = pymc.Uniform('sigma', 0., 100., value=np.sqrt(var0)) #
Scatter

@pymc.deterministic
def mu(x_data=x_data, alpha=alpha, beta=beta):
return alpha + beta*x_data

@pymc.deterministic
def tau(sigma=sigma):
return 1./sigma**2.

eta = pymc.Normal('eta', mu, tau)
y = pymc.Normal('y', eta, 1./sigma_y**2., value=y_data,
observed=True)

M = pymc.MCMC([alpha, beta, sigma, eta, y])
M.sample(500000, 100000, 10)


Anand Patil

unread,
Jul 8, 2009, 8:11:21 PM7/8/09
to py...@googlegroups.com
Todd,


It _looks_ like there's an absorbing state. If eta happens to line up
perfectly with alpha + beta * xi, it's preferable for tau to be large.
And if tau is large, it's preferable for eta to line up with alpha +
beta * xi.


You can treat the symptoms by integrating eta out of the model. Remove
eta, and replace y with

y = pymc.Normal('y', mu, 1/(1/tau + 1/tau_y), value=y_data, observed=True)

and tau doesn't lose the plot anymore. Mixing is not great, but you
should be able to fix that by experimenting a bit with
AdaptiveMetropolis or with an informative prior on tau.


This doesn't address the real problem, which is why the absorbing
state seems to be actually absorbing rather than just a slow-mixing
zone from which the chain eventually escapes. It would be nice to
understand that.


Anand

Todd

unread,
Jul 12, 2009, 4:23:14 PM7/12/09
to PyMC


On Jul 8, 5:11 pm, Anand Patil <anand.prabhakar.pa...@gmail.com>
wrote:
> Todd,
>
> It _looks_ like there's an absorbing state. If eta happens to line up
> perfectly with alpha + beta * xi, it's preferable for tau to be large.
> And if tau is large, it's preferable for eta to line up with alpha +
> beta * xi.
>
> You can treat the symptoms by integrating eta out of the model. Remove
> eta, and replace y with
>
> y = pymc.Normal('y', mu, 1/(1/tau + 1/tau_y), value=y_data, observed=True)
>
> and tau doesn't lose the plot anymore. Mixing is not great, but you
> should be able to fix that by experimenting a bit with
> AdaptiveMetropolis or with an informative prior on tau.
>
> This doesn't address the real problem, which is why the absorbing
> state seems to be actually absorbing rather than just a slow-mixing
> zone from which the chain eventually escapes. It would be nice to
> understand that.
>
> Anand
>
>

Hi Anand,

Thanks for your analysis. Your suggestion to treat the symptoms by
integrating eta out of the model works well.

Todd

Anand Patil

unread,
Jul 13, 2009, 4:55:08 AM7/13/09
to py...@googlegroups.com
On Sun, Jul 12, 2009 at 9:23 PM, Todd<todd_...@mac.com> wrote:
>
>
>
> On Jul 8, 5:11 pm, Anand Patil <anand.prabhakar.pa...@gmail.com>
> wrote:
>> Todd,
>>
>> It _looks_ like there's an absorbing state. If eta happens to line up
>> perfectly with alpha + beta * xi, it's preferable for tau to be large.
>> And if tau is large, it's preferable for eta to line up with alpha +
>> beta * xi.
>>
>> You can treat the symptoms by integrating eta out of the model. Remove
>> eta, and replace y with
>>
>> y = pymc.Normal('y', mu, 1/(1/tau + 1/tau_y), value=y_data, observed=True)
>>
>> and tau doesn't lose the plot anymore. Mixing is not great, but you
>> should be able to fix that by experimenting a bit with
>> AdaptiveMetropolis or with an informative prior on tau.
>>
>> This doesn't address the real problem, which is why the absorbing
>> state seems to be actually absorbing rather than just a slow-mixing
>> zone from which the chain eventually escapes. It would be nice to
>> understand that.
>>
>> Anand
>>
>>
>
>
> Thanks for your analysis.  Your suggestion to treat the symptoms by
> integrating eta out of the model works well.


Hi Todd,

Glad you're sorted out. We still need to understand why this is
happening, however. I put a slightly slimmed-down test case on the
issues page here: http://code.google.com/p/pymc/issues/detail?id=269 .
Any insight is welcome.

Anand

Reply all
Reply to author
Forward
0 new messages