On Mon, Jul 13, 2009 at 6:30 AM, Todd<todd_...@mac.com> wrote:
>
> This simple model works very well for estimating the observed fluxes,
> including the uncertainty in the background. I'd like to feed the
> posterior distributions for f_obj for the two bands for all the
> quasars back into PyMC to infer the parameters of the lognormal
> distribution, but I'm not sure how best to do it. I know that I can
> construct the posterior distribution for the flux ratio for each
> quasar by either directly using the posterior samples of the two
> fluxes or by marginalizing the joint distribution of the fluxes:
>
> p(r|...) = \int f2 p(f2*r, f2)df2
The reason this is confusing is the requirement that the ratio have a
lognormal posterior is inconsistent with your model. The 'right' thing
to do would be to make a histogram of the posterior of the ratio using
the MCMC samples, and present that in place of the lognormal
distribution.
You can still present a lognormal approximation to the posterior, but
it's just going to be an approximation, so you may as well keep it
simple. You could estimate the parameters by simply taking the mean
and variance of samples of log(r), and then compute the
Kullback-Liebler divergence or Kolmogorov-Smirnov test statistic or
something to see how good of an approximation it is.
If you really want to force the posterior to be lognormal 'inside' the
model, MCMC is probably not the right tool for the job. Consider a
variational approximation.
Finally, I don't know how these things work, but it's worth having a
careful think about whether the f's for the two bands should really be
modeled independently. For example, should the two f_sky's be
correlated? If so, it could have a noticeable effect on the posterior
of the ratio.
Cheers
Anand
Wait- do you want to summarize the posterior of a single ratio as
lognormal, or model the distribution of many ratios as lognormal? If
the latter, scratch everything I said. :)
The easiest thing to do is make ratio a lower-level variable than f1
or f2, and put both bands of all your datapoints in the same model
mu_ratio = Normal('mu_ratio',...)
tau_ratio = Gamma('tau_ratio',...)
ratios = Lognormal('ratio',mu_ratio,tau_ratio)
f1 = Uniform('f1',0,100,value = ones(n))
f2 = f1 * ratios
...
and then you get posteriors for mu_ratio and tau_ratio, as you (may
have) wanted.
Anand