Hi,
First, congratulations to the developers on excellent software
packages, both pymc and pymc.gp. I have been particularly impressed
by the layout (returning random GP realizations as callabale Python
objects -- brilliant!) and computational efficiencies used by pymc.gp
(in particular, the low-rank Cholesky calculations). Not to mention,
the GP Users' guide is outstanding.
My question has to do with whether or not the low-rank Cholesky
simplifications can be taken advantage of when evaluating the
likelihood associated with GP parent parameters. I'm interested in a
very simple case: following Chapter 2 of the GP Users guide, say that
all of the GP's children, K, are observations of the GP on the mesh
x_star (i.e. the equation in Section 2.3.3) and that there is no
observation variance, so V_i is 0. If I am following the notation
correctly, I believe the likelihood term of interest here would be
p(f(x_star)|P), which is a multivariate normal distribution based on
the GP mean and covariance, which are functions of P.
My question is whether or not we can take advantage of incomplete
Cholesky factorization when computing the likelihood ratio for P and
P_p. The problem I run into often is that for reasonable values of
the covariance "scale" parameter, the covariance matrix is numerically
singular (i.e. there are some redundant data in K). I'm interested in
the idea of using incomplete Cholesky factorization for these cases,
but my concern is whether it is valid to compare likelihood values for
P and P_p when the ranks of their respective incomplete Cholesky
factors might differ (e.g. say their are 1000 observations in K and
for one set of GP parameters P the incomplete Cholesky factor retains
80 observations and for P_p the Cholesky factor retains 300).
Intuitively, to me it doesn't seem "fair" to compare likelihood values
based on two systems having different dimension.
If my intuition here is correct, are there any tricks for dealing with
this situation (aside from introducing a nugget/observation variance),
or do we just have to live with the MCMC sampler avoiding regions for
P where the full covariance matrix is numerically singular?
Thanks,
John
--
You received this message because you are subscribed to the Google Groups "PyMC" group.
To post to this group, send email to py...@googlegroups.com.
To unsubscribe from this group, send email to pymc+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pymc?hl=en.
By the way, is the pymc MAP solver supported for GP submodels? I made
a brief attempt to use it but got an error that looked like it came
out of the NormalApproximation class.
I claim that for this example, theoretically, Lhood(scale=1) >
Lhood(scale=0.15), but because of finite precision on the computer, we
can't compute Lhood(scale=1). I believe if the computer had infinite
floating point precision, we could evaluate Lhood(scale=1) and the MCMC
sampler would correctly identify the posterior distribution for the
scale parameter as having a mode close to 1.
John