I'm trying to develop a notebook on Bayesian Linear regression solved via VI from scratch. I realise it is fairly easy to do so using `tf.keras` and following TFP tutorial on Probabilistic regression
I've made good progress in my post here
where I have been able to show how to compute and optimise KL-divergence between different families sampling and reparameterisation. I'm stuck at how to
I then looked at the PCA tutorial on TFP website and adapted it for linear regression here
. This implementation uses JointDistributionCoroutineAutoBatched akin to the PCA tutorial.
The main pieces of the code are:
def lr(x, stddv_datapoints):
num_datapoints, data_dim = x.shape
b = yield tfd.Normal(
w = yield tfd.Normal(
scale=2.0 * tf.ones([data_dim]), name="w"
y = yield tfd.Normal(
loc=tf.linalg.matvec(x, w) + b,
Using VI (again code inspired heavily from the PCA post), I am able to get qw_mean, qw_stddv, qb_mean, qb_stddv
I wanted to ask the following:
- What is the easiest way to make a prediction on unseen data in my code above? As in, akin to tf.kerad.model.fit and predict, can we do something here?
- What would be the best way for linear regression (from scratch) and not using the tf.keras.*
- specify prior (p) on W -- easy
- specify likelihood (l) on data under y ~ N(XW + b)
- using surrogate q and optimise ELBO to get as close as possible to p*l
- Some examples use tfp.math.minimize and some use tf.GradientTape. When is a particular one recommended over the other.