You mean in this section:
http://en.wikipedia.org/wiki/Weighted_least_squares#Weighted_least_squares
I couldn't quite understand the least-squares formulation.
Andrew might jump in with a more Bayesian model-based solution,
because the application looks to my untrained eye a lot like his pet
example, 8 schools.
The way to work out what happens in Stan is to just look
at the posterior. If you have a standard linear regression with
no priors and coefficient and noise scale parameters beta and sigma,
then the log posterior is proportional to
log p(beta,sigma|y,x) propto SUM_n log normal(y[n] | x[n] * beta, sigma)
= SUM_n -log(sigma) - [0.5 * (y[n] - x[n] * beta) / sigma)]^2
Where I'm using propto loosely on the log scale to mean equal up to
an additive constant that doesn't depend on beta or sigma.
So if you do
for (n in 1:N)
increment_log_prob(normal_log(y[n], x[n] * beta, sigma));
this is the posterior you get.
If you multiply the term added to the log prob by w[i], as in
for (n in 1:N)
increment_log_prob(w[n] * normal_log(y[n], x[n] * beta, sigma));
then the posterior will be
log p(beta, sigma | y, x, w) propto
SUM_n w[n] * { -log(sigma) - [0.5 * (y[n] - x[n] * beta) / sigma)]^2 }
If that's what you want, then you're good to go.
Even if you think it's right analytically, I would run some tests
on problems with known answers to make sure you're getting the
right answers. (Of course, that's tricky because the MLE isn't
necessarily close to the posterior mean until you get a lot of data.)
- Bob