Linear regression with uncertainty in x and y

Michael Kuhn

unread,

Aug 24, 2015, 12:22:03 PM8/24/15

to Stan users mailing list

Hello,

in small examples I'm trying to understand how to use Bayesian inference to take measurement uncertainty into account. I made this toy example for normally distributed measurements, where x and y both include a noise term: http://rpubs.com/biocs/uncertainty

For each (x,y) point with associated sigma_x and sigma_y, I derive virtual points (xx, yy) on the regression line -- see code in the linked example -- which maximise the likelihood. This is possible because the PDF for the multivariate normal distribution can be analysed analytically. So when the sampling algorithm changes the slope or intercept, each measurement's uncertainty is taken into account using the maximum likelihood that can be associated with the measurement.

To account for outliers, I'd like to use a Student's t distribution as suggested in "Doing Bayesian Data Analysis". However, it looks like the PDF cannot be treated that easily. I was therefore wondering if there's another approach to this problem with STAN.

For example, could I call an external function to compute the virtual points with maximum likelihood numerically?

thanks, Michael

Bob Carpenter

unread,

Aug 24, 2015, 12:27:31 PM8/24/15

to stan-...@googlegroups.com

Section 10.1 of the manual has a normal-error regression predictor
error model. You don't need to do any analytic calculations if you
just model the underlying x leading to the observed measurement of x
as latent parameters. And yes, you can use Student-t error if you want.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Maxwell Joseph

unread,

Aug 24, 2015, 12:28:34 PM8/24/15

to Stan users mailing list

This might be relevant: http://mbjoseph.github.io/2013/11/27/measure.html

I assumed the measurement error for the covariate was normally distributed, but you ought to be able to tweak the likelihood to use the t-distribution.