Bayesian model averaging in Stan?

832 views
Skip to first unread message

Allen Riddell

unread,
Jun 30, 2013, 9:54:39 AM6/30/13
to stan-...@googlegroups.com
Hi stan-users,

After getting the horseshoe prior working without any problems, I found myself
wondering about BMA. Is there a model resembling BMA that works in Stan? (I'm
thinking about linear regression with p>50.)

Thanks,

Allen

Bob Carpenter

unread,
Jul 1, 2013, 1:42:49 PM7/1/13
to stan-...@googlegroups.com
We don't support BMA as such.

Instead, you can implement a mixture over models directly within Stan
and estimate the mixing rate at the same time. This won't do hard
variable selection for you, though.

There's a chapter on mixtures in the manual with examples. It's
a bit complicated because you have to marginalize out the discrete
parameters in an arithmetically stable way. The marginalization also
helps sampling efficiency by computing with an expectation instead of
a sample.

By horseshoe, do you mean this?

ftp://ftp.stat.duke.edu/pub/WorkingPapers/08-31.pdf

If so, did you use one of the latent parameter formulations
(lambda or kappa) or find a way to express the probability function
more directly?

- Bob


Allen Riddell

unread,
Jul 2, 2013, 2:20:27 PM7/2/13
to stan-...@googlegroups.com


> Instead, you can implement a mixture over models directly within Stan
> and estimate the mixing rate at the same time. This won't do hard
> variable selection for you, though.

Is it worth trying in this case? If p = 50, 2^50 possible models/mixture
components, right?

> By horseshoe, do you mean this?
>
> ftp://ftp.stat.duke.edu/pub/WorkingPapers/08-31.pdf
>
> If so, did you use one of the latent parameter formulations
> (lambda or kappa) or find a way to express the probability function
> more directly?

I used the latent parameter formulation. Worked well; it beat lasso handily on
the dataset I was working with.

Thanks,

Allen

Andrew Gelman

unread,
Jul 2, 2013, 5:03:20 PM7/2/13
to stan-...@googlegroups.com, Allen Riddell
Allen,
When you write this up, please send this to us. It's good to have this sort of story!
Andrew

Bob Carpenter

unread,
Jul 2, 2013, 7:36:19 PM7/2/13
to stan-...@googlegroups.com


On 7/2/13 2:20 PM, Allen Riddell wrote:
>
>
>> Instead, you can implement a mixture over models directly within Stan
>> and estimate the mixing rate at the same time. This won't do hard
>> variable selection for you, though.
>
> Is it worth trying in this case? If p = 50, 2^50 possible models/mixture
> components, right?

No :-) With p = 50, if you want models to be identified
by subsets of predictors, there's no way to marginalize explicitly.

Without discrete parameters, there's not a good
way to have a prior that has a finite probability mass
at 0 --- I'd think it would be challenging even with discrete
parameters.

It looks like you're already looking at various forms of
shrinkage, which should work just as well predictively.

- Bob

Reply all
Reply to author
Forward
0 new messages