Re: [stan-users] something wrong with my cubic B-spline simulation/model

554 views
Skip to first unread message

Andrew Gelman

unread,
Jun 17, 2013, 5:22:02 PM6/17/13
to stan-...@googlegroups.com
Just to clarify:
- The graph from the BDA excerpt shows 21 different curves, I believe.
- The parameterization is important because of the next stage, which is the prior distribution. If you set up the splines in the way that makes the mathematical formulas the simplest, the 21 coefs won't have clean interpretations.
A

On Jun 17, 2013, at 5:09 PM, Bob Carpenter wrote:

> >> On 6/13/13 12:31 PM, Andrew Gelman wrote:
> >>> Do we have B-splines?
>
> On 6/16/13 4:08 PM, Marcus Brubaker wrote:
>> It should be possible to implement splines with fixed numbers of knots right now. We could implement some specialized
>> functions which would make it more convenient and potentially faster but the basic functionality would remain the same.
>
> We've been discussing cubic b-splines on the dev list,
> but I thought I'd post my (failed?) attempt at fitting
> the models here, because as Marcus noted, it shouldn't
> require anything special.
>
> Any help would be appreciated because I've already wasted
> considerable time trying to get it to work (or maybe I
> did and don't realize it --- see below).
>
> I'm attaching
>
> * David Dunson's DRAFT section on cubic b-splines for the next
> edition of Bayesian data analysis (aka "BDA3").
>
> * my attempt to simulate data in R
>
> * my attempt to write a Stan model to fit it
>
> I tried to keep the notation in R and Stan as close to that
> of BDA3 as possible.
>
> The two problems I'm having are:
>
> * the simulations don't look anything like Dunson's:
>
> - my simulations look like simple cubic functions,
> whereas Dunson's have many local minima and maxima
>
> - my simulations take all sorts of values outside of [0,1]
> for inputs in [0,1], whereas Dunson drew many simulations
> all of which took values in [0,1] for inputs in [0,1]
>
> - I believe by "standard normal" he means Normal(0,1)
> for the coefficients
>
> * Stan isn't recovering the coefficients, despite reporting
> good R-hat and n_eff
>
> - or maybe this is the best we can do? with only 2 knots
> and 10000 data points, it's getting the noise scale right
> and the params are within very very fat 95% intervals
>
> Here are some comments on the BDA draft and how I got through
> them (there may be errors in my reasoning here, of course).
>
> The first problem I had is that the meaning of H
> changes from the number of regression coefficients in
> (20.1) to the number of knots later in the same paragraph.
> So I couldn't tell what was going on in later figures when
> they said H = 21.
>
> I didn't understand the notation for the "positive part" of
> a function, or even that that's what it was supposed to be:
>
> z_+ = 1_{z > 0}z
>
> So I went searching. I found the explanations in section 5.2 of Hastie
> et al.'s book much crisper mathematically in that I could follow its
> definitions:
>
> http://www-stat.stanford.edu/~tibs/ElemStatLearn/
>
> In particular, it cleared up the + notation as:
>
> f(x)_+ = max(0,f(x))
>
> Thus I inferred that 1_{z > 0} was an indicator function in notation
> I'd never seen before. The post-multiplication by z is unfortunate
> typographically.
>
> BDA then continues with "...closely related alternative choice of
> splines, which we find preferable, is the family of cubic B-splines'",
> but it never precisely defines cubic B-splines. It does provide an illustration
> in figure 1 that I can't make heads or tails of (I think it's a bunch of overplotted
> functions, but unlike the rest of BDA, the caption doesn't give any details).
> The figure says it was generated "with the bs() function R", but again
> no details as to how. Turns out bs() is a function in Hastie et al.'s
> "splines" package, which is on CRAN, but which doesn't run in R 3.0.0.
> (I'd add a citation to the BDA section).
>
> Am I just simulating from the wrong model? Is what I have not
> cubic B-splines? It looks like they are from Hastie et al., but then
> Hastie et al. go on to define "natural cubic splines" in section 5.2.1.
>
> Anyway, I think I'll leave this for someone else to do, as I need
> to get back to actually coding Stan itself.
>
> The next thing I would've done would've been to expand out the basis
> explicitly and try to fit that directly using linear regression (that is,
> not try to compute the splines in Stan). But even if that worked, it
> wouldn't solve the problem that my simulations look nothing at all like
> Dunson's.
>
> - Bob
>
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
> <cubic-b-spline.sim.R.txt><cubic-b-spline.stan.txt><bspline.pdf>

Bob Carpenter

unread,
Jun 17, 2013, 5:09:17 PM6/17/13
to stan-...@googlegroups.com
cubic-b-spline.sim.R.txt
cubic-b-spline.stan.txt
bspline.pdf
Reply all
Reply to author
Forward
0 new messages