Re: [stan-users] seemingly meaningless reparameterization doubles runtime

13 views
Skip to first unread message

Andrew Gelman

unread,
Apr 25, 2017, 6:06:33 PM4/25/17
to stan-...@googlegroups.com
D
Interesting--thanks for the explanation!  I know about vectorization, but I'd never thought of this particular issue before.
A

On Apr 25, 2017, at 12:46 AM, Daniel Lee <bea...@alum.mit.edu> wrote:

Hi Andrew,

It's purely computational. Although the math is the same, simple1 is much less efficient than simple2.

It really gets down to two lines in both simple1 and simple2. Sorry, it's really hard to describe this, but I'll try.

simple1:
  vector[N_student_year] a;   
  a = sigma_a*eta_a;
...
  vector[N] erv_hat; 
  erv_hat = mu_a + a[student_year];

In the definition of a, we have a scalar times a N_student_year vector (100). In the definition of erv_hat, we have a scalar plus a vector of length N (2000). So we're adding mu_a to each element of the vector of length N, which is adding a lot to the autodiff expression graph.

simple2:
  vector[N_student_year] a;   
  a = mu_a + sigma_a*eta_a;
...
  vector[N] erv_hat; 
  erv_hat = a[student_year];

To contrast, here, we're adding mu_a to an N_student_year vector (100). In the definition of erv_hat, we're now just assigning it values, so there is construction cost, but no additional bits to the expression graph. (I think we're smart enough where the indexing just copies the original element over, which is still a little bit of work, but much cheaper than adding things to the expression graph.)

The take away here is that you want to do algebra on the smallest sized vectors and then broadcast rather than broadcast to a larger vector then do algebra. Size of the expression graph has a computational effect.



Daniel




On Mon, Apr 24, 2017 at 10:40 PM, Andrew Gelman <gel...@stat.columbia.edu> wrote:
Hi, this one's really bugging me.  I know the answer must be obvious but for some reason I'm stuck.  We have 2 models that are essentially identical except for the definition of one of the transformed parameters.  Attached are:

aging_indep.R:  R code generating fake data and fitting the 2 models
simple1.stan
simple2.stan

When you run it, you'll see that simple1.stan takes approx twice as long.  Can someone please explain why?  As I said, I'm sure it's obv but I can't figure it out.
Thanks.
Andrew

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages