Logit or Probit recommendation?

Stephen Martin

unread,

Nov 14, 2016, 12:08:23 PM11/14/16

to stan-...@googlegroups.com

Hey all,

Yet another post by me. I thought I posted this last night but I don't think it actually sent.

I have a model that makes pretty extensive use of logistic (and multinomial/softmax logistic) regression, or parameterization in terms of log-odds that is transformed to probabilities. I've tried my best to use stan functions that are more efficient (e.g., log_inv_logit(pi_logit[n]) for cluster membership probabilities), but I'm curious whether the Stan developers would recommend using probit as opposed to logit? Is one more efficient than the other? I know Phi_approx() was created as an efficient probit function for this sort of thing. Reparameterizing in terms of probits would be a hefty task, so before I tried it out, I was just curious what the Stan devs (or other users) think about the efficiency of Phi() vs inv_logit link functions.

Thanks,
--Stephen

Bob Carpenter

unread,

Nov 14, 2016, 4:24:23 PM11/14/16

to stan-...@googlegroups.com

There's a discussion in the Gelman and Hill regression book and
the conclusion is that it's primarily a matter of convenience.
The primary difference is in the scale.

With Stan, the inv_logit() function is much more efficient than
Phi and also more robust to outliers. The Phi_approx() function
is close to Phi() and more efficient.

The logistic distribution is easy to understand as a log-odds (logit)
transofrm of a uniform(0, 1) variable.

In some Gibbs applications, it was easier to do probit because
the conjugate structure of the latent normal structure could be
exploited to simplify some computations (at least I'm pretty sure
that's why Probit is so popular).

- Bob

> On Nov 14, 2016, at 12:08 PM, Stephen Martin <hwki...@gmail.com> wrote:
>
> Hey all,
>
> Yet another post by me. I thought I posted this last night but I don't think it actually sent.
>

> I have a model that makes pretty extensive use of logistic (and multinomial/softmax logistic) regression, or parameterization in terms of log-odds that is transformed to probabilities. I've tried my best to use stan functions that are more efficient (e.g., log_inv_logit(pi_logit[n]) for cluster membership probabilities), but I'm curious whether the Stan developers would recommend using probit as opposed to logit? Is one more efficient than the other? I know Phi() was created as an efficient probit function for this sort of thing. Reparameterizing in terms of probits would be a hefty task, so before I tried it out, I was just curious what the Stan devs (or other users) think about the efficiency of Phi() vs inv_logit link functions.
>
> Thanks,
> --Stephen
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Andrew Gelman

unread,

Nov 14, 2016, 4:25:17 PM11/14/16

to stan-...@googlegroups.com

I just recommended Phi_approx last week in a project. I have no idea if Phi_approx is faster than inv_logit but one could always experiment.
A

Ben Goodrich

unread,

Nov 14, 2016, 4:34:12 PM11/14/16

to Stan users mailing list, gel...@stat.columbia.edu

On Monday, November 14, 2016 at 4:25:17 PM UTC-5, Andrew Gelman wrote:

I just recommended Phi_approx last week in a project. I have no idea if Phi_approx is faster than inv_logit but one could always experiment.

There is no need to experiment when you can read the documentation. Phi_approx(x) is no faster because it simply calls inv_logit(0.07056 * x ^ 3 + 1.5976 * x) but is only slightly slower because it only has to evaluate that polynomial.

Ben

Andrew Gelman

unread,

Nov 14, 2016, 4:43:42 PM11/14/16

to stan-...@googlegroups.com

thx! Is Phi_approx slower than bernoulli_logit, though, because bernoulli_logit has some steps built in?

Stephen Martin

unread,

Nov 14, 2016, 4:51:22 PM11/14/16

to Stan users mailing list

Extremely helpful responses; thanks to all. I somehow missed the Phi_approx() section (I think I was just remembering that function from memory, didn't bother to look it up in the manual) stating that it's really just rescaling the inv_logit, ala the classic 1.6 IRT transform.

@Bob, thanks for the elaboration on the use of Probit; makes sense that gibbs samplers would prefer that. I'll keep using logit because it's much more intuitive to me, should be faster apparently, and seems more flexible for my usecase.

Bob Carpenter

unread,

Nov 15, 2016, 1:50:58 PM11/15/16

to stan-...@googlegroups.com

In decreasing order of speed:

1. y ~ bernoulli_logit(alpha);

2. y ~ bernoulli(inv_logit(alpha));

3. y ~ bernoulli(Phi_approx(alpha));

The speed comes because

1 >> 2: build simpler expression graph and thus quicker chain rule
propagation for autodiff

2 >> 3: inv_logit() is faster than Phi_approx()

We could build a

4. y ~ bernoulli_probit(alpha);

which would be faster than option 3 but slower than option 1.

- Bob

Andrew Gelman

unread,

Nov 15, 2016, 4:40:18 PM11/15/16

to stan-...@googlegroups.com

Hmm, this suggests another fudge which would be to do bernoulli_logit(alpha/1.6) (or maybe it's alpha*1.6, I can never remember which way this goes), which is cruder than Phi_approx but might do the job in our example.
A

Reply all

Reply to author

Forward