Area Under a Curve in Stan

sepehr akhavan

unread,

Mar 7, 2016, 1:27:46 PM3/7/16

to Stan users mailing list

Hi,

1) I have a distribution where its parameters are sampled. I'd like to calculate area under the distribution curve (at each iteration given the sampled parameters) up to a point. Is there any easy way to do it in Stan? Any suggestion?

2) In R, there are different methods available to compute area under the curve. Is there anyway that I could call one of those methods inside my rStan cod?

3) Between method (1) and method (2), which one is recommended?

Thanks very much for your help,

-Sepehr

p.s. please let me know if you would need my actual stan code to answer this question.

Bob Carpenter

unread,

Mar 7, 2016, 3:35:06 PM3/7/16

to stan-...@googlegroups.com

If I understand you correctly, you want to compute
the CDF of the marginal posterior for a specified parameter.

There's no way to do that from within a Stan program.
You can empirically estimate it using the posterior draws after
the program finishes.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

sepehr akhavan

unread,

Mar 7, 2016, 4:25:25 PM3/7/16

to Stan users mailing list

Hi Bob,

Thanks very much for your answer. Yes, you are right for any distribution curve, that would be cdf. My aim, however, is to be able to compute area under any curve (and not necessarily a probability distribution). For instance, you may consider an area under a GP curve.

Now that it's not possible to do it with Stan, is there anyway to use some of the R functions inside stan? Of course, one way is to generate points off of a GP in Stan and tp export them from Stan and to use them inside an AUC function in R. Is there any better way to do it?

Thanks very much,

-Sepehr

Bob Carpenter

unread,

Mar 7, 2016, 4:33:10 PM3/7/16

to stan-...@googlegroups.com

I'm afraid not.

For marginal posteriors, you can use simple MCMC methods
to compute the CDF of a density, with the estimate
being the fraction of points below the cutoff.

- Bob

James Camac

unread,

Mar 21, 2016, 6:50:02 AM3/21/16

to Stan users mailing list

Hi Bob,
Maybe I've missed something in the original question.

But I can't see why stan cannot calculate AUC, especially if you can in JAGS using the example below:

library(R2jags)

set.seed(1)
J <- 2
n <- 5000
X <- cbind(1, matrix(runif(n*J, -1, 1), ncol=2))
b <- 1:3
p <- plogis(X %*% b)
y <- rbinom(n, 1, p)

M <- function() {
  for(i in 1:n) {
    y[i] ~ dbern(p[i])
    logit(p[i]) <- inprod(X[i, ], b)
  }
  for(j in 1:(J+1)) {
    b[j] ~ dnorm(0, 0.0001)
  }
  for (t in 1:length(thr)) {
    sens[t] <- sum((p > thr[t]) && (y==1))/n1
    spec[t] <- sum((p < thr[t]) && (y==0))/n0
    fpr[t] <- 1 - spec[t]
    fnr[t] <- 1 - sens[t]
  }
  auc <- sum((sens[2:length(sens)]+sens[1:(length(sens)-1)])/2 * 
    -(fpr[2:length(fpr)] - fpr[1:(length(fpr)-1)]))
}

j <- jags(list(X=X, y=y, n=n, J=J, n1=sum(y), n0=sum(!y), thr=seq(0, 1, 0.05)), 
          NULL, c('b', 'sens', 'spec', 'fpr', 'fnr', 'auc'), M, 3, 1000)

Though having said this, it isn't immediately clear how this would be efficiently coded in stan.

James Camac

unread,

Mar 21, 2016, 6:59:53 AM3/21/16

to stan-...@googlegroups.com

Ah I see there is issues with summing like this...

n <- 5000
X <- runif(n, -1, 1)
a <- 0.5
b <- 1.2
p <- plogis(a + b * X)

y <- rbinom(n, 1, p)

thr <- seq(0, 1, 0.05)
n_threshold <- length(seq(0, 1, 0.05))

data <-
  list(
    n = n,
    variable = X,
    y = y,
    thr = thr,
    n_threshold = n_threshold
  )
model <-" 
  data{
    int<lower=1> n;
    real variable[n];
    int n_threshold;
    real thr[n_threshold];
    int<lower=0, upper=1> y[n];
  }
  parameters{
  real a;
  real b;
  }
  model{
    real logit_p[n];
    real p[n];
    real sens[n_threshold];
    real spec[n_threshold];
    real fpr[n_threshold];
    real auc;

  for(i in 1:n) {
    logit_p[i] <- a + b * variable[i];
    p[i] <- inv_logit(logit_p[i]);
  }
  y ~ bernoulli(p);
  a ~ normal(0, 10);  
  b ~ normal(0, 10);
  

  for (t in 1:n_threshold) {
    sens[t] <- sum((p > thr[t]) && (y==1))/cumulative_sum(y);
    spec[t] <- sum((p < thr[t]) && (y==0))/cumulative_sum(!y);

    fpr[t] <- 1 - spec[t]
  }

  auc <- sum((sens[2:length(sens)]+sens[1:(length(sens)-1)])/2 * 
    -(fpr[2:length(fpr)] - fpr[1:(length(fpr)-1)]))
}"

j <- stan(model_code=model, data = data, iter=2000, chains = 3)


binary infix operator >= with functional interpretation logical_gt requires arguments or primitive type (int or real), found left type=real[], right arg type=real; No matches for: 

  logical_gt(real[], real)

hmmm.. so not vectorised...

Michael Betancourt

unread,

Mar 21, 2016, 7:26:31 AM3/21/16

to stan-...@googlegroups.com

Let’s be clear — what you’re proposing is an _estimate_ of the AUC, and an

estimate with potentially large error. The problem with these kinds of naive

estimates is that they completely undermine the theoretical guarantees of

MCMC as the error biases the Markov chains away from true target distribution.

If you can guarantee that the error in the AUC estimate is around machine

precision then you’re okay, but that is highly unlikely with such a simple

quadrature procedure here. At least not unless you pay a huge computational

burden with really tiny intervals.

James Camac

unread,

Mar 21, 2016, 8:43:17 AM3/21/16

to Stan users mailing list

Of course. Fewer sampled thresholds means greater error.

I don't see why you needed to declare this is only an estimate of AUC.
As far as I'm aware you can only ever "estimate" it.

Despite this, like WAIC and other measures, these approximations are useful for examining model performance.

James Camac

unread,

Mar 21, 2016, 8:46:50 AM3/21/16

to Stan users mailing list

Also I might be confused but how does calculating AUC in the generated quantities bias the MCMC sampling?

Michael Betancourt

unread,

Mar 21, 2016, 8:56:35 AM3/21/16

to stan-...@googlegroups.com

The code you included was in the model block, not the generated quantities block.

On Mar 21, 2016, at 12:46 PM, James Camac <james...@gmail.com> wrote:

> Also I might be confused but how does calculating AUC in the generated quantities bias the MCMC sampling?
>

Bob Carpenter

unread,

Mar 21, 2016, 11:09:57 AM3/21/16

to stan-...@googlegroups.com

I've never found AUC very useful for anything other than
bakeoffs based on AUC and its relatives (like TREC for
information retrieval). In real applications, there's usually
a decision problem involving operating points to maximize
expected value, with costs for false positives and negatives,
and benefits for true positives and true negatives. That is,
rarely is a classifier intended to be used at all specificy
or sensitivity levels.

But if you just want to compute the standard empirical
version in the generated quantities, that won't affect sampling.
And if you do it in the generated quantities, you'll automatically
get posterior variance of the AUC estimate (which won't include
error from the quadrature, but I understand that's traditional
in machine learning settings).

- Bob

James Camac

unread,

Mar 21, 2016, 3:57:22 PM3/21/16

to Stan users mailing list

Sorry Michael, I was meant to put it in the generated quantities section. My bad.

Hi Bob, we use it for binary classification problems along with cross validating to assess model performance for binary problems.

I'm curious to know what measures you would recommend for binary issues though.

Mick Cooney

unread,

Mar 21, 2016, 4:01:21 PM3/21/16

to stan-...@googlegroups.com

How does placing the calculation in the model block bias the outputs?

--
Mick Cooney
mickc...@gmail.com

Bob Carpenter

unread,

Mar 21, 2016, 4:08:53 PM3/21/16

to stan-...@googlegroups.com

You can't actually do a return-value calculation in the model block,
because you can only set local variables.

In general, inaccuracy of density calculations will affect the
error bounds in the Hamiltonian simulation (roughly to generate
new proposals) negatively.

- Bob

Bob Carpenter

unread,

Mar 21, 2016, 4:10:53 PM3/21/16

to stan-...@googlegroups.com

The model is being fit on what the machine learning people
call "log loss", so I'd use that as a direct measure of
probabilistic accuracy. It's nice to see the received
operating curve (either precision-recall or specificity-sensitivity),
but I don't know of any siutation other than evaluating classifiers
for their own sake where it's relevant. When you want to actually
use a classifier for an application, you usually care about
the sensitivity-specificity tradeoff at a particular application
point. Or about whether your model's calibrated in that it makes
the right predictions at the rate its probabilities say it does.

- Bob

James Camac

unread,

Mar 21, 2016, 4:28:11 PM3/21/16

to Stan users mailing list

Thanks, this is good to know.

I'll have a look at how to code up the log loss. I haven't used it before. AUC seems to be pervasive in my field for these types of problems and is primarily used as a model selection tool along with the log likelihood or some other derived quantity such as AIC,DIC or the likes.

Just so you know the above model I put up last night doesn't work. It was getting late.

I'll code up a correct version later today, once I've worked out how I can get around some of the condition and vectorising issues.

Jonah Gabry

unread,

Mar 21, 2016, 5:18:08 PM3/21/16

to Stan users mailing list

On Monday, March 21, 2016 at 4:28:11 PM UTC-4, James Camac wrote:

Thanks, this is good to know.
I'll have a look at how to code up the log loss. I haven't used it before.

Luckily it's pretty straightforward. If y (observed) and yhat (predicted) are N-vectors then you could do something like this in R:

-1 / length(y) * sum(y * log(yhat) + (1 - y) * log(1 - yhat))

You could also do it in generated quantities in a Stan program, but you might have y declared as an integer array and yhat as a real array or vector, in which

case you'd probably have to do it in a loop.

Bob Carpenter

unread,

Mar 21, 2016, 5:19:54 PM3/21/16

to stan-...@googlegroups.com

Log loss is just negative log likelihood, usually on held
out or cross-validated data. Aki, Jonah, and Andrew have
a paper on the ICs and their relation to cross-validation.

I really like the Raftery et al. paper on calibration and
sharpness:

https://www.stat.washington.edu/raftery/Research/PDF/Gneiting2007jrssb.pdf

- Bob

sepehr akhavan

unread,

Jun 17, 2016, 8:50:57 PM6/17/16

to Stan users mailing list

Dear Bob,

I had asked a question on computing the area under the curve of a pdf a few months back. I was wondering whether I could use my own rectangular integration code inside the model block. I would need to compute a cdf that is not available in closed form and is going to be used as part of my likelihood. Earlier in this thread, you had mentioned that the AUC calculation should be inside of the generated quantities block in order not to impact the sampling. In my case, however, this is not possible as my likelihood needs my AUC calculations.

Thanks very much for your help,

-Sepehr

Bob Carpenter

unread,

Jun 18, 2016, 5:06:33 PM6/18/16

to stan-...@googlegroups.com

Hard to say without more details. Stan requires
the log density function that you define to be differentiable
with respect to the parameters. So if you have an
iterative algorithm, Stan will just autodiff that iterative
algorithm. It can work in some cases, but isn't ideal
in terms of efficiency or arithmetic accuracy and stability
when compared to working out a second iterative algorithm
for the derivatives. That is, the accuracy guarantees for
iterative approximations of functions don't automatically carry
over to their derivatives.

I'd suggest you try it and see what happens. Do you have
a way to simulate data according to the model and see if you
can recover the parameters?

- Bob

Reply all

Reply to author

Forward

Message has been deleted