Examples of Extreme Value Theory models in Stan

892 views
Skip to first unread message

t.wol...@ibe.edu.pl

unread,
Jan 28, 2015, 3:56:23 PM1/28/15
to stan-...@googlegroups.com
Hi,

Do you know any examples of Extreme Value Theory models implemented in Stan?

Thanks in advance,
Tim

Bob Carpenter

unread,
Jan 28, 2015, 4:56:31 PM1/28/15
to stan-...@googlegroups.com
I only know what I just read on the Wikipedia, but if you
have a specific model in mind and can write the density down
in math, we should be able to help code it in Stan.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Cameron Bracken

unread,
Jan 29, 2015, 2:01:29 PM1/29/15
to stan-...@googlegroups.com
I have implemented all sorts of extremes models in stan. Everything from stationary univariate GEV to spatial extremes models. I am happy to discuss if you have anything specific in mind.

Cameron 

Tymoteusz Wołodźko

unread,
Jan 30, 2015, 2:48:17 AM1/30/15
to stan-...@googlegroups.com
Actually I must say that I just want to learn more on Bayesian implementations of this kind of models, so I don't have any certain model in mind at this moment, but rather was looking for some working examples. I would be very grateful for some, even the simple "handbook" ones, to get better idea on this.

but unfortunately I don't find them helpful. I have also seen other authors mentioning Bayesian approach to EVT but rarely going into details.

Maybe it would be good to paste a simple example on Stan github page with examples..?

Thanks,
Tim



--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/B1x42n0ecQw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Tymoteusz Wołodźko
Zespół Analiz Osiągnięć Uczniów Instytut Badań Edukacyjnych | ul. Górczewska 8 | 01-180 Warszawa
tel.  +48
695 370 391 | www.ibe.edu.pl

Bob Carpenter

unread,
Jan 30, 2015, 1:30:42 PM1/30/15
to stan-...@googlegroups.com
We're always happy to host contributed models with contributors
maintaining copyright as long as the code, data, and doc is open-source
licensed in such a way that we have the rights to redistribute.

There is a stan-dev/model-examples repo which is where they would
go.

Data can be simulated data or real data, but we need permissions for
real data. We prefer the BSD license for code and CC BY for doc.

Also, if there's coding or modeling inisight to be gained from
the models, I can drop a section in the manual (with attribution,
of course!).

- Bob
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Cameron Bracken

unread,
Jan 30, 2015, 1:33:45 PM1/30/15
to stan-...@googlegroups.com

Here is the most simple example of an extremes model, fitting a univariate stationary GEV distribution.

functions {

  real gev_log (real y, real mu, real sigma, real xi){
    real z;
    z <- 1 + (y - mu) * xi / sigma;
    return -log(sigma) - (1 + 1/xi)*log(z) - pow(z,-1/xi);  
  }

}

data {
  int<lower=0> N;
  real y[N];
}

transformed data {
  real min_y;
  real max_y;
  min_y <- min(y);
  max_y <- max(y);
}
parameters {
  real xi;
  real<lower=0> sigma;
  // location has upper/lower bounds depending on the value of xi
  real<lower=if_else( xi < 0, min_y + sigma / xi, negative_infinity() ),
       upper=if_else( xi > 0, positive_infinity(), max_y + sigma / xi )> mu;
}
model {
  # priors on component parameters
  sigma ~ gamma(.0001,.0001);
  #mu ~ uniform
  xi ~ uniform(-.5,.5);

  for(i in 1:N) {
    y[i] ~ gev(mu, sigma, xi);  
  }
}

Bob Carpenter

unread,
Jan 30, 2015, 2:00:50 PM1/30/15
to stan-...@googlegroups.com
Thanks for sharing.

There's a serious problem with the model as written.
The parameter declaration:

> parameters {
> real xi;


is not consistent with the model:

> xi ~ uniform(-.5,.5);

The general rule is that any parameter value matching the constraints
(which here includes xi = 10, for example), should have support in
the model.

To fix, you need to declare:

real<lower=-0.5,upper=0.5> xi;

If you don't, what will happen is that any path leading outside
of (-0.5,0.5) will cause an immediate rejection and sampling can
devolve. You'll also get failed random initializations.

We also don't recommend those very diffuse priors that were
in the early BUGS models (or the interval constraints they switched
to later). There are a few sections in the regression chapter of
the manual that we've added recently on what we do recommend. It can
also be a huge help with computation to have less diffuse priors, even
if they're not informative enough to have a noticeable impact on the posterior.

Maybe we should add the GEV density as a built-in.

It's possible to speed up the implementation below with
some algebra by treating y as a vector in the function.
We'd eke out even more speed with a built-in version and also
potentially avoid some calculations you don't need.

The implementation below would be faster if you converted
everything to vectors, but it's not so straightorward to write
because we don't have a vector version of pow(), so I had
to define some intermediates to avoid redundant calculations.
I haven't tried to validate this does the right thing, but
it'd look something like this:

real gev_log (vector y, real mu, real sigma, real xi) {
vector[size(y)] z;
vector[size(y) + 1] lp;
real inv_xi;
real inv_xi_p1;
real neg_inv_xi;
z <- ((1 + y) - mu) * (xi / sigma);
inv_xi <- inv(xi);
inv_xi_p1 <- 1 + inv_xi;
neg_inv_xi <- -inv_xi;
for (n in 1:size(y))
lp[n] <- inv_xi_p1 * log(z[n]) + pow(z[n],neg_inv_xi);
lp[size(y) + 1] <- size(y) * log(sigma);
return -sum(lp);
}

and then it's just

y ~ gev(mu,sigma,xi);

in the model. This should be quite a bit faster than the
simpler implementation. But then it may not be worth the ugliness
if it's already fast enough.

I may pull this out as an optimization case study for the manual.

- Bob

> real z;
> z <- 1 + (y - mu) * xi / sigma;
>
> }


data {p
vector[N] y;

- Bob
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Tymoteusz Wołodźko

unread,
Jan 30, 2015, 4:55:50 PM1/30/15
to stan-...@googlegroups.com
Thank you both for interesting information and comments.

Cameron Bracken, thanks for the example. It is not that straightforward, so an example is always helpful as a place to start!

Greetings,
Tim


> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Cameron Bracken

unread,
Jan 30, 2015, 6:58:27 PM1/30/15
to

Bob,

Thanks a ton for the feedback. I am actually using my GEV function in far more complex models so your vector implementation should help a lot. Feel free to use as you please in the manual. A built in GEV and GPD function would be fantastic.

I made some changes to the model and get an error that I am not sure how to fix:

Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model 'gev' with error message:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
no matches for function name="size"
    arg 0 type=vector
available function signatures for size:
0.  size(int[1]) : int
1.  size(real[1]) : int
2.  size(vector[1]) : int
3.  size(row vector[1]) : int
4.  size(matrix[1]) : int
5.  size(int[2]) : int
6.  size(real[2]) : int

...
functions {

  real gev_log (vector y, real mu, real sigma, real xi) { 
    vector[size(y)] z; 
    vector[size(y) + 1] lp; 
    real inv_xi; 
    real inv_xi_p1; 
    real neg_inv_xi; 
    z <- ((1 + y) - mu) * (xi / sigma); 
    inv_xi <- inv(xi); 
    inv_xi_p1 <- 1 + inv_xi; 
    neg_inv_xi <- -inv_xi; 
    for (n in 1:size(y)) 
      lp[n] <-  inv_xi_p1 * log(z[n]) + pow(z[n],neg_inv_xi); 
    lp[size(y) + 1] <- size(y) * log(sigma); 
    return -sum(lp); 
  } 

}

data {
  int<lower=0> N;
  real y[N];
}

transformed data {
  real min_y;
  real max_y;
  real sd_y;
  min_y <- min(y);
  max_y <- max(y);
  sd_y <- sd(y);
}
parameters {
  real<lower=-0.5,upper=0.5> xi;
  real<lower=0> sigma;
  // location has upper/lower bounds depending on the value of xi
  real<lower=if_else( xi < 0, min_y + sigma / xi, negative_infinity() ),
       upper=if_else( xi > 0, positive_infinity(), max_y + sigma / xi )> mu;
}
model {
  # priors
  sigma ~ normal(sd_y,100);
  #mu ~ uniform
  #xi ~ uniform(-.5,.5);

  y ~ gev(mu, sigma, xi);  
}

Thanks,
Cameron

> Tymoteusz Wołodźko
> Zespół Analiz Osiągnięć Uczniów Instytut Badań Edukacyjnych | ul. Górczewska 8 | 01-180 Warszawa
> tel.  +48 695 370 391 | www.ibe.edu.pl
>
>
> ​
>

Ben Goodrich

unread,
Jan 30, 2015, 10:28:49 PM1/30/15
to stan-...@googlegroups.com
On Friday, January 30, 2015 at 6:58:27 PM UTC-5, Cameron Bracken wrote:
I made some changes to the model and get an error that I am not sure how to fix:
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model 'gev' with error message:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
no matches for function name="size"
    arg 0 type=vector
available function signatures for size:
0.  size(int[1]) : int
1.  size(real[1]) : int
2.  size(vector[1]) : int
3.  size(row vector[1]) : int
4.  size(matrix[1]) : int
5.  size(int[2]) : int
6.  size(real[2]) : int

The size function returns the number of elements in an array. For a vector, I think you need to instead use

rows(y)

Ben

Bob Carpenter

unread,
Feb 1, 2015, 5:31:44 PM2/1/15
to stan-...@googlegroups.com
Sorry, I didn’t compile to check these kinds of
errors.

We haven’t implemented size() for vectors yet.
You’ll need to use cols() or num_elements().

- Bob
Reply all
Reply to author
Forward
0 new messages