Hello, everyone,
I am a user for Rstan, Right now I encounter a problem regarding the Copula Bayesian estimation using Stan.
I try to estimate the right tail dependence between monthly
losses to buildings (x) and losses to tenancies(y), the dataset is attached as data.csv (N
is the total number of data, which is 132). And I select Gumbel copula to model
the right tail dependence.
The Gumbel formula is as belows:
So I try to estimate lamda L in the above table.
For the Gumbel and Clayton copula density, I searched on the internet,
Clayton copula density:
Gumbel copula density:
here both u and v are uniform distributions at [0,1], and delta is the parameter to be estimated (in my model I denote as "a").
Then I saw Ben's post at https://groups.google.com/forum/#!msg/stan-dev/E953tjdtg44/SRCUSHPB92YJ talking about the Clayton copula, and this gave me some intuition for writing the stan model of Gumbel.
In my own model, I think the only
difference from Ben's post model is the gumbel density function (I believe is
correct). The problem is, the final result for lamda L is extremely large, like
0.98, which is not reasonable.
Then I try to compare my Gumbel copula model and Clayton copula model, for the estimated
Kendall’s tau. Because this is the common parameter that these two models share
(no right tails dependence for Clayton and no left tail dependence for
Gumbel).
As a reference, the kendalls’s tau directly calculated from dataset is 0.286.
My gumbel model (a is the parameter in copula,
and tau is the kendall’s tau. I assume lognormal marginal
distribution and non-informative prior for tau):
modelcode="
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters{
real mu1;
real mu2;
real<lower=0> sigma1;
real<lower=0> sigma2;
real<lower=-1,upper=1> tau;
}
transformed parameters{
real<lower=1> a;
a=1 / (1 - tau);
}
model{
//priors
x~lognormal(mu1,sigma1);
y~lognormal(mu2,sigma2);
//likelihood
if(tau!=1){
real p[2];
p[1]= lognormal_cdf(x,mu1,sigma1);
p[2]= lognormal_cdf(y,mu2,sigma2);
target+=-((-log(p[1]))^a+(-log(p[2]))^a)^(1/a)-log(p[1])-log(p[2])+(-2+2/a)*log((-log(p[1]))^a+(-log(p[2]))^a)+(a-1)*log(log(p[1])*log(p[2]))+log1p((a-1)*((-log(p[1]))^a+(-log(p[2]))^a)^(-1/a));
}
}
"
fit=stan(model_code=modelcode,data=dat,iter=12000,warmup=2000,thin=10,chains=3)
print(fit)
result after 3000 post-warmup draws:
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
mu1 3.55 0.00 0.04 3.48 3.53 3.55 3.58 3.63 2642 1
mu2 3.21 0.00 0.06 3.10 3.17 3.21 3.25 3.33 2487 1
sigma1 0.40 0.00 0.02 0.35 0.38 0.39 0.41 0.45 2724 1
sigma2 0.62 0.00 0.04 0.55 0.59 0.61 0.64 0.69 2758 1
tau 0.99 0.00 0.01 0.97 0.98 0.99 0.99 0.99 2537 1
a 80.01 0.53 26.36 31.63 58.95 80.43 101.40 124.81 2483 1
lp__ 173.33 0.03 1.67 169.19 172.40 173.67 174.57 175.60 2642 1
As you can see, the kendall's tau is 0.99, and the parameter a (or "delta" in the above formula) is too high, not reasonable!
But if using Clayton model(I also attached the model below, almost same with the Gumbel model above, except the log density) is 0.39, which seems reasonable.
The clayton copula model:
modelcode="
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters{
real mu1;
real mu2;
real<lower=0> sigma1;
real<lower=0> sigma2;
real<lower=-1,upper=1> tau;
}
transformed parameters{
real<lower=1> a;
a=2 * tau / (1 - tau);
}
model{
// assume non-informative priors for tau
x~lognormal(mu1,sigma1);
y~lognormal(mu2,sigma2);
//likelihood
if(tau!=0){
real p[2];
p[1]= lognormal_cdf(x,mu1,sigma1);
p[2]= lognormal_cdf(y,mu2,sigma2);
target += log1p(a) - (a + 1) * log(p[1] * p[2]) - (2 * a + 1) / a * log(p[1] ^ -a + p[2] ^ -a - 1);
}
}
"
So it seems the tau getting from my Gumbel model is obviously not
correct, I don’t know the reason and have been confused by days. Someone
may say the Gumbel copula density is not correct? But I have checked it
multiple times and couldn't see anything wrong.
Could
you guys help me have a look?
Thanks a lot,
Jason Zhang
I am a user for Rstan, Right now I encounter a problem regarding the Copula Bayesian estimation using Stan.
excellent. I translated also the normal-copula, the frank and have the student-t copula on my list.We should put all this really helpful things to a github.. If I do so and keep everythingunder BSD or whatever license, do you allow to store your code examples with reference ofthe owener(s) as well?
/* Gumbel copula log density
*
* @param u Real number on (0,1], not checked but function will return NaN
* @param v Real number on (0,1], not checked but function will return NaN
* @param theta Real number >= 1, will throw otherwise
* @param log density
*/
real gumbel_copula(real u, real v, real theta) {
real neg_log_u = -log(u); real log_neg_log_u = log(neg_log_u);
real neg_log_v = -log(v); real log_neg_log_v = log(neg_log_v);
real log_temp = log_sum_exp(theta * log_neg_log_u,
theta * log_neg_log_v);
real theta_m1 = theta - 1;
if (theta < 1) reject("theta must be >= 1");
if (is_inf(theta)) {
if (u == v) return 0;
else return negative_infinity();
}
return theta_m1 * log_neg_log_u + theta_m1 * log_neg_log_v
+ neg_log_u + neg_log_v - exp(log_temp / theta)
+ log_sum_exp( 2 * theta_m1 / -theta * log_temp, log(theta_m1) +
(1 - 2 * theta) / theta * log_temp);
}
real normal_copula(real u, real v, real rho) { return (0.5*rho*(-2. *u*v + 1. *square(u)*rho + 1. *square(v)*rho)) / (-1. + square(rho)) - 0.5*log(1 - square(rho)); }
real clayton_copula(real u, real v, real theta) { if(theta == 0.0) reject("clayton_copula: theta must != 0"); return log1p(theta) - (theta + 1) * (log(u) + log(v)) - (1 + 2 * theta) / theta * log(pow(u, -theta) + pow(v, -theta) - 1); } real frank_copula(real u, real v, real theta) { return log(theta) + log1m(exp(-theta))-theta*(u+v) - 2* log(1 - exp(-theta) - (1-exp(-theta*u))*(1-exp(-theta*v))); }
library(copula)
myCop <- archmCopula(family = "frank", dim = 2, param =.2)
u<-c(0.7088403,0.05479929)
dCopula(u,myCop)
[1] 0.9622948