Copula Marshall Olkin, Multivariate Student-t

已查看 410 次
跳至第一个未读帖子

andre.p...@googlemail.com

未读,
2016年12月7日 21:11:352016/12/7
收件人 Stan users mailing list
Hi,

I've had convergence problems with the multivariate student-t distribution, so I came along and found this
beauti


which lead me to think about Copulas in STAN. 

1.
Let's consider the gumbel copula, since some Frank, Joe,...  require discrete distributions.

     case 'gumbel'
        % uses Marshall-Olkin's method
        if theta < 1
            error('archrnd:BadGumbelParameter', ...
                  'THETA must be greater than or equal to 1 for the Gumbel copula.');
        end
        if theta == 1
            y = rand(m,n); %independence
        else
            %uses positive, scaled by (cos(pi/2/theta))^theta, stable distribution
            V = rand(m,1)*pi-pi/2;
            W = -log(rand(m,1));
            T = V + pi/2;
            gamma = sin(T/theta).^(1/theta).*cos(V).^(-1).*((cos(V-T/theta))./W).^(1-1/theta);
            y = exp( - (-log(rand(m,n))).^(1/theta)./(gamma*ones(1,n))  );
        end

Considering the 'else' branch:

This is easy to implement in STAN. Note m is the sample size. Thus we would only need
to define two uniform variables for V, W and n parameters, normal(0,1) distributed 
and take its CDF, since rand(m,n) is in [0,1]. 

2. 
Multivariate Student-t

        corrtest(theta); %test if R is a correlation matrix.
        s = chi2rnd(nu,m,1);
        y = tcdf(randn(m,n)*chol(theta).*( sqrt(nu./s)*ones(1,n) ),nu);

This is a 'MATT'-trick decomposition the multivariate student-t distribution to normal(0, 1)
ones. 

3. Numerical solutions 'fixed-Talbot'

Would a implementation of numerical algorithm in STAN like the fixed talbot
to overcome the restrictions of not only discrete parameterizations, but also
allow other than Archimedean copulas?

----

Does somebody have experiences with this already? Comments are
welcome.

Thanks so much,
Andre 

https://www.case.hu-berlin.de/events/events/Archive/Workshops/Copulae/talks/Hofert.pdf


Ben Goodrich

未读,
2016年12月7日 21:39:452016/12/7
收件人 Stan users mailing list
On Wednesday, December 7, 2016 at 9:11:35 PM UTC-5, andre.p...@googlemail.com wrote:
I've had convergence problems with the multivariate student-t distribution

Sampling from a multivariate student t with low degrees of freedom is known to be problematic with Stan. For all elliptical distributions, it is better to utilize the stochastic representation:

https://en.wikipedia.org/wiki/Multivariate_t-distribution#Definition

and make the random variable of interest be a transformed parameter. Like this:

data {
  int<lower=1> p;
  vector[p] mu;
  cholesky_factor_cov[p,p] L;
  real<lower=0> nu;
}
parameters {
  vector[p] z;
  real<lower=0> u;
}
transformed parameters {
  vector[p] x;
  x = mu + sqrt(nu / u) * (L * z); // distributed multi_student_t
}
model {
  target += normal_lpdf(z | 0, 1);
  target += chi_square_lpdf(u | nu);
}

So, you don't really need to utilize a copula for this but copulas would be interesting for other reasons. We haven't received (or sought) funding to implement them in Stan.

Ben

andre.p...@googlemail.com

未读,
2016年12月8日 01:08:562016/12/8
收件人 Stan users mailing list
Ben, 

where I can vote for your code to put in the manual?

Since my last past, I used your suggestion and defined 
for nu:

real<lower=2> nu;
...
(nu - 2) ~ exponential(0.2);

a suggestion from Andrew Gelman in some other post. It works
well. n_eff ~ 50% for nu as it is a big model with 10^4 parameters.

I'll try the copulas, not because there are not need, but I hope
to tame the variance more, since my parameters are mostly latent.

Andre

Bob Carpenter

未读,
2016年12月8日 01:37:122016/12/8
收件人 stan-...@googlegroups.com
The best place to put any request that is concrete
enough to be implemented is in the issue tracker for
the appropriate repo. For the manual, there's always
an open issue named with the previous release++ and
that's the best place to put this kind of request:

https://github.com/stan-dev/stan/issues/2122

There's also a longer term manual issue with longer
term or bigger project requests.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

andre.p...@googlemail.com

未读,
2016年12月8日 10:22:412016/12/8
收件人 Stan users mailing list
I coded the nested gumbel copula and provide as an example. 
If I use  

 target += normal_lcdf(y | 0, 1);

I assumed I'll get y in [0,1], but y goes just to very high values.
So we have to use: y ~ normal(0, 1); and maybe Phi_approx(y).

The most things are straight forward from:


The operator^ is not vectorized yet, thus I didn't put effort to
make most speed out of it. 

Comments welcome,
Andre
copula.stan
copula.R
n_lpdf.R
n_lpdf.stan

Ben Goodrich

未读,
2016年12月8日 12:22:472016/12/8
收件人 Stan users mailing list
On Thursday, December 8, 2016 at 10:22:41 AM UTC-5, andre.p...@googlemail.com wrote:
If I use  

 target += normal_lcdf(y | 0, 1);

I assumed I'll get y in [0,1], but y goes just to very high values.

You need to declare y as

parameters {
  real<lower=0,upper=1> y;
}

Ben

回复全部
回复作者
转发
0 个新帖子