multi_student_t with different dfs per variate?

100 views
Skip to first unread message

Mike Lawrence

unread,
Nov 8, 2016, 7:20:27 PM11/8/16
to stan-...@googlegroups.com
multi_student_t() has been suggested for a more robust model of multivariate data, and I can imagine doing explicit inference on the degrees of freedom parameter, but if doing so, I imagine it might be the case that different dimensions of the multivariate might necessitate different values for the df. Obviously this is beyond the design of multi_student_t(), but could it be hacked together from the relation between normal(), chi-square() and student_t()? 

As I understand it, if X is normal(0,1) and Z is chi_square(2), then X/sqrt(Z/2) is student_t(2,0,1). So I imagine a data-generating model as:

  data{
    int K ;
    int N ;
    vector[K] mu ;
    vector<lower=0>[K] sigma ;
    vector<lower=0>[K] log_df ; #note log-df instead of df 
    corr_matrix[K] rho ;
  }
  transformed data{
    matrix[K,K] covmat ;
    covmat = quad_form_diag(rho,sigma) ;
  }
  parameters{
    matrix[N,K] nrml ;
    matrix[N,K] chisq ;
  }
  model{
    for(i in 1:N){
      nrml[i] ~ multi_normal(mu, covmat) ;
      for(j in 1:K){
        chisq[i,j] ~ chi_sq( exp( log_df[j] ) ) ;
      }
    }
  }
  generated quantities{
    matrix[N,K] Y ;
    for(i in 1:N){
      for(j in 1:K){
        Y[i,j] = nrml[i,j] / sqrt( chisq[i,j] / exp( log_df[j] ) )
      }
    }
  }

But I'm unsure how to convert this into a model that achieves inference rather than data generation. That is, starting with something like:

  data{
    int K ;
    int N ;
    matrix[N,K] Y ;
  }
  parameters{
    vector[K] mu ;
    vector<lower=0>[K] sigma ;
    vector<lower=0>[K] log_df ;
    corr_matrix[K] rho ;
   #possibly other parameters? nrml? chisq?
  }
  transformed parameters{
    matrix[K,K] covmat ;
    covmat = quad_form_diag(rho,sigma) ;
 ...

Thoughts?

Mike

Bob Carpenter

unread,
Nov 9, 2016, 10:59:44 AM11/9/16
to stan-...@googlegroups.com
You only ever need to code the density you care about in Stan
and declare the known variables as data and unknowns as parameters.
Just make sure you add appropriate Jacobians for the transforms.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Mike Lawrence

unread,
Nov 9, 2016, 11:14:22 AM11/9/16
to stan-...@googlegroups.com
Yes, but I believe the density I describe (multi_student_t with different dfs per dimension) isn't built into Stan, and I'm unsure how to code it. I feel like it can be done by combining multi_normal with chi-square, but am not expert enough to see precisely how.



> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

Mike Lawrence

unread,
Nov 9, 2016, 11:37:24 AM11/9/16
to stan-...@googlegroups.com
To make things clearer, if there were a function multi_student_t_2() that accepted a vector for the df rather than a single value, then I would simply do:

  data{
    int K ;
    int N ;
    matrix[N,K] Y ;
  }
  parameters{
    vector[K] mu ;
    vector<lower=0>[K] sigma ;
    vector<lower=0>[K] log_df ;
    corr_matrix[K] rho ;
    matrix[N,K] chisq ;
  }
  transformed parameters{
    matrix[K,K] covmat ;
    covmat = quad_form_diag(rho,sigma) ;
  }
  model{
    #priors (scaled-data assumed)
    mu ~ normal(0,1) ;
    sigma ~ weibull(2,1) ;
    log_df ~ cauchy(0,1) ;
    rho ~ lkj_corr(1) ;
    #pdf for multivariate student-t with df per dim
    for(i in 1:N){
      Y[i] ~ multi_student_t_2(exp(log_df),mu, covmat) ;
    }
  }


Without the multi_student_t_2() function, this is what I came up with just now, but I'm not sure if I need to do any Jacobian adjustments (or how to go about computing that in the first place):

  data{
    int K ;
    int N ;
    matrix[N,K] Y ;
  }
  parameters{
    vector[K] mu ;
    vector<lower=0>[K] sigma ;
    vector<lower=0>[K] log_df ;
    corr_matrix[K] rho ;
    matrix[N,K] chisq ;
  }
  transformed parameters{
    matrix[K,K] covmat ;
    matrix[N,K] nrml ;
    covmat = quad_form_diag(rho,sigma) ;
    #assert that nrml is a function of Y, chisq & log_df
    for(i in 1:N){
      for(j in 1:K){
        nrml[i,j] = Y[i,j] * sqrt( chisq[i,j] / exp( log_df[j] ) ) ;
      }
    }
  }
  model{
    #priors (scaled-data assumed)
    mu ~ normal(0,1) ;
    sigma ~ weibull(2,1) ;
    log_df ~ cauchy(0,1) ;
    rho ~ lkj_corr(1) ;
    #pdf for nrml & chisq
    for(i in 1:N){
      nrml[i] ~ multi_normal(mu, covmat) ;
      for(j in 1:K){
        chisq[i,j] ~ chi_square( exp( log_df[j] ) ) ;
      }
    }
  }



Thoughts?

Bob Carpenter

unread,
Nov 9, 2016, 4:08:40 PM11/9/16
to stan-...@googlegroups.com
Just write that function then write your program the way you
did. If you write it with the _lpdf suffix, you'll even
be able to use the sampling notation with it.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Mike Lawrence

unread,
Dec 7, 2016, 10:47:24 PM12/7/16
to Stan users mailing list
Bob just posted some code in another thread that I think helps me here. Is it correct that the way to have a multivariate student-t with different degrees of freedom for each axis, I would do (the lines I've changed are commented out with my elaboration immediately below):

data {
  int<lower=1> p;
  vector[p] mu;
  cholesky_factor_cov[p,p] L;
  //real<lower=0> nu;
  vector<lower=0>[p] nu;
}
parameters {
  vector[p] z;
  //real<lower=0> u;
  vector<lower=0>[p] u;
}
transformed parameters {
  vector[p] x;
  x = mu + sqrt(nu / u) * (L * z); // distributed multi_student_t
}
model {
  target += normal_lpdf(z | 0, 1);
  target += chi_square_lpdf(u | nu);
}

Mike Lawrence

unread,
Feb 22, 2017, 4:53:57 PM2/22/17
to stan-...@googlegroups.com
Re-posting this question as it just came up for me again:

Bob Carpenter

unread,
Feb 22, 2017, 5:06:27 PM2/22/17
to stan-...@googlegroups.com
I think you need elementwise operations, but otherwise, yes.

In the simple case where you want a bunch of normals with different
means mu and scales sigma, you'd do:

vector[N] mu;
vector<lower=0>[N] sigma;

y_raw ~ normal(0, 1);
y = mu + sigma .* y_raw;

Note the ".*" for the elementwise operation.

- Bob
Reply all
Reply to author
Forward
0 new messages