Fit Parameters of Bivariate Normal Probability

698 views
Skip to first unread message

Jonathan Knight

unread,
Jun 23, 2014, 10:52:05 AM6/23/14
to stan-...@googlegroups.com
Dear Stan List

I am trying to fit a bivariate normal distribution and get back the covariance matrix from two input values (x) and a density value (y). I am obviously missing something obvious as I cannot get this to work. I want to be able to extend this to more dimensions when I can get bivariate to work. Any help would be appreciated.

My code is as follows:

library(mvtnorm)
library(rstan)

x <- array(rnorm(n=200),dim=c(100,2)) # Though could pick any distribution for x's
y <- dmvnorm(x=x,mean=c(0,0),sigma=matrix(c(1,0,0,1),ncol=2)) #Creates the multi-variate normal distribution

test_data <- list(N=100,x=x ,y=y)
test_code <- '
data {
  int<lower=1> N;
  vector[2] x[N];
  vector[N] y;
}
transformed data{
  vector[2] sigma;
  vector[2] mu;
  for (i in 1:2){ 
    sigma[i] <- sd(x[i]);
    mu[i] <- mean(x[i]);
  } 
}
parameters{
  real<lower = -0.99,upper=0.99> tau;
}
transformed parameters{
  cov_matrix[2] Sigma;
  Sigma[1,1] <- sigma[1]*sigma[1];
  Sigma[2,2] <- sigma[2]*sigma[2];
  Sigma[1,2] <- sigma[1]*sigma[2]*tau;
  Sigma[2,1] <- sigma[1]*sigma[2]*tau;
}
model {
    y ~ multi_normal(mu,Sigma);
}
'
fit <- stan(model_code = test_code, data = test_data,iter = 1000, chains = 4)
print(fit)


I get the following errors:
Error in function stan::prob::multi_normal_log(l): Size of random variableSize of random variable (100) and size of location parameter (2) must match in sizeRejecting proposed initial value with zero density.

which I think is because the multi_normal is expecting vectors the size of the sample, so how can I get the multi_normal function to recognise that I want the mean and covariances of all the data ?


Luc Coffeng

unread,
Jun 23, 2014, 12:04:33 PM6/23/14
to
What are you exactly trying to do? My first intuition would be that in the model block, you should evaluate x (the random variable) rather than y (a density). Keep in mind that "sampling" statements in the model block (e.g. x ~ multi_normal(mu,sigma) simply represent adding the log-probability/density of (x | mu,sigma) to the log-posterior (leaving out any constants).

Why do you need the density y?

Also, if you want to extend the dimensions of this model, I recommend you take a look at some of the native parameter types that could make your life a lot easier (correlation matrices and cholesky factors, starting with the first). The simple constraints on rho that you have now will only work in the two-dimensional case; for higher dimensions, this approach will not guarantee that the covariance matrix is positive-definitive and will lead to a lot of Metropolis rejections and inefficient sampling. Cholesky factors have the added benefit of cranking down the computational effort through (no inversion of the full covariance matrix required).


Luc

Bob Carpenter

unread,
Jun 23, 2014, 1:10:59 PM6/23/14
to stan-...@googlegroups.com
I was also confused by the language, but I
think you want something like this (I didn't compile it
to make sure it works, but the logic's right):

data {
int<lower=0> N; // # observations
vector[2] y[N]; // N binary observations
}
parameters {
vector[2] mu; // mean to fit
vector<lower=0>[2] sigma; // scales to fit
real<lower=-1,upper=1> rho; // correlation to fit
}
transformed parameters {
cov_matrix[2] Sigma;
Sigma[1,1] <- square(sigma[1]);
Sigma[2,2] <- square(sigma[2]);
Sigma[1,2] <- rho * sigma[1] * sigma[2];
Sigma[2,1] <- Sigma[1,2];
}
model {
for (n in 1:N)
y[n] ~ multi_normal(mu,Sigma);
}

This model has improper priors for mu and sigma.

Alternatively, you could remove sigma and rho and move
the declaration of Sigma to parameters and just fit it directly.
But as Luc pointed out in his reply, that's not the most efficient
thing to do.

As of release 2.3 (out any minute now), we have vectorized
multi-normal so you should be able to get away with a model
that's just

model {
y ~ multi_normal(mu,Sigma);
}

- Bob


On Jun 23, 2014, at 6:04 PM, Luc Coffeng <lucco...@gmail.com> wrote:

> What are you exactly trying to do? My first intuition would be that in the model block, you should evaluate x (the random variable) rather than y (a density). Keep in mind that "sampling" statements in the model block (e.g. x ~ multi_normal(mu,sigma) simply represent adding the log-probability LP(x | mu,sigma) the log-posterior (leaving out any constants).
>
> Why do you need the density y?
>
> Also, if you want to extend the dimensions of this model, I recommend you take a look at some of the native parameter types that could make your life a lot easier (correlation matrices and cholesky factors, starting with the first). The simple constraints on rho that you have now will only work in the two-dimensional case; for higher dimensions, this approach will not guarantee that the covariance matrix is positive-definitive and will lead to a lot of Metropolis rejections and inefficient sampling. Cholesky factors have the added benefit of cranking down the computational effort through (no inversion of the full covariance matrix required).
>
>
> Luc
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Jonathan Knight

unread,
Jun 24, 2014, 6:18:50 AM6/24/14
to stan-...@googlegroups.com
Thanks for your comments - as I read them I realised that I wasn't thinking about the problem correctly. I was also reading the Stan 2.3 manual rather than 2.2 so that didn't help ! Here is the model (for 2.2) I have ended with - I wasn't able to work out the Cholesky factorisation so I fixed the priors of the mean and covarience. Any further comments would be appreciated.

library(mvtnorm)
library(rstan)

x <- rmvnorm(n=100,mean=c(2,4),sigma=matrix(c(1,0,0,2),ncol=2))
y <- dmvnorm(x=x,mean=c(0,0),sigma=matrix(c(1,0,0,1),ncol=2))

test_data <- list(N=100,D=2,x=x ,y=y)

test_code <- '
data {
  int<lower=1> N;
  int<lower=1> D;
  vector[D] x[100];
  vector[100] y;
}
transformed data {
  cov_matrix[D] S;
  for (i in 1:D)
    for (j in 1:D)
      S[i,j] <- if_else(i==j,1.5,0);
}
parameters{
  cov_matrix[D] Sigma;
  vector[D] mu;
}
model {
  for (i in 1:D)
    mu[i] ~ normal(2,10);
  Sigma ~ inv_wishart(D,S);
  for (i in 1:N)
    x[i] ~ multi_normal(mu,Sigma);

Bob Carpenter

unread,
Jun 24, 2014, 7:18:44 AM6/24/14
to stan-...@googlegroups.com
Replace this:

> for (i in 1:N)
> x[i] ~ multi_normal(mu,Sigma);

with:

{
matrix[D,D] L;
L <- cholesky_decompose(Sigma);
for (n in 1:N)
x[n] ~ multi_normal_cholesky(mu,L);
}

Introducing new brackets gives you the scope for a local
variable; alternatively you can declare L at the top of
the model block.

But you might also want to look at our recommendations for priors
in the manual (2.2 or 2.3). Unlike BUGS and JAGS, we're not
restricted to conjugate priors for multivariates and we think the
scaled LKJ prior is more natural (lets you concentrate prior mass
near a diagonal correlation matrix and independently put priors on
the scales of each dimension).

- Bob

Reply all
Reply to author
Forward
0 new messages