confused about cov matrix

649 views
Skip to first unread message

Andrew Gelman

unread,
Aug 4, 2016, 12:43:01 PM8/4/16
to Stan users mailing list
Hi all. I fit the following model:

data {
int N;
matrix[N,2] y;
}
parameters {
vector[2] mu;
cov_matrix[2] Sigma;
matrix<lower=-0.5, upper=0.5>[N,2] round_err;
}
transformed parameters {
matrix[N,2] z;
z = y - round_err;
}
model {
for (n in 1:N)
z[n,] ~ multi_normal(mu, Sigma);
}

It was with a dataset with N=185 data points (i.e., y was a 185 x 2 matrix).

Each chain gave a message kinda like this:

The following numerical problems occured the indicated number of times after warmup on chain 4
count
Exception thrown at line 16: multi_normal_log: Covariance matrix is not symmetric. Covariance matrix 14
Exception thrown at line 16: multi_normal_log: LDLT_Factor of covariance parameter is not positive d 2
When a numerical problem occurs, the Hamiltonian proposal gets rejected.
If the number in the 'count' column is small, do not ask about this message on stan-users.

--

I'm not sure if 14 is a small number so I decided it was ok for me to ask about this message here! The thing that's puzzling me here is that Sigma is declared as a cov_matrix so shouldn't it automatically be symmetric and positive definite? Or should I be using another parameterization? This seems like a problem.
A

P.S. The model seemed to fit just fine.

Bob Carpenter

unread,
Aug 4, 2016, 12:55:10 PM8/4/16
to stan-...@googlegroups.com
Yes, that's a problem. I'd thought we'd forced symmetry
on the inverse transform.

And yes, you want an alternative parameterization in
terms of Cholesky factors and to use multi_normal_cholesky.
You can use a scaled correlation matrix Cholesky factor
or just use the covariance matrix Cholesky factory
given that you don't have a prior (we don't have a
Cholesky-parameterized Wishart).

It's much much more efficient to vectorize the multi_normal
so that the matrix only has to be factored once (which is already
going to be more efficient using the Cholesky base type). So that'd
look like this:

vector<lower=-0.5, upper=0.5>[2] round_err[N];

vector[2] z[N]
for (n in 1:N) z[n] = y[n] - round_err[n];

z ~ multi_normal_cholesky(mu, L_Sigma);

When we vectorize the binary operators like -, you won't need
to loop over n in the transformed parameters.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Michael Betancourt

unread,
Aug 4, 2016, 7:40:07 PM8/4/16
to stan-...@googlegroups.com
We can force symmetry but you can still get lose positive-definiteness
with extreme correlations. Given that there is no prior on Sigma, this
isn’t necessarily surprising.

Andrew Gelman

unread,
Aug 4, 2016, 8:06:16 PM8/4/16
to stan-...@googlegroups.com
Hmm, somehow if this is the recommended way, maybe we should make it the default?
A

Andrew Gelman

unread,
Aug 4, 2016, 9:15:25 PM8/4/16
to stan-...@googlegroups.com
Michael: There's no prior on Sigma here but we have tons of data. So 14 warnings is kind of a problem.

Also, Bob: Is there an intuition as to why I'm supposed to do vector[2] z[N];, rather than the more natural (to me) matrix[N,2] z;?

> On Aug 4, 2016, at 7:40 PM, Michael Betancourt <betan...@gmail.com> wrote:
>

Bob Carpenter

unread,
Aug 4, 2016, 10:04:18 PM8/4/16
to stan-...@googlegroups.com
Only because we don't have matrix ~ multi_normal...
implemented. We just have containers (arrays) of
vectors.

- Bob

Michael Betancourt

unread,
Aug 5, 2016, 4:22:17 AM8/5/16
to stan-...@googlegroups.com
Look at the traces — covariance matrices are tricky
and you can get weird non-identifiabilities that admit
near singular matrices (which end up singular
numerically) even with tons of data. Asymptotics are
slooooooooow and even a little bit of prior information
to regularize those extreme matrices can make huge
improvements.

Andrew Gelman

unread,
Aug 5, 2016, 6:08:03 PM8/5/16
to stan-...@googlegroups.com
If we have a covariance matrix type, I'd think Stan should be able to ensure that it's symmetric--that's a pretty trivial condition, no?

Ben Goodrich

unread,
Aug 5, 2016, 6:34:42 PM8/5/16
to Stan users mailing list, gel...@stat.columbia.edu
On Friday, August 5, 2016 at 6:08:03 PM UTC-4, Andrew Gelman wrote:
If we have a covariance matrix type, I'd think Stan should be able to ensure that it's symmetric--that's a pretty trivial condition, no?

It was probably triggered by nan != nan.


Bob Carpenter

unread,
Aug 6, 2016, 2:50:40 PM8/6/16
to stan-...@googlegroups.com
Yes, as Michael says, it's easy to make symmetric.
I'll fix the constraining transform to ensure the result
is symmetric.

But we can't make it positive definite easily without
some kind of regularization. Which is why I'm assuming
Michael's suggesting a prior.

- Bob

Ben Goodrich

unread,
Aug 6, 2016, 3:13:03 PM8/6/16
to Stan users mailing list
On Saturday, August 6, 2016 at 2:50:40 PM UTC-4, Bob Carpenter wrote:
Yes, as Michael says, it's easy to make symmetric.
I'll fix the constraining transform to ensure the result
is symmetric.

It was always supposed to be symmetric (except for nan) but for some reason one of the versions of cov_matrix_constrain() only calls multiply_lower_tri_self_transpose() in the comments.

Ben

Reply all
Reply to author
Forward
0 new messages