non positive-definite covariance matrices from sdreport()

Aaron Osgood-Zimmerman

unread,

Mar 23, 2017, 5:10:26 PM3/23/17

to TMB Users

Hi all -

We're working on fitting SPDE space (and time) models in TMB. We've done a bunch of comparisons against INLA and we're generally seeing pretty comparable modeling results. TMB also seems to have much faster fit times for large datasets.

I recently changed our code to allow for datapoints that aren't at mesh vertices. The results look good when the model fits correctly but we've been running into an issue where the covariance matrix from sdreport is not positive-definite (even after rounding). When the datpoints were constrained to be at mesh vertices we never ran into this issue.

To be explicit, these lines:

obj <- MakeADFun(...)

SD0 <- sdreport(obj, getReportCovariance=TRUE)

sigma <- SD0$cov

return a non-pos-def sigma. For larger simulated datasets this happens less often and the matrix isn't that far from pos-def (I can add a small number to the diagonal to make it pos-def). For smaller datasets this happens more often and the matrix may be quite far from pos-def (I may have to add diag(>5, nrow(sigma)) to sigma to make it PD).

Any suggestions/ideas/tips/advice on what might be going on and how to ensure that our VCOV matrix is PD?

We generate sigma to take joint draws from the fixed and random effects in order to simulate predictive maps and make inferences based on the draws.

Much thanks for all your awesome work in making this package available and for thinking about our question.

Cheers,
Aaron

Hans Skaug

unread,

Mar 24, 2017, 3:44:46 AM3/24/17

to TMB Users

Hi,

We're working on fitting SPDE space (and time) models in TMB. We've done a bunch of comparisons against INLA and we're generally seeing pretty comparable modeling results. TMB also seems to have much faster fit times for large datasets.

TMB-SPDE should give very similar results to INLA, modulo numerical imprecision in the optimization and the uninformative priors used by INLA (which I think can be turned off by some option).

I recently changed our code to allow for datapoints that aren't at mesh vertices. The results look good when the model fits correctly but we've been running into an issue where the covariance matrix from sdreport is not positive-definite (even after rounding). When the datpoints were constrained to be at mesh vertices we never ran into this issue.

To be explicit, these lines:

obj <- MakeADFun(...)
SD0 <- sdreport(obj, getReportCovariance=TRUE)
sigma <- SD0$cov

return a non-pos-def sigma. For larger simulated datasets this happens less often and the matrix isn't that far from pos-def (I can add a small number to the diagonal to make it pos-def). For smaller datasets this happens more often and the matrix may be quite far from pos-def (I may have to add diag(>5, nrow(sigma)) to sigma to make it PD).

Any suggestions/ideas/tips/advice on what might be going on and how to ensure that our VCOV matrix is PD?

Usually you would include a mesh point for each observation and points where you want to make a prediction.

I am guessing that the reason you are not doing this is that you have a very large number of observations.

Further I assume that you interpolate between three mesh points for an observation lying in between.

I have never tried this, but I have always assumed it would be possible, but you seem to have found that there

are problems. Could you provide a small example code where the problem occurs?

Hans

Cole Monnahan

unread,

Mar 24, 2017, 10:59:01 AM3/24/17

to Hans Skaug, TMB Users

For what it's worth, I fit a SPDE model in TMB that had about 100,000 data points and 2000 mesh knots. I fit it to several versions of the model (structures, likelihoods, etc.) and it always had a PD vcov. I didn't do any explicit extrapolation between mesh points, but rather associated each point with a cell (cluster?). I think the INLA tools do this automatically if I remember correctly. Anyways, just to say this has been done and can work. I can dig up details if it helps.... -Cole

--
To post to this group, send email to us...@tmb-project.org. Before posting, please check the wiki and issuetracker at https://github.com/kaskr/adcomp/. Please try to create a simple repeatable example to go with your question (e.g issues 154, 134, 51). Use the issuetracker to report bugs.
---
You received this message because you are subscribed to the Google Groups "TMB Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tmb-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tmb-users/df6fb7a1-bb9a-409d-805f-39d8ed124ee6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aaron Osgood-Zimmerman

unread,

Mar 24, 2017, 4:23:37 PM3/24/17

to TMB Users

Hi Hans -

Thanks for the prompt reply.

We will have a lot of data in our actual modelling, but for now we're using small simulated datasets.

I do project/interpolate from the mesh vertices to the data locations using an 'A' matrix constructed from INLA functions (see the code for details).

Due to the size of the data object, I can't attach the files directly, but you can find them here: https://cloud.ihme.washington.edu/index.php/s/4qaZOyJn0sxJwJt

There is a data object, the R code, and the .cpp model specification.

Please let me know if you have any ideas. One thing I haven't tried yet is returning the precision matrix and checking if the inverse of the precision is PD. I've assumed that won't make a difference but maybe that's more stable?

Cheers,

Aaron

Aaron Osgood-Zimmerman

unread,

Mar 24, 2017, 4:28:40 PM3/24/17

to TMB Users, hsk...@gmail.com

Hi Cole -

As I mentioned in my email, the PD issue seems to happen much less often (or not at all) for larger datasets. I used the 'A' projector matrices from INLA to project the GP approximation from mesh points to data locations.

I would be interested in seeing your code and testing your method on my small data to see if it fixes the PD issues I'm having.

Much thanks,
Aaron

To unsubscribe from this group and stop receiving emails from it, send an email to tmb-users+...@googlegroups.com.

Hans Skaug

unread,

Mar 28, 2017, 7:37:19 AM3/28/17

to TMB Users

Thanks. I had a look at this, and there are things I do not understand by let me

have a go at it:

- The model seems fairly well behaved.

- You had changed the control option to nlminb(). Early termination of nlminb could cause a non-positive Hessian.

When remove "control" everyting seems OK. cov2cor(SD0$cov) shows reasonable correlations.

They way I understand it Epsilon_xt contains the space-time random field from which you want to sample.

I do not understand "ADREPORT( alpha )" as alpha is already a parameter for which standard deviation will be calculated.

Aaron Osgood-Zimmerman

unread,

Mar 28, 2017, 5:33:21 PM3/28/17

to TMB Users

Hi Hans -

Much thanks! You are completely correct and once I set the nlminb controls back to the defaults it all works as expected. Sorry to all for taking your time and I hope someone finds the code example useful if nothing else.

As for the ADREPORT(alpha) line, this clearly betrays my lack of understanding regarding the TMB package and framework. I'm still coming up to speed on it after using INLA and other space-time modelling tools. Will it slow things down much to include it that line in the cpp template? I suppose a better question is: when exactly do we I need to use adreport and when can I get away without it?

Much thanks (again),

Aaron

Hans Skaug

unread,

Mar 29, 2017, 12:32:04 AM3/29/17

to TMB Users

Hi Aaron,

Your code contain elements that are useful to others, like "inla.spde.make.A()" which I imagine projects new locations onto an existing mesh.

The INLA R docs are not particularly informative about what inla.spde.make.A does. Can you say, or point to more detailed info?

I am not sure if including ADREPORT(alpha) actually does any harm, but my own steering rules are: do not use ADREPORT() on anything

that is imported from R as a PARAMETER_ type, as you will automatically be able to use sdreport() on it.

The current TMB docs on this are brief, and are being worked on:

http://kaskr.github.io/adcomp/group__macros.html#ga43a13c435127491364e4a167478f4992

On Tuesday, March 28, 2017 at 11:33:21 PM UTC+2, Aaron Osgood-Zimmerman wrote:

Hi Hans -

Aaron Osgood-Zimmerman

unread,

Mar 29, 2017, 1:49:27 AM3/29/17

to TMB Users

Hi Hans -

You are correct that the inla.spde.make.A() function projects the approximated GP surface from mesh vertices to other locations. They don't have to be new locations per se if, as we were trying to do, none of your data are at mesh vertices to begin with. The rows of A correspond to datapoints and the columns to mesh vertices. In the event that each observation is at a mesh vertex then the A matrix returned from that function should only have one non-zero entry (of value 1) in each row which maps datapoints to mesh vertices. This may also be useful if you don't want to worry about datapoint and mesh vertex ordering/matching. For datapoints located away from mesh vertices, each corresponding data row in the A matrix will have three non-zero entries which will sum to one and which linearly (since the spde object approximation in INLA is a linear/planar approximation to the continuous surface) project from the approximated surface at the mesh points to the value of the approx. surface at some location inside one of the mesh triangles.

I found the SPDE tutorial to be pretty useful though it takes some dedication to get through it: http://www.math.ntnu.no/inla/r-inla.org/tutorials/spde/spde-tutorial.pdf . It discusses these mesh "projector matrices" on page 14 (section 1.2.2).

Thanks for the tips on adreport! I missed the dichotomy you describe between parameter_types and other derived objects but that's quite useful to know. I understand it's a work in progress but I'm quite excited by the flexibility TMB offers and I am OK being an earliy(ish) adopter.

Cheers,
Aaron

Hans Skaug

unread,

Mar 29, 2017, 2:39:23 AM3/29/17

to TMB Users

Sounds good. It seems important to exploit that A is a sparse matrix when the model is large.