lavaan WARNING: covariance matrix of latent variables is not positive definite

10,331 views
Skip to first unread message

João Marôco

unread,
Jul 23, 2018, 1:00:01 PM7/23/18
to lavaan
Dear All,
I am fitting this model with Lavaan 0.6-2:

> fit<-cfa(EMI_2, data=dataBD,ordered=ord)
 
And I am getting this Warning message:

Warning message:
In lav_object_post_check(object) :
  lavaan WARNING: covariance matrix of latent variables
                is not positive definite;
                use lavInspect(fit, "cov.lv") to investigate.

 I did run 
lavInspect(fit, "cov.lv")

and got this cov matrix:
 
    GE    REV   PRA   DES   RS    AFI   COM   PRS   PRV   SAP   CON   APA   FOR   AGI  
GE  0.348                                                                              
REV 0.395 0.652                                                                        
PRA 0.376 0.692 0.748                                                                  
DES 0.293 0.476 0.551 0.617                                                            
RS  0.165 0.232 0.325 0.505 0.708                                                      
AFI 0.219 0.343 0.396 0.397 0.358 0.512                                                
COM 0.158 0.274 0.402 0.537 0.573 0.358 0.696                                          
PRS 0.101 0.125 0.078 0.135 0.139 0.182 0.132 0.273                                    
PRV 0.261 0.415 0.301 0.237 0.082 0.178 0.067 0.275 0.619                              
SAP 0.356 0.584 0.533 0.374 0.101 0.217 0.140 0.196 0.612 0.759                        
CON 0.105 0.182 0.139 0.173 0.163 0.029 0.073 0.096 0.301 0.302 0.697                  
APA 0.194 0.354 0.320 0.335 0.311 0.096 0.177 0.107 0.381 0.501 0.610 0.861            
FOR 0.292 0.492 0.518 0.482 0.319 0.262 0.304 0.134 0.409 0.579 0.276 0.538 0.730      
AGI 0.296 0.461 0.470 0.558 0.343 0.295 0.359 0.170 0.370 0.469 0.222 0.366 0.600 0.657 

So, no negative or zero variances.

My model does convert, all loadings look great and the GOF are also great: 

 > fit.meas<-c("chisq", "df", "cfi", "tli", "nnfi", "rmsea", "rmsea.ci.lower", "rmsea.ci.upper")
> fitMeasures(fit, fit.meas)
         chisq             df            cfi            tli           nnfi          rmsea rmsea.ci.lower 
      5934.289        988.000          0.991          0.989          0.989          0.063          0.061 
rmsea.ci.upper 
         0.064 

My questions:
1- What else can I do to find out what is the source of the non-positive and definite cov matrix?
2 - How seriously will my model estimates will be compromised (since I do not have negative/zero variances) ?

Any thoughts on this will be welcome.

Best,
João
 

Chao Xu

unread,
Jul 23, 2018, 2:55:43 PM7/23/18
to lavaan
1- What else can I do to find out what is the source of the non-positive and definite cov matrix?

Positive-definiteness refers to all eigenvalues being positive. So the appearance of the covariance matrix tells you literally nothing. You can use eigen command to see the diagnostics. 

2 - How seriously will my model estimates will be compromised (since I do not have negative/zero variances) ?

It is usually caused by poor fit. Given look of the model-fitting indices, I would recommend the following article. 


Chao

Jeremy Miles

unread,
Jul 24, 2018, 3:51:28 AM7/24/18
to lavaan
This is tell you that the program has found a model that fits, but it turns out that the parameter estimates for the model have an implied covariance matrix that cannot actually exist. It does this by saying that the model is not positive definite, which means that it has a non-positive determinant (and non-positive eigenvalues).

This can arise in a few ways, it's easier to use a correlation matrix to understand what happened. Sometimes this arises because of something serious (like a correlation greater than 1). Sometimes it's less serious, it's worth looking and the correlations differ slightly from what is possible (e.g.  this correlation matrix can't exist in real data):

1
0.9 1
0.9 0.0 1

Depending on your model there's a trick you can use called a Cholesky decomposition, which will make sure that this doesn't happen. 

You don't always need to worry about it - try to see how large the discrepancy is.

Jeremy 


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

João Marôco

unread,
Jul 24, 2018, 4:00:31 AM7/24/18
to lav...@googlegroups.com
Hi Xao
Thanks. All GoF looks great. 
Best 

João Marôco
[Sent from my not that smart smartphone with more errors than usual]

--

João Marôco

unread,
Jul 24, 2018, 4:03:56 AM7/24/18
to lav...@googlegroups.com
Thanks Jeremy 
I do have one. 99 correlation. But even if I remove one of the two factor, still get NPD matrix. 
Is it possible to do Cholesky decomposition using lavaan? 

Best, 
João Marôco
[Sent from my not that smart smartphone with more errors than usual]

Jeremy Miles

unread,
Jul 24, 2018, 4:07:44 AM7/24/18
to lavaan

Correlation of 0.99 sounds like the problem.  That typically happens when you have omitted a correlated error somewhere. 

You can fit a Cholesky if you have a CFA model. What is your model?


jpma...@gmail.com

unread,
Jul 24, 2018, 4:39:47 AM7/24/18
to lav...@googlegroups.com

Thanks Jeremy,

 

This is my model:

 

EMI_2 <- '

GE =~ EMI6 + EMI20 + EMI34 + EMI46

REV =~ EMI3 + EMI17 + EMI31

PRA =~ EMI9 + EMI23 + EMI37 + EMI48

DES =~ EMI14 + EMI28 + EMI42 + EMI51

RS =~ EMI5 + EMI19 + EMI33 + EMI45

AFI =~ EMI10 + EMI24 + EMI38 + EMI49

COM =~ EMI12 + EMI26 + EMI40 + EMI50

PRS =~ EMI11 + EMI25 + EMI39

PRV =~ EMI2 + EMI16 + EMI30

SAP =~ EMI7 + EMI21 + EMI35

CON =~ EMI1 + EMI15 + EMI29 + EMI43

APA =~ EMI4 + EMI18 + EMI32 + EMI44

FOR =~ EMI8 + EMI22 + EMI36 + EMI47

AGI =~ EMI13 + EMI27 + EMI41

'

> fit<-cfa(EMI_2, data=dataBD, ordered=ord)

 

I do have very large MI for some items, but since the overall fit is good I did not add any item’s correlations.

 

Best

Jeremy Miles

unread,
Jul 24, 2018, 4:57:03 AM7/24/18
to lavaan

Ah, the large MIs might explain it.

The overall fit looks at every part of the model, but very good parts can cancel out bad parts - it's a bit like the statistician how has one hand in a fire and one hand in a bucket of ice and says "On average, I'm fine."  

You have global fit, but local misfit and you should probably try to address that.

Jeremy

jpma...@gmail.com

unread,
Jul 24, 2018, 5:10:33 AM7/24/18
to lav...@googlegroups.com

Yes, I have correlated items, removed them, but still the same problem. All item’s loadings are above .5, so local fit problems don’t seem to exist. All items have |sk| and |ku| smaller than 2…

I am running out of ideas (other than to increase sample size) on how to tackle the problem. Will research on the Cholesky implementation in lavaan.

Best,

Jeremy Miles

unread,
Jul 24, 2018, 5:14:30 AM7/24/18
to lavaan
Do you still have high MIs?

I'll write something about the Cholesky later, but here's an example to get you going:

## Regular example:
## The famous Holzinger and Swineford (1939) example
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

Same data, but model covariances of speed, textual and visual using Cholesky.


HS.model2 <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 
              c1 =~ visual + textual + speed
              c2 =~ textual + speed
              c3 =~ speed
              c1 ~~ 0*c2 + 0*c3
              c2 ~~ 0*c3
              visual ~~ 0 * visual
              textual ~~ 0 * textual
              speed ~~ 0 * speed'

I'm a tiny bit concerned that this will just cover your problems, rather than solve them though.

J

jpma...@gmail.com

unread,
Jul 24, 2018, 8:48:30 AM7/24/18
to lav...@googlegroups.com

Yes, my MI go from 90+… to 400 for example:

> mi[mi$op == "~~",]

       lhs op   rhs      mi    epc sepc.lv sepc.all sepc.nox

2516 EMI27 ~~ EMI41 403.957  0.323   0.323    2.000    2.000

2267 EMI25 ~~ EMI30 209.811  0.311   0.311    0.891    0.891

1911  EMI5 ~~  EMI4 193.139  0.343   0.343    0.664    0.664

2515 EMI13 ~~ EMI41 169.946 -0.255  -0.255   -1.437   -1.437

2514 EMI13 ~~ EMI27 156.603 -0.247  -0.247   -1.195   -1.195

2481 EMI32 ~~ EMI44 155.238  0.179   0.179    0.764    0.764

2426 EMI15 ~~ EMI29 142.210  0.174   0.174    0.887    0.887

2497  EMI8 ~~ EMI36 132.468  0.150   0.150    0.606    0.606

2329 EMI16 ~~ EMI21  92.656  0.140   0.140    0.797    0.797

 

It’s a large instrument (51 items) and 14 factors… So it will take me a while to adapt your example. Thanks for your input.

Reply all
Reply to author
Forward
0 new messages