lavaan negative variance again

3,530 views
Skip to first unread message

adamqu...@gmail.com

unread,
Feb 11, 2015, 4:48:49 PM2/11/15
to lav...@googlegroups.com
library(lavaan)
library(semPlot)
CFA_mod  =  "M1 =~ NA*x10
+ x8
+ x3

M1 ~~ NA*M1
d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")
Warning messages:
1: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING: could not compute standard errors!
  lavaan NOTE: this may be a symptom that the model is not identified.

2: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS",  :
  lavaan WARNING: some estimated variances are negative
3: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS",  :
  lavaan WARNING: covariance matrix of latent variables is not positive definite; use inspect(fit,"cov.lv") to investigate.
4: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS",  :
  lavaan WARNING: observed variable error term matrix (theta) is not positive definite; use inspect(fit,"theta") to investigate.
> summary(CFA_fit)
lavaan (0.5-17) converged normally after  55 iterations

  Number of observations                            68

  Estimator                                        ULS
  Minimum Function Test Statistic               18.698
  Degrees of freedom                                 1
  P-value (Unknown)                                 NA

Parameter estimates:

  Information                                 Expected
  Standard Errors                             Standard

                   Estimate  Std.err  Z-value  P(>|z|)
Latent variables:
  M1 =~
    x10              -0.534
    x8                0.650
    x3                1.227

Regressions:
  d ~
    M1               -4.700

Variances:
    M1                0.127
    x10               1.061
    x8                0.930
    x3                0.850
    d                -1.768
> vifValues = list()
> for (i in 1:length(folds))
+ {
+   vifValues[[i]]  <- HH::vif( dd[-folds[[i]],
+                                           c("x10", "x8", "x3" ,"d")])
+ }
> do.call(rbind, vifValues)
           x10       x8       x3        d
 [1,] 2.145099 1.688815 2.408851 3.606164
 [2,] 2.001310 1.618807 2.573754 3.537053
 [3,] 2.216816 1.701084 2.386481 3.674937
 [4,] 2.080792 1.629400 2.601167 3.728013
 [5,] 2.013422 1.600859 2.486274 3.247940
 [6,] 2.001551 1.779798 2.329134 3.509830
 [7,] 2.324024 1.783590 2.242215 3.341166
 [8,] 2.188027 1.704483 2.269582 3.098449
 [9,] 2.216507 1.685163 2.578708 3.786062
[10,] 1.950682 1.572766 2.369971 3.421800

I am getting negative variances again. Situation is worse with ML estimator. I looked for multicollinearity in estimators, but there is not a strong sign of that. There is a dependence to dependent variable, but that should not be a problem, right?

Can you help me what to figure out what is going on? 

Thanks


data.csv

Terrence Jorgensen

unread,
Feb 12, 2015, 3:09:14 PM2/12/15
to lav...@googlegroups.com

CFA_mod  =  "M1 =~ NA*x10
+ x8
+ x3

M1 ~~ NA*M1
d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")


This model is not identified.  You are freely estimating all the factor loadings AND the factor variance, which is more information than you have in your observed data.  If you set your first factor loading to 1, or your factor variance to 1, then your first warning will go away, but you will still have a negative error variance for "d".  
 
Can you help me what to figure out what is going on? 

The observed correlation matrix indicates that "d" is negatively related to some indicators (x8 and x3) but positively related to x10.

> cor(dd[ , c("x10","x8","x3","d")])
           x10         x8          x3          d
x10 1.00000000  0.3451849  0.05592003  0.3855493
x8  0.34518487  1.0000000  0.38996040 -0.3534684
x3  0.05592003  0.3899604  1.00000000 -0.6805526
d   0.38554934 -0.3534684 -0.68055265  1.0000000

The model is trying to reproduce this matrix as well as possible by making the error variance of "d" negative and the factor loading of x10 negative.  What's more is that your indicators are hardly related to each other.  Basically, this model is a poor description of your data.

Terry

adamqu...@gmail.com

unread,
Feb 12, 2015, 4:53:19 PM2/12/15
to lav...@googlegroups.com
Thanks for your reply.



On Thursday, February 12, 2015 at 8:09:14 PM UTC, Terrence Jorgensen wrote:

CFA_mod  =  "M1 =~ NA*x10
+ x8
+ x3

M1 ~~ NA*M1
d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")


This model is not identified.  You are freely estimating all the factor loadings AND the factor variance, which is more information than you have in your observed data.  If you set your first factor loading to 1, or your factor variance to 1, then your first warning will go away, but you will still have a negative error variance for "d".  

I believe you're mentioning about the rule #observables(#observables+3)/2<= parameters estimated in a model where means are estimated. Yes I fixed latent variance, still there was negative variance 
 
Can you help me what to figure out what is going on? 

The observed correlation matrix indicates that "d" is negatively related to some indicators (x8 and x3) but positively related to x10.

> cor(dd[ , c("x10","x8","x3","d")])
           x10         x8          x3          d
x10 1.00000000  0.3451849  0.05592003  0.3855493
x8  0.34518487  1.0000000  0.38996040 -0.3534684
x3  0.05592003  0.3899604  1.00000000 -0.6805526
d   0.38554934 -0.3534684 -0.68055265  1.0000000


I understand that model will put x10 coefficient negative, but are you suggesting that my indicators do not have high enough correlation between each other, thus single latent variable is not able cope with. If that is the case, how high the correlation between the variables should be. In the past I had problems with the fit when two variables were highly correlated too.

Terrence Jorgensen

unread,
Feb 15, 2015, 11:58:10 PM2/15/15
to lav...@googlegroups.com
are you suggesting that my indicators do not have high enough correlation between each other, thus single latent variable is not able cope with. If that is the case, how high the correlation between the variables should be.

In EFA, many researchers use a rule of thumb that standardized factor loadings should be more than around lambda = .34, which would imply that lambda^2 = 10% of the variance in the indicator is explained by the common factor.  But in theory a perfectly valid indicator could have 99% error variance and 1% common-factor variance, but that would make it an extremely poor indicator of the latent construct.  In your case, x8 seems minimally related (r^2 ~ 10%) to both x10 and x3, but x10 and x3 seem very independent (r^2 ~ 2.5%).  But what's worse (and a separate issue, that also seems to indicate the 3 "x" variables do not measure the same thing) is that 2 of your indicators are related to "d" in the opposite direction of the other indicator's relationship to "d".

 
In the past I had problems with the fit when two variables were highly correlated too.

High within-construct correlations don't imply good fit of the model as a whole.  Maybe it fit poorly for the same reason that your current model does.

Terry

Reply all
Reply to author
Forward
0 new messages