lavaan negative variance again

adamqu...@gmail.com

unread,

Feb 11, 2015, 4:48:49 PM2/11/15

to lav...@googlegroups.com

library(lavaan)

library(semPlot)

CFA_mod = "M1 =~ NA*x10

+ x8

+ x3

M1 ~~ NA*M1

d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")

Warning messages:

1: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, :

lavaan WARNING: could not compute standard errors!

lavaan NOTE: this may be a symptom that the model is not identified.

2: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS", :

lavaan WARNING: some estimated variances are negative

3: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS", :

lavaan WARNING: covariance matrix of latent variables is not positive definite; use inspect(fit,"cov.lv") to investigate.

4: In lavaan::lavaan(model = CFA_mod, data = dd[, -1], estimator = "ULS", :

lavaan WARNING: observed variable error term matrix (theta) is not positive definite; use inspect(fit,"theta") to investigate.

> summary(CFA_fit)

lavaan (0.5-17) converged normally after 55 iterations

Number of observations 68

Estimator ULS

Minimum Function Test Statistic 18.698

Degrees of freedom 1

P-value (Unknown) NA

Parameter estimates:

Information Expected

Standard Errors Standard

Estimate Std.err Z-value P(>|z|)

Latent variables:

M1 =~

x10 -0.534

x8 0.650

x3 1.227

Regressions:

d ~

M1 -4.700

Variances:

M1 0.127

x10 1.061

x8 0.930

x3 0.850

d -1.768

> vifValues = list()

> for (i in 1:length(folds))

+ {

+ vifValues[[i]] <- HH::vif( dd[-folds[[i]],

+ c("x10", "x8", "x3" ,"d")])

+ }

> do.call(rbind, vifValues)

x10 x8 x3 d

[1,] 2.145099 1.688815 2.408851 3.606164

[2,] 2.001310 1.618807 2.573754 3.537053

[3,] 2.216816 1.701084 2.386481 3.674937

[4,] 2.080792 1.629400 2.601167 3.728013

[5,] 2.013422 1.600859 2.486274 3.247940

[6,] 2.001551 1.779798 2.329134 3.509830

[7,] 2.324024 1.783590 2.242215 3.341166

[8,] 2.188027 1.704483 2.269582 3.098449

[9,] 2.216507 1.685163 2.578708 3.786062

[10,] 1.950682 1.572766 2.369971 3.421800

I am getting negative variances again. Situation is worse with ML estimator. I looked for multicollinearity in estimators, but there is not a strong sign of that. There is a dependence to dependent variable, but that should not be a problem, right?

Can you help me what to figure out what is going on?

Thanks

data.csv

Terrence Jorgensen

unread,

Feb 12, 2015, 3:09:14 PM2/12/15

to lav...@googlegroups.com

CFA_mod = "M1 =~ NA*x10
+ x8
+ x3

M1 ~~ NA*M1
d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")

This model is not identified. You are freely estimating all the factor loadings AND the factor variance, which is more information than you have in your observed data. If you set your first factor loading to 1, or your factor variance to 1, then your first warning will go away, but you will still have a negative error variance for "d".

Can you help me what to figure out what is going on?

The observed correlation matrix indicates that "d" is negatively related to some indicators (x8 and x3) but positively related to x10.

> cor(dd[ , c("x10","x8","x3","d")])
x10 x8 x3 d
x10 1.00000000 0.3451849 0.05592003 0.3855493
x8 0.34518487 1.0000000 0.38996040 -0.3534684
x3 0.05592003 0.3899604 1.00000000 -0.6805526
d 0.38554934 -0.3534684 -0.68055265 1.0000000

The model is trying to reproduce this matrix as well as possible by making the error variance of "d" negative and the factor loading of x10 negative. What's more is that your indicators are hardly related to each other. Basically, this model is a poor description of your data.

Terry

adamqu...@gmail.com

unread,

Feb 12, 2015, 4:53:19 PM2/12/15

to lav...@googlegroups.com

Thanks for your reply.

On Thursday, February 12, 2015 at 8:09:14 PM UTC, Terrence Jorgensen wrote:

CFA_mod = "M1 =~ NA*x10
+ x8
+ x3

M1 ~~ NA*M1
d ~ M1 "

> CFA_fit = cfa(CFA_mod, data=dd[,-1], estimator="ULS")

This model is not identified. You are freely estimating all the factor loadings AND the factor variance, which is more information than you have in your observed data. If you set your first factor loading to 1, or your factor variance to 1, then your first warning will go away, but you will still have a negative error variance for "d".

I believe you're mentioning about the rule #observables(#observables+3)/2<= parameters estimated in a model where means are estimated. Yes I fixed latent variance, still there was negative variance

Can you help me what to figure out what is going on?

The observed correlation matrix indicates that "d" is negatively related to some indicators (x8 and x3) but positively related to x10.

> cor(dd[ , c("x10","x8","x3","d")])
x10 x8 x3 d
x10 1.00000000 0.3451849 0.05592003 0.3855493
x8 0.34518487 1.0000000 0.38996040 -0.3534684
x3 0.05592003 0.3899604 1.00000000 -0.6805526
d 0.38554934 -0.3534684 -0.68055265 1.0000000

I understand that model will put x10 coefficient negative, but are you suggesting that my indicators do not have high enough correlation between each other, thus single latent variable is not able cope with. If that is the case, how high the correlation between the variables should be. In the past I had problems with the fit when two variables were highly correlated too.

Terrence Jorgensen

unread,

Feb 15, 2015, 11:58:10 PM2/15/15

to lav...@googlegroups.com

are you suggesting that my indicators do not have high enough correlation between each other, thus single latent variable is not able cope with. If that is the case, how high the correlation between the variables should be.

In EFA, many researchers use a rule of thumb that standardized factor loadings should be more than around lambda = .34, which would imply that lambda^2 = 10% of the variance in the indicator is explained by the common factor. But in theory a perfectly valid indicator could have 99% error variance and 1% common-factor variance, but that would make it an extremely poor indicator of the latent construct. In your case, x8 seems minimally related (r^2 ~ 10%) to both x10 and x3, but x10 and x3 seem very independent (r^2 ~ 2.5%). But what's worse (and a separate issue, that also seems to indicate the 3 "x" variables do not measure the same thing) is that 2 of your indicators are related to "d" in the opposite direction of the other indicator's relationship to "d".

In the past I had problems with the fit when two variables were highly correlated too.

High within-construct correlations don't imply good fit of the model as a whole. Maybe it fit poorly for the same reason that your current model does.

Terry

Reply all

Reply to author

Forward