lavaan ERROR: initial model-implied matrix (Sigma) is not positive definite;

2,574 views
Skip to first unread message

Siu Lam

unread,
Dec 2, 2017, 4:49:24 PM12/2/17
to lavaan
Hi, I'm working on a univariate latent change score model using the Lavaan package. It has two-time points. When I tried to fix the variance of the change scores at 0, it gave me an error 
Error in lav_model_estimate(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  : 
  lavaan ERROR: initial model-implied matrix (Sigma) is not positive definite;
  check your model and/or starting parameters.

I then tried to modify it by using the starting value at 0.001. Then the error is gone. I wonder why is that?  Following is the code. 

Model<-'
T2 ~ 1*T1     # Fixed regression of T2 on T1
dT =~ 1*T2     # Fixed regression of dT on T2
T2 ~ 0*1          # This line constrains the intercept of T2 to 0
T2 ~~ 0*T2    # This fixes the variance of the T2 to 0 

dT ~ 1             # This estimates the intercept of the change scores 
T1 ~  1           # This estimates the intercept of T1 
dT1 ~~  dT1       # This estimates the variance of the change scores (It doesn't work if I fix it at 0. However, it does work if I fix it at 0.001.)
T1 ~~   T1    # This estimates the variance of T1 
dT ~ T1          # This estimates the self-feedback parameter
'
fit <- lavaan(Model, data=Data1, estimator='mlr',fixed.x=FALSE,missing='fiml')
summary(fitULCS, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)

Thank you for reading. 

Terrence Jorgensen

unread,
Dec 2, 2017, 6:44:28 PM12/2/17
to lavaan
When I tried to fix the variance of the change scores at 0, it gave me an error 

Assuming there is nothing else wrong with your model specification, the lavaan() function does not always give smart starting values if it cannot guess what they should be from your model specification, leading to the error you see.  You can try using the sem() function, or changing some of lavaan's default arguments:

?lavOptions


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Siu Lam

unread,
Dec 3, 2017, 2:48:39 AM12/3/17
to lavaan
Thank you for the suggestion! 

John Mathews

unread,
Oct 31, 2020, 2:32:44 PM10/31/20
to lavaan
I m trying to stimulate a 4-Likert cat dataset i receive this error when I m adding the thresholds statements 
population.model="
! regressions 
   SE=~0.7*x1+0.7*x2+0.7*x3+0.7*x4+0.7*x5+0.7*x6+0.7*x7+0.7*x8
   ET=~0.7*x9+0.7*x10+0.7*x11+0.7*x12+0.7*x13
  
   RI=~0.7*y1+0.7*y2+0.7*y3+0.7*y4+0.7*y5+0.7*y6+0.7*y7
   PG=~0.7*y8+0.7*y9+0.7*y10+0.7*y11+0.7*y12+0.7*y13+0.7*y14+0.7*y15+0.7*y16+0.7*y17
   
   SE=~RI
   ET=~RI
   RI=~PG
   SE=~PG
   ET=~PG
! residuals, variances and covariances
   SE ~~ 0.51*SE
   ET ~~ 0.51*ET
   SE ~~ 0.25*ET
   RI ~~ 0.51*RI
   x1 ~~ 0.51*x1
    x2 ~~ 0.51*x2
   x3 ~~ 0.51*x3
   x4 ~~ 0.51*x4
   x5 ~~ 0.51*x5
   x6 ~~ 0.51*x6
   x7 ~~ 0.51*x7
   x8 ~~ 0.51*x8
  x9 ~~ 0.51*x9
x10 ~~ 0.51*x10
   x11 ~~ 0.51*x11
   x12 ~~ 0.51*x12
   x13 ~~ 0.51*x13
   
   y1 ~~ 0.51*y1
   y2 ~~ 0.51*y2
   y3 ~~ 0.51*y3
   y4 ~~ 0.51*y4
   y5 ~~ 0.51*y5
   y6 ~~ 0.51*y6
   y7 ~~ 0.51*y7
   PG ~~ 0.51*PG
   y8 ~~ 0.51*y8
   y9 ~~ 0.51*y9
   y10 ~~ 0.51*y10
   y11 ~~ 0.51*y11
   y12 ~~ 0.51*y12
   y13 ~~ 0.51*y13
   y14 ~~ 0.51*y14
   y15 ~~ 0.51*y15
   y16 ~~ 0.51*y16
   y17 ~~ 0.51*y17
! means
  ET+RI+PG+ SE~1*1;
  x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13~1*1;
  y1+y2+y3+y4+y5+y6+y7+y8+y9+y10+y11+y12+y13+y14+y15+y16+y17~1*1; 
  x1 ~*~ x1
x2 ~*~ x2
x3 ~*~ x3
x4 ~*~ x4
x5 ~*~ x5
x7 ~*~ x7
x8 ~*~ x8
x9 ~*~ x9
x10 ~*~ x10
x11 ~*~ x11
x12 ~*~ x12
x13 ~*~ x13
x1 | -.1*t1 + 0.8*t2+1.5*t3
x2 | -.1*t1 + 0.8*t2+1.5*t3
x3 | -.1*t1 + 0.8*t2+1.5*t3
x4 | -.1*t1 + 0.8*t2+1.5*t3
x5 | -.1*t1 + 0.8*t2+1.5*t3
x6 | -.1*t1 + 0.8*t2+1.5*t3
x7 | -.1*t1 + 0.8*t2+1.5*t3
x8 | -.1*t1 + 0.8*t2+1.5*t3
x9 | -.1*t1 + 0.8*t2+1.5*t3
x10 | -.1*t1 + 0.8*t2+1.5*t3
x11 | -.1*t1 + 0.8*t2+1.5*t3
x12 | -.1*t1 + 0.8*t2+1.5*t3
x13 | -.1*t1 + 0.8*t2+1.5*t3
y1 ~*~ y1
y2 ~*~ y2
y3 ~*~ y3
y4 ~*~ y4
y5 ~*~ y5
y7 ~*~ y7
y8 ~*~ y8
y9 ~*~ y9
y10 ~*~ y10
y11 ~*~ y11
y12 ~*~ y12
y13 ~*~ y13
y14 ~*~ y14
y15 ~*~ y15
y16 ~*~ y16
y17 ~*~ y17
y1 | -.1*t1 + 0.8*t2+1.5*t3

y2 | -.1*t1 + 0.8*t2+1.5*t3
y3 | -.1*t1 + 0.8*t2+1.5*t3
y4 | -.1*t1 + 0.8*t2+1.5*t3
y5 | -.1*t1 + 0.8*t2+1.5*t3
y6 | -.1*t1 + 0.8*t2+1.5*t3
y7 | -.1*t1 + 0.8*t2+1.5*t3
y8 | -.1*t1 + 0.8*t2+1.5*t3
y9 | -.1*t1 + 0.8*t2+1.5*t3
y10 | -.1*t1 + 0.8*t2+1.5*t3
y11 | -.1*t1 + 0.8*t2+1.5*t3
y12 | -.1*t1 + 0.8*t2+1.5*t3
y13 | -.1*t1 + 0.8*t2+1.5*t3
y14 | -.1*t1 + 0.8*t2+1.5*t3
y15 | -.1*t1 + 0.8*t2+1.5*t3
y16 | -.1*t1 + 0.8*t2+1.5*t3
y17 | -.1*t1 + 0.8*t2+1.5*t3 
";
dat.sim = simulateData(population.model, sample.nobs=50,skewness =1,kurtosis = 3.5, seed = 2020)
 can someone help me 

Edward Rigdon

unread,
Oct 31, 2020, 4:37:48 PM10/31/20
to lav...@googlegroups.com
Did you mean here:

  SE=~RI
   ET=~RI
   RI=~PG
   SE=~PG
   ET=~PG
that each of these factors is somehow a higher order factor of the other factors? That is what the =~ operator is telling lavaan. Did you just mean a regression relation, where the operator is just ~ ?

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/be4a93d4-bb80-42ab-aee4-0a1e3e50735bn%40googlegroups.com.

John Mathews

unread,
Oct 31, 2020, 6:57:38 PM10/31/20
to lavaan
higher order factor of the other factors  

Terrence Jorgensen

unread,
Nov 1, 2020, 5:34:32 AM11/1/20
to lavaan
I m trying to stimulate a 4-Likert cat dataset i receive this error when I m adding the thresholds statements 

Not sure if this is the reason for the error, but why are you specifying both residual variances AND scaling factors (with the ~*~ operator)?  One implies the other.  I would omit the scaling factors (which don't have population values anyway) and just leave your residual variances in there (which do have population values).  And perhaps you would have to set parameterization = "theta" too?

   SE=~RI
   ET=~RI
   RI=~PG
   SE=~PG
   ET=~PG

Is there a reason you are thinking of RI & PG as indicators of SE and ET (and RI)?  All of those factors have observed indicators.  Second-order loadings are just in the Beta matrix, and it is arbitrary to just define the regression paths instead.  Try doing this, in case that is throwing lavaan's expectations off:

RI ~ SE + ET
PG ~ RI + SE + ET

But remember to specify population values for those slopes, don't just leave the parameters "blank".

! means
  ET+RI+PG+ SE~1*1;
  x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13~1*1;
  y1+y2+y3+y4+y5+y6+y7+y8+y9+y10+y11+y12+y13+y14+y15+y16+y17~1*1; 

Do you realize you are setting your latent means to 1 instead of 0?  There's nothing actually wrong with that, but you would not be able to recover the other population parameters because the intercepts of latent item-responses are fixed to zero for identification when estimating the model.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

José Antonio Moreira de Rezende

unread,
Nov 28, 2024, 8:25:32 AM11/28/24
to lavaan
I received this error when I was trying to solve the following model (V2 and V14 was excluded because of std = 0):

model <- '
  # measurement model
  F1 =~ V8 + V9
  F2 =~ V10 + V11 + V12 + V16 + V17 + V18 + V19 + V20 + V21 + V22
  F3 =~ V13

  # regressions
  V3 ~ V4
  V7 ~ V1 + F1
  V8 ~ F1
  V9 ~ F1
  V10 ~ F2
  V11 ~ F2
  V12 ~ F2
  V16 ~ F2
  V17 ~ F2
  V18 ~ F2
  V19 ~ F2
  V20 ~ F2
  V21 ~ F2
  V22 ~ F2
  V13 ~ F3
  V15 ~ V6 + V7 + F1
  F1 ~ V1 + V3 + V4
  F2 ~ V5
  F3 ~ V5

  # variances
  F2 ~~ F3
  V8 ~~ V9
  V15 ~~ F3
  V15 ~~ F2
'
I normalized and scaled the input values (in the attachment). It gave me the errors:

1: lavaan->lav_model_estimate():  

   initial model-implied matrix (Sigma) is not positive definite; check your model and/or starting parameters .
2: lavaan->lav_model_estimate():  

   initial model-implied matrix (Sigma) is not positive definite; check your model and/or starting parameters .
3: lavaan->lav_model_estimate():  

   initial model-implied matrix (Sigma) is not positive definite; check your model and/or starting parameters .
4: lavaan->lav_model_estimate():  

   initial model-implied matrix (Sigma) is not positive definite; check your model and/or starting parameters .
5: lavaan->lav_lavaan_step11_estoptim():  
   Model estimation FAILED! Returning starting values. 

José Antonio Moreira de Rezende

unread,
Nov 28, 2024, 8:28:37 AM11/28/24
to lavaan
Sorry, I think I forgot to attach the file.
dados_transformados.csv

José Antonio Moreira de Rezende

unread,
Nov 30, 2024, 7:49:53 AM11/30/24
to lavaan

Hello, everyone.

Since no one has responded to my question so far, I believe I may not have been specific enough to provide a clear understanding of the problem. I am available to provide further clarification if needed.

The data used in my analysis comes from the 2010 demographic census conducted by IBGE, the Brazilian agency responsible for that year’s census survey. The data of interest is associated with a specific municipality in the state of Minas Gerais (Brazil), and my research aims to explore causal relationships within families whose per capita household income is up to half the minimum wage.

To create a dataset suitable for use with the lavaan package, the data was sourced from two files. The first file contains census information about households, and the second includes data about individuals surveyed. The CSV file I shared is the result of a probabilistic match, where each individual was randomly assigned to a household within the same income category. The resulting data was then adjusted to a normal distribution and scaled to have a mean of zero and a variance of one.

Thus, the CSV file contains both continuous, binary and categorical data. The categorical variables were adjusted following the approach detailed by Bollen (2014) in Chapter 9 (reference in the end of this message). As mentioned in my message from november 28th, variables V2 and V14 were excluded from the model because they resulted in zero variance during the data filtering process. Rows with missing data were also discarded.

Below is the description of each observable and latent variable:

  • V1: Age (integer)
  • V3: Highest level of academic education (categorical: 1 to 14)
  • V4: Ethnical (categorical: 1, 2, 3, 4, 5, 9)
  • V5: Household in urban or rural area (binary: 1, 2)
  • V6: Lives with a spouse or partner (categorical: 1, 2, 3)
  • V7: Total number of living children as of July 31, 2010 (integer)
  • V8: Employment status (categorical: 1 to 7)
  • V9: Was contributing to an official social security institution for any job held during the week of July 25–31, 2010 (categorical: 1, 2, 3)
  • V10: Presence of a refrigerator in the household (binary: 1, 2)
  • V11: Presence of a television in the household (binary: 1, 2)
  • V12: Number of bathrooms in the household (integer: 0 to 9)
  • V13: Source of water supply in the household (categorical: 1 to 10)
  • V15: Monthly per capita income in terms of minimum wages (real number)
  • V16: Presence of an electricity meter in the household (categorical: 1, 2, 3)
  • V17: Presence of a radio (binary: 1, 2)
  • V18: Presence of a washing machine (binary: 1, 2)
  • V19: Presence of a mobile phone (binary: 1, 2)
  • V20: Presence of a landline phone (binary: 1, 2)
  • V21: Presence of a computer (binary: 1, 2)
  • V22: Presence of a computer with internet access (binary: 1, 2)

The reference of Bollen (2014) is: Bollen, K. A. (2014). Structural Equations with Latent Variables. Wiley.

I hope this additional information helps clarify my query, and I am available to discuss any aspect of the dataset or model further. Thank you in advance for your assistance and have a nice weekend.

Best regards!
Reply all
Reply to author
Forward
0 new messages