Multivariate Latent Growth Curve Model with Linear and Non-Linear Curve

737 views
Skip to first unread message

Edo Sebastian Jaya

unread,
Jan 6, 2016, 8:28:17 PM1/6/16
to lavaan
Dear all,

I am trying to conduct a parallel growth curve of ostracism experience and symptom variable over four time points. There is a large proportion of missing data, however I would only use the completers (i.e. those who participated at every time points). To illustrate, the initial number of participant was 2359 people and the completers were 139 people.

So as a first step, I successfully fit the symptom variable on a linear curve using the growth() function and MLM estimator (and adding a quadratic curve in the model did not improve fit). Then, I successfully fit the ostracism variable on a linear curve, but the fit indices were not optimal and adding a quadratic curve did not improve fit. Also, the slope was not significant.

Because the fit was not optimal, I tried a latent curve function (syntax: i =~ 1*OES + 1*T2OES + 1*T3OES + 1*T4OES; s =~ 0*OES + 1*T2OES + NA*T3OES + NA*T4OES) with MLM estimator. I got the following warning messages:
Warning messages:
1: In lav_object_post_check(lavobject) :
  lavaan WARNING: some estimated variances are negative
2: In lav_object_post_check(lavobject) :
  lavaan WARNING: covariance matrix of latent variables is not positive definite; use inspect(fit,"cov.lv") to investigate.

However, using WLSM estimator made the warning messages go away and this model showed statistically significant fit improvement over the linear model.

So now I have two growth curve models for my two variables: a linear curve for symptoms (estimated with MLM) and a non-linear curve for ostracism (estimated with WLSM). A more complex problem arise when I join the two together in one model as a multivariate latent growth curve to see whether the slope is correlated to one another, and this is central to my research question.

First, I used the optimal curve model of both variables. This is the syntax for linear curve symptoms and non-linear curve ostracism:
GPOSOES3.model <- '
i1 =~ 1*F.CAPE.P + 1*T2F.CAPE.P + 1*T3F.CAPE.P + 1*T4F.CAPE.P
s1 =~ 0*F.CAPE.P + 1*T2F.CAPE.P + 2*T3F.CAPE.P + 3*T4F.CAPE.P
i2 =~ 1*OES + 1*T2OES + 1*T3OES + 1*T4OES
s2 =~ 0*OES + 1*T2OES + NA*T3OES + NA*T4OES
i1 ~~ i2
s2 ~~ s1'
GPOSOES3.fit <- growth(GPOSOES3.model, data = d, estimator = "DWLS")
But, this produced the following warning message:
Warning messages:
1: In lav_object_post_check(lavobject) :
  lavaan WARNING: some estimated variances are negative
2: In lav_object_post_check(lavobject) :
  lavaan WARNING: covariance matrix of latent variables is not positive definite; use inspect(fit,"cov.lv") to investigate.
3: In lav_object_post_check(lavobject) :
  lavaan WARNING: observed variable error term matrix (theta) is not positive definite; use inspect(fit,"theta") to investigate.


Changing the estimator to MLM did not solve the problem (but the theta warning goes away).

This lead me back to use a linear curve for both symptoms and ostracism:
GPOSOES2.model <- '
i1 =~ 1*F.CAPE.P + 1*T2F.CAPE.P + 1*T3F.CAPE.P + 1*T4F.CAPE.P
s1 =~ 0*F.CAPE.P + 1*T2F.CAPE.P + 2*T3F.CAPE.P + 3*T4F.CAPE.P
i2 =~ 1*OES + 1*T2OES + 1*T3OES + 1*T4OES
s2 =~ 0*OES + 1*T2OES + 2*T3OES + 3*T4OES
s1 ~~ i2
s2 ~~ i1'
GPOSOES2.fit <- growth(GPOSOES2.model, data = d, estimator = "DWLS")

This also produced similar warning message:
Warning messages:
1: In lav_object_post_check(lavobject) :
  lavaan WARNING: some estimated variances are negative
2: In lav_object_post_check(lavobject) :
  lavaan WARNING: observed variable error term matrix (theta) is not positive definite; use inspect(fit,"theta") to investigate.

However, changing the estimator to MLM this time solve the problem (no warning signs).

Output:
lavaan (0.5-20) converged normally after  82 iterations

  Number of observations                           139

  Estimator                                         ML      Robust
  Minimum Function Test Statistic               65.083      35.548
  Degrees of freedom                                22          22
  P-value (Chi-square)                           0.000       0.034
  Scaling correction factor                                  1.831
    for the Satorra-Bentler correction

Model test baseline model:

  Minimum Function Test Statistic              927.985     367.869
  Degrees of freedom                                28          28
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.952       0.960
  Tucker-Lewis Index (TLI)                       0.939       0.949

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -681.530    -681.530
  Loglikelihood unrestricted model (H1)       -648.989    -648.989

  Number of free parameters                         22          22
  Akaike (AIC)                                1407.060    1407.060
  Bayesian (BIC)                              1471.619    1471.619
  Sample-size adjusted Bayesian (BIC)         1402.016    1402.016

Root Mean Square Error of Approximation:

  RMSEA                                          0.119       0.067
  90 Percent Confidence Interval          0.086  0.153       0.034  0.095
  P-value RMSEA <= 0.05                          0.001       0.174

Standardized Root Mean Square Residual:

  SRMR                                           0.057       0.057

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Latent Variables:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
  i1 =~                                                                 
    F.CAPE.P          1.000                               0.348    0.854
    T2F.CAPE.P        1.000                               0.348    0.981
    T3F.CAPE.P        1.000                               0.348    0.842
    T4F.CAPE.P        1.000                               0.348    1.001
  s1 =~                                                                 
    F.CAPE.P          0.000                               0.000    0.000
    T2F.CAPE.P        1.000                               0.049    0.139
    T3F.CAPE.P        2.000                               0.098    0.238
    T4F.CAPE.P        3.000                               0.147    0.424
  i2 =~                                                                 
    OES               1.000                               0.885    0.768
    T2OES             1.000                               0.885    0.819
    T3OES             1.000                               0.885    0.788
    T4OES             1.000                               0.885    0.769
  s2 =~                                                                 
    OES               0.000                               0.000    0.000
    T2OES             1.000                               0.190    0.176
    T3OES             2.000                               0.380    0.338
    T4OES             3.000                               0.570    0.495

Covariances:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
  s1 ~~                                                                 
    i2               -0.006    0.007   -0.838    0.402   -0.137   -0.137
  i1 ~~                                                                 
    s2                0.011    0.012    0.911    0.363    0.170    0.170
    s1               -0.005    0.004   -1.187    0.235   -0.287   -0.287
    i2                0.149    0.044    3.397    0.001    0.482    0.482
  s1 ~~                                                                 
    s2                0.006    0.003    2.070    0.038    0.619    0.619
  i2 ~~                                                                 
    s2               -0.010    0.055   -0.184    0.854   -0.060   -0.060

Intercepts:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    F.CAPE.P          0.000                               0.000    0.000
    T2F.CAPE.P        0.000                               0.000    0.000
    T3F.CAPE.P        0.000                               0.000    0.000
    T4F.CAPE.P        0.000                               0.000    0.000
    OES               0.000                               0.000    0.000
    T2OES             0.000                               0.000    0.000
    T3OES             0.000                               0.000    0.000
    T4OES             0.000                               0.000    0.000
    i1                1.468    0.032   46.354    0.000    4.219    4.219
    s1               -0.033    0.006   -5.107    0.000   -0.668   -0.668
    i2                1.827    0.090   20.360    0.000    2.063    2.063
    s2               -0.020    0.029   -0.694    0.488   -0.105   -0.105

Variances:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    F.CAPE.P          0.045    0.015    3.065    0.002    0.045    0.271
    T2F.CAPE.P        0.012    0.004    2.727    0.006    0.012    0.096
    T3F.CAPE.P        0.060    0.022    2.764    0.006    0.060    0.349
    T4F.CAPE.P        0.008    0.006    1.280    0.200    0.008    0.063
    OES               0.545    0.170    3.211    0.001    0.545    0.410
    T2OES             0.368    0.095    3.874    0.000    0.368    0.315
    T3OES             0.374    0.131    2.857    0.004    0.374    0.296
    T4OES             0.278    0.146    1.905    0.057    0.278    0.210
    i1                0.121    0.024    4.982    0.000    1.000    1.000
    s1                0.002    0.002    1.426    0.154    1.000    1.000
    i2                0.784    0.177    4.434    0.000    1.000    1.000
    s2                0.036    0.028    1.301    0.193    1.000    1.000

R-Square:
                   Estimate
    F.CAPE.P          0.729
    T2F.CAPE.P        0.904
    T3F.CAPE.P        0.651
    T4F.CAPE.P        0.937
    OES               0.590
    T2OES             0.685
    T3OES             0.704
    T4OES             0.790


In short, the output conform my hypothesis that the slope of both variables is correlated significantly. However, I noticed a couple of strange things. First, the slope of ostracism was not significant, while the slope of symptom was significant. How is it possible that they correlate (if one is stable and another is changing)? Second, some of my model can only be computed using certain estimator (e.g. non-linear curve for ostracism can only be computed using WLSM and not MLM, linear symptom model cannot be computed with WLSM). What does this mean? Can my linear symptom model (estimated with MLM) result be compared with my non-linear ostracism model (estimated with WLSM)?

Thanks in advance for any comments or ideas.

Best,
Edo




Terrence Jorgensen

unread,
Jan 8, 2016, 7:20:30 AM1/8/16
to lavaan
It sounds like you had no problem running lavaan, so I'm not sure why you posted your question here instead of the more general SEM forum SEMNET (http://www2.gsu.edu/~mkteer/semnet.html).  You are not getting error messages, just warning messages that your estimates fall outside of theoretical boundaries.  With your small sample size, you should expect to get slightly different results with different estimators (estimators are only equivalent asymptotically).  

Did you do what the warnings said?  (use inspect(fit,"cov.lv") to investigate)  If you investigated which variances were negative, I'm guessing they will be among the same variances that were almost zero in the model without warnings:

 
Variances:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    F.CAPE.P          0.045    0.015    3.065    0.002    0.045    0.271
    T2F.CAPE.P        0.012    0.004    2.727    0.006    0.012    0.096
    T3F.CAPE.P        0.060    0.022    2.764    0.006    0.060    0.349
    T4F.CAPE.P        0.008    0.006    1.280    0.200    0.008    0.063
    OES               0.545    0.170    3.211    0.001    0.545    0.410
    T2OES             0.368    0.095    3.874    0.000    0.368    0.315
    T3OES             0.374    0.131    2.857    0.004    0.374    0.296
    T4OES             0.278    0.146    1.905    0.057    0.278    0.210
    i1                0.121    0.024    4.982    0.000    1.000    1.000
    s1                0.002    0.002    1.426    0.154    1.000    1.000
    i2                0.784    0.177    4.434    0.000    1.000    1.000
    s2                0.036    0.028    1.301    0.193    1.000    1.000

The ML estimator assumes all parameters have normally distributed sampling distributions, even if they have natural theoretical boundaries like (co)variances.  So when you explain most of the variance in an outcome variable (i.e., population residual variance is close to zero), sometimes you will get negative estimates of residual variance just due to sampling variability, particularly in small samples (when sampling variance is higher).  See http://dx.doi.org/10.1177/0049124112442138 for a great discussion of this, along with methods for testing whether a negative error variance estimate is due to sampling variability or misspecification.  Judging from the fact that one estimator gave you a negative estimate and another did not (assuming they are among the ones I highlighted), I would suppose the former, but that is an empirical question you can test easily with bootstrapped CIs, as described in the citation above. 

As far as your sample size goes, you have thrown away a lot of data to only analyze completers.  If you haven't already, you should read some literature on modern methods for handling missing data.  FIML is available in lavaan (missing = "fiml"), including with robust correction (estimator = "MLR").  The semTools package has functions to help automate the use of auxiliary variables needed to justify the MAR assumption (see ?auxiliary), or to use multiple imputation (see ?runMI).  Here is some literature to get you started:


Terry

Edo Sebastian Jaya

unread,
Jan 9, 2016, 9:53:59 AM1/9/16
to lavaan
Thank you Terry. Your answer is very helpful. I asked here because I thought that other SEM programs stop when they have negative variances (or covariances), and this made me wonders what to make of the results.
I have done what you suggested, and I did found those highlighted variables to have negative SE variance from the bootstrap test. I have also tried to run the same analysis with FIML and MLR, and the results are much nicer.

Edo

Terrence Jorgensen

unread,
Jan 10, 2016, 6:56:08 AM1/10/16
to lavaan
I thought that other SEM programs stop when they have negative variances (or covariances)

There is nothing impossible about negative covariances.  EQS is the only software I am aware of that constrains variances to be positive by default.  This paper provides insight into why that is not a good default-idea:

 
I have done what you suggested, and I did found those highlighted variables to have negative SE variance from the bootstrap test. I have also tried to run the same analysis with FIML and MLR, and the results are much nicer.

Good news.  I imagine you were using WLS because you had Likert-type indicators.   FYI, robust ML provides approximately equivalent results to the more appropriate robust DWLS as long as there are at least 5 categories, but not so much with fewer categories.  This article provides good guidance about which estimator to prefer in different situations:


Terry

Mikko Rönkkö

unread,
Jan 11, 2016, 1:38:43 AM1/11/16
to lav...@googlegroups.com
Hi,

On 10 Jan 2016, at 13:56 , Terrence Jorgensen <tjorge...@gmail.com> wrote:

I thought that other SEM programs stop when they have negative variances (or covariances)

There is nothing impossible about negative covariances.  EQS is the only software I am aware of that constrains variances to be positive by default.  This paper provides insight into why that is not a good default-idea:


State’s sem command does that as well. In that case it is not just default, but a constraint that cannot be removed.

Mikko


Edo Sebastian Jaya

unread,
Jan 11, 2016, 1:07:16 PM1/11/16
to lavaan
Again, thank you so much for the references! I did not know that robust ML could be similarly appropriate to robust DWLS. This is very helpful!
Reply all
Reply to author
Forward
0 new messages