latent growth curve model predicted values

Shimon Sarraf

unread,

Jan 14, 2019, 5:48:26 PM1/14/19

to lavaan

Is it possible for lavaan to calculate predicted values (for observed or latent variables) for a growth curve model when one is using time varying and/or time invariant predictors? Based on my reading of some google group exchanges from a year ago, I don't believe so but wanted to confirm. If something has been developed, I'd greatly appreciate any suggested reference material.

Thank you in advance for your thoughts.

Shimon

Terrence Jorgensen

unread,

Jan 15, 2019, 8:05:22 AM1/15/19

to lavaan

Is it possible for lavaan to calculate predicted values (for observed or latent variables).

Yes for latent, but not observed.

for a growth curve model when one is using time varying and/or time invariant predictors?

You can get factor scores even for endogenous factors (i.e., predicted by time invariant predictors). But time varying predictors wouldn't predict the growth factors, only the observed variables.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Shimon Sarraf

unread,

Jan 16, 2019, 1:57:32 PM1/16/19

to lavaan

Thank you, Terence! Your help is much appreciated. Unfortunately, when I include time-varying or time-invariant predictors I'm getting the following error message by using the lavpredict command:

Error in Sigma.hat.inv[[g]] : subscript out of bounds

Any recommendations on how to address?

Here's the code I'm using [with lavaan 0.6-3; semTools 0.5-1]

RR.model.2 <- '

i =~ 1*RR10 + 1*RR11 + 1*RR12 + 1*RR13 + 1*RR14 + 1*RR15 + 1*RR16 + 1*RR17 + 1*RR18

s =~ 0*RR10 + 1*RR11 + 2*RR12 + 3*RR13 + 4*RR14 + 5*RR15 + 6*RR16 + 7*RR17 + 8*RR18

s2 =~ 0*RR10 + 1*RR11 + 4*RR12 + 9*RR13 + 16*RR14 + 25*RR15 + 36*RR16 + 49*RR17 + 64*RR18

i ~ ZSR

s ~ ZSR

s2 ~ ZSR

# residual variances

RR10~~r*RR10

RR11~~r*RR11

RR12~~r*RR12

RR13~~r*RR13

RR14~~r*RR14

RR15~~r*RR15

RR16~~r*RR16

RR17~~r*RR17

RR18~~r*RR18'

#using 1 of 5 imputed data sets

RR.fit.2.a <- growth(RR.model.2, data = stackeddata[stackeddata$imp==1,], estimator = "WLSMV")

fscores.2.a <- lavPredict(RR.fit.2.a)

Terrence Jorgensen

unread,

Jan 16, 2019, 6:34:38 PM1/16/19

to lavaan

The ?growth help-page example has both time-varying and time-invariant predictors, and it returns factor scores without error.

example(growth)
lavPredict(fit)

But that example has continuous indicators and uses ML estimation. You use DWLS, so I presume you have categorical indicators, which could be the problem. Since categorical data do not have means, the intercepts are fixed to zero by default, so there cannot be any growth unless you constrain thresholds to allow intercepts to be identified (although fixed to zero so that they can be implied by means of growth factors). See this paper for details:

Mehta, P. D., Neale, M. C., & Flay, B. R. (2004). Squeezing interval change from ordinal panel data: Latent growth curves with ordinal outcomes. Psychological Methods, 9(3), 301-333. http://dx.doi.org/10.1037/1082-989X.9.3.301

Shimon Sarraf

unread,

Jan 22, 2019, 4:33:38 PM1/22/19

to lavaan

Thank you once again, Terrence. I've attached some descriptive statistics of the variables I'm using to clarify the issues I'm dealing with in my final model. My outcome and some of the time-invariant covariates are continuous variables but my time-varying covariates are binary. Some of the time-invariant covariates are not normally distributed as you'll see in the attachment (skewed and kurtotic in some cases). The final model is actually more complex than the one I originally sent to you and the google group:

RR.model.3 <- '

# intercept

i =~ 1*RR10 + 1*RR11 + 1*RR12 + 1*RR13 + 1*RR14 + 1*RR15 + 1*RR16 + 1*RR17 + 1*RR18

s =~ 0*RR10 + 1*RR11 + 2*RR12 + 3*RR13 + 4*RR14 + 5*RR15 + 6*RR16 + 7*RR17 + 8*RR18

s2 =~ 0*RR10 + 1*RR11 + 4*RR12 + 9*RR13 + 16*RR14 + 25*RR15 + 36*RR16 + 49*RR17 + 64*RR18

#time invariant predictors

i ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR

s ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR

s2 ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR

#time varying covariates

RR10~Incent10

RR11~Incent11

RR12~Incent12

RR13~Incent13

RR14~Incent14

RR15~Incent15

RR16~Incent16

RR17~Incent17

RR18~Incent18

RR15~LMS15

RR16~LMS16

RR17~LMS17

RR18~LMS18

# residual variances

RR10~~r*RR10

RR11~~r*RR11

RR12~~r*RR12

RR13~~r*RR13

RR14~~r*RR14

RR15~~r*RR15

RR16~~r*RR16

RR17~~r*RR17

RR18~~r*RR18'

RR.fit.3 <- growth.mi(RR.model.3, data = RR.data.amelia2.imps, estimator = "WLSMV", ordered=c(18:30))

I ended up using WLSMV because I got an error message when I used MLR with ordered variables:

Error in lav_options_set(opt) :

lavaan ERROR: estimator ML for ordered data is not supported yet. Use WLSMV instead.

This said, I think I may have made a mistake by identifying my binary time-varying covariates (Incent and LMS variables) as "ordered" in my code, and could in fact drop the "ordered=c(18:30)" syntax and use MLR. Does this approach make most sense to you or would you advise doing otherwise?

Take care,

Shimon

Descriptive statistics for lavaan google group.docx

Terrence Jorgensen

unread,

Jan 25, 2019, 9:54:34 AM1/25/19

to lavaan

I think I may have made a mistake by identifying my binary time-varying covariates (Incent and LMS variables) as "ordered" in my code, and could in fact drop the "ordered=c(18:30)" syntax and use MLR. Does this approach make most sense to you or would you advise doing otherwise?

If the only binary/ordered variables in your model are exogenous, then yes. Set fixed.x=TRUE (should be the default) and you can use MLR (with FIML if necessary for incomplete data).

Shimon Sarraf

unread,

Feb 11, 2019, 11:45:04 PM2/11/19

to lavaan

Prof. Jorgensen:

Thanks for your suggestion. I tried MLR but the model fit results are sub-par (CFI/TLI <.9; RMSEA > .8) in contrast to the original WLSMV model fit results that have very good TLI and RMSEA results (though CFI is still <.9 and less than optimal). Since I'm very interested in predicting latent variables (intercept, slope, and slope-squared) MLR is ideal, but the fit results are so poor that I'm concerned that any predicted values would be way off. The parameter estimates based on both estimator types, for fixed and random effects, are almost exactly alike though. Additionally, the MLR results come along with warnings unlike the WLSMV results:

1: In computeOmega(Sigma.hat = Sigma.hat, Mu.hat = Mu.hat, lavsamplestats = lavsamplestats,  :
  lav_model_gradient: Sigma.hat is not positive definite

This message disappears when I remove time-invariant covariates as predictors for the latent slope variable though. Here are MLR model fit statistics and explained variance results for the observed variables and intercept latent variable:

      cfi.scaled    tli.scaled rmsea.scaled    srmr  
         0.833         0.825    0.098         0.113 
> inspect(RR.fit.3.a, 'r2')
 RR10  RR11  RR12  RR13  RR14  RR15  RR16  RR17  RR18     intercept 
0.829 0.832 0.842 0.852 0.866 0.864 0.869 0.874 0.881 0.488

I'm uncertain about the best path forward so any other suggestions would be appreciated.

On a somewhat related issue, is there any lavaan function that provides an indication of fit for individual model-implied trajectories?

Take care, and thanks again!

Shimon

Shimon Sarraf

unread,

Feb 23, 2019, 8:53:31 PM2/23/19

to lavaan

Prof. Jorgensen:

Any thoughts on the situation described below?

Shimon

Terrence Jorgensen

unread,

Feb 28, 2019, 10:09:03 PM2/28/19

to lavaan

the fit results are so poor that I'm concerned that any predicted values would be way off

You are wise to be concerned. The strength of inferences cannot exceed the validity of the model upon which they are based.

is there any lavaan function that provides an indication of fit for individual model-implied trajectories?

No.

Reply all

Reply to author

Forward