latent growth curve model predicted values

211 views
Skip to first unread message

Shimon Sarraf

unread,
Jan 14, 2019, 5:48:26 PM1/14/19
to lavaan
Is it possible for lavaan to calculate predicted values (for observed or latent variables) for a growth curve model when one is using time varying and/or time invariant predictors? Based on my reading of some google group exchanges from a year ago, I don't believe so but wanted to confirm. If something has been developed, I'd greatly appreciate any suggested reference material.
Thank you in advance for your thoughts.
Shimon 

Terrence Jorgensen

unread,
Jan 15, 2019, 8:05:22 AM1/15/19
to lavaan
Is it possible for lavaan to calculate predicted values (for observed or latent variables).

Yes for latent, but not observed.

for a growth curve model when one is using time varying and/or time invariant predictors?

You can get factor scores even for endogenous factors (i.e., predicted by time invariant predictors).  But time varying predictors wouldn't predict the growth factors, only the observed variables.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Shimon Sarraf

unread,
Jan 16, 2019, 1:57:32 PM1/16/19
to lavaan
Thank you, Terence! Your help is much appreciated. Unfortunately, when I include time-varying or time-invariant predictors I'm getting the following error message by using the lavpredict command:
Error in Sigma.hat.inv[[g]] : subscript out of bounds
Any recommendations on how to address?
Here's the code I'm using [with lavaan 0.6-3; semTools 0.5-1]

RR.model.2 <- '
i =~ 1*RR10 + 1*RR11 + 1*RR12 + 1*RR13 + 1*RR14 + 1*RR15 + 1*RR16 + 1*RR17 + 1*RR18
s =~ 0*RR10 + 1*RR11 + 2*RR12 + 3*RR13 + 4*RR14 + 5*RR15 + 6*RR16 + 7*RR17 + 8*RR18
s2 =~ 0*RR10 + 1*RR11 + 4*RR12 + 9*RR13 + 16*RR14 + 25*RR15 + 36*RR16 + 49*RR17 + 64*RR18
i ~ ZSR 
s ~ ZSR 
s2 ~ ZSR
# residual variances
RR10~~r*RR10
RR11~~r*RR11
RR12~~r*RR12
RR13~~r*RR13
RR14~~r*RR14
RR15~~r*RR15
RR16~~r*RR16
RR17~~r*RR17
RR18~~r*RR18'

#using 1 of 5 imputed data sets
RR.fit.2.a <- growth(RR.model.2, data = stackeddata[stackeddata$imp==1,], estimator = "WLSMV")

fscores.2.a <- lavPredict(RR.fit.2.a)

Terrence Jorgensen

unread,
Jan 16, 2019, 6:34:38 PM1/16/19
to lavaan
The ?growth help-page example has both time-varying and time-invariant predictors, and it returns factor scores without error.

example(growth)
lavPredict
(fit)

But that example has continuous indicators and uses ML estimation.  You use DWLS, so I presume you have categorical indicators, which could be the problem.  Since categorical data do not have means, the intercepts are fixed to zero by default, so there cannot be any growth unless you constrain thresholds to allow intercepts to be identified (although fixed to zero so that they can be implied by means of growth factors).  See this paper for details:

Mehta, P. D., Neale, M. C., & Flay, B. R. (2004). Squeezing interval change from ordinal panel data: Latent growth curves with ordinal outcomes. Psychological Methods, 9(3), 301-333. http://dx.doi.org/10.1037/1082-989X.9.3.301

Shimon Sarraf

unread,
Jan 22, 2019, 4:33:38 PM1/22/19
to lavaan
Thank you once again, Terrence. I've attached some descriptive statistics of the variables I'm using to clarify the issues I'm dealing with in my final model. My outcome and some of the time-invariant covariates are continuous variables but my time-varying covariates are binary. Some of the time-invariant covariates are not normally distributed as you'll see in the attachment (skewed and kurtotic in some cases). The final model is actually more complex than the one I originally sent to you and the google group: 

RR.model.3 <- '
# intercept
i =~ 1*RR10 + 1*RR11 + 1*RR12 + 1*RR13 + 1*RR14 + 1*RR15 + 1*RR16 + 1*RR17 + 1*RR18
s =~ 0*RR10 + 1*RR11 + 2*RR12 + 3*RR13 + 4*RR14 + 5*RR15 + 6*RR16 + 7*RR17 + 8*RR18
s2 =~ 0*RR10 + 1*RR11 + 4*RR12 + 9*RR13 + 16*RR14 + 25*RR15 + 36*RR16 + 49*RR17 + 64*RR18
#time invariant predictors
i ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR
s ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR
s2 ~ Public + ZSize + ZFT + ZFemale + ZAA + ZLAT + ZSR
#time varying covariates
RR10~Incent10
RR11~Incent11
RR12~Incent12
RR13~Incent13
RR14~Incent14
RR15~Incent15
RR16~Incent16
RR17~Incent17
RR18~Incent18
RR15~LMS15
RR16~LMS16
RR17~LMS17
RR18~LMS18
# residual variances
RR10~~r*RR10
RR11~~r*RR11
RR12~~r*RR12
RR13~~r*RR13
RR14~~r*RR14
RR15~~r*RR15
RR16~~r*RR16
RR17~~r*RR17
RR18~~r*RR18'

RR.fit.3 <- growth.mi(RR.model.3, data = RR.data.amelia2.imps, estimator = "WLSMV", ordered=c(18:30))

I ended up using WLSMV because I got an error message when I used MLR with ordered variables:
Error in lav_options_set(opt) : 
  lavaan ERROR: estimator ML for ordered data is not supported yet. Use WLSMV instead.

This said, I think I may have made a mistake by identifying my binary time-varying covariates (Incent and LMS variables) as "ordered" in my code, and could in fact drop the "ordered=c(18:30)" syntax and use MLR. Does this approach make most sense to you or would you advise doing otherwise?

Take care,
Shimon 
Descriptive statistics for lavaan google group.docx

Terrence Jorgensen

unread,
Jan 25, 2019, 9:54:34 AM1/25/19
to lavaan
I think I may have made a mistake by identifying my binary time-varying covariates (Incent and LMS variables) as "ordered" in my code, and could in fact drop the "ordered=c(18:30)" syntax and use MLR. Does this approach make most sense to you or would you advise doing otherwise?

If the only binary/ordered variables in your model are exogenous, then yes.  Set fixed.x=TRUE (should be the default) and you can use MLR (with FIML if necessary for incomplete data).

Shimon Sarraf

unread,
Feb 11, 2019, 11:45:04 PM2/11/19
to lavaan
Prof. Jorgensen:
Thanks for your suggestion. I tried MLR but the model fit results are sub-par (CFI/TLI <.9; RMSEA > .8) in contrast to the original WLSMV model fit results that have very good TLI and RMSEA results (though CFI is still <.9 and less than optimal). Since I'm very interested in predicting latent variables (intercept, slope, and slope-squared) MLR is ideal, but the fit results are so poor that I'm concerned that any predicted values would be way off. The parameter estimates based on both estimator types, for fixed and random effects, are almost exactly alike though. Additionally, the MLR results come along with warnings unlike the WLSMV results:
1: In computeOmega(Sigma.hat = Sigma.hat, Mu.hat = Mu.hat, lavsamplestats = lavsamplestats,  :
  lav_model_gradient: Sigma.hat is not positive definite
This message disappears when I remove time-invariant covariates as predictors for the latent slope variable though. Here are MLR model fit statistics and explained variance results for the observed variables and intercept latent variable:

      cfi.scaled    tli.scaled rmsea.scaled    srmr  
         0.833         0.825    0.098         0.113 
> inspect(RR.fit.3.a, 'r2')
 RR10  RR11  RR12  RR13  RR14  RR15  RR16  RR17  RR18     intercept 
0.829 0.832 0.842 0.852 0.866 0.864 0.869 0.874 0.881 0.488 

I'm uncertain about the best path forward so any other suggestions would be appreciated. 

On a somewhat related issue, is there any lavaan function that provides an indication of fit for individual model-implied trajectories?

Take care, and thanks again!
Shimon

Shimon Sarraf

unread,
Feb 23, 2019, 8:53:31 PM2/23/19
to lavaan
Prof. Jorgensen:
Any thoughts on the situation described below?
Shimon

Terrence Jorgensen

unread,
Feb 28, 2019, 10:09:03 PM2/28/19
to lavaan
the fit results are so poor that I'm concerned that any predicted values would be way off

You are wise to be concerned.  The strength of inferences cannot exceed the validity of the model upon which they are based.

is there any lavaan function that provides an indication of fit for individual model-implied trajectories?

No.
Reply all
Reply to author
Forward
0 new messages