Predict dependent/endogenous variables from new dataset using lavaan path model

137 views
Skip to first unread message

L. Aguilar

unread,
Jul 28, 2020, 6:37:35 PM7/28/20
to lavaan
Hi everyone,

I have constructed a path model with lavaan (regressions only, no latent variables) using a complete dataset (data1) with values for all variables (x's and y's). The R code I used is as follows:
# Specification
model1 <- 'y1 ~ x1 + x2 + x3 + x4 + y2
y2 ~ x1 + x2 + x4 + x5 + x6'
# Fit
fit1 <- sem(model = model1, data = data1, estimator = "MLR")

I now want to predict/estimate values for y1 and y2 using a new dataset (data2) which includes values for all x variables, but not for the y variables. In other words, I want to predict unknown values for the y variables using the known x variable values and my fitted path model estimates, similar to prediction in regular regression.

However, from my searches in this Google Group and elsewhere, the lavPredict() function is NOT built to do this (the CRAN description for lavPredict() explicitly states: "the goal of this function is NOT to predict future values of dependent variables as in the regression framework!"; also, here). My understanding is that the regular predict() function will also call lavPredict() for lavaan objects. Though it seems like there were plans to update lavPredict() to be able to conduct regression-style predictions (see here), I cannot find any documentation of these suggested updates being implemented.

Is there any way to predict (in a regression sense) values for my dependent variables given values of my independent variables from a new dataset using my lavaan path model (as described above)?

Apologies in advance if I somehow missed the answer to this question elsewhere. Thank you so much!

Terrence Jorgensen

unread,
Jul 30, 2020, 9:58:27 AM7/30/20
to lavaan
Is there any way to predict (in a regression sense) values for my dependent variables given values of my independent variables from a new dataset using my lavaan path model (as described above)?

Have you also found examples on the forum of how to do this:  https://groups.google.com/d/msg/lavaan/ftvD1Nxb4Iw/OBcY44gbBwAJ

It is not a trivial matter to provide predicted values of observed variables for SEMs in general, but some folks have been working on getting this functionality working for path models like yours.  You can see the progress here (and you could download the files to copy the syntax and use it now):


Eventually, this should be available in semTools (for path models), depending on when the contributor can find the time.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Reply all
Reply to author
Forward
0 new messages