On 07/28/2015 04:46 PM, Peter McHugh wrote:
> Everything works fine, however, the predicted values are inconsistent
> with what I would expect given estimated coefficients, new data values,
> etc.
Could you provide a small/simple artificial example (with as few
variables as possible) to explain why the predicted values seem
inconsistent?
I have the impression that in your model, there are no latent variables.
In that case, all that happens is applying the regression formula. This
is an example with three variables (x, m, y), where the model is a
simple mediation setup:
library(lavaan)
set.seed(1234)
x <- rnorm(1000)
m <- 5 + 1.2*x + rnorm(1000, 0, sd = 0.4)
y <- 2 + 0.8*m + rnorm(1000, 0, sd = 0.6)
Data <- data.frame(y,m,x)
model <- ' y ~ m + x
m ~ x '
fit <- sem(model, data = Data, meanstructure = TRUE)
summary(fit)
head(lavPredict(fit, type = "ov"))
# y m x
# [1,] 2.201717 3.531017 -1.2070657
# [2,] 2.316077 5.345494 0.2774292
# [3,] 2.378246 6.331893 1.0844412
# [4,] 2.114000 2.139284 -2.3456977
# [5,] 2.327763 5.530909 0.4291247
# [6,] 2.333690 5.624941 0.5060559
the 'formulas' in this case (ie. without any latent variables) boil down
to this:
N <- nobs(fit); nvar <- 3
X <- matrix(0, N, p); X[,3] <- fit@Data@X[[1]][,3]
BETA <- lavTech(fit, "est")$beta # regression coefficients
int <- lavTech(fit, "est")$alpha # intercepts
INT <- matrix(int, N, p, byrow = TRUE)
Yhat <- INT + X %*% t(BETA)
head(Yhat)
# [,1] [,2] [,3]
# [1,] 2.201717 3.531017 -0.0265972
# [2,] 2.316077 5.345494 -0.0265972
# [3,] 2.378246 6.331893 -0.0265972
# [4,] 2.114000 2.139284 -0.0265972
# [5,] 2.327763 5.530909 -0.0265972
# [6,] 2.333690 5.624941 -0.0265972
where the last column (the exogenous 'x'), will be replaced by its
observed values)
Yves.