# lavaan Regression predictors p values different than the SPSS output

43 views

### Lazaros Gonidis

Oct 4, 2019, 5:18:08 AM10/4/19
to lavaan
Hi everyone,

I am running a regression based on covariances matrix with two predictors, PS and PC (script to follow).
All is working as intended but I get slightly different p-values for PS (p = .082) and PC (p = .013) than the SPSS values.
Any ideas on why this happens?

script:

# Y : Dv
# PS, PC predictors (k:2 predictors, plus 1 intercept)

# N contains 100 onservations

N <- 100
k <-2

#define correlation matrix
cor_matrix =
matrix(
c(
1, 0.1, -0.2,
0.1, 1,  0.3,
-0.2, 0.3, 1
), nrow = 3, ncol = 3,
dimnames = list(
c("Y", "PS", "PC"),
c("Y", "PS", "PC")
))

# convert correlation matrix to covvariance matrix for lavaan
library(lavaan)
sd_vector = c(1,1,1)
mean_vector = c(0,0,0)
cov_matrix = lavaan::cor2cov(cor_matrix, sd_vector)

# fit the model
fit <- sem( "Y ~ PS + PC",
sample.cov = cov_matrix,
sample.nobs = N,
meanstructure = TRUE,
sample.mean = mean_vector)

# get Rsquare
inspect(fit, "r2")
R2 <-inspect(fit, "r2")

# get F-statistic and p-value
fvalue <-((R2)*(N-k-1))/((1-R2)*(k))
cat("F = ", fvalue)
pvalue <- pf(fvalue, k, N-k-1, lower.tail = FALSE)
cat("p = ", pvalue)

# look at the regression
summary(fit, standardize=TRUE, rsquare = TRUE)

### Terrence Jorgensen

Oct 4, 2019, 6:19:07 AM10/4/19
to lavaan
I get slightly different p-values for PS (p = .082) and PC (p = .013) than the SPSS values.
Any ideas on why this happens?

The point estimates are the same because OLS estimates are ML estimates when OLS assumptions are met.  But MLE is more efficient, so its SEs are smaller, affecting the t or Wald z ratio and p values.  Also, even if the SEs were identical, the p value would differ because MLE is asymptotic, whereas OLS is not; in the latter, the df parameter for an approximately normal t distribution can be derived, instead of relying on an asymptotic Wald z statistic being perfectly normal even in finite samples.

FYI, you can simulate data that conform perfectly to your summary stats if you just want to use lm()

`simData <- MASS::mvrnorm(N, mu = mean_vector, Sigma = cov_matrix,                         empirical = TRUE)summary(lm(Y ~ PS + PC, data = data.frame(simData))) # same slopes and R^2`

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen