lavaan Regression predictors p values different than the SPSS output

Skip to first unread message

Lazaros Gonidis

Oct 4, 2019, 5:18:08 AM10/4/19
to lavaan
Hi everyone,

I am running a regression based on covariances matrix with two predictors, PS and PC (script to follow).
All is working as intended but I get slightly different p-values for PS (p = .082) and PC (p = .013) than the SPSS values.
Any ideas on why this happens?

Many thanks in advance.


# Y : Dv
# PS, PC predictors (k:2 predictors, plus 1 intercept)

# N contains 100 onservations

N <- 100
k <-2

#define correlation matrix
cor_matrix = 
   1, 0.1, -0.2,
   0.1, 1,  0.3,
  -0.2, 0.3, 1
), nrow = 3, ncol = 3,
dimnames = list(
  c("Y", "PS", "PC"),
  c("Y", "PS", "PC")

# convert correlation matrix to covvariance matrix for lavaan
sd_vector = c(1,1,1)
mean_vector = c(0,0,0)
cov_matrix = lavaan::cor2cov(cor_matrix, sd_vector)

# fit the model
fit <- sem( "Y ~ PS + PC", 
           sample.cov = cov_matrix,
           sample.nobs = N, 
           meanstructure = TRUE, 
           sample.mean = mean_vector)

# get Rsquare
inspect(fit, "r2")
R2 <-inspect(fit, "r2")

# get F-statistic and p-value
fvalue <-((R2)*(N-k-1))/((1-R2)*(k))
cat("F = ", fvalue)
pvalue <- pf(fvalue, k, N-k-1, lower.tail = FALSE)
cat("p = ", pvalue)

# look at the regression
summary(fit, standardize=TRUE, rsquare = TRUE)

Terrence Jorgensen

Oct 4, 2019, 6:19:07 AM10/4/19
to lavaan
I get slightly different p-values for PS (p = .082) and PC (p = .013) than the SPSS values.
Any ideas on why this happens?

The point estimates are the same because OLS estimates are ML estimates when OLS assumptions are met.  But MLE is more efficient, so its SEs are smaller, affecting the t or Wald z ratio and p values.  Also, even if the SEs were identical, the p value would differ because MLE is asymptotic, whereas OLS is not; in the latter, the df parameter for an approximately normal t distribution can be derived, instead of relying on an asymptotic Wald z statistic being perfectly normal even in finite samples.

FYI, you can simulate data that conform perfectly to your summary stats if you just want to use lm()

simData <- MASS::mvrnorm(N, mu = mean_vector, Sigma = cov_matrix,
(lm(Y ~ PS + PC, data = data.frame(simData))) # same slopes and R^2

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Lazaros Gonidis

Oct 4, 2019, 7:08:01 AM10/4/19
to lavaan
Many thanks Terrence,

especially for the simulating data part. I will look into that straight away.

All the best,

Reply all
Reply to author
0 new messages