observed sample matrix - implied matrix do not much the residual matrix

47 views
Skip to first unread message

Michael Filsecker

unread,
May 4, 2025, 4:38:47 AM5/4/25
to lav...@googlegroups.com
Dear colleagues,

I used the data "Political Democracy", for testing different results. Here the model:
model <- '
  # measurement model
    f1 =~ x1 + x2 + x3
    f2 =~ y1 + y2 + y3 + y4
    
  # regressions
    f1 ~ f2
 '
then I called for the model implied Var/Cov matrix with "fitted(fit)" and then the residual matrix with "resid(fit)"
However, when I compared the implied matrix "fitted(fit)" with the sampla covariance calculated with "cov(data)", the differences I get by manually doing the differences are not the same as the ones in the residual matrix. Any possible reason for that?. 
I calculated the observed matrix:
head(PoliticalDemocracy)
data<-(PoliticalDemocracy[c("x1", "x2","x3","y1","y2","y3","y4")])
cov(data)

Thank you!!
Kind regards,
Michael.

Jeremy Miles

unread,
May 4, 2025, 6:29:16 PM5/4/25
to lav...@googlegroups.com
You've been hit by the old "Do I divide by N, or N - 1, when calculating a covariance?" issue.

Lavaan divides by N. The cov() function in R divides by N - 1. You need to rescale the covariance matrix - code shown below, now the match.

library(dplyr)
library(lavaan)
data("PoliticalDemocracy")

d <- PoliticalDemocracy %>%
  dplyr::select(x1, x2, x3, y1, y2, y3, y4)


model <- '
  # measurement model
    f1 =~ x1 + x2 + x3
    f2 =~ y1 + y2 + y3 + y4
   
  # regressions
    f1 ~ f2
 '
fitted <- lavaan::sem(model, d)
cov_fitted <- lavaan::fitted(fitted)$cov
resids_fitted <- lavaan::resid(fitted)$cov
cov(d)
round(cov(d) * ( (nrow(d) - 1) / nrow(d)   ), 3)  # Scale the covariance matrix
cov_fitted + resids_fitted

Jeremy

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lavaan/CACXLW89BxCVEJPT6VUG3JHOSw9RN24VE63MNmuFsOBAzn4JEkQ%40mail.gmail.com.

Michael Filsecker

unread,
May 6, 2025, 3:27:31 PM5/6/25
to lav...@googlegroups.com
thank you Jeremy! can this explain why when using the dataset HolzingerSwineford1939 I obtain VERY different results between these two commands?

Data9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5","x6","x7","x8","x9")]
S <- cov(Data9)
round(S, 3)

AND

inspect(fit, "sampstat")

Thank you!
Michael.

?


Jeremy Miles

unread,
May 6, 2025, 9:45:04 PM5/6/25
to lav...@googlegroups.com

Same.  cov() divides by (n - 1) . Lavaan doesn't.

> library(lavaan)
> 
> Data9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5","x6","x7","x8","x9")]
> S <- cov(Data9)
> round(S, 3)
      x1     x2    x3    x4    x5    x6     x7    x8    x9
x1 1.363  0.409 0.582 0.507 0.442 0.456  0.085 0.265 0.460
x2 0.409  1.386 0.453 0.210 0.212 0.248 -0.097 0.110 0.245
x3 0.582  0.453 1.279 0.209 0.113 0.245  0.089 0.213 0.375
[snip]
> 
> model <- '
+   # measurement model
+     f1 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
+     '
> 
> fit <- lavaan::sem(model, data = Data9)
>     
> 
> inspect(fit, "sampstat")$cov * (nrow(Data9) / (nrow(Data9) - 1))
       x1     x2     x3     x4     x5     x6     x7     x8     x9
x1  1.363                                                        
x2  0.409  1.386                                                 
x3  0.582  0.453  1.279                                          
[snip]

Jeremy

Message has been deleted

Michael Filsecker

unread,
May 10, 2025, 4:11:07 AM5/10/25
to lav...@googlegroups.com
Dear Alexander, dear Jeremy,

thank you for your input. I am going to re-do the analysis because I believe the difference in the two matrices is so big, that I am not sure if we can explain it be the N versus N-1 division...I will get back to you anyways!:)
Michael.

Am Do., 8. Mai 2025 um 23:09 Uhr schrieb Alexander Miles <al...@milesfamily.name>:
Jeremy is correct! While base R (the cov function) gives you the sample covariance matrix (divides by n-1), lavaan gives you the maximum likelihood estimator of the covariance matrix (divides by n). 

To go from the sample (unbiased) covariance to the ML covariance, you can multiply by (n - 1) / n. (the reciprocal, to replace the denominator n-1 with n). 

> n <- nrow(Data9) # 301 > S_ml <- S * (n - 1) / n # rescale > round(S_ml, 3) - (inspect(fit, "sampstat")$cov) x1 x2 x3 x4 x5 x6 x7 x8 x9 x1 0 x2 0 0 x3 0 0 0 x4 0 0 0 0 x5 0 0 0 0 0 x6 0 0 0 0 0 0 x7 0 0 0 0 0 0 0 x8 0 0 0 0 0 0 0 0 x9 0 0 0 0 0 0 0 0 0
Alex
Reply all
Reply to author
Forward
0 new messages