Hi All,
I encountered an interesting behavior of lavaan that I would like to have comments from others, including new users of lavaan.
Suppose, in a multiple group model, a user (a) imposes between-group equality constraints on regression paths, and (b) defines a parameter from these paths, e.g., the difference in effect:
``` r
# Adapted from the help page of cfa().
library(lavaan)
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
speed ~ c(a, a) * visual + c(b, b) * textual
amb := a - b'
```
This model syntax looks OK (I will discuss a problem with this model later).
The user-defined parameter is amb, a minus b.
These are the results (I added group.label on purpose, explained later):
```r
fit1 <- cfa(HS.model, data = HolzingerSwineford1939,
group = "school",
group.label = c("Pasteur", "Grant-White"))
parameterEstimates(fit1, standardized = TRUE, se = FALSE)[c(10, 11, 46, 47, 73), ]
#> lhs op rhs block group label est std.lv std.all std.nox
#> 10 speed ~ visual 1 1 a 0.236 0.359 0.359 0.359
#> 11 speed ~ textual 1 1 b 0.110 0.165 0.165 0.165
#> 46 speed ~ visual 2 2 a 0.236 0.289 0.289 0.289
#> 47 speed ~ textual 2 2 b 0.110 0.163 0.163 0.163
#> 73 amb := a-b 0 0 amb 0.127 0.194 0.194 0.194
```
The difference is .127, and the standardized difference is .194.
However, if we change the order of the group, these are the results:
```r
fit2 <- cfa(HS.model, data = HolzingerSwineford1939,
group = "school",
group.label = c("Grant-White", "Pasteur"))
parameterEstimates(fit2, standardized = TRUE, se = FALSE)[c(10, 11, 46, 47, 73), ]
#> lhs op rhs block group label est std.lv std.all std.nox
#> 10 speed ~ visual 1 1 a 0.236 0.289 0.289 0.289
#> 11 speed ~ textual 1 1 b 0.110 0.163 0.163 0.163
#> 46 speed ~ visual 2 2 a 0.236 0.359 0.359 0.359
#> 47 speed ~ textual 2 2 b 0.110 0.165 0.165 0.165
#> 73 amb := a-b 0 0 amb 0.127 0.126 0.126 0.126 ```
The unstandardized difference is still .127, as expected. However, the difference in the standardized solution is now .126. That is, this difference depends on the order of the groups.
The reason is, the values of amb in the standardized solution are computed using the standardized solutions from different groups, the first group in each case:
```r
# In fit1, the std.all of 'amb' is computed using results from "Pasteur"
.359 - .165
#> [1] 0.194
# In fit2, the std.all of 'amb' is computed using results from "Grant-White"
.289 - .163
#> [1] 0.126
```
This phenomenon is not new and a similar one has been discussed here before.
However, the behavior of lavaan also makes sense because, to lavaan, when computing amb (:= a - b), it assumes there are only one value of all parameters labelled a, and one value for all parameters labelled b. This is true in the unstandardized solution. However, this is not true in the standardized solution.
For users who are aware of the phenomenon (equal Bs does not imply equal betas because the SDs can be different), they may already notice that the model defined above, though not wrong, can lead to misleading results in the standardized solution.
To (a) impose between-group equality constraints on regression paths, (b) define a parameter from these paths, e.g., the difference in effects, *and* (c) estimate the differences in both the unstandardized solution and the standardized solution, the correct way to define the model is to use explicit equality constraints, rather than labels:
``` r
# Adapted from the help page of cfa().
library(lavaan)
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
speed ~ c(a1, a2) * visual + c(b1, b2) * textual
a1 == a2
b1 == b2
amb1 := a1 - b1
amb2 := a2 - b2'
fit1 <- cfa(HS.model, data = HolzingerSwineford1939,
group = "school",
group.label = c("Pasteur", "Grant-White"))
fit2 <- cfa(HS.model, data = HolzingerSwineford1939,
group = "school",
group.label = c("Grant-White", "Pasteur"))
parameterEstimates(fit1, standardized = TRUE, se = FALSE)[c(10, 11, 46, 47, 73, 74), ]
#> lhs op rhs block group label est std.lv std.all std.nox
#> 10 speed ~ visual 1 1 a1 0.236 0.359 0.359 0.359
#> 11 speed ~ textual 1 1 b1 0.110 0.165 0.165 0.165
#> 46 speed ~ visual 2 2 a2 0.236 0.289 0.289 0.289
#> 47 speed ~ textual 2 2 b2 0.110 0.163 0.163 0.163
#> 75 amb1 := a1-b1 0 0 amb1 0.127 0.194 0.194 0.194
#> 76 amb2 := a2-b2 0 0 amb2 0.127 0.126 0.126 0.126
parameterEstimates(fit2, standardized = TRUE, se = FALSE)[c(10, 11, 46, 47, 73, 74), ]
#> lhs op rhs block group label est std.lv std.all std.nox
#> 10 speed ~ visual 1 1 a1 0.236 0.289 0.289 0.289
#> 11 speed ~ textual 1 1 b1 0.110 0.163 0.163 0.163
#> 46 speed ~ visual 2 2 a2 0.236 0.359 0.359 0.359
#> 47 speed ~ textual 2 2 b2 0.110 0.165 0.165 0.165
#> 75 amb1 := a1-b1 0 0 amb1 0.127 0.126 0.126 0.126
#> 76 amb2 := a2-b2 0 0 amb2 0.127 0.194 0.194 0.194```
Defined this way, we can correctly have an estimate of the difference in effects in the unstandardized solution, which is the same across group, and *two* estimates of the difference in effects in the standardized solution, which can be different across groups because the SDs of the variables involved are not constrained to be different.
The difference between the unstandardized and the standardized solutions is not an issue here. This is well-known (same Bs does not imply same betas).
The issue I am interested in is a software design one.
Should lavaan compute the user-defined parameter (amb in the example) in the standardized solution (std.all), in cases like the one above?
That is, should an SEM program check whether the values it uses to compute a user-defined parameter (a and b, in this case), though constrained to be equal in the unstandardized solution, can take different values in the standardized solution, and hence refuses to compute the parameter in the standardized solution?
Or should users themselves have the responsibility to be careful when using labels if they are also going to read the results from the standardized solution?
-- Shu Fai