fixed error variance for single indicator in CFA model with multiple and single indicators for constructs

521 views
Skip to first unread message

Павел Валединский

unread,
Feb 4, 2020, 9:37:59 AM2/4/20
to lavaan
Hi everyone,
Now I have a model with 9 variables and 3 constructs. 1st and 2nd factors have 4 indicators each, but the 3rd factor (Fac3) has only 1 variable as indicator (v9). I found in Kline's book that for the model (where there are constructs with multiple indicators and single indicators in one model) to be identified, single indicator shoud have a fixed error variance. How can I fix the error variance of 20% to this single indicator in syntax? 
Model <- '#measurement model
Fac1 =~ v1 + v2 + v3 + v4
Fac2 =~ v5 + v6 + v7 + v8
Fac3 =~ v9
#regressions
v10 ~ Fac1 +Fac2 +Fac3'

Thank you 

Terrence Jorgensen

unread,
Feb 4, 2020, 3:14:41 PM2/4/20
to lavaan
How can I fix the error variance of 20% to this single indicator in syntax? 

Are you saying the reliability of v9 is 0.8?  Then the ratio of the factor variance (psi3) to error variance (theta9) should be 0.8 / 0.2, and you can exploit this to specify a model constraint in your syntax:

Fac3 =~ 1*v9
Fac3 ~~ psi3*Fac3
v9
~~ theta9*v9
## model constraint
theta9
== psi3 * 0.2 / 0.8 # implies theta9 / psi3 == 0.2 / 0.8


Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Павел Валединский

unread,
Feb 5, 2020, 12:56:49 PM2/5/20
to lavaan
Thank you very much! The model now works differently giving appropriet estimates, different from the previous ones. 

вторник, 4 февраля 2020 г., 23:14:41 UTC+3 пользователь Terrence Jorgensen написал:

Valeria Ivaniushina

unread,
Feb 10, 2020, 10:12:30 AM2/10/20
to lavaan
Hi Terrence,

This formula 

Fac3 =~ 1*v9
Fac3 ~~ psi3*Fac3 
v9 ~~ theta9*v9
## model constraint
theta9 == psi3 * 0.2 / 0.8 # implies theta9 / psi3 == 0.2 / 0.8

is different from the script that was discussed previously in many other threads on this forum:
F =~ x
x ~~ a*x,  where a = (1-Reliability)*variance of x

I am confused that I don't see the variance of the indicator in your formula.
Maybe I am missing something obvious?

Regards,
Valeria Ivaniushina

Terrence Jorgensen

unread,
Feb 11, 2020, 7:24:31 AM2/11/20
to lavaan
I am confused that I don't see the variance of the indicator in your formula.
Maybe I am missing something obvious?

Not necessarily obvious, but they are equivalent.  Specifying the observed variance of the indicator allows you to explicitly set the constraint on the residual variance as a numeric value.  The formula above implicitly makes the same constraint, by constraining the ratio of psi / theta to be the ratio of reliability / unreliability.  A little algebra shows that this is equivalent to saying the total variance is equal to the sum of those 2 variance components.  

The limitation of the implicit approach in this thread is that it relies on the model-implied rather than the observed indicator variance.  But with a single-indicator construct, I don't see a lot of room for such misspecification that would make the observed and model-implied indicator variance differ.

Valeria Ivaniushina

unread,
Feb 11, 2020, 10:08:33 AM2/11/20
to lavaan
Thank you Terrence, this point is clear now. Two follow-up questions:

1) Is it possible to express your formula in the Mplus language?

2) On the Statmodel forum Linda said: "With categorical outcomes, residual variances are not parameters in the model. I would not recommend trying to correct these variables using reliability".  
But including a single indicator as a manifest variable (or as a latent with error fixed to 0, as lavaan does by default) would mean that we assume this indicator to be perfectly measured, right? Is it a valid assumption?  Kenneth Bollen, Les Hayduk and some other are clearly against this practice.

Valeria

Terrence Jorgensen

unread,
Feb 12, 2020, 4:48:55 AM2/12/20
to lavaan
1) Is it possible to express your formula in the Mplus language?

Yes, you can use the NEW command to name the new variable, then define it below (I think in a MODEL CONSTRAINT command).  You can search for examples in the user guide or forum, especially in the mediation models, for defining new parameters.
 
2) On the Statmodel forum Linda said: "With categorical outcomes, residual variances are not parameters in the model.

Well, they are if you use parameterization = "theta", although they are nonetheless fixed (to 1) by default for identification.  But the issue with identification is that with a single continuous indicator, you only have 1 piece of observed information (the variance), so you can only estimate 1 piece of information (either the factor variance or loading, fixing the other to 1).  With a categorical indicator, you don't have an observed variance.  You are fitting the model to a polychoric correlation matrix, in which the (total) variances are fixed to 1.  So you cannot estimate anything in a single-indicator construct.

Using the default parameterization = "delta", the residual variances are fixed to 1 minus the common-factor variance for identification, such that the model-implied total variances == 1.  So the trick is simply to fix the factor loading to the square-root of the reliability, in which case the residual variance will be set to 1 minus the reliability.  For example, suppose the reliability of "u4" is 0.64 (however that is supposed to have been estimated...), the square-root of which is 0.8:

myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names
(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
model
<- '
f1 =~ u1 + u2 + u3
f2 =~ 0.8*u4 # reliability == 0.64
'

summary
(cfa(model, data = myData, ordered = paste0("u", 1:4), std.lv = TRUE))

Notice the the residual variance in the summary is fixed to 1 - 0.64 = 0.36

But including a single indicator as a manifest variable (or as a latent with error fixed to 0, as lavaan does by default) would mean that we assume this indicator to be perfectly measured, right? Is it a valid assumption?  Kenneth Bollen, Les Hayduk and some other are clearly against this practice.

It's up to you.  I agree it is highly dubious to assume it is error-free, but it is also suspicious to use a reliability estimate for a single categorical item.  When we calculate scale reliability, it is the reliability of the composite (e.g., a scale sum or scale mean), not the reliability of each individual scale item.  If you are using a single categorical indicator, I doubt it is a composite of many items.  There is nothing wrong with choosing a few different values (including perfect reliability) and comparing results as a sensitivity analysis to different reliability assumptions.

Valeria Ivaniushina

unread,
Feb 12, 2020, 10:34:23 AM2/12/20
to lav...@googlegroups.com
Thank you very much, Terrence!
it's very helpful
Valeria

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/fd3785ce-0113-4180-9500-9afd5e6d3955%40googlegroups.com.

Ling

unread,
Dec 22, 2024, 3:11:41 AM (8 days ago) Dec 22
to lavaan
Dear Jorgensen,

May I know how you get the reliability of 0.64 here? I face the same issue, with f2=~u4, and I tried to estimate the reliability for f2, but I seems like the reliability is for the variables with more than one items? So may I know how to estimate the reliability for a latent variable f2 measured by only one item u4?Thank you so much!

Looking forward to your response. 

Best regards,

Ling

f2 =~ 0.8*u4 # reliability == 0.64
Reply all
Reply to author
Forward
0 new messages