Correcting for Measurement Error in Path Models

678 views
Skip to first unread message

Tim

unread,
Aug 3, 2017, 5:16:23 PM8/3/17
to lavaan
Hi,
I am somewhat new to lavaan, and trying to determine the appropriate way to apply Bollen's (1989) (1-alpha)*(variance) correction for measurement error within a path model using lavaan. I am using lavaan to run an Actor-Partner Interdependence Model path analysis (Kenny, Kashy, & Cook, 2006). My predictors are single items, and my outcomes are composites based on multi-item scales; the outcomes are the ones I am applying the error corrections to. I am using path analysis due to my relatively small sample size (75 dyads). I initially ran the following model to apply the measurement error corrections in my analysis:

#

# Approach 1 - applying error correction directly to the observed composite

#

library(lavaan);

modelData <-read.csv(file=..., header = TRUE);

model<-"

DVmale ~ X1male + X2male + X1female + X2female

DVfemale ~ X1female+ X2female + X1male + X2male

DVmale ~~ 0.1528028*DVmale

DVfemale ~~ 0.2171088*DVfemale

DVmale ~~ DVfemale

";

result<-sem(model, data=modelData, estimator = "ML", fixed.x=FALSE, missing = "FIML");

summary(result, fit.measures=TRUE, standardized=TRUE)


When I run the model using Approach 1, most of the regression paths are significant and in the theoretically predicted direction. However, the model fit is very poor - in fact, unrealistically so (RMSEA > 2, CFI = 0, Chi-square/df ratio > 400, etc.). Although there is evidence that model fit statistics can be quite inappropriately biased toward model rejection when degrees of freedom and sample size are low (Kenny, Kaniskan, & McCoach, 2015), as they were in this study (n = 75, df = 2), these fit measures seem so off-base to me that I am concerned about the results as a whole. If I were not applying error corrections at all, I would have fully saturated model, and therefore (trivially) perfect fit. But the addition of the error corrections is adding 2 df to the model, putting me in a small-df situation for estimating model fit. Following Kenny (http://davidakenny.net/kkc/c7/c7.htm), I am essentially using the APIM as a "regression hack" to account for the fact that scores for male and female members of couples are non-independent - model fit per se is not my primary concern, but rather the significance and direction of the regression coefficients. That said, I began to wonder if perhaps I was incorrectly applying the error correction. So I tried the approach below, which explicitly models the outcomes as latent variables with single indicators.

# 

# Approach 2 - Applying error correction to single indicators of latent outcome variables

#

library(lavaan);

modelData <-read.csv(file=..., header = TRUE);

model<-"

DVmale_Latent =~ 1.0*DVmale_Observed

DVfemale_Latent =~ 1.0*DVfemale_Observed

DVmale_Latent ~ X1male + X2male + X1female + X2female

DVfemale_Latent ~ X1female + X2female + X1male + X2male

DVmale_Observed ~~ .13106046*DVmale_Observed

DVfemale_Observed ~~ .1571883984*DVfemale_Observed

DVmaleLatent ~~ DVfemaleLatent

";

result<-sem(model, data=modelData, estimator = "ML", fixed.x=FALSE, missing = "FIML");

summary(result, fit.measures=TRUE, standardized=TRUE)

With Approach 2 above, I get a saturated model and the fit issue resolves (at least, in the sense of a saturated model fitting "perfectly"). But now none of the regression paths are significant. Although I have read a good bit about error corrections in SEM, I am a bit confused whether I am applying them appropriately in lavaan, as well as the more general question of which of the two approaches above is correct (or maybe neither of them)? I'd appreciate any insights.


Edward Rigdon

unread,
Aug 3, 2017, 8:14:50 PM8/3/17
to lav...@googlegroups.com
     OK, I'm embarrassed--this took me wayyyy too long to figure out.
     In model 1, you are conflating "measurement error" (as it is simplistically understood) with "structural error," the variance in the dependents not accounted for by the predictors. In model 1, you are constraining R2 for the two dependents, and that is your problem. You are forcing your predictors to explain a specific fraction of the dependents' total variance, and the predictors are not doing that. Hence the lack of fit.
     In model 2, you are separating what is called "measurement error" from "structural error." In model 2, the factors have "structural errors," and those are free to vary, depending on the predictive success of the predictors. The two dependent observed variables have their "measurement errors" constrained, which is what (I think) you intended. You can't achieve this result (through this kind of modeling) without creating the factors. The constraints on the "measurement errors" do not affect fit because those parameters would not be identified anyway--they have to be fixed to some values, and it would be hard to find values such that the chi-square and other fit indices will complain about them. (Try it--double or triple your values and see what happens.)
     You *could* achieve the same thing by running the model without constraints but then adjusting all results for the reliability estimates, but since you have a method that does the job, who needs another one?
     Your model 1 gains 2 DF and poor fit because you are constraining the structural errors for the dependents. Your model 2 has 0 DF because the two structural errors are free to conform to the data.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+unsubscribe@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Tim

unread,
Aug 3, 2017, 9:56:38 PM8/3/17
to lavaan
Ah.... Thanks so much, Ed. That makes a lot of sense. I am even more embarrassed that it took me so long to see that. Looks like I don't have a paper with this dataset - but, I'd much rather pull the plug on a spurious finding than put it out there into the literature. Thanks again.

- Tim
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Mikko Rönkkö

unread,
Aug 4, 2017, 1:19:25 AM8/4/17
to lav...@googlegroups.com
Hi,

On 4 Aug 2017, at 24:16 , Tim <worl...@gmail.com> wrote:

Hi,
I am somewhat new to lavaan, and trying to determine the appropriate way to apply Bollen's (1989) (1-alpha)*(variance) correction for measurement error within a path model using lavaan. I am using lavaan to run an Actor-Partner Interdependence Model path analysis (Kenny, Kashy, & Cook, 2006). My predictors are single items, and my outcomes are composites based on multi-item scales; the outcomes are the ones I am applying the error corrections to. I am using path analysis due to my relatively small sample size (75 dyads). I initially ran the following model to apply the measurement error corrections in my analysis:

#
# Approach 1 - applying error correction directly to the observed composite
#
library(lavaan);
modelData <-read.csv(file=..., header = TRUE);
model<-"
DVmale ~ X1male + X2male + X1female + X2female
DVfemale ~ X1female+ X2female + X1male + X2male
DVmale ~~ 0.1528028*DVmale
DVfemale ~~ 0.2171088*DVfemale
DVmale ~~ DVfemale
";
result<-sem(model, data=modelData, estimator = "ML", fixed.x=FALSE, missing = "FIML");
summary(result, fit.measures=TRUE, standardized=TRUE)

Your mode constrains the error terms of regressions, not measurement errors. Doing the measurement error correction does not change the DF of the model. The proper way to correct for measurement error is to add a latent variable whose error variance is fixed to a known value

LDVmale ~ X1male + X2male + X1female + X2female
LDVfemale ~ X1female+ X2female + X1male + X2male

LDVmale ~= DVmale
DVmale ~~ 0.1528028*DVmale

LDVfemale ~= DVfemale
DVfemale ~~ 0.2171088*DVfemale

LDVmale ~~ LDVfemale

But it is not clear why would want to use the kind of model that you are using. If you use alpha as a reliability measure, you must assume that a factor model holds for the indicators. Because you must make this assumption, it would be more straightforward (and rigorous) to just specify LDVfemale and LDVmale as factors in the model without creating composites of their indicators. Errors in variables models do not do much to help with sample size issues.

Mikko


When I run the model using Approach 1, most of the regression paths are significant and in the theoretically predicted direction. However, the model fit is very poor - in fact, unrealistically so (RMSEA > 2, CFI = 0, Chi-square/df ratio > 400, etc.). Although there is evidence that model fit statistics can be quite inappropriately biased toward model rejection when degrees of freedom and sample size are low (Kenny, Kaniskan, & McCoach, 2015), as they were in this study (n = 75, df = 2), these fit measures seem so off-base to me that I am concerned about the results as a whole. If I were not applying error corrections at all, I would have fully saturated model, and therefore (trivially) perfect fit. But the addition of the error corrections is adding 2 df to the model, putting me in a small-df situation for estimating model fit. Following Kenny (http://davidakenny.net/kkc/c7/c7.htm), I am essentially using the APIM as a "regression hack" to account for the fact that scores for male and female members of couples are non-independent - model fit per se is not my primary concern, but rather the significance and direction of the regression coefficients. That said, I began to wonder if perhaps I was incorrectly applying the error correction. So I tried the approach below, which explicitly models the outcomes as latent variables with single indicators.
# 
# Approach 2 - Applying error correction to single indicators of latent outcome variables
#
library(lavaan);
modelData <-read.csv(file=..., header = TRUE);
model<-"
DVmale_Latent =~ 1.0*DVmale_Observed
DVfemale_Latent =~ 1.0*DVfemale_Observed
DVmale_Latent ~ X1male + X2male + X1female + X2female
DVfemale_Latent ~ X1female + X2female + X1male + X2male
DVmale_Observed ~~ .13106046*DVmale_Observed
DVfemale_Observed ~~ .1571883984*DVfemale_Observed
DVmaleLatent ~~ DVfemaleLatent
";
result<-sem(model, data=modelData, estimator = "ML", fixed.x=FALSE, missing = "FIML");
summary(result, fit.measures=TRUE, standardized=TRUE)

With Approach 2 above, I get a saturated model and the fit issue resolves (at least, in the sense of a saturated model fitting "perfectly"). But now none of the regression paths are significant. Although I have read a good bit about error corrections in SEM, I am a bit confused whether I am applying them appropriately in lavaan, as well as the more general question of which of the two approaches above is correct (or maybe neither of them)? I'd appreciate any insights.


Tim

unread,
Aug 4, 2017, 9:51:14 AM8/4/17
to lavaan
Thanks, Mikko.
Reply all
Reply to author
Forward
0 new messages