What is the correct way to define the covariance when measurement error is defined

127 views
Skip to first unread message

Shajar

unread,
Nov 7, 2019, 8:52:46 AM11/7/19
to lavaan
Hi,
I have a model without latent variables. But, I have an estimate of the measurement error, So for each measure I define a latent variable (addition of _L in the code below) and the model it self (under #regression) is defined with the latent variables. At the end I define the measurement error as I calculated it.
My question is regarding definition of covariance:  I want to define a covariance between 2 variables. What is the correct way: to define the covariance between the latent variables or between to original measured variables?
 
mod.2o <-'
# Latent variable  
Chl_L =~ Chl
Sh_Chl_L =~ Sh_Chl
PP_Outlet_L =~ PP_Outlet
PP_td_in_L =~ PP_td_in
NO3_Outlet_L =~ NO3_Outlet
NO3_td_in_L =~ NO3_td_in
TNdif_L =~ TNdif
Temp_L =~ Temp

#Regrression
Chl_L ~  Sh_Chl_L + Temp_L   
PP_Outlet_L ~ PP_td_in_L  + Chl_L
TNdif_L ~ NO3_Outlet_L + NO3_td_in_L + Chl_L
NO3_Outlet_L ~ Temp_L + NO3_td_in_L

#Covariances - this is what I think is correct to define:
PP_Outlet_L ~~  TNdif_L
# or should I define: PP_Outlet ~~  TNdif ??

#measurment error variances
Chl ~~ 0.24*Chl
Sh_Chl ~~ 0.2*Sh_Chl
PP_Outlet ~~ 0.33*PP_Outlet
PP_td_in ~~ 0.64*PP_td_in
NO3_Outlet ~~ 0.055*NO3_Outlet
NO3_td_in ~~ 0.052*NO3_td_in
TNdif ~~ 0.01*TNdif
Temp ~~ 0.02*Temp
'
# Fit model
mod.2o.fit <- sem(mod.2o, dat3, estimator = "MLR", missing = "ML.X")


Nickname

unread,
Nov 8, 2019, 10:18:38 AM11/8/19
to lavaan
Shajar,
  The correct way to specify the covariance is the one that is consistent with what you are modeling.  As such, it might vary from study to study.   The key question to ask yourself is about local independence:  If you could hold the latent variables constant, would you still expect the observed variables to covary (focusing on the pair in question). 

If not, then you just have random measurement error separating the observed variables from the latent variables, and the covariance should be modeled at the latent level because it involves the latent attributes themselves.  In your case, it looks like it reflects omitted common causes. 

Conversely, if so, then that means that the covariance is not related to the attributes but just an artifact of the two observed variables (e.g., common wording in self-report items).  In that case, the covaraince should be modeled at the observed level. 

Of course, it is possible for both phenomena to be present in a given set of data.  In that case, you would need a research design sufficient to allow you to disentangle the covariances at the two levels.

Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/

Shajar

unread,
Nov 8, 2019, 11:49:13 AM11/8/19
to lavaan
Thanks for the reply, Keith
The main reason for this covariance is unknown common causes. However, for these pair of measurements it is also possible that there is a common measurement error (both samples are taken together so wrong sampling method would affect both measurements). Since I have an estimate of the measurement error, I recon that it includes the error covariance and it wouldn't be right to model also the error covariance. (or should I model it?) 
Thanks, Shajar

BTW, my research field is ecology, so the term "self report items" does not mean a lot to me....


Terrence Jorgensen

unread,
Nov 9, 2019, 9:17:24 AM11/9/19
to lavaan
Since I have an estimate of the measurement error, I recon that it includes the error covariance

Doubtful.  If your estimates of measurement error come from articles reporting scale reliability, I'm guess you only have univariate estimates (i.e., reliability per variable).  An estimate of residual covariance/correlation would have to be drawn from the same source, probably from a model that estimates it.

should I model it?

Probably, but as Keith said, you would need multiple indicators to identify separate estimates of (1) covarying errors and (2) covariance among the error-free components of your (latent-ish) focal variables.

BTW, my research field is ecology, so the term "self report items" does not mean a lot to me....

Feel free to provide the context of your variables, in order for anyone to provide an explanation that might make sense to you.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Shajar Regev

unread,
Nov 9, 2019, 11:12:00 AM11/9/19
to lav...@googlegroups.com
Thanks Terrence!
"If your estimates of measurement error come from articles reporting scale reliability, I'm guess you only have univariate estimates (i.e., reliability per variable)."
My estimates of measurement error comes from actual double measurements made at the same time and place and I calculated reliability with Cronbach’s-α. Yes, it is univariate estimates.
 
 "you would need multiple indicators to identify separate estimates of (1) covarying errors and (2) covariance among the error-free components of your (latent-ish) focal variables."
I'm not sure I understand what you mean by "multiple indicators". Since I relay on historical data, I can not have a research design, as Keith suggested, to disentangle the covariances at the two levels.

 context of my variables:
Chl = Chlorophyll a; Sh_Chl = Chlorophyll 15 days ago; PP_Outlet = Particulate phosphorus at lake outlet; PP_td_in = Particulate phosphorus at lake inlets; NO3_Outlet = Nitrate at outlet; NO3_td_in = Nitrate at inlet; TNdif = difference of total nitrogen between inlet and outlet; Temp = Temperature

Thanks again, Shajar

Nickname

unread,
Nov 9, 2019, 2:52:59 PM11/9/19
to lavaan
Shajar,
  In terms of moving forward with the present analysis, perhaps one option would be to proceed as follows:
1. Freely estimate the covariance between disturbances at the latent level.
2. Fix the covariance at the observed level starting at zero.
3. Refit the model gradually increasinig the uniqueness covariance.
4. Plot the parameter estimates as a function of the uniqueness covariance as a form of sensitivity analysis.
5. Briefly describe the procedure in your write-up and report the minimum and maximum values of the parameters across fits.

  In addition to evaluating the potential for bias in the present study, this may also provide guidance regarding the design of future studies.  The optimal solution would be to refine the measurement procedure in a way that eliminates the covariance between uniquenesses.  Short of that, you can incorporate multiple measurements into future data collection as Terrence described.


Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/

Shajar Regev

unread,
Nov 9, 2019, 3:02:31 PM11/9/19
to lav...@googlegroups.com
Thanks Keith! this is very helpful. Can you explain what is the "uniqueness covariance"?

Nickname

unread,
Nov 9, 2019, 10:35:00 PM11/9/19
to lavaan
Shajar,


>Can you explain what is the "uniqueness covariance"?


PP_Outlet ~~  TNdif

In the context of common factor models, you can think of common factor variables as representing shared variance and the latent variables specific to each observed variable as representing unique variance.  Hence, the variables themselves take on the name uniquenesses, and covariances between them correlated uniquenesses.  Variance in observed variables due to the random error of Classical Measurement Theory only offers a lower bound to unique variance because observed variables can also have reliable variance that is unique to the one observed variable.  Although the terminology reflects common factor models, it is routinely generalized to other types of structural equation models.
Reply all
Reply to author
Forward
0 new messages