Extracting the observed variance-covariance matrix of a model with 2 approaches

41 views
Skip to first unread message

Isa

unread,
Jan 31, 2019, 1:22:58 AM1/31/19
to lavaan

Dear Lavaan Google Group

 

I just tried to extract the observed variance-covariance matrix from my data and my model using the following two commands:

Extract it from the data:

lavCor(data_imputed[,c("t4_belief_qualities_trait","t4_belief_useless_trait", "t4_belief_useful_trait", "t4_belief_positive_trait")], output = "cov")

                          t4_blf_q_ t4_blf_sl_ t4_blf_sf_ t4_blf_p_ t4_sc_ 
t4_belief_qualities_trait   1.000                                          
t4_belief_useless_trait     0.359     1.000                                
t4_belief_useful_trait      0.687     0.513      1.000                     
t4_belief_positive_trait    0.585     0.562      0.664      1.000          
t4_scim_location            1.028     3.468      1.923      1.240   127.007

 

Extract it from the model (I am using the default setting conditional.x = T):

sem_fit_med_se <- sem(sem_med_se,  estimator = "WLSMV", data = data_imputed, mimic = 'Mplus')

inspect(sem_fit_med_se, "sampstat")$res.cov

                          t4_blf_q_ t4_blf_sl_ t4_blf_sf_ t4_blf_p_ t4_sc_
t4_belief_qualities_trait  1.000                                          
t4_belief_useless_trait    0.346     1.000                                
t4_belief_useful_trait     0.679     0.505      1.000                     
t4_belief_positive_trait   0.590     0.569      0.677      1.000          
t4_scim_location           0.282     1.915      0.791      1.024    71.956

 

To my understanding, these two outputs should look the same no matter if I am extracting the observed variance-covariance matrix directly from the data or from the fitted lavaan model. Does anybody know what is the problem here?

 

Thanks a lot for any help!

Best, Isabel

Terrence Jorgensen

unread,
Feb 2, 2019, 5:36:26 AM2/2/19
to lavaan
Even though I cannot see your sem_med_se script, I can tell that you have at least one exogenous covariate.  With categorical data, the default setting conditional.x=TRUE means the model is not fitted to the polychoric correlation matrix you get from the saturated model fitted by lavCor().  Instead, it is fitted to the residual polychoric correlations, after exogenous covariates have been partialed out.  This is indicated by the name $res.cov rather than $cov in the "sampstat" output.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Isa

unread,
Feb 5, 2019, 10:23:16 AM2/5/19
to lavaan

Dear Terence,

 

thank you very much for your response. I guess my problem is that I don’t fully understand what is going on in this process of partialing out the exogenous covariates. But according to your answer I understand that the two matrices I showed in my first post do not necessarily need to be the same if categorical exogenous covariates are involved.

 

In terms of reproducibility of the analysis in such a case: is it enough to report the observed matrix generated by the lavCor() function or should the residual polychoric correlation matrix $res.cov be given, or both?

 

Thanks again very much! Best, isabel

Terrence Jorgensen

unread,
Feb 6, 2019, 9:41:21 AM2/6/19
to lavaan

In terms of reproducibility of the analysis in such a case: is it enough to report the observed matrix generated by the lavCor() function or should the residual polychoric correlation matrix $res.cov be given, or both? 


Nothing wrong with reporting both, but for your analysis to be reproducible, neither would be enough.  You would also need to provide the weight matrix used in the weighted least-squares estimation routine.  It would be more straight-forward to provide the (de-indentified) data on a repository like the Open Science Framework with your R scripts.

Isa

unread,
Feb 14, 2019, 3:11:27 AM2/14/19
to lavaan

Dear Terence,

 

thanks again a lot for your response and the link to the Open Science Framework! I would like to ask two follow-up questions on this:

 

·       In case I will not be able to put my data on a repository like you mentioned, how would I extract the weight matrix? And according to your answer: reporting all 3 matrices (observed, residual polychoric, and weight) would make an analysis reproducible?

 

·       Regarding the setting conditional.x = T /partialing out the exogenous covariates: Does this setting mean that I assume the covariances among the exogenous covariates to be zero? (When I compare the model summary output between conditional.x = T and conditional.x = F, I see that the latter also shows me in addition the covariances among the exogenous covariates).

 

Best wishes, isabel

Terrence Jorgensen

unread,
Feb 19, 2019, 3:19:49 PM2/19/19
to lavaan

·       In case I will not be able to put my data on a repository like you mentioned, how would I extract the weight matrix? And according to your answer: reporting all 3 matrices (observed, residual polychoric, and weight) would make an analysis reproducible?


You would also need "gamma", the sampling covariance matrix of the polychoric estimates.  See this thread:


 

·       Regarding the setting conditional.x = T /partialing out the exogenous covariates: Does this setting mean that I assume the covariances among the exogenous covariates to be zero? (When I compare the model summary output between conditional.x = T and conditional.x = F, I see that the latter also shows me in addition the covariances among the exogenous covariates).


No, partialing out the covariates means they are not part of the matrix that the model tries to reproduce, so that's why their covariances are not included in the output.  They can still covary.  When you set it FALSE (but fixed.x=TRUE still), they are still not estimated, but their observed sample statistics are plugged in.
Reply all
Reply to author
Forward
0 new messages