Difference between lavCor() and psych:polychoric()

1,374 views
Skip to first unread message

Tanya Murphy

unread,
Sep 12, 2016, 3:51:33 PM9/12/16
to lavaan
Hi, 

I want to estimate the correlation matrix for a set of 24 ordinal items. They are all 3-choice Likert type responses and are all believed to be negatively coded (higher is worse).
I compared the polychoric correlation matrices from lavCor(data.as.factors, estimator = "DWLS", output = "cov") and psych:polychoric(data.numeric, [defaults]). And I do not get the same correlations---even in the first decimal place for some pairings. 

I converted the variables to ordered factors for lavCor, but left them as numeric for psych:polychoric(), which does not accept factors. I don't think that explains the difference, however, since both functions treat the variables as ordinal categories if I understand correctly. 

For  psych:polychoric() I also get the following warning, which I think I understand on it's own, but I don't know why lavCor() does not give a similar warning:

> 276 cells were adjusted for 0 values using the correction for continuity. Examine your data carefully.
> Warning message:
> In cor.smooth(mat) : Matrix was not positive definite, smoothing was done

Any explanation for the difference or reassurance that I can confidently use the lavaan correlation matrix as input to EFA functions would be greatly appreciated.

Best regards,
Tanya

Terrence Jorgensen

unread,
Sep 13, 2016, 3:42:52 AM9/13/16
to lavaan
Any explanation for the difference or reassurance that I can confidently use the lavaan correlation matrix as input to EFA functions would be greatly appreciated.

I don't know whether the "quick two step procedure" implemented in psych is the same procedure that lavaan uses.  But you shouldn't separate the steps of estimating polychorics, then treating them as if they are known values when you estimate an EFA.  Your SEs and fit statistic will both yield inflated Type I error rates.  Luckily, you can use lavaan to do EFA, using a function called efaUnrotate() in the semTools package -- just provide your data and the number of factors, along with any other lavaan options (e.g., names of ordered variables if they are not already of class "ordered" in your data.frame), and it will use the appropriate estimator (DWLS with robust SEs and test statistics).  The output is a lavaan object with your initial solution, which you can pass to the oblqRotate() function to obtain an oblique rotation.  

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Tanya Murphy

unread,
Sep 13, 2016, 8:54:53 AM9/13/16
to lavaan
Thank you! I'll try it. I agree it is so important to propagate the uncertainty in more of the multi-step estimation procedures we use.

Stas Kolenikov

unread,
Sep 13, 2016, 9:45:43 AM9/13/16
to lav...@googlegroups.com
In my experience working with polychoric correlations, the two-step
procedure can be off by about 10% in its standard errors, so the test
will get inflated by 20-25%. See
http://econpapers.repec.org/paper/bocscon16/15.htm.

Back to Tanya's question about the point estimates -- different
packages may have different attitudes towards zero cells, and it looks
like psych::polychoric() is more proactive on them (or at least admits
it fiddles with them). See if you can find a set of options in both
psych::polychoric and lavCor that estimate the same thing -- I suspect
it may be correct=0 in the former.




-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI @abtsrbi
-- Education Officer, Survey Research Methods Section of the American
Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name
> --
> You received this message because you are subscribed to the Google Groups
> "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lavaan+un...@googlegroups.com.
> To post to this group, send email to lav...@googlegroups.com.
> Visit this group at https://groups.google.com/group/lavaan.
> For more options, visit https://groups.google.com/d/optout.

Tanya Murphy

unread,
Sep 13, 2016, 10:14:37 AM9/13/16
to lavaan

I compared correct = 0.5 (default) and correct = 0, but it made no difference at least in the first two decimal places). 

Having 24 variables, the matrix is huge, but here are the last couple of lines to give you a sense of the magnitude of the difference between psych::polychoric and lavCor

polychoric(cha, correct = 0.5) 
       r3q14 r3q15 r3q16 r3q17 r3q18 r3q20 r3q21 r3q22 r3q23 r3q24 r3q26
r3qa23  0.46  0.23  0.44  0.46  0.28  0.05  0.40  0.40  1.00            
r3qa24  0.38  0.28  0.45  0.42  0.27  0.14  0.38  0.43  0.53  1.00      
r3qa26  0.32  0.29  0.33  0.33  0.28  0.19  0.41  0.45  0.45  0.44  1.00

polychoric(cha, correct = 0) 
       r3q14 r3q15 r3q16 r3q17 r3q18 r3q20 r3q21 r3q22 r3q23 r3q24 r3q26           
r3qa23  0.46  0.23  0.44  0.46  0.28  0.05  0.40  0.40  1.00            
r3qa24  0.38  0.28  0.45  0.42  0.27  0.14  0.38  0.43  0.53  1.00      
r3qa26  0.32  0.29  0.33  0.33  0.28  0.19  0.41  0.45  0.45  0.44  1.00 

lavCor(chaf, estimator = "DWLS", output = "cov")
       r3q14 r3q15 r3q16 r3q17 r3q18 r3q20 r3q21 r3q22 r3q23 r3q24 r3a26
r3qa23  0.44  0.10  0.40  0.46  0.21 -0.03  0.22  0.29  1.00             
r3qa24  0.31  0.14  0.42  0.37  0.18  0.07  0.12  0.30  0.37  1.00       
r3qa26  0.24  0.22  0.25  0.27  0.24  0.18  0.27  0.43  0.33  0.29  1.00

I am trying the semTools functions that Terrance recommended. The EFA is sort of a replication step (of past studies) before moving to the main CFA analysis. I will see with my co-authors how much I will need to justify my choice of EFA methods. Thanks for the reference!

Stas Kolenikov

unread,
Sep 13, 2016, 10:51:24 AM9/13/16
to lav...@googlegroups.com
OK, try different estimators within lavaan::lavCor.


-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI @abtsrbi
-- Education Officer, Survey Research Methods Section of the American
Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name



Yves Rosseel

unread,
Oct 7, 2016, 2:56:46 AM10/7/16
to lav...@googlegroups.com
On 09/12/2016 09:51 PM, Tanya Murphy wrote:
>> Warning message:
>> In cor.smooth(mat) : Matrix was not positive definite, smoothing was done

This is it: lavaan does not 'smooth' if the correlation matrix is not
positive definite.

Yves.

joh4nd

unread,
Jun 12, 2023, 11:11:43 AM6/12/23
to lavaan
psych package may not check whether the provided data is scaled as presupposed, and return a warning if it is not strictly ordinal in the sense of taking real values. I.e. do not rescale variable on e.g. 0-1 before providing them as input to psych with that specificiation.
Reply all
Reply to author
Forward
0 new messages