Selecting analyses with ordered variables

376 views
Skip to first unread message

zrc2...@gmail.com

unread,
May 29, 2016, 2:28:41 PM5/29/16
to lavaan
Dear Lavaan group,

I apologize in advance that this is not a strictly Lavaan question but would appreciate any suggestions from the group. Using Lavaan, I have completed the CFA of data involving ordinal variables (all based on same 5-pt likert scales). I came across the point that 5-level responses may be treated as continuous in a few posts that referenced Rhemtulla 2012 (https://groups.google.com/d/msg/lavaan/kdbSKNRDiYg/O7nfRz0UBAAJ) and ran the CFA treating the variables as both continuous and "ordered." The results differed to some extent, and I am now thinking that it may be worth backtracking and treat the data as ordered in both the PCA and EFA stages to compare the results (Also, many of the variables (particularly the initial item pool) show significant skew with only 5 item response options available)

I completed both PCA and EFA outside of R and would appreciate any packages or methods you may recommend for these steps. I have come across a few options based on my own search (e.g. MCA using FactoMineR or CATPCA in SPSS) but would appreciate input from those that may be more familiar. Alternatively, if I were to complete CFA with data as ordered, is it necessary preceding steps also accounted for the data as ordered?

Thank you.

Terrence Jorgensen

unread,
May 30, 2016, 12:01:14 PM5/30/16
to lavaan
I came across the point that 5-level responses may be treated as continuous in a few posts that referenced Rhemtulla 2012 (https://groups.google.com/d/msg/lavaan/kdbSKNRDiYg/O7nfRz0UBAAJ) and ran the CFA treating the variables as both continuous and "ordered."

If your research interest is the measurement parameters, then your factor loadings will be attenuated with only 5 categories.  Read the recommendations in the paper carefully -- only the structural parameters are unbiased with 5 categories.  7 categories appear necessary for good estimates of factor loadings.

I completed both PCA and EFA outside of R and would appreciate any packages or methods you may recommend for these steps.

I wouldn't recommend an exploratory model if you have any theory-driven hypotheses about what your items measure.  I certainly wouldn't use PCA if you hypothesize common factors (which PCA does not; PCA is an eigen decomposition, not a measurement model).  But if you don't know anything about what your items measure, you can perform EFA using lavaan.  In the semTools package, there is a function called efaUnrotate() that fits an EFA to your model using lavaan, so you can pass any lavaan() arguments to it (e.g., ordered, in which case it would choose DLWS estimation by default).  Then you can rotate the solution using oblqRotate().

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Mark Taper

unread,
May 30, 2016, 12:47:12 PM5/30/16
to lavaan
Likert items are ordinal categorical, Likert scales are continuous.  It sounds as if you should use item response theory to convert the set of response to your items into estimates of values on the underlying Likert scale(s).  IRT will give you estimates and their standard errors.  You can then analyse these values as continuous.  Be sure to include the measurement errors in you analysis, or you will introduce bias.  see (Lu, I. R. R. 2005. Embedding IRT in structural equation models: A comparison with regression based on IRT scores. Structural Equation Modeling-a Multidisciplinary Journal 12:263-277.)  There are a number of R packages that will analyse Likert style data.  The one I am most familiar with is mirt.  Mirt is well documented and has an active user's group.

One more point to raise.  You mention skewed results.  This is only a problem if you have individuals all of whose responses are either the minimum or maximum.  In which case you have censored data even when converted to the Likert scale.  Censored data will also generate biases, which may be large (see Holst, K. K., T. H. Scheike, and J. B. Hjelmborg. 2016. The liability threshold model for censored twin data. Computational Statistics & Data Analysis 93:324-335.)  The lavaan package doesn't yet handle censored data, but the lava package does.  The lava package is very elegant and powerful, but it is not quite as well documented as lavaan, nor does it have a user's group.  However, if you have censored latent data it is the only R package I know that handles it (at least easily).

Zinrc3 K

unread,
May 30, 2016, 10:27:06 PM5/30/16
to lav...@googlegroups.com
Thank you very much for all of the helpful explanations and suggestions. It seems that I will have to review the procedures required for questionnaire validation, given the type and the nature of the data I am analyzing. I will look into both Mirt and Lava.

Regarding the point about steps prior to CFA, PCA was first employed to reduce the item numbers to a more manageable number and EFA along with parallel analysis was then used to determine the underlying factor structure and to further reduce the items as appropriate. Perhaps there is a distinction between the likert items and scales that I am missing here. If the results that I obtained from PCA and EFA by treating the data as continuous yielded a factor structure that seems to be well-supported by CFA (both continuous and ordered), is there a need to validate the structure through other methods that are intended for ordinal/categorical data?

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/Xh0q_IOohsc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Mark L. Taper

unread,
May 30, 2016, 10:56:47 PM5/30/16
to lav...@googlegroups.com
The Likert items are your questions, the Likert scale is an estimate of an underlying continuous axis.  It corresponds to the factors in efa and cfa.  Like factor analysis IRT analysis can be either exploratory or confirmatory.  

Do you need to redo the analysis?  That depends on what you plan on using it for, and what kind of time pressure your are under.  If you just need to take a quick look at the data before making some sort of decision about the design of future work, then the fa may be good enough.  If on the other hand, the fine details of the results matter, then you should do it over.  

The big picture will probably be similar.  The fa is an approximation to the irt factor analysis based on assumptions about the data that we know to be false.  The irt factors will be similar to the fa factors, but they will be different. It is hard to say whether those differences will be important until you look at them.  There is also no question that the irt analysis will be more defensible for publication purposes.

Best MLT

-- 
Mark L. Taper:              Environmental & Ecological Analysis

Department of Ecology                 Department of Biology
310 Lewis Hall                               220 Bartram Hall
Montana State University               University of Florida
Bozeman, Montana                        Gainesville, Florida  

Errors like straws upon the surface flow:
Who would search for pearls must dive below.
     -Jon Dryden (1631-1700)

Science is the belief in the ignorance of experts.
     -Richard Feynman 1996




Zinrc3 K

unread,
May 30, 2016, 11:14:41 PM5/30/16
to lav...@googlegroups.com
Thank you again for your very helpful response. The value of conducting IRT analysis is now very clear.
Reply all
Reply to author
Forward
0 new messages