Extract a latent variable - Reliability test

634 views
Skip to first unread message

Omid Ghavibazoo

unread,
Mar 16, 2020, 10:57:40 AM3/16/20
to lavaan
Hi everyone,

I am wondering if it is possible to extract a constructed latent variable by lavaan and use it separately in OLS regressions via glm() in R. I already know that lavPredict() in lavaan provides such estimations. But since my latent variable is constructed via 3 manifest variables, I don't know how to run tests such as test of unidimensionality, item reliability and composite reliability I tried the reliability() function from semTools package, but it gives an error. 

Is it possible to run such reliability tests which are popular for continuous manifest variables in the case of dichotomous (binary) variables?

Thanks everyone. 

Mauricio Garnier-Villarreal

unread,
Mar 16, 2020, 5:58:03 PM3/16/20
to lavaan
Omid

I am not sure how you are planning to use the factor scores in glm() to test reliability. There are multiple reasons why I dont recommend this, like factor indeterminancy would provide bias results with only 1 set of factor scores.

Now, for the issues you are interested, you can test these with the factor analysis.
- Dimensionality: specify your theoretical model, and evaluate the factor loadings, overall, and local model fit
- item reliability: for this you can use the R2 from the CFA, the proportion of variance of each item that can be attributed to the underlying factor
- composite reliability: you can use the reliability and maximalRelia functions.

You said that some of this gave you an error. Please be more specific, with you syntax and error message so we can try to address specific issues

Omid Ghavibazoo

unread,
Mar 17, 2020, 3:24:55 AM3/17/20
to lavaan
Dear Mauricio,

I don't want to test reliability of the latent construct via glm(). I want to construct a latent variable from three dichotomous variables (assume latent risk aversion level of individuals). Then I extract this from lavaan (like a usual continuous variable). Then I want to use it as a control in a logit or probit regression in glm() to see how the level of risk aversion influences purchase of insurance product (as my dependent variable in glm regression).
I have seen one paper doing so, but I was wondering if this is common. The paper is as below:

Hermansson, C. (2018). Can self-assessed financial risk measures explain and predict bank customers’ objective financial risk?. Journal of Economic Behavior & Organization148, 226-240.

In this study they have used CFA to construct latent variable and then they used that variable later in OLS regression for other purposes. Is this common? Then should I report any reliability for my latent construct?

Thanks and best,

Mauricio Garnier-Villarreal

unread,
Mar 17, 2020, 12:36:19 PM3/17/20
to lavaan
Omid

You can extract the factor scores. But because of factor indeterminancy, it is not recommended to extract only 1 set of factor scores. Factor indeterminancy means that multiple sets of factor scores can be presented as possible factor scores.

So, if you wihc to use factor scores outcise lavaan, the recommendes steps are;
1. define your factor model: here you want to describe factor loadings, etc to be able to state that the factor is "good". You can present some measures of reliability, like McDonald omega, and maximal Reliability from semTools
2. Use the plausibleValues function from semTools to extract multiple sets of factor scores (for example 100). This way you can account for variability due to factor indeterminancy. This approach has shown to reproduce the relations with the factor the best
3. Include the multiple sets of factor scores into glm(). You do this by treating the multiple sets of factor scores as multiple imputations. You could do this with the zelig package, or the with function from the mitools package
4. Use the pooled estimates from the multiple plausible factor scores as your model estimates


Hope this helps

Omid Ghavibazoo

unread,
Mar 18, 2020, 3:51:40 AM3/18/20
to lavaan
Hi Mauricio,

Thanks a lot for your help. Now the point is that my latent variable is constructed via three dichotomous manifest variables. In this case, non of the reliability functions in semTools work. I receive below error:

> maximalRelia(model1.fit)
Error: $ operator not defined for this S4 class
> reliability(model1.fit)
Error: $ operator not defined for this S4 class

Some people told me that in case of dichotomous variable, calculating the reliability is difficult or impossible. What do you think?

Thanks and best,

Mauricio Garnier-Villarreal

unread,
Mar 18, 2020, 5:57:21 PM3/18/20
to lavaan

At least the reliability function I am sure ot works with categorical indicators. It estimates the indices based on the polychorical correlation matrix. I think maximalRelia should work too

Can you share a full example, of the lavaan call, model syntax, etc?. So I can see the whole scenario

Omid Ghavibazoo

unread,
Mar 18, 2020, 6:10:18 PM3/18/20
to lavaan
Hi Mauricio,
My latent is constructed from three dichotomous variables and number of observations are 10,000:

model1<-' 
latent =~ X1 + X2 + X3
'
fit <- cfa(model1, data=data.frame, std.lv=F, ordered=c("X1","X2","X3"),meanstructure = F)
summary(fit, rsquare=T, standardized = T, fit.measures=T)
pred.latent<-lavPredict(fit)
reliability(fit)

Thanks and best,

Mauricio Garnier-Villarreal

unread,
Mar 19, 2020, 2:22:14 AM3/19/20
to lavaan
Omid

I can replicate the error from a simulated data. I couldnt find the fix from the code. I have added the issue in the semTools github to try to fix it soon

Mauricio Garnier-Villarreal

unread,
Mar 19, 2020, 12:57:45 PM3/19/20
to lavaan
Omid

This was an old issue that is fixed in the development version of semTools, you can install it with this

devtools::install_github("simsem/semTools/semTools")

It works well in my example now

Omid Ghavibazoo

unread,
Mar 20, 2020, 8:33:40 AM3/20/20
to lavaan
Hi Mauricio,

I still receive the same error. Maybe I have problem with installing the latest version as you said. I share my codes here:

# Updating the latest version
remove.packages("semTools")
detach("package:semTools", unload=TRUE)
devtools::install_github("simsem/semTools/semTools")

> Downloading GitHub repo simsem/semTools@master
> These packages have more recent versions available.
> Which would you like to update?
>
> 1:   numDeriv (2016.8-1 -> 2016.8-1.1) [CRAN]
> Enter one or more numbers separated by spaces, or an empty line to cancel
> 1: 1
> mnormt   (NA       -> 1.5-6     ) [CRAN]
> numDeriv (2016.8-1 -> 2016.8-1.1) [CRAN]
> Installing 2 packages: mnormt, numDeriv
> Content type 'application/zip' length 117637 bytes (114 KB)
> downloaded 114 KB
> Content type 'application/zip' length 115714 bytes (113 KB)
> downloaded 113 KB
>
>package ‘mnormt’ successfully unpacked and MD5 sums checked
>Error: (converted from warning) cannot remove prior installation of package ‘mnormt’

And in the end, no semTools is installed. Even if I install it via Tools, I get the same error as before. 

Since my latent variable is constructed via three dichotomous variables, is that important how many degrees of freedom I have for it? Three dichotomous variables lead to zero degree of freedom and I was wondering how the degree of freedom is related to reliability. I know that I can fix some loadings or the errors in order to have overidentified model, but I am not sure for the matter of reliability what I should do with degrees of freedom.

Thanks and best,

Terrence Jorgensen

unread,
Mar 21, 2020, 11:44:12 AM3/21/20
to lavaan
>Error: (converted from warning) cannot remove prior installation of package ‘mnormt’

And in the end, no semTools is installed

Then try first installing mnormt in a fresh R session (without anything loaded), then install semTools.
 
Since my latent variable is constructed via three dichotomous variables, is that important how many degrees of freedom I have for it? Three dichotomous variables lead to zero degree of freedom and I was wondering how the degree of freedom is related to reliability. 

Reliability has nothing to do with whether your model is just- or over-identified.  You don't need df to estimate reliability from the parameter estimates.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Omid Ghavibazoo

unread,
Mar 23, 2020, 8:52:34 AM3/23/20
to lavaan
Hi Terrence,

Thanks a lot! Now it works.

Best,
Message has been deleted

Omid Ghavibazoo

unread,
Mar 23, 2020, 9:41:49 AM3/23/20
to lavaan
Dear Mauricio,

I followed your suggestion and I got the below results:

> reliability(model1.fit)
>For constructs with categorical indicators, the alpha and the average variance extracted are calculated from polychoric (polyserial) correlations, not from Pearson correlations.
>
>            latent
>alpha  -0.91511165
>omega   0.08175116
>omega2  0.08175116
>omega3  0.08175116
>avevar  0.56580538
> maximalRelia(model1.fit)
>[1] 0.600543
>attr(,"weight")
>risk.aver.b       stock    mut.fund 
>  -2.466692    2.847804    2.618888

First of all, since semTools already alerts that polychoric correlations are used, should I test the reliability differently via other packages? (I already know that this is due to use of dichotomous indicators).
Also, I don't understand based on numbers for omegas and alpha, whether my construct is reliable. Which one should I trust? Alpha or Omega?

Thanks and best,



On Monday, March 16, 2020 at 10:58:03 PM UTC+1, Mauricio Garnier-Villarreal wrote:

Mauricio Garnier-Villarreal

unread,
Mar 23, 2020, 12:09:08 PM3/23/20
to lavaan
Omid

I dont think you need to test the reliability with other methods. With categorical indicators, the use of the polychoric correlation adjusts for the type of item. So there are no unnecesary assumptions about the items.

The mayor difference in direction between measures (omega = 0.08, MR = 0.6), I would think is due to having at least one factor loadings being negative, and others being positive. Alpga dn omega need all the items to be in the same direction, while MR does not. If I am right, if you reverse code the item(s) that are negative you should have an omega that makes more sense

Between reliability measures I recommend omega3 and Maximal Reliability, instead of alpha, as alpha has many issues and these 2 are better methods accounting by the SEM model

Cho, E., & Kim, S. (2015). Cronbach’s Coefficient Alpha: Well Known but Poorly Understood. Organizational Research Methods, 18(2), 207–230. https://doi.org/10.1177/1094428114555994
Gu, F., Little, T. D., & Kingston, N. M. (2013). Misestimation of Reliability Using Coefficient Alpha and Structural Equation Modeling When Assumptions of Tau -Equivalence and Uncorrelated Errors Are Violated. Methodology, 9(1), 30–40. https://doi.org/10.1027/1614-2241/a000052

Omid Ghavibazoo

unread,
Mar 24, 2020, 5:12:43 AM3/24/20
to lavaan
Dear Mauricio,

Thanks a million for you helpful comments! I changed the direction as you suggested and now I have reasonable results as below:
>         latent
>alpha  0.8005231
>omega  0.5805688
>omega2 0.5805688
>omega3 0.5805688
>avevar 0.5732141
> maximalRelia(model1.fit)
>[1] 0.5871624
>attr(,"weight")
>risk.tolerance.b            stock         mut.fund 
>       0.7862761        1.1024056        1.1113182 

 But based on What I found, the cutoff point for omega3 is 0.7. That means I cannot use the currect latent construct for publication right?. Is there any cutoff point for maximal reliability as well?

Best regards,

Mauricio Garnier-Villarreal

unread,
Mar 24, 2020, 7:15:20 PM3/24/20
to lavaan

Omid

I hate these cutoff values, because they are arbitrary and have NO technical criteria to defend them. These come from Nunnally, which said: based on our experience 0.7 is a good overall guideline. It was never suggested as a strong cutoff. This 0,7 comes from alpha, and have have been extended to omega, again with no technical criteria

This has become such a common practice, that have lead to people fudging the data until they reach that score. As well as choose the measures that "look" better. Like in your case, how alpha looks a lot better than omega and MR. But we know that alpha its a biased measured, as it assumes tau equivalence, and this has shown that can be either higher or lower biased (Raykov, dont have the exact year). While omega and MR doesnt assume tau equivalence.

It has become really easy to calculate alpha and report it as passing this 0.7 score. Which has lead to the common practice to report "reliability" with little to no thought about validity.

For more details about these issues
Cho, E., & Kim, S. (2015). Cronbach’s Coefficient Alpha: Well Known but Poorly Understood. Organizational Research Methods, 18(2), 207–230. https://doi.org/10.1177/1094428114555994

Now, to your commet about "should you use this factor?". This is a more complex questions than just saying "is reliability higher than the arbitrary cutoff?". You should be looking at factor loadings, does the overall model makes sense and present strong relations? Could be that the factor loadings are not strong enough?

You can describe omega and MR as the continuous interpretation, instead of the good/bad binary based on the cutoff: proportion of true score variance based on variance explained by the factor

Also, there are other factors that could be having an effect, for example number of items, most of these measures will report higher scores with more items. And in your example you have a only 3 items

I agree that the value of 0.58 feels low. But my overall point is that the issue is much more complicated that higher/lower than 0.7

Gavin T. L. Brown

unread,
Mar 24, 2020, 8:23:37 PM3/24/20
to lavaan
Hi Omid and Mauricio
If I can, I'd like to add a little to what the meaning of scale reliability is.
Cattell (back in 1964; how quickly we forget) made it clear that there was a conceptual distinction between reliability and homogeneity of items in a set. What he meant is that if you have items that are highly homogeneous (i.e., the say almost the same thing with the same words) then reliability will be high--he called it a bloated specific. I'm sure you can think of some high reliability scales that achieve that by redundant repetition of item wording.
So if your scale items are not homogeneous but describe the latent construct well, I would expect low reliability estimates but good fit to the data from a confirmatory factor analysis of a latent structure. Perhaps this is what you have ended up in? low reliability because items are heterogeneous but acceptable or good fit to the data?
see
Cattell, R. B. (1964). Validity and reliability: A proposed more basic set of concepts. Journal of Educational Psychology, 55(1), 1-22.
Cattell, R. B., & Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality, in test scales. Educational and Psychological Measurement, 24(1), 3-30.

Omid Ghavibazoo

unread,
Mar 25, 2020, 4:34:26 AM3/25/20
to lav...@googlegroups.com
Hi Gavin and Mauricio

Thanks a lot for bringing this discussion up. I totally understand your points. Regarding cut-off points I agree that relying only on these numbers are misleading. But at the end of the day, researchers and reviewers of the academic journals may rely on certain cut-off points for accepting a paper. The discussion of homogeneity was also interesting because I come from finance background and I was not familiar with it. Since this topic came up, I would like to elaborate a bit more on my scale items:
There are different sets of questions in two separate modules of a certain survey data. One of them is asking about self-reported risk aversion of individuals:

"Which of the statements on the card comes closest to the amount of financial risk that you are willing to take when you save or make investments?
1. Take substantial financial risks expecting to earn substantial returns
2. Take above average financial risks expecting to earn above average returns
3. Take average financial risks expecting to earn average returns
4. Not willing to take any financial risks.”
I made a binary out of above question (if 4 is chosen =0, otherwise 1)

The other two manifest variables comes from answering to questions about stock ownership (yes or no) and mutual fund ownership (yes or no).

As you can see there is no homogeneity in these questions and I thought that it would be better to make a latent construct which reflects both self reported risk aversion and real risky asset ownership of individuals. Please correct me if this idea is wrong.
The summary results of lavaan is:
>Latent Variables:
>                         Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
>  latent =~                                                            
>    risk.tolernc.b    1.000                               0.703    0.703
>    stock                1.089    0.027   40.777    0.000    0.766    0.766
>    mut.fund          1.058    0.026   40.429    0.000    0.743    0.743
>Variances:
>                  Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
>   .risk.tolernc.b    0.506                               0.506    0.506
>   .stock                0.413                               0.413    0.413
>   .mut.fund          0.447                                0.447    0.447
>    latent               0.494    0.016   30.828    0.000    1.000    1.000
> R-Square:
>                   Estimate
>    risk.tolernc.b    0.494
>    stock                0.587
>    mut.fund          0.553
Once again, I really appreciate the time you put for answering such long posts in this group.

All the best,

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/siJuLCOj6tg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/69e6cf3e-a11a-4d3a-b342-fa236a73a24b%40googlegroups.com.

Nickname

unread,
Mar 25, 2020, 10:59:58 AM3/25/20
to lavaan
Omid,
  Three observations...

1. I completely agree with the spirit of the comments from Mauricio and Gavin.  However, the .7 benchmark (not really accurate to call it a cut off) actually has an interesting basis.  The benchmarks of .7 for a new test and .9 for an established test (people tend to forget that .7 is only for new tests still in development) are based on the standard error of measurement.  Reliability of .7 roughly corresponds to a standard error of measurement of .5 and .9 roughly corresponds to .3.  I do not bring this up to justify reviewer behavior making radically different interpretations of .69 and .70.  That is nonsense.  However, I have found that it is a real eye opener for students when I show them how inaccurate a scale with reliability of .7 is in terms of standard error of measurement.

2. The basic concept behind reliability is that each observations contains error but the errors cancel out over large numbers of observations.  So, the very idea of estimating reliability for a three-item scale is a little dodgy.  Evaluating the reliability of a 3-item scale is a little like estimating a population mean from a sample of 3 observations.  You can go through the motions but you are pressing the limits of the underlying assumptions.  Expecting long-run properties to kick in at N=3 is a big ask.  (I am a little confused how you got down to only 3 items.  Are these three scale scores computed form more items?  If so, parceling may not be your friend in this context.)

3. I did not have time to pursue this earlier but it is not clear to me why you want to extract factor score estimates for use in a predictive model.  It is not clear to me why you could not use lavaan's probit link to model the dichotomous outcome in the same model as the measurement parameters.  I may have missed something in your description.

Good luck,
Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/




Omid Ghavibazoo

unread,
Mar 25, 2020, 12:10:11 PM3/25/20
to lavaan
Hi Keith,

The main purpose for extracting the factor score estimates from lavaan and using it in glm() is that I can easily include and test many interactions between the extracted factor estimates and independent variables (such as age, marital status, gender, income,net worth,etc). I want to run the regression with 21 control (independent) variables which include country dummies. Lavaan does not support interactions between independent and the latent construct. I was thinking maybe I could do that in probit regressions using glm(). Maybe treating the extracted factor estimates as an independent continuous variable in regression using glm() function and setting up interactions with it is fully wrong and I don't know this.

Best,  

Mauricio Garnier-Villarreal

unread,
Mar 25, 2020, 1:06:14 PM3/25/20
to lavaan

Appriciate Gavin and Keith comments.

On Gavin comments about homogeneity, this is closer to what alpha measures, with inter-item correlation is closer to the idea of item homogeneity than reliability. For example, this is why a really high alpha is a problem, because it would basically show that the items are redundant.

Agree with Keith that a good reliability measure can be explained in function of the standard error of measurement. My issue is the practice of using the 0.7 as a strict cutoff, instead of a benchmark. Also, on using it for alpha which is a bad approximate to reliability.

If you planned to extract the factor scores, you should use the plausiblecores instead of only 1 set of factor scores

Nickname

unread,
Mar 26, 2020, 9:47:15 AM3/26/20
to lavaan
Omid,

  >>> Lavaan does not support interactions between independent and the latent construct.

  I do not understand what you mean by that.  I think that 'independent' means an exogenous observed variable.  It is true that the LISREL all-y model underlying lavaan does not directly represent effects involving observed variables.  However, path analysis and other models involving such effects are readily incorporated using single-indicator latent variables.  Once you have a single-indicator latent variable, you can proceed as you would with a multiple-indicator latent variable.  You can then use methods for interactions between latent variables.  I could imagine that a very large number of interactions could lead to a very complex model that might swamp your sample size.  You would probably want to build up the model one term at a time or possibly write a program to write the lavaan model syntax if the model is very complicated.  However, unless I am misunderstanding something, I believe that the model is expressible in lavaan model syntax.

  PS:  I realized after I sent my last post that I should have made it explicit that I was referring to standard errors of measurement expressed in standard deviation terms.

Omid Ghavibazoo

unread,
Mar 27, 2020, 3:32:31 AM3/27/20
to lavaan
Hi Mauricio,

Do you know any paper that have used imputations for factor scores and I can refer to it?

Thanks and best,

Mauricio Garnier-Villarreal

unread,
Mar 27, 2020, 9:31:55 AM3/27/20
to lavaan
Omid

Is not imputations, is using multiple sets of factor scores to account for factor score indeterminancy/variability

I dont have a reference on that. There is a lot of work on factor scores, and how the results can vary depending on which method is used to estimate the factor scores. I know Yves is working on other approaches to be implemented in lavaan, that for now are experimental

The best I know is the Mplus technical paper
Asparouhov, T. & Muthen, B. O. (2010). Plausible values for latent variables using Mplus. Technical Report. Retrieved from www.statmodel.com/download/Plausible.pdf

Gavin T. L. Brown

unread,
Mar 27, 2020, 8:52:59 PM3/27/20
to lavaan
re: query about factor scores
i found this to be a helpful guide for factor scores. it's in an open-source, no fee, double-blind refereed journal which has an excellent reviewer board.

Distefano, C., Zhu, M. & Mîndrilă, D. (2009). Understanding and using factor scores: Considerations for the applied researcher. Practical Assessment, Research & Evaluation, 14, https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1226&context=pare.


On Tuesday, 17 March 2020 03:57:40 UTC+13, Omid Ghavibazoo wrote:

Omid Ghavibazoo

unread,
Mar 30, 2020, 7:41:17 AM3/30/20
to lavaan
Dear Gavin, Keith and Mauricio

Thank you very much for your comments and responses!

Best regards,

Omid
Reply all
Reply to author
Forward
0 new messages