using factor scores as IVs in growth model

79 views

CFAfactor-scoregrowth

Skip to first unread message

Ajay Somaraju

unread,

Jul 25, 2018, 6:59:35 PM7/25/18

to lavaan

Hi,

Linda Muthen says on her website that factor scores give unbiased estimates when used as IVs .

"The usual factor scores obtained by the "regression method" give unbiased slopes when regressed on (used as IVs), but not when used as DVs. There are other factor score methods that give scores that are unbiased for DV use. See: Skrondal, A. and Laake, P. (2001). Regression among factor scores. Psychometrika 66, 563-575. "

I have always heard that factor scores are not true estimates, so I think I may be reading into this wrong.

I would like to regress latent slope and intercept in my growth model on factor scores I derived from a cfa. Is there a way to correct factor scores using Croon's or some other estimation technique in lavaan? Can someone also clear up my confusion surrounding factor scores.

Thanks,

Ajay

Edward Rigdon

unread,

Jul 25, 2018, 9:12:46 PM7/25/18

to lav...@googlegroups.com

Linda is not necessarily correct here (and I say such a thing about Linda with trepidation). Common factors are typically indeterminate, which means (among other things) that one cannot faithfully represent a common factor strictly as function of the observed variables in the model. Actual factor scores will include both a determinate part (which is a function of the data and model parameters) and an arbitrary part (which is a function of model parameters and one arbitrary vector per common factor). The "regression method" gives you the determinate part, which thus omits a part of the variance of the comon factor.

The arbitrary vectors must be orthogonal to the observed variables within the factor model (and each other), but need not be composed of pure random variance. These vectors can be correlated with variables outside the factor model. Or not. The arbitrary part is arbitrary--there is no one best set of values for the cases in your dataset. So tyou could, for example, create pure random variance vectors and complete your factor scores, and use them. But the resulrs of any later analysis will only apply to that particular choice for the arbitrary component of your factor scores. The contribution of this arbitrary part to the total factor score depends on the degree of indeterminacy, which decreases as the number of indicators and strength of loadings increases--though in the typical scenario, with a handful of indicators and moderate loadings, indeterminacy can still be quite high.

So if you use "regression method" scores as predictors in the later analysis, your unstandardized parameter estimates would be the same as what you would get if you completed the factor scores with pure random error. However, your standardized parameter estimates would be biased upward, because the variances of your predictors would be too small. And there will be other results that you would have gotten if the arbitrary component had NOT been pure random variance. So using regression method factor scores will not produce findings that you can generalize.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Ajay Somaraju

unread,

Jul 25, 2018, 10:23:48 PM7/25/18

to lav...@googlegroups.com

Thank you for your response! I really appreciate it

Is there no way to correct for the indeterminacy and subsequent bias through Croon's method, Skrondal & Laake, etc ? Yves Rosseel suggests "a bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate."

If factor scores are not generalizable why are they used? Do they serve any utility?

Thanks again,

Ajay

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+unsubscribe@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+unsubscribe@googlegroups.com.

Edward Rigdon

unread,

Jul 25, 2018, 11:19:32 PM7/25/18

to lav...@googlegroups.com

Ajay--

Well, first, you could use more and stronger indicators. That would reduce factor indeterminacy to some small and maybe inconsequential share of the factor's variance.

OR you could make indeterminacy go away entirely by modifying your model. Factor indeterminacy is the result of a rank problem, where you are trying to represent p observed variables in terms of k common factors plus p specific factors or residuals, so you are explaining p things using p + k things, and there is no unique best to do that. But if, for example, you fix to 0 the variance of one specific factor per common factor, then the rank problem is solved and the solution becomes determinate (de Krijnen et al.).

OR you could try Schonemann and Steiger's "regression component analysis. You eliminate the problem by eliminating the assumption that common factors and specific factors are mutually orthogonal. It is a very simple transformation, yielding a solution where the regression components conform to the same covariance matrix as the factor model yet are determinate functions of the observed variables.

OR you could do the whole analysis within the same factor model. Then you don't need factor scores at all. Factor indeterminacy affects neither parameter estimates nor tandard errors nor fit indices within the factor model.

OR you could ignore the problem entirely and just use the regression factor scores. That's a popular option.

"Bias correcting" factor scores just deal with the missing variance in regression factor scores, in effect generating arbitrary components that are pure random error. The method does, indeed, produce *a* set of conforming factor scores, just not the only set of conforming factor scores. So it is not quite honest to say that the results represent the factor in question--though the results *are* consistent with the factor in question.

On Wed, Jul 25, 2018 at 6:23 PM Ajay Somaraju <ajay.s...@gmail.com> wrote:

Thank you for your response! I really appreciate it

Is there no way to correct for the indeterminacy and subsequent bias through Croon's method, Skrondal & Laake, etc ? Yves Rosseel suggests "a bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate."

If factor scores are not generalizable why are they used? Do they serve any utility?

Thanks again,

Ajay

On Wed, Jul 25, 2018 at 4:12 PM, Edward Rigdon <edward...@gmail.com> wrote:

Linda is not necessarily correct here (and I say such a thing about Linda with trepidation). Common factors are typically indeterminate, which means (among other things) that one cannot faithfully represent a common factor strictly as function of the observed variables in the model. Actual factor scores will include both a determinate part (which is a function of the data and model parameters) and an arbitrary part (which is a function of model parameters and one arbitrary vector per common factor). The "regression method" gives you the determinate part, which thus omits a part of the variance of the comon factor.
The arbitrary vectors must be orthogonal to the observed variables within the factor model (and each other), but need not be composed of pure random variance. These vectors can be correlated with variables outside the factor model. Or not. The arbitrary part is arbitrary--there is no one best set of values for the cases in your dataset. So tyou could, for example, create pure random variance vectors and complete your factor scores, and use them. But the resulrs of any later analysis will only apply to that particular choice for the arbitrary component of your factor scores. The contribution of this arbitrary part to the total factor score depends on the degree of indeterminacy, which decreases as the number of indicators and strength of loadings increases--though in the typical scenario, with a handful of indicators and moderate loadings, indeterminacy can still be quite high.
So if you use "regression method" scores as predictors in the later analysis, your unstandardized parameter estimates would be the same as what you would get if you completed the factor scores with pure random error. However, your standardized parameter estimates would be biased upward, because the variances of your predictors would be too small. And there will be other results that you would have gotten if the arbitrary component had NOT been pure random variance. So using regression method factor scores will not produce findings that you can generalize.

On Wed, Jul 25, 2018 at 2:59 PM Ajay Somaraju <ajay.s...@gmail.com> wrote:

Hi,

Linda Muthen says on her website that factor scores give unbiased estimates when used as IVs .

"The usual factor scores obtained by the "regression method" give unbiased slopes when regressed on (used as IVs), but not when used as DVs. There are other factor score methods that give scores that are unbiased for DV use. See: Skrondal, A. and Laake, P. (2001). Regression among factor scores. Psychometrika 66, 563-575. "

I have always heard that factor scores are not true estimates, so I think I may be reading into this wrong.

I would like to regress latent slope and intercept in my growth model on factor scores I derived from a cfa. Is there a way to correct factor scores using Croon's or some other estimation technique in lavaan? Can someone also clear up my confusion surrounding factor scores.

Thanks,

Ajay

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Yves Rosseel

unread,

Jul 30, 2018, 5:04:57 PM7/30/18

to lav...@googlegroups.com

Just for the record: in the 'bias-correcting method', we use factor
scores (in combination with Croon's correction) only to get consistent
estimates of the *structural parameters* (eg regressions among the
latent variables). That is our only goal. We do not actually 'use' these
factor scores at the individual level.

Yves.

Ajay Somaraju

unread,

Jul 30, 2018, 5:15:18 PM7/30/18

to lav...@googlegroups.com

Thank you both so much for your informative responses. I really appreciate it :)

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Reply all

Reply to author

Forward

0 new messages