Factor Scores

1,183 views
Skip to first unread message

Joo Chung

unread,
Sep 19, 2012, 4:53:10 PM9/19/12
to lav...@googlegroups.com
Hi,
I was wondering about how lavaan generates factors scores. I know that predict() does this, but I have a hard time determining what exactly the package is doing. For example, factor scores from Mplus and traditional refined methods for factor analysis (e.g., Thurstone regression, Bartlett scores) produce factor scores that are somewhat like a Z-score (i.e., has no resemblance to the scaling of the raw items, such 1-5 Likert). However, the factor scores I generated using predict() are actually scaled the same as my raw items (in my case, 1-5 Likert). Is it some kind of weighted average this is being implemented?
Thank you!

yrosseel

unread,
Sep 19, 2012, 5:09:24 PM9/19/12
to lav...@googlegroups.com
On 09/19/2012 10:53 PM, Joo Chung wrote:
> Hi,
> I was wondering about how lavaan generates factors scores. I know
> that predict() does this, but I have a hard time determining what
> exactly the package is doing. For example, factor scores from Mplus and
> traditional refined methods for factor analysis (e.g., Thurstone
> regression, Bartlett scores) produce factor scores that are somewhat
> like a Z-score (i.e., has no resemblance to the scaling of the raw
> items, such 1-5 Likert). However, the factor scores I generated using
> predict() are actually scaled the same as my raw items (in my case, 1-5
> Likert).

lavaan uses the (classical) regression approach for computing factor
scores. At least for the Holzinger & Swineford data, the factor scores
indeed are scaled similar to z-scores:

> example(cfa)
> head(predict(fit))
visual textual speed
[1,] -0.81767692 -0.13754477 0.06150717
[2,] 0.04951972 -1.01272409 0.62549399
[3,] -0.76139654 -1.87228642 -0.84057276
[4,] 0.41934118 0.01848571 -0.27133731
[5,] -0.41590502 -0.12225014 0.19432948
[6,] 0.02325682 -1.32981725 0.70885385

while the observed variables range from 0.0 to 10.0. These factor scores
are very close to what Mplus reports. Could you send me your script/data
so I can reproduce your findings?

Yves.


Joo Chung

unread,
Sep 19, 2012, 6:20:55 PM9/19/12
to lav...@googlegroups.com
Hi there,
I've attached the dataset I've been working with and the accompanying script.

I quick correction on my initial post. The items in the dataset are scaled from 0 to 3 Likert. I initially converted the items to ordinal first to take advantage of lavaan's new capabilities with ordinal data.

My initial model was the following:

------------
myModel <- '
f1 =~ V1 + V2 + V3 + V4
f2 =~ V5 + V6 + V7 + V8 + V9 + V10
f3 =~ V11 + V12 + V13 + V14 + V15 + V16
f4 =~ V17 + V18 + V19 + V20'
fit <- cfa(myModel, data = Subset)
------------ 

However, this was unable to produce factor scores (stating that the system was computationally singular) although I was able to get fit statistics.

I attempted to add some residual variances to the observed variables as such:

------------ 
myModel <- '
#latent variables
f1 =~ V1 + V2 + V3 + V4
f2 =~ V5 + V6 + V7 + V8 + V9 + V10
f3 =~ V11 + V12 + V13 + V14 + V15 + V16
f4 =~ V17 + V18 + V19 + V20
#residual variances observed variables
V1 ~~ V1
V2 ~~ V2
V3 ~~ V3
V4 ~~ V4
V5 ~~ V5
V6 ~~ V6
V7 ~~ V7
V8 ~~ V8
V9 ~~ V9
V10 ~~ V10
V11 ~~ V11
V12 ~~ V12
V13 ~~ V13
V14 ~~ V14
V15 ~~ V15
V16 ~~ V16
V17 ~~ V17
V18 ~~ V18
V19 ~~ V19
V20 ~~ V20'
fit <- cfa(myModel, data = Subset)

------------ 

This model successfully produced the factor scores. Interestingly, the resulting factor scores are approximately between 1 to 4 (not 0 to 3 in the observed variables). I suspect that the additional residual variances had something to do with this, but I am not well versed enough in SEM to get a good sense of the problem.

Thank you for your hard work!
Example.RData
Example.r

yrosseel

unread,
Sep 20, 2012, 4:00:21 AM9/20/12
to lav...@googlegroups.com
On 09/20/2012 12:20 AM, Joo Chung wrote:
> Hi there,
> I've attached the dataset I've been working with and the accompanying
> script.
>
> I quick correction on my initial post. The items in the dataset are
> scaled from 0 to 3 Likert. I initially converted the items to ordinal
> first to take advantage of lavaan's new capabilities with ordinal data.

This explains it all. The predict() function can not handle yet factor
scores if the observed variables are ordinal. It should throw an error.
Do not use them (yet) if you have categorical data. I need to fix this
first.

Yves.

Joo Chung

unread,
Sep 20, 2012, 12:57:46 PM9/20/12
to lav...@googlegroups.com
I see :)
I'm looking forward to your next update!
Reply all
Reply to author
Forward
0 new messages