Factor Scores

Joo Chung

unread,

Sep 19, 2012, 4:53:10 PM9/19/12

to lav...@googlegroups.com

Hi,

I was wondering about how lavaan generates factors scores. I know that predict() does this, but I have a hard time determining what exactly the package is doing. For example, factor scores from Mplus and traditional refined methods for factor analysis (e.g., Thurstone regression, Bartlett scores) produce factor scores that are somewhat like a Z-score (i.e., has no resemblance to the scaling of the raw items, such 1-5 Likert). However, the factor scores I generated using predict() are actually scaled the same as my raw items (in my case, 1-5 Likert). Is it some kind of weighted average this is being implemented?

Thank you!

yrosseel

unread,

Sep 19, 2012, 5:09:24 PM9/19/12

to lav...@googlegroups.com

On 09/19/2012 10:53 PM, Joo Chung wrote:
> Hi,
> I was wondering about how lavaan generates factors scores. I know
> that predict() does this, but I have a hard time determining what
> exactly the package is doing. For example, factor scores from Mplus and
> traditional refined methods for factor analysis (e.g., Thurstone
> regression, Bartlett scores) produce factor scores that are somewhat
> like a Z-score (i.e., has no resemblance to the scaling of the raw
> items, such 1-5 Likert). However, the factor scores I generated using
> predict() are actually scaled the same as my raw items (in my case, 1-5
> Likert).

lavaan uses the (classical) regression approach for computing factor
scores. At least for the Holzinger & Swineford data, the factor scores
indeed are scaled similar to z-scores:

> example(cfa)
> head(predict(fit))
visual textual speed
[1,] -0.81767692 -0.13754477 0.06150717
[2,] 0.04951972 -1.01272409 0.62549399
[3,] -0.76139654 -1.87228642 -0.84057276
[4,] 0.41934118 0.01848571 -0.27133731
[5,] -0.41590502 -0.12225014 0.19432948
[6,] 0.02325682 -1.32981725 0.70885385

while the observed variables range from 0.0 to 10.0. These factor scores
are very close to what Mplus reports. Could you send me your script/data
so I can reproduce your findings?

Yves.

Joo Chung

unread,

Sep 19, 2012, 6:20:55 PM9/19/12

to lav...@googlegroups.com

Hi there,

I've attached the dataset I've been working with and the accompanying script.

I quick correction on my initial post. The items in the dataset are scaled from 0 to 3 Likert. I initially converted the items to ordinal first to take advantage of lavaan's new capabilities with ordinal data.

My initial model was the following:

------------

myModel <- '

f1 =~ V1 + V2 + V3 + V4

f2 =~ V5 + V6 + V7 + V8 + V9 + V10

f3 =~ V11 + V12 + V13 + V14 + V15 + V16

f4 =~ V17 + V18 + V19 + V20'

fit <- cfa(myModel, data = Subset)

------------

However, this was unable to produce factor scores (stating that the system was computationally singular) although I was able to get fit statistics.

I attempted to add some residual variances to the observed variables as such:

------------

myModel <- '

#latent variables

f1 =~ V1 + V2 + V3 + V4

f2 =~ V5 + V6 + V7 + V8 + V9 + V10

f3 =~ V11 + V12 + V13 + V14 + V15 + V16

f4 =~ V17 + V18 + V19 + V20

#residual variances observed variables

V1 ~~ V1

V2 ~~ V2

V3 ~~ V3

V4 ~~ V4

V5 ~~ V5

V6 ~~ V6

V7 ~~ V7

V8 ~~ V8

V9 ~~ V9

V10 ~~ V10

V11 ~~ V11

V12 ~~ V12

V13 ~~ V13

V14 ~~ V14

V15 ~~ V15

V16 ~~ V16

V17 ~~ V17

V18 ~~ V18

V19 ~~ V19

V20 ~~ V20'

fit <- cfa(myModel, data = Subset)

------------

This model successfully produced the factor scores. Interestingly, the resulting factor scores are approximately between 1 to 4 (not 0 to 3 in the observed variables). I suspect that the additional residual variances had something to do with this, but I am not well versed enough in SEM to get a good sense of the problem.

Thank you for your hard work!

Example.RData

Example.r

yrosseel

unread,

Sep 20, 2012, 4:00:21 AM9/20/12

to lav...@googlegroups.com

On 09/20/2012 12:20 AM, Joo Chung wrote:
> Hi there,
> I've attached the dataset I've been working with and the accompanying
> script.
>
> I quick correction on my initial post. The items in the dataset are
> scaled from 0 to 3 Likert. I initially converted the items to ordinal
> first to take advantage of lavaan's new capabilities with ordinal data.

This explains it all. The predict() function can not handle yet factor
scores if the observed variables are ordinal. It should throw an error.
Do not use them (yet) if you have categorical data. I need to fix this
first.

Yves.

Joo Chung

unread,

Sep 20, 2012, 12:57:46 PM9/20/12

to lav...@googlegroups.com

I see :)

I'm looking forward to your next update!

Reply all

Reply to author

Forward