"Thurstonian IRT-model for binary forced choice data" - computable in lavaan?

814 views
Skip to first unread message

Henning Bumann

unread,
Dec 31, 2012, 6:08:43 AM12/31/12
to lav...@googlegroups.com

Hello everyone,

I'm fairly new to structural equation modeling in general and do not know the lavaan package very well so far. 

My goal is to fit an "Thurstonian IRT-Model" to binary forced choice questionnaire data and I would like to know if this can already be done in lavaan and if so I would be grateful for some model fitting advice.

The model was developed by Anna Brown and published recently. A graphical representation looks like this:

some characteristics of the model

  • the manifest variables are the representation of a binary choice process. If the first Item is preferred to the other the value of the variable is 1 otherwise its 0.
    •   Lavaan seems to be able to work with this kind of data. Are there any limitations, that I should be aware of?

## sample data
n
<- 200
i1i2
<- sample(c(0,1), n, replace=T)
i3i4
<- sample(c(0,1), n, replace=T)
i5i6
<- sample(c(0,1), n, replace=T)
i7i8
<- sample(c(0,1), n, replace=T)

data
<- as.data.frame(cbind(i1i2, i3i4, i5i6, i7i8))
  • the manifest variables measure two different latent traits

## specify and fit model
model
<- '
T1 =~ i1i2 + i3i4 + i5i6 + i7i8
'


## attempts to identify a second latent variable with negative loadings

## add:
'
T2 =~ i1i2 + i3i4 + i5i6 + i7i8  # doesnt fit
T2 =~ -1*i1i2 + -1*i3i4 + -1*i5i6 + -1*i7i8  # all Parameters fixed, no estimate
T2 =~ -i1i2 - i3i4 - i5i6 - i7i8  # doesnt fit
'

  • The model is fit using DWLS estimation. And the latent parameters are fitted via a MAP-Bayes-algorithm. 
    •  I saw, that DWLS estimation is newly implemented in lavaan. Will this estimation work for this kind of data and how is it possible to estimate the latent traits for each case?

fit <- lavaan(model=model, model.type="sem", data=data, estimator="DWLS")
summary
(fit)

## attempts access estimates for every case
parameterEstimates
(fit)
coef
(fit, type="user")

  • For Model Identification the Variances of the Latent Variables need to be fixed to 1, while the "Uniqueness of the manifest Variables" needs to be fixed to 0.5.
    • Does uniqueness equal variance of the manifest variables? How do I fix it?

## attempts to fix Variances and Uniqueness
## add:
'
T1 ~~ 1*T1
i1i2 ~~0.5*i1i2
i3i4 ~~0.5*i3i4
i5i6 ~~0.5*i5i6
i7i8 ~~0.5*i7i8
'

I hope, I provided enough information to determine, wether lavaan will be able to compute this model. Looking forward to your answers,

Henning

PS: For those of you, that are familiar with MPlus, i added an mplus-example from the original publication.

MPLUS-Syntax

TITLE: Example forced-choice questionnaire with 3 Triplets (=9 Items) measuring 3 Traits

 

DATA
: FILE IS /Users/Henning/Data/Consulting;

VARIABLE
: NAMES ARE i25i26 i83i84 i141i142 i199i200;

 USEVARIABLES ARE ALL
;

 CATEGORICAL ARE ALL
;

ANALYSIS
:

 ESTIMATOR IS wlsm
;

 PARAMETERIZATION IS theta
;

MODEL
:

! latent traits are indicated by binary outcomes directly

Trait1 BY i1i2*1 i1i3*1 (l1)

 i4i5
*1 i4i6*1 (l4)

 i7i8
*1 i7i9*1 (l7);

Trait2 BY i1i2*-1 (l2_m)

 i2i3
*1 (l2)

 i4i5
*-1 (l5_m)

 i5i6
*1 (l5)

 i7i8
*-1 (l8_m)

 i8i9
*1 (l8);

Trait3 BY i1i3*-1 i2i3*-1 (l3_m)

 i4i6
*-1 i5i6*-1 (l6_m)

 i7i9
*-1 i8i9*-1 (l9_m);

! variances for all traits are set to 1

 
Trait1-Trait3@1;

! traits are freely correlated

 
Trait1 WITH Trait2* Trait3*;

 
Trait2 WITH Trait3*;

! pairwise errors are free; parameters are declared here to impose constraints later

 i1i2
*1 (e1e2);

 i1i3
*1 (e1e3);

 i2i3
*1 (e2e3);

 i4i5
*1 (e4e5);

 i4i6
*1 (e4e6);

 i5i6
*1 (e5e6);

 i7i8
*1 (e7e8);

 i7i9
*1 (e7e9);

 i8i9
*1 (e8e9);

! errors related to the same utility are correlated, some are with minus sign i1i2 WITH i1i3*.5 (e1);

 i1i2 WITH i2i3
*-.5 (e2_m);

 i1i3 WITH i2i3
*.5 (e3);

 i4i5 WITH i4i6
*.5 (e4);

 i4i5 WITH i5i6
*-.5 (e5_m);

 i4i6 WITH i5i6
*.5 (e6);

 i7i8 WITH i7i9
*.5 (e7);

 i7i8 WITH i8i9
*-.5 (e8_m);

 i7i9 WITH i8i9
*.5 (e9);

MODEL CONSTRAINT
:

!loadings relating to the same item are equal in absolute value

 l2
=-l2_m; l5=-l5_m; l8=-l8_m;

! errors of pairs are equal to sum of 2 utility errors

 e1e2
=e1-e2_m;

 e1e3
=e1+e3;

 e3
= -e2_m+e3;

 e4e5
=e4- e5_m;

 e4e6
=e4+e6;

 e5e6
= -e5_m+e6;

 e7e8
=e7- e8_m;

 e7e9
=e7+e9;

 e8e9
= -e8_m+e9;

!fixing unique variances of one utility per block to identify the model e3=.5; e6=.5; e9=.5;

SAVEDATA
: ! trait scores for individuals are estimated and saved in a file FILE IS ExampleTestResults.dat;

SAVE
=FSCORES;


yrosseel

unread,
Jan 3, 2013, 5:28:46 AM1/3/13
to lav...@googlegroups.com
On 12/31/2012 12:08 PM, Henning Bumann wrote:
> Hello everyone,
>
> I'm fairly new to structural equation modeling in general and do not
> know the lavaan package very well so far.
>
> My goal is to fit an "Thurstonian IRT-Model" to binary forced choice
> questionnaire data and I would like to know if this can already be done
> in lavaan

I don't think so. It would seem (and this is also reflected in the Mplus
syntax) that these models rely on the so-called 'theta' parameterization
for categorical variables. lavaan currently (0.5-11) only supports the
'delta' parameterization. In the 'theta' parameterization, the residual
errors of the (binary) indicators can be estimated (if they are
identified; be default, they are fixed to 1), while in the 'delta'
parameterization, they are never estimated (they are a function of other
model parameters).

But I'm working on the theta parameterization, and it should be
available in 0.5-12. Once we have this in place, I believe lavaan will
be able to fit these Thurstonian IRT models.

But a much nicer approach (IMHO) would be to have a dedicated function
(say, thurstonianIRT()) where there is no need to force a SEM
parameterization upon a non-SEM model, but where we fit the needed
parameters directly.

Yves.

Jared Harpole

unread,
Sep 11, 2014, 4:48:05 PM9/11/14
to lav...@googlegroups.com
Yves,

I see that the theta parameterization is available is it possible to make the factor loadings negative to fit the Thurstonian IRT model as Henning Bumann posted above? Here is the code from above.

## specify and fit model
model
<- '
T1 =~ i1i2 + i3i4 + i5i6 + i7i8
'


## attempts to identify a second latent variable with negative loadings

## add:
'
T2 =~ i1i2 + i3i4 + i5i6 + i7i8  # doesnt fit
T2 =~ -1*i1i2 + -1*i3i4 + -1*i5i6 + -1*i7i8  # all Parameters fixed, no estimate
T2 =~ -i1i2 - i3i4 - i5i6 - i7i8  # doesnt fit
'





 e5e6
= -e5_m<span style="color: #660;" cla
...

Yves Rosseel

unread,
Sep 18, 2014, 3:02:39 AM9/18/14
to lav...@googlegroups.com

> '
> T2 =~ i1i2 + i3i4 + i5i6 + i7i8 # doesnt fit
> T2 =~ -1*i1i2 + -1*i3i4 + -1*i5i6 + -1*i7i8 # all Parameters fixed, no
> estimate
> T2 =~ -i1i2 - i3i4 - i5i6 - i7i8 # doesnt fit
> '

The last line is wrong syntax. You can suggest negative starting values,
as in

T2 =~ start(-1)*i1i2 + start(-1)*i3i4 + start(-1)*i5i6 + start(-1)*i7i8

Yves.

Seongho Bae

unread,
Aug 29, 2015, 12:09:05 PM8/29/15
to lavaan
Hello, Yves,

How about nowadays? Can I fitting "Thurstonian IRT-Model" with the lavaan?

Seongho

2013년 1월 3일 목요일 오후 7시 28분 46초 UTC+9, yrosseel 님의 말:

peterlus...@gmail.com

unread,
Aug 16, 2016, 11:19:02 AM8/16/16
to lavaan
Hi,

I'm interested in fitting a Thurstonian IRT Model in lavaan, too. As I understand, it should be possible since the implementation of the theta parametrization. However, I can't figure out how to write these two parts from the Mplus syntax in lavaan:

! pairwise errors are free; parameters are declared here to impose constraints later

 i1i2
*1 (e1e2);

 i1i3
*1 (e1e3);

 i2i3
*1 (e2e3);


 ...



! errors of pairs are equal to sum of 2 utility errors

 

 e1e2
=e1-e2_m;

 e1e3
=e1+e3;

 e3
= -e2_m+e3;

 e4e5
=e4- e5_m;

 ...


Also, I'm wondering if lavaan is going to be able to compute MAP factor scores.

Any help would be greatly appreciated! Thanks!

Terrence Jorgensen

unread,
Aug 18, 2016, 8:56:47 AM8/18/16
to lavaan
I can't figure out how to write these two parts from the Mplus syntax in lavaan:

Use the double equal sign (==), which is the logical operator in R.  Check out the bottom example on this page:


Also, I'm wondering if lavaan is going to be able to compute MAP factor scores.

According to the help page (?lavPredict), no.

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

peterlus...@gmail.com

unread,
Aug 23, 2016, 3:57:17 AM8/23/16
to lavaan
Thank you, I will try to implement it soon!

Paul Buerkner

unread,
May 24, 2017, 3:41:13 AM5/24/17
to lavaan
Did anyone had success yet fitting the Thurstonian IRT Model in lavaan?

In particular, how can I label the scales y* in the "theta" parameterization and put constraints on them?

Terrence Jorgensen

unread,
May 24, 2017, 6:53:32 PM5/24/17
to lavaan
how can I label the scales y* in the "theta" parameterization and put constraints on them?

y1 ~*~ c(scale1, scale1)*y1

Keep in mind, though, that the first group's scales will always be fixed by default, and only other groups' scale parameters will be freed after applying equality constraints to measurement parameters.  Here is an example that frees the scale parameters for 1 item and constrains them to equality, rather than fixing the first to 1 and freeing the second (the default).  

myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names
(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")

model
<- '
  f1 =~ u1 + u2 + u3
  f2 =~ u4 + u5 + u6
'

fit1
<- cfa(model, data = myData, ordered = paste0("u", 1:6),
           
group = "g", group.equal = c("loadings","thresholds"))
summary
(fit1, fit.measures = TRUE, standardized = TRUE)

scales
<- ' u1 ~*~ c(NA, NA)*u1 + c(scale1, scale1)*u1 '
fit2
<- cfa(c(model, scales), data = myData, ordered = paste0("u", 1:6),
           
group = "g", group.equal = c("loadings","thresholds"))
summary
(fit2, fit.measures = TRUE, standardized = TRUE)

Note that these are not statistically equivalent -- the second fits way worse, with the same df = 18.  The first model shows the latent scales are quite different, so the worse fit makes sense, but I can't think of how to test the equivalence assumption.  But I'm not sure if that is what you are need to run a Thurstonian IRT model.

Yves Rosseel

unread,
May 25, 2017, 1:54:14 PM5/25/17
to Paul Buerkner, lav...@googlegroups.com
This will not work in lavaan 0.5-23, as lavaan does not allow residual
variances of categorical observed variables to be set free (in a
single-group analysis), even if they are identified.

I changed this in dev 0.6-1 (commit 1132). The lavaan syntax and output
(as well as the mplus output) are given in attach.

Yves.
triplets.out
yr.R
yr.txt

watri...@gmail.com

unread,
Oct 13, 2017, 9:23:29 AM10/13/17
to lavaan
It's great to see how lavaan gets better and better - thank you a lot!

Will there be a possibility to estimate MAP factor scores for the model, too?

Ki Cole

unread,
Mar 5, 2018, 5:16:06 PM3/5/18
to lavaan
Yves,

I really appreciate all the updates you've made to the 'lavaan' package regarding Thurstonian IRT. 

In the example given, you have three traits. In a four-trait, most/least, forced-choice situation, there is missing-not-at-random data. Does 'lavaan' handle this type of missing data? The package information indicates the options of dealing with missing data are only valid for MAR and MCAR types of data.

Thank you.
klc 

Yves Rosseel

unread,
Mar 11, 2018, 12:51:37 PM3/11/18
to lav...@googlegroups.com
> most/least, forced-choice situation, there is missing-not-at-random
> data. Does 'lavaan' handle this type of missing data?

No!

Yves.
Reply all
Reply to author
Forward
0 new messages