Multiple raters and multiple items question

Nils Myszkowski

unread,

Dec 4, 2017, 8:50:03 PM12/4/17

to mirt-package

Hi all,

I have a dataset with Likert-type judgments (multiple judges) of productions of multiple participants at multiple tasks (but not all participants took the same tasks, it's a bit of planned missingness design in terms of tasks).

I am trying to estimate each individual's ability reflected in the different productions judged, taking into account both judge characteristics (different uses of the judgment scale, severities) with a GRM or GPCM and (random) task difficulty. That's the last part I'm a bit confused with.

I know how to fit a GRM and GPCM model to account for judge variability (like a "regular" IRT), but I am unsure how to also control for task difficulty when the participants have taken different tasks.

So far what I came up with from the vignette is to use a "regular" GRM or a GPCM with the judgments (treated as items), with the function mixedmirt(), where Theta would be modeled as a function of an intercept (the individual's ability) + a random intercept per task :

model = 1,
itemtype = "graded",
lr.fixed = ~ 1 + id, #the latent trait estimate depends on the participant
lr.random = ~ 1|task #the latent trait depends on the task (randomly picked from a population of tasks)

And my task-controlled ability estimates would be the "id" mean estimates, I guess (which I hope to export in some way). Also I hope to study task difficulty with the fixed task effect.

Does that sound about right?

---

Also, from my understanding, the previous model assumes that for a task, there is a unique difficulty level for all participants. So a follow-up question would be, what if I can't make that assumption (it other terms, that the difficulty of the task can vary randomly per participant?) or want to test it? Would I need a random slopes and intercept model, such as...

lr.fixed = ~ 1 + id
lr.random = ~ 1|task + id|task

?

Sorry if all of this is confusing, and thank you in advance for your help...and also: Congrats for such a great R package.

Nils

Phil Chalmers

unread,

Dec 12, 2017, 2:45:54 PM12/12/17

to Nils Myszkowski, mirt-package

On Mon, Dec 4, 2017 at 8:50 PM, Nils Myszkowski <nilsmyszko...@gmail.com> wrote:

Hi all,

I have a dataset with Likert-type judgments (multiple judges) of productions of multiple participants at multiple tasks (but not all participants took the same tasks, it's a bit of planned missingness design in terms of tasks).

I am trying to estimate each individual's ability reflected in the different productions judged, taking into account both judge characteristics (different uses of the judgment scale, severities) with a GRM or GPCM and (random) task difficulty. That's the last part I'm a bit confused with.

I know how to fit a GRM and GPCM model to account for judge variability (like a "regular" IRT), but I am unsure how to also control for task difficulty when the participants have taken different tasks.

So far what I came up with from the vignette is to use a "regular" GRM or a GPCM with the judgments (treated as items), with the function mixedmirt(), where Theta would be modeled as a function of an intercept (the individual's ability) + a random intercept per task :
model = 1, itemtype = "graded", lr.fixed = ~ 1 + id, #the latent trait estimate depends on the participant lr.random = ~ 1|task #the latent trait depends on the task (randomly picked from a population of tasks)

The lr.fixed term is not required here from your description. It should be used for predictors which explain the differences in theta. mixedmirt() treats the random theta terms separately (even from lr.random) because the non-linear product from the discrimination/slope parameters. So, the ability is included automatically for all models, and has it's own associated variance component built-in.

And my task-controlled ability estimates would be the "id" mean estimates, I guess (which I hope to export in some way). Also I hope to study task difficulty with the fixed task effect.

Does that sound about right?

---
Also, from my understanding, the previous model assumes that for a task, there is a unique difficulty level for all participants. So a follow-up question would be, what if I can't make that assumption (it other terms, that the difficulty of the task can vary randomly per participant?) or want to test it? Would I need a random slopes and intercept model, such as...
lr.fixed = ~ 1 + id
lr.random = ~ 1|task + id|task
?

In the previous formulation the difficulty is assumed to be constant within each item, and so the ability/task difficulty are the only reasons for witnessing different responses. So, I don't think a random slope is what you want here. Cheers.

Phil

Sorry if all of this is confusing, and thank you in advance for your help...and also: Congrats for such a great R package.

Nils

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nils Myszkowski

unread,

Dec 20, 2017, 2:11:31 PM12/20/17

to mirt-package

Hi Phil,

Thank you for this response ! I will try that as soon as possible.

I guess my idea of a random slope was related to the fact that I want to control for potential interactions between the difficulty of the task and the subject.

The idea being that theta is the ability "as shown to the judge" (the judge is the item here), and that this expressed ability depends on the task presented (a task of random difficulty) and the participant (id), as well the possibility that the content of a specific task makes it more or less easy for specific participant. In a way like an interaction, I would like to reflect the fact that the tasks are only a random sample of a population of tasks.

Cheers,

Nils

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.

Reply all

Reply to author

Forward