Multidimensional model - only one item for some of the dimensions?

ykl7

unread,

Jul 25, 2019, 7:10:17 AM7/25/19

to mirt-p...@googlegroups.com

Dear Phil

First off, thanks for creating the mirt package!

I am fitting a 3-dimensional model to describe nine items, and I would like your input on overfitting/identifiability/scale shrinkage if only one item is related to an entire dimension.

Code:

model3dimensions <- mirt.model('F1 = 1-7
                           F2 = 8
                           F3 = 9
                           COV = F1*F2*F3')

model3dimensions_run_MHRM <- mirt(mIRTdata, model3dimensions, method="MHRM")

The model converges using the MHRM estimation method and I obtain parameter estimates. However, I am concerned that the results cannot be trusted, as I have read that overfitting may occur when trying to fit few items (one or two) per dimension (I read this in a paper describing an IRT analysis of a specific scale, yet I cannot seem to find any theoretical books that discuss the "minimum" number of items per dimension). In this context, how can I assess whether I am overfitting/obtaining biased estimates using the mirt package?

I am fairly new to IRT, so any references or rules of thumb regarding e.g. when it is plausible to go from a unidimensional model to a multidimensional model would be highly appreciated.

Thanks in advance!

Phil Chalmers

unread,

Jul 25, 2019, 11:12:16 AM7/25/19

to ykl7, mirt-package

I think the issue here is worse than overfitting; the model actually isn't identified. If you pass SE=TRUE to the mirt() function you'll compute the ACOV matrix, which either will not be positive definite (in which a warning message will appear) or the condition number of this matrix via print(mod) will be extremely large, indicating that the likelihood surface is numerically very flat (hence, no unique maximum exists). You can estimate this type of model but you must add some constraints before doing so, such as fixing the slopes for the F2 and F3 unstandardized loadings both to 1. Using the syntax you provided this would look like the following:

model3dimensions <- mirt.model('F1 = 1-7
F2 = 8
F3 = 9

START = (8, a2, 1.0), (9, a3, 1.0)

FIXED = (8, a2), (9, a3)
COV = F1*F2*F3')

HTH.

Phil

On Thu, Jul 25, 2019 at 7:10 AM ykl7 <yassinek...@gmail.com> wrote:

Dear Phil

First of, thanks for creating the mirt package!

I am fitting a 3-dimensional model to describe nine items, and I would like your input on whether overfitting/identifiability/scale shrinkage if only one item is related to an entire dimension.

Code:

model3dimensions <- mirt.model('F1 = 1-7
                           F2 = 8
                           F3 = 9
                           COV = F1*F2*F3')

model3dimensions_run_MHRM <- mirt(mIRTdata_IPSS_qol_bii, model3dimensions, method="MHRM")

The model converges using the MHRM estimation method and I obtain parameter estimates. However, I am concerned that the results cannot be trusted, as I have read that overfitting may occur when trying to fit few items (one or two) per dimension (I read this in a paper describing an IRT analysis of a specific scale, yet I cannot seem to find any theoretical books that discuss the "minimum" number of items per dimension). In this context, how can I assess whether I am overfitting/biased using the mirt package?

I am fairly new to IRT, so any references or rules of thumb regarding e.g. when it is plausible to go from a unidimensional model to a multidimensional model would be highly appreciated.

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/dabf2f6b-d0d8-4ce9-b4af-1272f27580d2%40googlegroups.com.

Yassine Kamal Lyauk

unread,

Jul 25, 2019, 5:46:13 PM7/25/19

to Phil Chalmers, mirt-package

Thank you for you answer. I have two short follow-up questions if you don’t mind:

- Where does the non-identifiability stem from? I.e. when a single item is associated with an entire dimension, is it given that you cannot simultaneously estimate the slope, difficulty and latent variable estimates (in the case of a graded response model)? Furthermore, if we had only consider two dimensions F1=1-7 and F2=8, we would run into the same issue I presume?

- Would it be OK to constrain the difficulty parameter b1 for both F2 and F3 instead of the slopes? Or should all difficulty parameters be constrained (polytomous responses)?

Thanks again.

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/dabf2f6b-d0d8-4ce9-b4af-1272f27580d2%40googlegroups.com.

--

You received this message because you are subscribed to the Google Groups "mirt-package" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/CAKYNACGp4pfciYcGtY1Hj-ScTse5jUvk1eHZR7TgNBJXjv-2YA%40mail.gmail.com.

--

Best regards / Med venlig hilsen

Yassine Kamal Lyauk

Phil Chalmers

unread,

Jul 25, 2019, 5:53:32 PM7/25/19

to Yassine Kamal Lyauk, mirt-package

On Thu, Jul 25, 2019 at 5:46 PM Yassine Kamal Lyauk <yassinek...@gmail.com> wrote:

Thank you for you answer. I have two short follow-up questions if you don’t mind:

- Where does the non-identifiability stem from? I.e. when a single item is associated with an entire dimension, is it given that you cannot simultaneously estimate the slope, difficulty and latent variable estimates (in the case of a graded response model)?

Correct.

Furthermore, if we had only consider two dimensions F1=1-7 and F2=8, we would run into the same issue I presume?

Also correct. It's possible to do a so-called "doublet" loading, such as F2 = 8-9, but this would require an additional constraint as well (typically the slope parameters are constrained to be equal, which solves this problem). This is highly related to the problem of identification in structural equation modeling, where at minimum in order to estimate the parameters 3 items are required for sufficient identification without strong constraints.

- Would it be OK to constrain the difficulty parameter b1 for both F2 and F3 instead of the slopes? Or should all difficulty parameters be constrained (polytomous responses)?

Unfortunately no, the problem generally relates to the slope parameters since these reflect how much the item 'correlates' with the latent traits in an unstandardized metric. This is talked about quite a bit in the SEM literature (e..g, Ken Bollen's 1989 book is a great reference for this topic), so I'd recommend inspecting those if you are more interested in the topic. HTH.

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/dabf2f6b-d0d8-4ce9-b4af-1272f27580d2%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/CAKYNACGp4pfciYcGtY1Hj-ScTse5jUvk1eHZR7TgNBJXjv-2YA%40mail.gmail.com.

Reply all

Reply to author

Forward