Hi Phil,
I'm running into issues when scoring new data from pre-fit models, specifically when the new data is missing items and/or possible response values. Here is a simplified example walking through the issues:
# fit model to calibration data with 5 items
lsat <- expand.table(LSAT7)
mod_lsat <- mirt(lsat, 1)
# new data to be scored, missing an item
lsat2 <- lsat[,-1]
fscores(mod_lsat, response.pattern = lsat2)
# Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent
To deal with this issue, I implemented a workaround that creates model object that's aligned to the structure of the new data but uses parameters from the calibration model.
library(dplyr)
align_mod <- \(mod, df) {
# get model parameter values
mod_vals <- mod2values(mod) |> select(-parnum)
# get data parameter structure
data_pars <- mirt(df, pars = "values")
# replace data parameter values with model values
data_vals <- data_pars |>
filter(class != "GroupPars") |>
select(group, item, class, name, parnum) |>
left_join(mod_vals, by = c("group", "item", "class", "name")) |>
bind_rows(data_pars |> filter(class == "GroupPars"))
# set up mirt model for data using constructed parameter values
mirt(df, pars = data_vals, TOL = NaN)
}
mod2 <- align_mod(mod_lsat, lsat2)
lsat2_scores <- fscores(mod2, response.pattern = lsat2)
This deals with the missing item issue, but fails if there are any items that couldn't be estimated if a new model were to be estimated from this data (for example if an item only has one response category in new data).
lsat3 <- lsat2[1:20,]
mod3 <- align_mod(mod_lsat, lsat3)
# Error: The following items have only one response category and cannot be estimated: Item.2 Item.3 Item.4
Do you have any suggestions for how to handle this issue and be able to score new data from a pre-fit model, when the new data is missing items or item response categories compared to the model?
Thank you!
Mika