Scoring new data with missingness

2 views
Skip to first unread message

Mika Braginsky

unread,
Jan 14, 2026, 5:19:27 PM (3 days ago) Jan 14
to mirt-package
Hi Phil,

I'm running into issues when scoring new data from pre-fit models, specifically when the new data is missing items and/or possible response values. Here is a simplified example walking through the issues:

# fit model to calibration data with 5 items
lsat <- expand.table(LSAT7)
mod_lsat <- mirt(lsat, 1)

# new data to be scored, missing an item
lsat2 <- lsat[,-1]

fscores(mod_lsat, response.pattern = lsat2)
# Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent


To deal with this issue, I implemented a workaround that creates model object that's aligned to the structure of the new data but uses parameters from the calibration model.

library(dplyr)
align_mod <- \(mod, df) {
  # get model parameter values
  mod_vals <- mod2values(mod) |> select(-parnum)

  # get data parameter structure
  data_pars <- mirt(df, pars = "values")

  # replace data parameter values with model values
  data_vals <- data_pars |>
    filter(class != "GroupPars") |>
    select(group, item, class, name, parnum) |>
    left_join(mod_vals, by = c("group", "item", "class", "name")) |>
    bind_rows(data_pars |> filter(class == "GroupPars"))

  # set up mirt model for data using constructed parameter values
  mirt(df, pars = data_vals, TOL = NaN)
}

mod2 <- align_mod(mod_lsat, lsat2)
lsat2_scores <- fscores(mod2, response.pattern = lsat2)

This deals with the missing item issue, but fails if there are any items that couldn't be estimated if a new model were to be estimated from this data (for example if an item only has one response category in new data).

lsat3 <- lsat2[1:20,]
mod3 <- align_mod(mod_lsat, lsat3)

# Error: The following items have only one response category and cannot be estimated: Item.2 Item.3 Item.4

Do you have any suggestions for how to handle this issue and be able to score new data from a pre-fit model, when the new data is missing items or item response categories compared to the model?

Thank you!
Mika
Reply all
Reply to author
Forward
0 new messages