Hello Phil,
I'm working with a very large set of assessment responses, sparsely populated. Functionally, the data look like the type one might get from a CAT-type assessment, with many columns/items, most of which are not answered in a given assessment.
My goals are
1) Identify item parameters
&
2) use high-quality items as anchors to equate new, future assessments that include a mix of these anchor items and new items.
In the original parent dataset, I have more than 100K assessments which seems too large to run in a single mirt analysis (I could be wrong, of course, but I haven't gotten any similar size dataset to successfully feed a mirt model). My plan has been to do a bootstrapped approach, where I run mirt on subsamples of data, estimate item characteristics, and then create composite (i.e. weighted mean) values for these item characteristics.
Using these 'composite' metrics obtained from my bootstrapping approach, I could then feed them into a new mirt model to allow me to equate a future set of assessments (composed of 'anchor items' and new items). Specifically, I would use the 'pars' argument to define anchor items by populating their difficulty and discrimination values, as well as setting the 'est' [estimation] value to FALSE so they would be treated as fixed.
My question: Is this the best approach to identify anchor item characteristics from such a large dataset and use them to equate future assessments & new items?
Thank you so much for your insights!
Best,
Russell