Using mirt to combine multiple datasets with common items for further analysis

133 views
Skip to first unread message

chung....@gmail.com

unread,
Apr 7, 2023, 7:10:42 PM4/7/23
to mirt-package

Hello,

 

I am working on a project where the goal is to examine the factor structure of ratings made on items that come from multiple measures of the same construct. I would like to use the mirt package to combine the datasets into one, so that I can factor analyze this dataset in a future step.


I have several datasets and they are each on the small end (~200 participants). There is no overlap in participants, but there is an overlap in items across the datasets. One complication is that the response format differs in some datasets, although the items are the same (e.g., on a scale of 1-7 vs. 1-5). Here are some simplified examples of the datasets.

 

Dataset 1:

ID PL1 PL2 PL3 PL4 PL5 CL1 CL2 CL3 CL4 CL5 FL1 FL2 FL3 FL4 FL5

1     X     X     X     X     X                                          X     X     X     X     X   

2     X     X     X     X     X                                          X     X     X     X     X   

3     X     X     X     X     X                                          X     X     X     X     X   

 

Dataset 2:

ID PL1 PL2 PL3 PL4 PL5 CL1 CL2 CL3 CL4 CL5 FL1 FL2 FL3 FL4 FL5

1                                          X     X     X     X     X     X     X     X     X     X   

2                                          X     X     X     X     X     X     X     X     X     X   

3                                          X     X     X     X     X     X     X     X     X     X

 

Dataset 3:

ID PL1 PL2 PL3 PL4 PL5 CL1 CL2 CL3 CL4 CL5 FL1 FL2 FL3 FL4 FL5

1     X     X     X     X     X      X     X     X     X    X

2     X     X     X     X     X      X     X     X     X     X   

3     X     X     X     X     X      X     X     X     X     X   

 

I was thinking of using a concurrent common-item linking procedure with the hopes of creating a single dataset that I could then factor analyze. Basically, I would like to combine the datasets into one so that it looks like this:


Combined dataset:

ID PL1 PL2 PL3 PL4 PL5 CL1 CL2 CL3 CL4 CL5 FL1 FL2 FL3 FL4 FL5

1     X     X     X     X     X                                          X     X     X     X     X   

2     X     X     X     X     X                                          X     X     X     X     X   

3     X     X     X     X     X                                          X     X     X     X     X   

4                                          X     X     X     X     X     X     X     X     X     X   

5                                          X     X     X     X     X     X     X     X     X     X   

6                                          X     X     X     X     X     X     X     X     X     X

7     X     X     X     X     X      X     X     X     X    X

8     X     X     X     X     X      X     X     X     X     X   

9     X     X     X     X     X      X     X     X     X     X   

 

However, although I have used IRT methods and linking before using mirt, I have never done it in this context (e.g., I've combined different datasets that have all items in common, and in another project, linked two different measures of a construct using external common items to examine theta values longitudinally). Any advice and/or references would be greatly appreciated. 


Thanks!

 

Joanne

chung....@gmail.com

unread,
Apr 18, 2023, 4:04:01 PM4/18/23
to mirt-package
Hi, just boosting this to see if anybody had advice. Thanks in advance!

J

Nie Ping

unread,
Apr 28, 2023, 6:09:20 AM4/28/23
to mirt-package
Hi  Joanne,

I have the same problem as you. But I have a stupid question, if you are convenient, can you help me answer it?

The problem is
We have two datasets. The participants are different, but the two datasets have common items. For example, for group1 participants, they answer 'item1', 'item2', 'item3', 'item4'. For group2 participants, they answer 'item1', 'item5', 'item6', 'item7'. so the common part is item1. According to the property of IRT model, the item parameters should be invariant, independent of the other items in the item set and participants. But when I use 2PL to estimate the parameters, the result is totally different for the common item 'item1'. Do you know why?

You can see from the pics, they have the common item B3.7.fu, but the item parameter estimation is totally different.

Thank you very much!

Best,
P.
屏幕截图 2023-04-28 120657.png
屏幕截图 2023-04-28 120608.png

Phil Chalmers

unread,
May 31, 2023, 10:44:58 AM5/31/23
to chung....@gmail.com, mirt-package
Hi Joanne,

Your combined dataset approach is quite reasonable here as the missing observations should be considered missing at random, and could be fitted with mirt() as a single group IRT model. If for whatever reason the datasets corresponds to particular subgroups reflecting different population samples then the multipleGroup() function could be used here as well, though in this case you would have to be careful about allowing specific equality constraints across the blocks of missing data to ensure model identification (e.g., in your combined dataset rows 7-9 attempted to estimate the slope/intercepts for FL1:Fl5 then this would fail as there's no response data to estimate said parameters, so you would have to arbitrarily set these parameters equal to one of the other groups in order to borrow relevant information for identification). HTH.

Phil


--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mirt-package/adf8f7c8-1890-46a8-aa05-1103d6355da7n%40googlegroups.com.

Phil Chalmers

unread,
May 31, 2023, 10:47:11 AM5/31/23
to Nie Ping, mirt-package
We have two datasets. The participants are different, but the two datasets have common items. For example, for group1 participants, they answer 'item1', 'item2', 'item3', 'item4'. For group2 participants, they answer 'item1', 'item5', 'item6', 'item7'. so the common part is item1. According to the property of IRT model, the item parameters should be invariant, independent of the other items in the item set and participants.

That's not what invariance in IRT means. The invariance is in the prediction space, but the parameters will differ as a function of other properties (e.g., scaling of theta). Unless the groups are believed to have equal latent trait distributions, most justifiably through random assignment, then this behaviour is expected and instead you would have to anchor/equate the parameters to place them on a similar scale. HTH.

Phil

 
--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.
Message has been deleted
Message has been deleted

Phil Chalmers

unread,
Oct 24, 2025, 2:56:14 PMOct 24
to Gudrun Eisele, mirt-package
Hi Gudrun,

It's possible this issue has been fixed on a recent version of mirt, but if the issue persists please provide a reproducible example and I'll see what the issue is and if it can be patched. Sorry for the delayed reply.

Phil


On Tue, Jul 22, 2025 at 12:09 PM Gudrun Eisele <g.v.e...@gmail.com> wrote:
Hi Phil,

I am conducting similar analyses to Joanne: I am fitting a single group irt model to data of different studies, with different combinations of items but some overlapping items. The item characteristics I receive look reasonable, but the fit indices are not calculated ( Error in if (null.fit$M2 > newret$M2) { : missing value where TRUE/FALSE needed ). This seems to be caused by a few items that are observed only in few of the datasets, and removing the problematic items solves the issue. From what I understood, the model should be able to deal with data missing by design. Why is this problem appearing?  Is this simply related to the high proportion of missing data? Or could there be a different reason?
Any leads would be highly appreciated!

Thanks!

Gudrun
Reply all
Reply to author
Forward
0 new messages