data simulation with mirt

980 views
Skip to first unread message

Kif Cu

unread,
Apr 11, 2014, 3:03:45 PM4/11/14
to mirt-p...@googlegroups.com
I am currently confused with generating multidimensional mixed format data to use it for linking study. 

The intended data has to have two dimensions with prespecified correlation levels. 
Also, both dimensions must contain dichotomous and polycotomous item formats(I mean even there is item format effect on the dimensionality, actual dimensionality must be independent of item format). 

For example 120 item test will consist of 80 dich. items and 40 polytomous items. Each dimension will be conposed of 40 dich and 20 poly. items. Also correlation betwwen the dimensions (let's say .40) has to be controlled.

I have been using mirt for a few months, and actually I am stuck with using mirt for such an aim. Could you give me any guidance or advice to proceed. 

Spesifically, I am wondering, how could I control the correlation between the dimensions, 
I meanings of NAs in a and d matrices. etc.
I hope I clearly state my concern.

Thank you so much for your consideration.
best regards......

Phil Chalmers

unread,
Apr 11, 2014, 4:47:17 PM4/11/14
to Kif Cu, mirt-package
If you prefer not to set up one large 'd' matrix of item intercepts directly, try constructing them separately given your latent Theta parameters. Try something like this:

```
library(mvtnorm)
Theta <- rmvnorm(1000, sigma = matrix(c(1, .5, .5, 1), 2)) #correlation of .5

#simulated dichotomous items first, slopes at 1, with simple structure
d <- matrix(rnorm(80))
a <- matrix(c(rep(1, 40), rep(0,80), rep(1,40)), 80)
dichitems <- simdata(a, d, 1000, Theta=Theta, itemtype = 'dich')

#now graded items, with 5 categories each
a2 <- matrix(c(rep(1, 20), rep(0,40), rep(1,20)), 40)
d2 <- matrix(rnorm(40*4), 40)
d2 <- t(apply(d2, 1, sort, decreasing=TRUE)) #sort since intercepts are ordered
polytomous <- simdata(a2, d2, 1000, Theta=Theta, itemtype = 'graded')

dat <- data.frame(dich=dichitems, poly=polytomous)

#estimate it
model <- mirt.model('F1 = 1-40, 81-100 
                    F2 = 41-80, 101-120
                    COV = F1*F2')
mod <- mirt(dat, model)
summary(mod)
coef(mod)
```

Phil


--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kif Cu

unread,
Apr 11, 2014, 5:05:31 PM4/11/14
to mirt-p...@googlegroups.com
Thank you Phil. It is very helpful relpy for my Qs

Kif Cu

unread,
May 14, 2014, 2:24:55 AM5/14/14
to mirt-p...@googlegroups.com
Hi Phil,
I have one more question.
I want to add domain and format effects simultaneously for daha generation ( In my previous example, my aim was to eleminate domain effect)
I couldn't configure how to form the R codes using sim function. This is an essential part of my project. I will be extremely glad if you help me or to give an advice. 

Thank you so mush.
kind regards..

Kif Cu

unread,
May 14, 2014, 2:25:45 AM5/14/14
to mirt-p...@googlegroups.com
Additionally, I want to control the spesific affects of domain and format factors. 

Phil Chalmers

unread,
May 14, 2014, 11:00:54 AM5/14/14
to Kif Cu, mirt-package
I'm not exactly sure what you mean by 'domain' and 'format' effects; is this similar to a bifactor model where there is one unidimensional trait measured but several specific factor components? If so it's simply a matter of setting up an appropriate slope (a) matrix and using that when simulating data.

Phil


--

Kif Cu

unread,
Jun 15, 2014, 3:10:14 PM6/15/14
to mirt-p...@googlegroups.com, avcu...@gmail.com
Hi, Phil, In my study I am thinking of using mirt function for calibrating mixed format multidimensional data. In mirt manual, I couldn't figure out the theoretical background of this analysis. Which models it used, Could you provide the names of articles that form the base of the analysis. I need it for Methods section of the study

Phil Chalmers

unread,
Jun 15, 2014, 11:31:58 PM6/15/14
to Kif Cu, mirt-package
By default the 2PL and graded response model are fit to the data (the 2PL being a special case of the graded response model when ncat = 2). See the itemtype argument for list of others.

Depending on the estimator you used, either citing the Bock and Atikin 1981 (EM) or Cai's 2010 (MHRM) article would suffice. Mixed item types both fit readily in these frameworks and can be understood from these articles alone. For more detailed (though redundant for the most part) information about the EM with polytomous items you could also check out the Muraki and Carlson 1995 paper. Full reference details are in the package in the Reference section when using help('mirt') in R.

Phil


On Sun, Jun 15, 2014 at 3:10 PM, Kif Cu <avcu...@gmail.com> wrote:
Hi, Phil, In my study I am thinking of using mirt function for calibrating mixed format multidimensional data. In mirt manual, I couldn't figure out the theoretical background of this analysis. Which models it used, Could you provide the names of articles that form the base of the analysis. I need it for Methods section of the study

--

Kiff Cu

unread,
Nov 13, 2014, 9:06:37 AM11/13/14
to mirt-p...@googlegroups.com
Hi Mr Chalmers:

The following codes were provided by you a couple months ago. 

> #simulated dichotomous items first, slopes at 1, with simple structure
> d <- matrix(rnorm(80))
> a <- matrix(c(rep(1, 40), rep(0,80), rep(1,40)), 80)
> dichitems <- simdata(a, d, 1000, Theta=Theta, itemtype = 'dich')
> #now graded items, with 5 categories each
> a2 <- matrix(c(rep(1, 20), rep(0,40), rep(1,20)), 40)
> d2 <- matrix(rnorm(40*4), 40)
> d2 <- t(apply(d2, 1, sort, decreasing=TRUE)) #sort since intercepts are ordered
> polytomous <- simdata(a2, d2, 1000, Theta=Theta, itemtype = 'graded')
> dat <- data.frame(dich=dichitems, poly=polytomous)
> #estimate it
> model <- mirt.model('F1 = 1-40, 81-100 
+                     F2 = 41-80, 101-120
+                     COV = F1*F2')
> mod <- mirt(dat, model)


They were working great at that time but ,currently, it is not working and give the following message

Error message: FUN(newX[, i], ...) : 
  Items contain category scoring spaces greater than 1.
                    Use apply(data, 2, table) to inspect and fix

What could be the probable reason for that,

Additionally, as I know plink package is not available currently so, read.mirt function is useless. I download an older version of plink (version 1.3.1) is there any way to use parameters from mirt to use in plink other than managing the matrices.

One last question, is that when running the above codes the following coeffs given 

$dich.Item_1
      a1 a2      d g u
par 0.99  0 -0.843 0 1

$dich.Item_2
       a1 a2     d g u
par 1.011  0 -0.34 0 1
 
(I add a few lines). Why a2 parameters are always estimated zero. is it related to producing "a" matrix
(a <- matrix(c(rep(1, 40), rep(0,80), rep(1,40)), 80)) 

I couldn't control it because of the error mentioned above...

I would be glad for your contribution

Best!

Kif

Phil Chalmers

unread,
Nov 13, 2014, 10:43:38 AM11/13/14
to Kiff Cu, mirt-package
Hi Kiff,

On Thu, Nov 13, 2014 at 9:06 AM, Kiff Cu <avcu...@gmail.com> wrote:
Hi Mr Chalmers:

The following codes were provided by you a couple months ago. 

> #simulated dichotomous items first, slopes at 1, with simple structure
> d <- matrix(rnorm(80))
> a <- matrix(c(rep(1, 40), rep(0,80), rep(1,40)), 80)
> dichitems <- simdata(a, d, 1000, Theta=Theta, itemtype = 'dich')
> #now graded items, with 5 categories each
> a2 <- matrix(c(rep(1, 20), rep(0,40), rep(1,20)), 40)
> d2 <- matrix(rnorm(40*4), 40)
> d2 <- t(apply(d2, 1, sort, decreasing=TRUE)) #sort since intercepts are ordered
> polytomous <- simdata(a2, d2, 1000, Theta=Theta, itemtype = 'graded')
> dat <- data.frame(dich=dichitems, poly=polytomous)
> #estimate it
> model <- mirt.model('F1 = 1-40, 81-100 
+                     F2 = 41-80, 101-120
+                     COV = F1*F2')
> mod <- mirt(dat, model)


They were working great at that time but ,currently, it is not working and give the following message

The graded response model simulation of the d2 parameters wasn't the best choice since some categories might be selected with extremely small probability (hence, you get a dropping warning from the estimation function). In this scheme it's possible for intercepts to be very close, and in the graded response model this causes some category response probabilities to be extremely low and therefore not likely to be selected. A better approach is to use some spacing constant between the intercepts to insure that all are not too close. E.g., 

diffs <- t(apply(matrix(runif(40*4, .3, 1), 40), 1, cumsum)); 
diffs <- -(diffs - rowMeans(diffs)); 
d <- diffs + rnorm(40)

This ensures that intercepts are at least at distance of 0.3 away from each other, and therefore you are likely to witness responses to that category.
 

Error message: FUN(newX[, i], ...) : 
  Items contain category scoring spaces greater than 1.
                    Use apply(data, 2, table) to inspect and fix

What could be the probable reason for that,

Additionally, as I know plink package is not available currently so, read.mirt function is useless. I download an older version of plink (version 1.3.1) is there any way to use parameters from mirt to use in plink other than managing the matrices.

Uncomment and source the function in from my Github location. There's no other way to do this due to CRAN policy, but at least you still have access to the function in the package if you want it. https://github.com/philchalmers/mirt/blob/master/R/read.mirt.R

 

One last question, is that when running the above codes the following coeffs given 

$dich.Item_1
      a1 a2      d g u
par 0.99  0 -0.843 0 1

$dich.Item_2
       a1 a2     d g u
par 1.011  0 -0.34 0 1
 
(I add a few lines). Why a2 parameters are always estimated zero. is it related to producing "a" matrix
(a <- matrix(c(rep(1, 40), rep(0,80), rep(1,40)), 80)) 

I couldn't control it because of the error mentioned above...

I'm not sure what you mean here, but this has to do with some confirmatory model declaration you've included via the mirt.model syntax that is passed to mirt() or other estimation methods. Cheers.

Phil
 

I would be glad for your contribution

Best!

Kif

--
Reply all
Reply to author
Forward
0 new messages