This is generally to be expected given that estimation is quite sensitive to parameterization given only a
convergence criteria in the EM. You could lower the TOL argument until they agree better, but there's no guarantee that they will converge to each other exactly without very large sample sizes (this is why parameterization can be tricky, and why the package generally prefers the slope-intercept versions for their improved stability). Here's a script worth experimenting with to see this:
a <- matrix(rlnorm(20,.2,.3))
diffs <- t(apply(matrix(runif(20*4, .3, 1), 20), 1, cumsum))
diffs <- -(diffs - rowMeans(diffs))
d <- diffs + rnorm(20)
dat <- simdata(a, d, 5000, itemtype = 'graded') # change the sample size/TOL criteria
mod <- mirt(dat, 1, 'gpcm')
mod2 <- mirt(dat, 1, 'gpcmIRT', TOL=1e-6)
coef(mod, IRTpars=TRUE, simplify=TRUE)$items
coef(mod2, simplify=TRUE)$items
M2(mod)
M2(mod2)