trouble loading SPSS data file.error

724 views
Skip to first unread message

dariah...@gmail.com

unread,
Dec 14, 2013, 5:56:55 AM12/14/13
to lav...@googlegroups.com
Hi everybody,

I am new to this group, new to R and lavaan so please bear with me...

I want to read a SPSS datafile into R lavaan. I want to conduct a CFA to test the dimensionality of four factors and afterwards applying an IRT model to investigate the testinformation of a personality questionnaire.

However, I have some trouble to load the SPSS file into R, see below. My question is: Is anybody familiar with this error and could help?

Best,
Daria


This is my code:

install.packages("lavaan", dependencies = TRUE)
install.packages("quadprog", dependencies = TRUE)

library(lavaan)
library(foreign)

## read in DAPP SPSS file

dappsfa <- read.spss(file.choose(), use.value.labels=F, to.data.frame=T)
dappsfa <- read.spss('DAPP-SF-A.ONLY.sav', use.value.labels=F, to.data.frame=T)

head(dappsfa)
summary(dappsfa)
str(dappsfa)

And this is the error message I receive:

re-encoding from latin-9
Warning message:
In read.spss(file.choose(), use.value.labels = F, to.data.frame = T) :
  /Users/Daria/UvA/ResearchMSc/Thesis 2/DATA/DATA.BackUp/IRT DAPP-SF-A.ONLY.sav: Unrecognized record type 7, subtype 18 encountered in system file


I get some output for the head, summary and string function, though. Using the following syntax to fit the forst factor, the summary function gives no output. I assume that this might be asscoicated with error loading the spss file??

dappsfaED <- ' emodys  =~ dappa18 + dappa20 + dappa31 + dappa67 + dappa81 + dappa102 + dappa105 + dappa123 +
dappa4 + dappa19 + dappa47 + dappa80 + dappa118 + dappa142 +
dappa9 + dappa125 + dappa129 + dappa140 + dappa141 + dappa144 +
dappa1 + dappa21 + dappa28 + dappa85 + dappa89 + dappa95 + dappa106 + dappa109 +
dappa35 + dappa38 + dappa46 + dappa52 + dappa91 + dappa107 + dappa115 + dappa130 + dappa132 + dappa138 +
dappa22 + dappa55 + dappa66 + dappa87 + dappa98 + dappa122 +
dappa7 + dappa16 + dappa29 + dappa121 + dappa135 + dappa136 +
dappa5 + dappa8 + dappa10 + dappa15 + dappa61 + dappa71 + dappa83 + dappa139 +
dappa23 + dappa32 + dappa33 + dappa86 + dappa103 + dappa108 +
dappa13 + dappa42 + dappa48 + dappa51 + dappa69 + dappa72 + dappa79 + dappa88 +
dappa11 + dappa27 + dappa37 + dappa44 + dappa84 + dappa128'

fit <- cfa(dappsfaED, data = dappsfa, std.lv=TRUE,
        ordered=c("dappa18","dappa20","dappa31","dappa67","dappa81","dappa102","dappa105","dappa123",
"dappa4","dappa19","dappa47","dappa80","dappa118","dappa142",
"dappa9","dappa125","dappa129","dappa140","dappa141","dappa144",
"dappa1","dappa21","dappa28","dappa85","dappa89","dappa95","dappa106","dappa109",
"dappa35","dappa38","dappa46","dappa52","dappa91","dappa107","dappa115","dappa130","dappa132","dappa138",
"dappa22","dappa55","dappa66","dappa87","dappa98","dappa122",
"dappa7","dappa16","dappa29","dappa121","dappa135","dappa136",
"dappa5","dappa8","dappa10","dappa15","dappa61","dappa71","dappa83","dappa139",
"dappa23","dappa32","dappa33","dappa86","dappa103","dappa108",
"dappa13","dappa42","dappa48","dappa51","dappa69","dappa72","dappa79","dappa88",
"dappa11","dappa27","dappa37","dappa44","dappa84","dappa128"))
summary(fit, fit.measures = TRUE, standardized=TRUE )

yrosseel

unread,
Dec 14, 2013, 8:42:56 AM12/14/13
to lav...@googlegroups.com
On 12/14/2013 11:56 AM, dariah...@gmail.com wrote:
> And this is the error message I receive:
>
> re-encoding from latin-9
> Warning message:
> In read.spss(file.choose(), use.value.labels = F, to.data.frame = T) :
> /Users/Daria/UvA/ResearchMSc/Thesis 2/DATA/DATA.BackUp/IRT
> DAPP-SF-A.ONLY.sav: Unrecognized record type 7, subtype 18 encountered
> in system file

I'm not an SPSS (import) expert, but I had trouble before reading SPSS
files using the read.spss() function. It is probably best to contact the
R-help mailing list for this, since this has nothing to do with lavaan.

But I always export my SPSS data to a 'comma-separated-value' (*.csv)
file. You can do this from within SPSS. Next, you can read this *.csv
file by using either read.csv() or read.csv2(), depending on your locale.

Yves.

dariah...@gmail.com

unread,
Dec 15, 2013, 6:11:36 AM12/15/13
to lav...@googlegroups.com
Hi Yves,

Thanks a lot for your answer and suggestions. I did not know about the R help mailing list. I will try that
and I will also try your suggestion to import another file format (did not know that either...)

I did find further information about the warning message I received, which apparantly can be ignored (see link below). Well.. I am a bit doubtful about it.

https://github.com/berndweiss/ps2012r_intro/blob/master/slide/ps2012-intro_R.org

Finally: Is it already possible to fit an IRT model for polytomous responses (e.g. the Samejima model) in lavaan?

Best,
Daria

Op zaterdag 14 december 2013 14:42:56 UTC+1 schreef Yves Rosseel:

yrosseel

unread,
Dec 15, 2013, 6:30:46 AM12/15/13
to lav...@googlegroups.com
On 12/15/2013 12:11 PM, dariah...@gmail.com wrote:
> Finally: Is it already possible to fit an IRT model for polytomous
> responses (e.g. the Samejima model) in lavaan?

Yes, sort of. You just need to declare your observed items as 'ordered'
(either in the data.frame, or using the ordered= argument in lavaan). If
you use a single latent factor, this corresponds to the Samejima model.

Note, however, that lavaan currently (0.5-15) uses the WLSMV estimator,
which will give you (slightly) different results compared to a marginal
ML estimator. And everything is on the probit scale (not the logit scale).

But you can get estimated scores for the latent variable, using the
predict() function.

Yves.

dariah...@gmail.com

unread,
Dec 15, 2013, 8:28:45 AM12/15/13
to lav...@googlegroups.com
Okay!
Would you otherwise recommend another R package to fit IRT models, for example the ltm package?

Well, the theoretical proposed factorial structure of the questionnaire is: personality pathology measured with four higher order latent scales to which 18 lower order scales are assigned to. A total of 144 items is firstly allocated  to the 18 lower order scales. These 18 indicators are supposed to load on the different 4 factors....

Best,
Daria


Op zondag 15 december 2013 12:30:46 UTC+1 schreef Yves Rosseel:

yrosseel

unread,
Dec 15, 2013, 10:37:36 AM12/15/13
to lav...@googlegroups.com
On 12/15/2013 02:28 PM, dariah...@gmail.com wrote:
> Okay!
> Would you otherwise recommend another R package to fit IRT models, for
> example the ltm package?

ltm is great, but limited (as most IRT software) to one or two latent
variables. If you need 18 first-order latent variables and 4
second-order latent variables, lavaan + WLSMV should be able to do it
(but you need a fairly large sample size).

Marginal ML (as used by most IRT software packages) is (almost) not
feasible with 18 latent variables.

Yves.

Garett Howardson

unread,
Dec 16, 2013, 10:56:30 AM12/16/13
to lav...@googlegroups.com
Hi Daria,
 
I had a similar error with lavaan using read.spss. Although, it seems to be a sporadic error, sometimes I get it and sometimes I don't. I've also had similar problems with read.csv so I usually try a mix of methods until I find one that works. One thing I try that works sometimes is to convert your spss-read-in file to a matrix, which lavaan seems to like better sometimes. You can do this pretty easily:
 
dappsfa.matx<-as.matrix(dappsfa)
 
As I mentioned though, this seems to only work sometimes so I usually just try a combination of things until something works. Another thing I seem to have trouble with is trying to use read.spss with a mix of string and integer data, so before converting to a matrix I typically remove any strings, which is annoying but this seems to work. I hope this helps!

Daria Henning

unread,
Dec 18, 2013, 5:25:27 PM12/18/13
to lav...@googlegroups.com
Hi Yves,

Thanks for your answer!

Wel I have two groups, one group is about 1600 and the other sample is about 250 adolescents.
Is that fairly large enough??
So it is aimed to model the CFA and IRT on the whole sample but actually also on both seperately.

Best,
Daria


Op zondag 15 december 2013 16:37:36 UTC+1 schreef Yves Rosseel:

Daria Henning

unread,
Dec 18, 2013, 5:32:45 PM12/18/13
to lav...@googlegroups.com
Hi Garret,

Thanks a lot for your post! I will try that and I'll let you know whether it'll work.

Yes, I had the string problem as well, I got lots of warnings about the string labels in certain variables. That was pretty frustrating...yes :). I have removed these variables intuitively and the warnings disappeared, except the one I described above.



Op maandag 16 december 2013 16:56:30 UTC+1 schreef Garett Howardson:
Reply all
Reply to author
Forward
0 new messages