המרת קבצי SAS/SPSS

31 views
Skip to first unread message

Avner Kantor

unread,
Oct 1, 2015, 5:13:38 AM10/1/15
to israel-r-user-group
שלום,

אני מעוניין להמיר את קובץ ה-txt לקובץ csv בעזרת סקריפט של spss או sas
יש כל מיני פתרונות במרשתת אבל אף אחד מהן לא מצליחה לי, בסופו של דבר אני נאלץ לעשות זאת באמצעות תוכנת spss.
מישהו כאן הצליח לעשות זאת? מצ"ב קבצים לדוגמא.

בתודה מראש,

אבנר

INT_STQ09_SPSS_DEC11.sps
INT_STQ09_SAS_DEC11.sas
INT_SCQ09_Dec11.zip

amit gal

unread,
Oct 1, 2015, 5:23:02 AM10/1/15
to israel-r-...@googlegroups.com

לא תספר מה אתה באמת רוצה לעשות, לא תקבל תשובות לעניין.
ככלל להשתמש בspss זה רעיון רע, ואם המטרה היא רק להמיר קבצים מפורמט לפורמט ואולי על הדרל לעשות קצת מניפולציה בסיסית של הנתונים, זה רעיון רע יותר. אני מנחש שכל דבר שאתה רוצה לעשות אפשר חעשות בr בקלות, רק שלא סיפרת מה רתה באמת רוצה לעשות, כאמור.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Avner Kantor

unread,
Oct 1, 2015, 5:47:57 AM10/1/15
to israel-r-user-group
אני רוצה לקרוא את הנתונים ב-R.

amit gal

unread,
Oct 1, 2015, 5:48:34 AM10/1/15
to israel-r-...@googlegroups.com

אז מה הבעיה?

Avner Kantor

unread,
Oct 1, 2015, 5:50:00 AM10/1/15
to israel-r-user-group
אני לא מצליח.

amit gal

unread,
Oct 1, 2015, 5:51:29 AM10/1/15
to israel-r-...@googlegroups.com

מה ניסית?

Jonathan Rosenblatt

unread,
Oct 1, 2015, 6:16:39 AM10/1/15
to israel-r-user-group
Maybe:

read.table('INT_SCQ09_Dec11.txt', fill = TRUE, sep=' ')
Jonathan Rosenblatt
www.john-ros.com

Tal Galili

unread,
Oct 1, 2015, 6:17:14 AM10/1/15
to israel-r-...@googlegroups.com
היי אבנר,
בהמשך לדבריו של עמית, מה שיעזור לנו לעזור לך זה:
1) מה אתה יודע על סוג הקובץ TXT שיש לך? (איזה פורמט הוא?)
2) מה ניסית? (ואיך זה לא עבד?)
3) יש לך דוגמא קטנה יותר בשביל להדגים את הבעיה? (קובץ קטן יותר, קוד R שאפשר להריץ?)



----------------Contact Details:-------------------------------------------------------
Contact me: Tal.G...@gmail.com
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English)
----------------------------------------------------------------------------------------------


Ofrit Lesser

unread,
Oct 1, 2015, 6:21:04 AM10/1/15
to israel-r-...@googlegroups.com

אבנר,

 

ניתן לקרוא הקובץ INT_SCQ09_Dec11.txt בעזרת read.csv, ולהשתמש בערך מתאים לפרמטרים (למשל sep=" ").

 

עפרית




Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com


Avner Kantor

unread,
Oct 1, 2015, 6:31:57 AM10/1/15
to israel-r-user-group
קובץ ה-txt הוא בפורמט של spss ובסקריפט נמצאים שמות העמודות.
ניסיתי להשתמש ב-foreign וב-spss_to_r.
יונתן, איך אתה מבצע את הדבקת שמות העמודות?

Tal Galili

unread,
Oct 1, 2015, 6:33:50 AM10/1/15
to israel-r-...@googlegroups.com
אתה יודע איזה פורמט בדיוק? כי אני חושב שלספסס יש כמה פורמטים. 
וכשאתה אומר לא עבד - מה בדיוק קרה?

Sent from a smart-phone

Avner Kantor

unread,
Oct 1, 2015, 6:38:50 AM10/1/15
to israel-r-user-group
טל, אני לא יודע איזה פורמט. הכיוון של יונתן ועפרית נראה טוב. עכשיו השאלה היא רק איך להפעיל על זה את הסקריפט.

Yoni Sidi

unread,
Oct 1, 2015, 6:47:13 AM10/1/15
to Israel R User Group, avner...@gmail.com
best way to read spss,stata,sas data files is with the haven library.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

Yoni Sidi

unread,
Oct 1, 2015, 7:59:18 AM10/1/15
to Israel R User Group, avner...@gmail.com
upon second look at the raw txt file it is inconsistent. im guessing the 9999.0000 is some flag, but it gets pasted along with the previous column thus merging variables at different parts of the file and creating unequal column numbers per row. there are other numbers that get merged other than that example. either that or there are missing na values that are supposed to be between the merged columns. this is why R is having such a problem reading the txt file, i am suprised that any program could read it.

 from where did you export the data that made this?

Avner Kantor

unread,
Oct 1, 2015, 1:51:39 PM10/1/15
to Yoni Sidi, Israel R User Group
אנסה להסביר יותר טוב מה אני מעוניין לעשות. יש קובץ txt בפורמט ascii. כיום אני משתמש בתוכנת spss כדי להריץ עליו סקריפט sps ואז אני מקבל דטה קריא. הנה ההסבר המקורי:
This syntax reads the ASCII text file data and applies variable and value labels, formats and missing value specifications.
השאלה היא כיצד אני יכול לבצע פעולה זו עם הסקריפט המקורי באמצעות R.
המטרה בסופו של דבר היא לקרוא את המידע.

יוני, הקבצים מופצים כך באתר ה-OECD

To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

Ofrit Lesser

unread,
Oct 1, 2015, 3:14:40 PM10/1/15
to israel-r-...@googlegroups.com

מה לגבי read.spss מחבילת foreign?

ראה: http://stackoverflow.com/questions/3136293/read-spss-file-into-r

 

From: israel-r-...@googlegroups.com [mailto:israel-r-...@googlegroups.com] On Behalf Of Avner Kantor
Sent: Thursday, 1 October 2015 1:38 PM
To: israel-r-user-group
Subject: Re: [Israel RUG] המרת קבצי SAS/SPSS

 

טל, אני לא יודע איזה פורמט. הכיוון של יונתן ועפרית נראה טוב. עכשיו השאלה היא רק איך להפעיל על זה את הסקריפט.

Yoni Sidi

unread,
Oct 1, 2015, 3:15:25 PM10/1/15
to Israel R User Group, yon...@gmail.com, avner...@gmail.com
there is no good way i found to directly read in the data since they are unevenly spaced. the closet thing was to add the option fill=T to the read.table or read.csv functions but that doesnt really do the job. this is you best option. i took the info from the sas file of where the length of each column was defined and split each line accordingly. i added the progress bar because it take a few minutes to split the data into each variable and there are quite a few rows in the file.

library(stringr)
a
=read.table("c:/temp/INT_SCQ09_Dec11.txt",sep="\t",stringsAsFactors = F)
b
=cumsum(c(0,3,3,1,5,5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,7,7,7,7,1,1,7,7,7,7,1,4,4,4,4,4,4,4,4,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,9,9,9,9,9,1,1,9,9,9,9,9,9,9,9,9,9,9,5,13))

x
=matrix(NA,nrow=nrow(a),ncol=length(b))
pb
<- txtProgressBar(min = 1,max(nrow(a)),style = 3)
for(i in 1:(nrow(a))){
 
for(j in 1:(length(b)-1)){
    x
[i,j]=str_sub(a[i,1],b[j]+1,b[j+1])
    setTxtProgressBar
(pb, i)
 
}
}
close
(pb)

x
=as.data.frame(x)

names
(x)=c('CNT','COUNTRY','OECD','SUBNATIO','SCHOOLID','SC01Q01','SC01Q02','SC01Q03','SC01Q04','SC01Q05','SC01Q06','SC01Q07','SC01Q08','SC01Q09','SC01Q10','SC01Q11','SC01Q12','SC01Q13','SC01Q14','SC02Q01','SC03Q01','SC03Q02','SC03Q03','SC03Q04','SC04Q01','SC05Q01','SC06Q01','SC06Q02','SC07Q01','SC07Q02','SC08Q01','SC09Q11','SC09Q12','SC09Q21','SC09Q22','SC09Q31','SC09Q32','SC10Q01','SC10Q02','SC10Q03','SC11Q01','SC11Q02','SC11Q03','SC11Q04','SC11Q05','SC11Q06','SC11Q07','SC11Q08','SC11Q09','SC11Q10','SC11Q11','SC11Q12','SC11Q13','SC12Q01','SC12Q02','SC13Q01','SC13Q02','SC13Q03','SC13Q04','SC13Q05','SC13Q06','SC13Q07','SC13Q08','SC13Q09','SC13Q10','SC13Q11','SC13Q12','SC13Q13','SC13Q14','SC14Q01','SC14Q02','SC14Q03','SC14Q04','SC14Q05','SC15Q01','SC15Q02','SC15Q03','SC15Q04','SC15Q05','SC16Q01','SC16Q02','SC16Q03','SC16Q04','SC16Q05','SC16Q06','SC16Q07','SC16Q08','SC17Q01','SC17Q02','SC17Q03','SC17Q04','SC17Q05','SC17Q06','SC17Q07','SC17Q08','SC17Q09','SC17Q10','SC17Q11','SC17Q12','SC17Q13','SC18Q01','SC19Q01','SC19Q02','SC19Q03','SC19Q04','SC19Q05','SC19Q06','SC19Q07','SC20Q01','SC20Q02','SC20Q03','SC20Q04','SC20Q05','SC20Q06','SC21Q01','SC21Q02','SC21Q03','SC22Q01','SC22Q02','SC22Q03','SC22Q04','SC22Q05','SC23Q01','SC23Q02','SC23Q03','SC23Q04','SC24Qa1','SC24Qa2','SC24Qa3','SC24Qa4','SC24Qa5','SC24Qb1','SC24Qb2','SC24Qb3','SC24Qb4','SC24Qb5','SC24Qc1','SC24Qc2','SC24Qc3','SC24Qc4','SC24Qc5','SC24Qd1','SC24Qd2','SC24Qd3','SC24Qd4','SC24Qd5','SC24Qe1','SC24Qe2','SC24Qe3','SC24Qe4','SC24Qe5','SC24Qf1','SC24Qf2','SC24Qf3','SC24Qf4','SC24Qf5','SC24Qg1','SC24Qg2','SC24Qg3','SC24Qg4','SC24Qg5','SC24Qh1','SC24Qh2','SC24Qh3','SC24Qh4','SC24Qh5','SC24Qi1','SC24Qi2','SC24Qi3','SC24Qi4','SC24Qi5','SC24Qj1','SC24Qj2','SC24Qj3','SC24Qj4','SC24Qj5','SC24Qk1','SC24Qk2','SC24Qk3','SC24Qk4','SC24Qk5','SC24Ql1','SC24Ql2','SC24Ql3','SC24Ql4','SC24Ql5','SC25Qa1','SC25Qa2','SC25Qa3','SC25Qa4','SC25Qb1','SC25Qb2','SC25Qb3','SC25Qb4','SC25Qc1','SC25Qc2','SC25Qc3','SC25Qc4','SC25Qd1','SC25Qd2','SC25Qd3','SC25Qd4','SC25Qe1','SC25Qe2','SC25Qe3','SC25Qe4','SC25Qf1','SC25Qf2','SC25Qf3','SC25Qf4','SC26Q01','SC26Q02','SC26Q03','SC26Q04','SC26Q05','SC26Q06','SC26Q07','SC26Q08','SC26Q09','SC26Q10','SC26Q11','SC26Q12','SC26Q13','SC26Q14','SC27Q01','ABGROUP','COMPWEB','IRATCOMP','PCGIRLS','PROPCERT','PROPQUAL','SCHSIZE','SCHTYPE','SELSCH','STRATIO','EXCURACT','LDRSHP','RESPCURR','RESPRES','SCMATEDU','STUDBEHA','TCHPARTI','TCSHORT','TEACBEHA','W_FSCHWT','STRATUM','VER_SCH')



yoni
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

amit gal

unread,
Oct 1, 2015, 6:56:28 PM10/1/15
to israel-r-...@googlegroups.com
Yoni - I wonder how you managed that - I tried and was baffled by the fact that each line has 486 characters, but the scripts seems to assume 1825 characters. couldnt find which variables to expect and where should I look.

I guess this is my limited SAS/SPSS, and not really an R question :)

amit


To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.

Yoni Sidi

unread,
Oct 2, 2015, 5:48:37 AM10/2/15
to Israel R User Group
are we allowed to answer non R questions here? look in the sas script file for the keyword-'length' it defines there manually the placement of each column.

yoni

Tal Galili

unread,
Oct 2, 2015, 7:07:25 AM10/2/15
to israel-r-...@googlegroups.com
 Blasphemy!

:P

Sent from a smart-phone

On Oct 2, 2015, at 12:48, Yoni Sidi <yon...@gmail.com> wrote:

are we allowed to answer non R questions here? look in the sas script file for the keyword-'length' it defines there manually the placement of each column.

yoni

amit gal

unread,
Oct 2, 2015, 7:24:43 AM10/2/15
to israel-r-...@googlegroups.com
didnt get it. :(

Yoni Sidi

unread,
Oct 2, 2015, 8:04:04 AM10/2/15
to Israel R User Group
almost only you will enjoy this unfortunately 
library(stringr)
a
=read.table("http://pisa2009.acer.edu.au/downloads/INT_SCQ09_SAS_DEC11.sas",sep="\t")
a1
=str_split(str_trim(a[(which(grepl('length',str_trim(a[,1])))+1):(which(grepl('infile',str_trim(a[,1])))-1),1])," ")
b
=as.numeric(gsub("[$.;]","",unlist(lapply(a1,tail,1))))
a3
=a1[[which(is.na(b))]]
b
[is.na(b)]=as.numeric(gsub("[A-Z$.;]","",a3[which(grepl("[1-9]",a3))]))
varnames=unlist(lapply(a1,head,1))
these are the same objects i put in manually in the code before.


On Friday, October 2, 2015 at 2:24:43 PM UTC+3, Amit Gal wrote:
didnt get it. :(


On Fri, Oct 2, 2015 at 2:07 PM, Tal Galili <tal.g...@gmail.com> wrote:
 Blasphemy!

:P

Sent from a smart-phone

On Oct 2, 2015, at 12:48, Yoni Sidi <yon...@gmail.com> wrote:

are we allowed to answer non R questions here? look in the sas script file for the keyword-'length' it defines there manually the placement of each column.

yoni

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages