Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Deleting empty variables and other questions

1,475 views
Skip to first unread message

Sona

unread,
Mar 18, 2011, 12:56:04 AM3/18/11
to
Hello,
I cam across this group doing a google search. I hope I can post
here:)

1. I am writing a syntax to analyze all my data in one step. I receive
my data in 2 excel spreadsheets. I have to restructure ( cases to
variables) the data, then merge the two data sets. When I do since all
the cases with the same identifier get combined into a single case I
get a lot of empty variables in the data set. I would like to delete
these variables using a syntax code if I can as it really clutters the
data set (its humongous).

2. When I import data from excel to SPSS ( I am using ver 19) the date
variables are getting converted to a string variable. I tried several
different ways to import but each time it converts the mm.dd.yy format
from excel into a 6 digit number that makes no sense to me. In version
17 I know I could define the variable type( string, numeric, date etc)
during import , but I am not able to do this in ver 19. Please advice.

3. I have a variable that is in decimal format ( to protect it). I
have to convert it to a binary format and then read the bits. For
example the variable value is 14363. I need to write a code to form a
new variable and convert it to 11100000011011. The I need to write a
code to read the 12th bit from the left and form a new variable where
the value is 1 if this bit is 1 and 0 if this bit is 0.
This one is the most complicated and I really need a solution for this
as I did it manually the first time to meet a deadline. If I have to
do it manually again I may collapse:) I have the code for this is SAS,
but I need to do this in SPSS syntax.

Code in SAS: bit11 = substr(charvar,11,1);

Please help. I really appreciate it.

Sona

unread,
Mar 18, 2011, 1:00:45 AM3/18/11
to

I made an error in my post.

"Then  I need to write a code to read the 12th bit from the RIGHT and

Bruce Weaver

unread,
Mar 18, 2011, 8:40:08 AM3/18/11
to

If the binary number is stored as a string, I think this does what you
want.

* Example assuming binary number is stored as string.
data list list / binarynum (a20).
begin data
11100000011011
11000000011011
111000000110110
111100000110110
1110000001101
1010000001101
111000000110
11000000110
end data.

* Read 12th bit from the RIGHT .

compute #Length = length(RTRIM(binarynum)).
compute #pos = #Length - 12 + 1.
if (#pos GT 0) bit12 = number(substr(binarynum,#pos,1),f1).
list.

Re the trouble with dates imported from Excel, scroll about halfway
down the page to find a section on "Importing date variables from
Excel". I think the problem you've encountered is discussed there.

https://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/spss/my-spss-page

HTH.

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."

Jon Peck

unread,
Mar 18, 2011, 9:03:19 AM3/18/11
to
Deleting empy varibles can be done using programmability. To enable this code, you need to download the Python Essentials from the SPSS Community site (www.ibm.com/developerworks/spssdevcentral) and the spssaux2.py module from that site (in the Utilities collection). Save it in the extensions directory of your SPSS installation.

Then run this code
begin program.
import spss, spssaux2
spssaux2.delEmptyVars()
end program.

That will remove all numeric variables where the values are all sysmis and all blank string variables.

BTW, you could calculate a dummy for bit 12 without all that roundabout mechanism described above, but I won't spell that out for now.

Andy W

unread,
Mar 18, 2011, 9:46:33 AM3/18/11
to

For 1 you can use the match files command with the drop (or keep)
subcommand.

*the asterisk specifies the active dataset.
match files file = *
/drop var1 var2 to var10.


For 2, I suspect it would be simplest to transform the date in excel
into something more managable. Dates in excel are the days since
1/1/1900 (and that day is recorded as 1). If you just converted that
to a numeric value in excel and then imported the numeric value, you
could use syntax in SPSS to transform it back into any date you want.

Here is an example. Maybe it deserves another question, but as a note
I'm not quite sure why I need to subtract two instead of one from my
excel date (perhaps I have some type of setting to round up?). Anyway
this is the general format of the syntax you need, just check yourself
to make sure the dates are the right dates.

*example taking excel dates as numbers.
compute date_SPSS = DateSum(Date.Dmy(01,01,1900),
(Excel_Date-2),"days").
formats date_SPSS (date11).
execute.


I'm not quite sure how to accomplish 3. I suspect others may be able
to help, but if your search the help files for "formats" it brings up
all the format types. I believe all the functions in SPSS search from
the left. In the example below I get the 6th character from the right.

string test (A9).
compute test = "987654321".
compute str_position = LENGTH(test) - 5.
string str_6 (A1).
compute str_6 = char.substr(test,str_position,1).
execute.


I just don't know how to change 14363 to 11100000011011.


Bruce Weaver

unread,
Mar 18, 2011, 10:57:34 AM3/18/11
to
On Mar 18, 9:03 am, Jon Peck <JKP...@gmail.com> wrote:

--- snip ---

> BTW, you could calculate a dummy for bit 12 without all that roundabout mechanism described above, but I won't spell that out for now.

As you know, Jon, I have on occasion posted *far* more roundabout
solutions than that one. ;-) So go on, what's the more
straightforward method?

Jon Peck

unread,
Mar 18, 2011, 1:36:03 PM3/18/11
to
First issue is whether the posited binary string is actually always the same length. Logically, if the number magnitude varies, so would the length of the binary representation. So to be safe, I would count from the right hand side. There were 14 bits in that binary representation, so you need to ignore

So if x is the variable whose value is 14363, then this creates the dummy:
compute isBit12On = mod(trunc(x/4), 2).

-Jon

Bruce Weaver

unread,
Mar 18, 2011, 3:03:37 PM3/18/11
to

While I counted from the left side, I counted over to position #pos,
where:

#pos = length(RTRIM(binarynum)) - 12 + 1

That's equivalent to counting 12 over from the right end, isn't it?
In the event that #pos is equal to 0 or less, there is no 12th bit
counting from the right, which is why I used IF.

if (#pos GT 0) bit12 = number(substr(binarynum,#pos,1),f1).

Cheers,
Bruce

Jon Peck

unread,
Mar 18, 2011, 4:36:25 PM3/18/11
to
The ambiguity is in the original problem statement. Did it really mean the 12th bit from the left regardless of length? If the magnitude of the number could vary enough to change the number of bits, which wouldn't be hard unless there are always the requisite leading zeros, then we need to know what the requirement really is. But my point was mainly that there would be no need to convert the orignal number into a string of 0's and 1's in order to create the dummy.

-Jon

Bruce Weaver

unread,
Mar 18, 2011, 4:47:46 PM3/18/11
to
On Mar 18, 4:36 pm, Jon Peck <JKP...@gmail.com> wrote:
> The ambiguity is in the original problem statement.  Did it really mean the 12th bit from the left regardless of length?  If the magnitude of the number could vary enough to change the number of bits, which wouldn't be hard unless there are always the requisite leading zeros, then we need to know what the requirement really is.  But my point was mainly that there would be no need to convert the orignal number into a string of 0's and 1's in order to create the dummy.
>
> -Jon


Jon, in Sona's second post, I took her clarification to mean that she
really wanted the 12th bit counting from the RIGHT.

Jon

unread,
Mar 19, 2011, 2:36:49 AM3/19/11
to

Sona, Jon and Bruce,
A general formula (counting from the right) would be
compute BitXisOn=mod(trunc(x/position),2).
where x is the number of interest, and position is 0 based (i.e.
rightmost bit is bit 0, next is bit 1 and so on).

If you actually want a string representation of the number that is a
bit more complex:
assume v1 is your number

string binary (A20).
compute #r1=v1.
compute binary="".
loop #i=20 to 0 by -1.
compute #twop=2**#i.
compute #res=trunc(#r1/#twop).
do if #res=1.
compute binary=concat(binary,"1").
compute #r1=#r1-#twop.
else.
compute binary=concat(binary,"0").
end if.
end loop.
* and to get a particular position, counting from the LEFT (since the
string has fixed length it is not particularly difficult to count from
the right)
compute v2=number(CHAR.SUBSTR(binary,12,1),f1.0).
exe.

One could of course put this into a macro,
(But, to Jon's chagrin, I guess, it does not help to pythonize the
code, since python does not have a decimal to binary string procedure
built in).

jon

Jon Peck

unread,
Mar 19, 2011, 10:24:58 AM3/19/11
to
Actually, in Python bin(x) gives you the binary representation of x as a string. So, bin(int(x)) would do it. And if you really wanted to go this way, you could use the SPSSINC TRANS extension command from the SPSS Community to apply this function to a variable in the active SPSS dataset.

begin program.
def binstring(x):
return bin(int(x))[2:]
end program.

spssinc trans result=binstring type=14
/formula "binstring(x)".

This creates a string variable of length 14 containing the binary representation. the [2:] in the return statement above is because the Python conversion prefixes the string with "0b".

But, as now amply demonstrated, there is no need to go through the explicit binary representation in order to solve the original problem.

Regards,
Jon Peck

Jon

unread,
Mar 19, 2011, 9:19:38 PM3/19/11
to
Jon,
I guess I asked for that.
Jon

Sona

unread,
May 4, 2011, 11:28:25 AM5/4/11
to
Thank you very much for your responses. I am sorry for the delay in
reviewing this. I got taken up with some other projects.

I am not very familiar with all the terminology so I am a bit lost
with these responses:( Let me pose my question again a bit clearly.

I have a variable named Intervention that contains a decimal number
for example 12305. I first need to convert this into a binary number.
Then I need to make each bit a separate variable and name the
variables as bit 1, bit 2 , bit 3 etc. bit 1 being the last number on
the right (reading from right to left).

Can you please break this up a little more for me so I can follow your
directions. I will be writing this code in SPSS syntax editor.

0 new messages