Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Age Categories. SPSS is being too precise with decimals.

546 views
Skip to first unread message

bertma...@yahoo.com

unread,
Feb 14, 2006, 4:06:26 PM2/14/06
to
I have a data set like this:

Name Age Age category
Joe 15 15-17
Nancy 20 20-21
Susie 45 35+

This is the syntax for calculating age based on admission date and date
of birth, recoding and labelling of variables:

COMPUTE Length = admit - dob.
COMPUTE Years = Length/(365.25*24*60*60).
FORMATS Years (F4.2).
EXECUTE.

STRING AgeCat (A8) .
RECODE
Years
(0 thru 0.99='1') (1 thru 2.99='2') (3 thru 4.99='3') (5 thru
6.99='4')
(7 thru 10.99='5') (11 thru 14.99='6') (15 thru 17.99='7') (18
thru
19.99='8') (20 thru 21.99='9') (22 thru 24.99='10') (25 thru
34.99='11')
(35 thru Highest='12') INTO AgeCat .
VARIABLE LABELS AgeCat 'Age Categories'.
EXECUTE .

VALUE LABELS AgeCat
1 "<1"
2 "1-2"
3 "3-4"
4 "5-6"
5 "7-10"
6 "11-14"
7 "15-17"
8 "18-19"
9 "20-21"
10 "22-24"
11 "25-34"
12 "35+".
EXECUTE.

I have age going out 2 decimals but there are occassional ages being
miscategorized. I have a few that won't be categorized. For example, if
I look out 3 decimals someone who is 24.999 or 22.991 won't be
categorized. How far out does one need to go?

Thanks.

Bruce Weaver

unread,
Feb 14, 2006, 4:22:47 PM2/14/06
to


If you only want YEARS to two decimal places, try this.

compute #y = Length/(365.25*24*60*60).
compute years = rnd(#y*100)/100.

--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir

vab

unread,
Feb 14, 2006, 4:45:24 PM2/14/06
to
SPSS stores values in binary format and binary numbers don't always
match decimal numbers. It looks like you're worried that values on the
boundary will get recoded twice. RECODE doesn't work that way. It only
recodes each variable for each case once during the data pass and the
first value that matches is used. So do something like this.

RECODE years (35 thru highest='12') (25 thru 35='11') (22 thru 25='10')
... into AgeCat.

If you're going to use a string to store the result, why bother with
value labels and just specify something like
(35 thru highest='35+').

neila...@msn.com

unread,
Feb 14, 2006, 10:01:52 PM2/14/06
to
NOTICE VAB is recoding the values from TOP to bottom.
That effectively traps your boundary cases!!
Good Job!
Neila

bertma...@yahoo.com

unread,
Feb 15, 2006, 1:27:04 PM2/15/06
to
What I did before reading the replies was went out one more decimal
(17.999) which seemed to work. I wasn't concerned about duplicates but
just being in the wrong category.

The reason I used value labels is that I hoped it would show reports in
the correct chronological order which in fact it didn't so I'll
probably redo and skip that part and just create a string variable.

I will try Bruce's suggestion. And, if I'm understanding VAB, I should
recode without the decimals (sorry, I'm not a highly skilled SPSS
user).

I lifted the syntax for the categories from a web page that has oodles
of syntax. It showed eg., (1 thru 2.99='2') with the 2.99 which is why
I tried it that way.

Thank you for your help.

neila...@msn.com

unread,
Feb 15, 2006, 2:02:20 PM2/15/06
to
>What I did before reading the replies was went out one more decimal
>(17.999) which seemed to work. I wasn't concerned about duplicates but
>just being in the wrong category.


***which seemed to work. ***.

FAMOUS LAST WORDS!!!!!
DON'T DO THAT!!! because it ONLY seems to work!!!!!!!

Do what Vab did. You will NEVER EVER but sure
using floating point numbers that something's not
slipping through! This reverse assignment recoding
method -for lack of a better word- is a keeper (I've
been doing it that way for years (and there really
is not a better solution aside from truncation the data values).
Neila

rya...@gmail.com

unread,
Feb 15, 2006, 3:45:33 PM2/15/06
to
Recode into a numeric variable, not a string. Then it will be easy to
sort the values lowest to highest in your summary tables.
The reason it didn't work for you is because you recoded into a string
var, which sorts differently (e.g. 11 comes before 2).

neila...@msn.com

unread,
Feb 15, 2006, 5:31:50 PM2/15/06
to
>Recode into a numeric variable, not a string. Then it will be easy to
>sort the values lowest to highest in your summary tables.

Good advice to use a numeric variable, but not for the reason stated.
Numerical variables are more efficient to process and don't suffer from

type inconsistencies for merges/adds etc. Also less typing (no need for

all those quotes). Some procs don't accept character variables....

>The reason it didn't work for you is because you recoded into a string
>var, which sorts differently (e.g. 11 comes before 2).

Well, one reason to use "02" instead of 2. of course " 2" < "11" .
This has nothing to do with the original issue which concerns values
slipping through the RECODE sieve.
HTH, Neila

vab

unread,
Feb 15, 2006, 5:46:56 PM2/15/06
to
I'm with Dave here. You must have gotten lucky this time. Are you sure
your case counts match for AgeCat and Years? Another quick way to check
is to put an ELSE into the recode to catch everything that fell through
the binary cracks.

neila...@msn.com

unread,
Feb 15, 2006, 7:36:39 PM2/15/06
to
>"Another quick way to check
>is to put an ELSE into the recode to catch everything that fell through
>the binary cracks. "

Well, If I were in the same room with you right now you would be in
pain ;-)))
PLEASE!!! any sane person would go with your original approach....

bert 'explaining to his boss as the door hits him on the ass on the way
out...."
Well boss, all of those almost 3 yr olds are in the ELSE category!!!!
Slam -ouch- .

*As the suspected witch with 6 toes lifts the pistol and ....(oh crap
wrong toe)...

Binary Cracks (great name for a rock band ).
Neila

bertma...@yahoo.com

unread,
Feb 17, 2006, 11:39:10 AM2/17/06
to
I guess I'm a prime example of a little bit of knowledge being
dangerous. But at least I know where to go to be set straight.

I ended up with this syntax and all is well (I think). Age is
calculated without decimals and without rounding. And now the crosstabs
are coming out in chronological order. Thanks for your help.

COMPUTE Age2 = TRUNC((CTIME.DAYS(admit-dob))/365.25) .

RECODE
Age
(0=1) (35 thru Highest=12) (25 thru 34=11) (22 thru 24=10) (20
thru
21=9) (18 thru 19=8) (15 thru 17=7) (11 thru 14=6) (7 thru 10=5)
(5
thru 6=4) (3 thru 4=3) (1 thru 2=2) INTO Agecat .

bertma...@yahoo.com

unread,
Feb 17, 2006, 12:14:09 PM2/17/06
to
Well actually the Recode (0=1) is at the end. I added (ELSE=13) for
missing data. Is that ok to do it that way?

0 new messages