Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Stata code to SPSS - calulating person years at risk by year for open cohort medical data

666 views
Skip to first unread message

Maddy

unread,
Dec 31, 2015, 7:22:30 AM12/31/15
to
Patients aged 45 and over registered with a primary care provider, enter the study from 1st Jan 2001 to 10th March 2008. Some die, others are censored when they deregister/leave the service provider over the decade. I want to calculate ageband and gender -specific death or incidence rates for *each year* to compare against national statistics and assess how representative/not the sample of patients is.

In Stata survival the command 'stsplit' is used to generate multiple records for a patient, providing multiple records per patient identifying their ageband for each year they were in the study (pre-specified agebands). From this it is possible to calculate the persons years at risk in each year and the death rate.

Is there an equivalent code in SPSS?

I reproduce the Stata code below: ** indicates comment
stset end (**date of exit from study), failure (evt_ind) origin (d_yob) enter (start) id (patid) scale (365.25)

stsplit ageband, at (20 30 40 50 60 70)
stsplit period, after (time=mdy(1,1,1960) at (41 (1) 48)

(**the 1960 date is a Stata thing! the code is splitting into 1 year intervals from 2001 to 2008)

strate ageband period , per (1000)


Thanks in advance
Maddy

David Marso

unread,
Dec 31, 2015, 12:53:54 PM12/31/15
to
Maddy,
Rather than expect anyone to translate this uncommented stata code you would do well to explain in detailed comments what each of these commands in stata do to the data file and their purpose for existing in the command stream in the first place.
Seriously:

stset end (**date of exit from study), failure (evt_ind) origin (d_yob) enter (start) id (patid) scale (365.25)

???? Unpack this a bit if you expect anything other than blank stares!!!

stsplit period, after (time=mdy(1,1,1960) at (41 (1) 48)
WTF??? Don't assume that is obvious. I grok it now but you could help yourself get help by not assuming that it is transparent that 41 maps to 2001, 48 to 2008 and that 1 in the middle represents 1 year! Also what actually happens to the data from doing this? Maybe a SNAPSHOT?

All that aside...
If you are attempting to build a record for each year a patient is involved you can use XSAVE within a loop.
AIR CODE not guaranteed/not tested...YMMV...
Assume you have an ID and two variables STARTDATE and TERMINATE existing on the same record/row each of which are SPSS DATE variables (look that up if you aren't completely clear on the concept).
2. File is SORTED by ID.

MATCH FILES / FILE * / KEEP ID StartDate Terminate <whatever other variables you need to work with.....>
LOOP YearBetween=XDATE.YEAR(StartDate) TO XDATE.YEAR(Terminate).
XSAVE OUTFILE <some file spec here> / KEEP ID YearBetween .
END LOOP.
EXECUTE.
MATCH FILE / TABLE= * / FILE= <some file spec here> / BY ID.

This will attach the original data records to an exploded file containing a Year variable for each year between StartDate and Terminate for each ID.

Next do the date arithmetic to determine age at each year see (See DATEDIFF function in COMPUTE command ).
RECODE your ages to whatever strata and then AGGREGATE to get whatever totals you need.
HTH, David
0 new messages