Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

obtaining cumulative sum in spss

1,398 views
Skip to first unread message

maja.g...@gmail.com

unread,
May 18, 2006, 12:26:36 PM5/18/06
to
Hi there,

I was wondering if anyone knows how to do the following: I have 3
columns, the first one is SSN (patient), the second one is days and the
third one is date. I would like to sum all days whenever the dates are
the same and just copy the number of days if there is only one date for
that row of days. But also, I need to make sure that this only happens
within the same SSN, i.e. if there are two same dates where one SSN
ends and the new SSN begins, this should be ignored, i.e. I don't want
to take the sum of days in this case. Essenitally I would like to get
column "new_days" as shown. And then delete all the rows that are blank
for new_days.

SSN days date new_days
1 30 30-Mar-00 .
1 30 30-Mar-00 .
1 10 30-Mar-00 .
1 30 30-Mar-00 .
1 16 30-Mar-00 116
1 100 3-Apr-00 .
1 10 3-Apr-00 110
1 139 30-Oct-98 139
2 30 30-Oct-98 .
2 30 30-Oct-98 .
2 30 30-Oct-98 .
2 30 30-Oct-98 120
2 100 1-Nov-98 .
2 30 1-Nov-98 .
2 30 1-Nov-98 160
....

Thanks so much


maja

Richard Ulrich

unread,
May 18, 2006, 5:10:57 PM5/18/06
to
On 18 May 2006 09:26:36 -0700, maja.g...@gmail.com wrote:

> Hi there,
>
> I was wondering if anyone knows how to do the following: I have 3
> columns, the first one is SSN (patient), the second one is days and the
> third one is date. I would like to sum all days whenever the dates are
> the same and just copy the number of days if there is only one date for
> that row of days. But also, I need to make sure that this only happens
> within the same SSN, i.e. if there are two same dates where one SSN
> ends and the new SSN begins, this should be ignored, i.e. I don't want
> to take the sum of days in this case. Essenitally I would like to get
> column "new_days" as shown. And then delete all the rows that are blank
> for new_days.
>

This looks like you want to Aggregate on SSN and date,
creating the sum of Days as New_days.

>
>
> SSN days date new_days
> 1 30 30-Mar-00 .
> 1 30 30-Mar-00 .
> 1 10 30-Mar-00 .
> 1 30 30-Mar-00 .
> 1 16 30-Mar-00 116
> 1 100 3-Apr-00 .
> 1 10 3-Apr-00 110
> 1 139 30-Oct-98 139
> 2 30 30-Oct-98 .
> 2 30 30-Oct-98 .
> 2 30 30-Oct-98 .
> 2 30 30-Oct-98 120
> 2 100 1-Nov-98 .
> 2 30 1-Nov-98 .
> 2 30 1-Nov-98 160
> ....

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Bruce Weaver

unread,
May 19, 2006, 9:33:44 AM5/19/06
to

You said you want a "cumulative sum", but your example doesn't show
that. If all you want is the sum total within each SSN x Date
combination, the AGGREGATE method suggested by Rich is the way to go.
If you have a new enough version of SPSS, you can have AGGREGATE write
those sums into the current working data file. Something like this:

sort cases by SSN date.
aggregate
/break = ssn date
/new_days = sum(days)
.

If what you really want is a *running total* (which is how I understand
"cumulative sum"), then something like this would do the trick, I think:

split file by SSN date.
create new_days = csum (days).
exe.
split file off.


--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir

0 new messages