I have a dataset that looks like this:
ID ACT TIME_start TIME_end
1 11 1 34
1 15 35 87
1 21 88 110
1 12 111 122
1 19 123 144
2 11 1 22
2 25 23 111
2 22 112 115
2 15 116 144
...
For each ID I have a number of observation (but not same for all ID),
I have an activity that they have done and a starting and ending point
for that specific activity. For each ID the first activity starts at
TIME_start=1 and the last activity ends at TIME_end=144. The variables
define when an individual has started and stopped doing an activity,
the 24 hours of the day are split up into 144 ten minutes periods.
Now I would like a dataset where I have one row for each ID and I have
48 variables named TIME1 to TIME48. The first of these variables,
TIME1, should have the value equal to the activity for that ID and at
the first ten minute period, TIME2 should have the value equal to the
activity for that ID at the fourth ten minute period, TIME3 should
have the value equal to the activity for that ID at the seventh ten
minute period, and so on.
I.e., for ID=1, TIME1 to TIME11 should have the value 11, TIME12 to
TIME29 should have value 15, and so on.
I was thinking that I should create a dataset that looks something
like this
ID TIME ACT
1 1 11
1 2 11
1 3 11
1 4 11
1 5 11
...
1 32 11
1 33 11
1 34 11
1 35 15
1 36 15
...
1 143 19
1 144 19
2 1 11
...
and so on.
And then take every third observation and delete the rest, and then
transpose the dataset. But I am not sure how to do that, so any
suggestions would be welcome. If you have another way to reach the
same result, it would also be very much appreciated.
All the best,
Richard
Does this do what you want?
* --- start of syntax --- .
new file.
dataset close all.
data list list / ID ACT TIME_start TIME_end (4f5.0).
begin data
1 11 1 34
1 15 35 87
1 21 88 110
1 12 111 122
1 19 123 144
2 11 1 22
2 25 23 111
2 22 112 115
2 15 116 144
end data.
* Create TIME1 to TIME144, and populate them.
numeric time1 to time148 (f2.0).
vector t = time1 to time144.
loop #i = TIME_start to TIME_end.
- compute t(#i) = act.
end loop.
exe.
* Use AGGREGATE reduce everything to one row per person .
aggregate outfile = * /
break = ID /
time1 to time144 = first(time1 to time144) .
* --- end of syntax --- .
--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."
The syntax works great. A minor issue is that I only wanted the
activity every third time period, so I just wanted 48 variables. But I
can just use every third variable, so it is no problem. But just out
of curiousity, would it be possible to solve that directly by changing
the syntax?
Once again, thank you!
Richard
> bwea...@lakeheadu.cahttp://sites.google.com/a/lakeheadu.ca/bweaver/Home
> "When all else fails, RTFM."- Hide quoted text -
>
> - Show quoted text -
OK, now I understand. I was a bit confused by why I was seeing both
144 and 48 in the original post. Try this.
* Create TIME1 to TIME48, and populate them.
numeric time1 to time48 (f2.0).
vector t = time1 to time48.
loop #i = TIME_start to TIME_end.
- if (mod(#i,3) EQ 0) t(#i/3) = act. /* Note division of #i by 3 .
end loop.
exe.
* Use AGGREGATE reduce everything to one row per person .
aggregate outfile = * /
break = ID /
time1 to time48 = first(time1 to time48) .
--
Bruce Weaver
bwe...@lakeheadu.ca