On Tue, 10 Mar 2020 11:05:09 -0700 (PDT),
aisha.s...@berkeley.edu
wrote:
>Hi, I am having a similar issue I think. I would like to remove cases where there is not a Matched ID for both pre and post. I have respondents who have a unique ID. If a respondent has completed a pre and a post, there would be two lines, both having the same IDCODE and then another variable indicating pre or post. (pre=1 and post=2). How would I remove those who don't have a pre and post and matched ID? Thank you! New to SPSS.
Do you know that no cases have more than one Pre
or Post record? - duplicated, or otherwise?
For a relatively well-composed file, the simplest
cleaning that I think of is to use AGGREGATE on ID,
ADDing a new variable to each line that has the
Number of cases for the ID; then SELECT to keep
only those with exactly 2 records. Before Selecting,
you can do a Freq to find out whether it is true that
all IDs have either 1 or 2 records, and never more than that.
For a messier file, without warning about the messiness,
you can use the LEAD( ) function as I will describe. LEAD(PrePost)
will return the value of PrePost in the record that comes next,
in contrast to the LAG( ) function that returns the value of
a variable from the previous record.
COMMENT file is sorted by ID and PrePost.
COMMENT - find a pair that make up two proper lines. Save both.
COMPUTE ToUse= 0.
IF (ID eq LEAD(ID) ) and (PrePost=1) and LEAD(PrePost=2) ToUse=1.
IF (ID eq LAG(ID) ) and (PrePost=2) and LAG(PrePost=1) ToUse= 1.
SELECT IF ToUse=1.
Save Outfile blah blah blah.
If you are using a really old version of SPSS, LEAD( ) might not
be available. In that case, you could use the LAG( ) line as above,
then re-Sort the file so that it is Descending order on PrePost,
then use the LEAD( ) line with each LEAD re-written as LAG( ).
Then do the Select.
Totally untested.
--
Rich Ulrich