Syntax question: operation of individual cases (not the whole row or column)?

aimoux

unread,

Nov 28, 2009, 12:59:11 PM11/28/09

to

Dear all,

I am new to SPSS. I am wondering how I can access individual cases and
do simple calculations? For instance, how to calculate the sum of the
1st element in column 1 (variable 1) and the 2nd element in column 2
(variable 2)? Can I declare individual cases by their index?

Thank you very much!!

Bruce Weaver

unread,

Nov 28, 2009, 1:44:57 PM11/28/09

to

aimoux wrote:
> Dear all,
>
> I am new to SPSS. I am wondering how I can access individual cases and
> do simple calculations? For instance, how to calculate the sum of the
> 1st element in column 1 (variable 1) and the 2nd element in column 2
> (variable 2)?

Either of the following will do what you want:

compute sum12 = var1 + var2.
compute sum12 = sum(var1,var2).

Look up COMPUTE in the Help for a list of available functions.

> Can I declare individual cases by their index?
>
> Thank you very much!!

Look up $CASENUM.

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

aimoux

unread,

Nov 28, 2009, 3:44:57 PM11/28/09

to

On Nov 28, 1:44 pm, Bruce Weaver <bwea...@lakeheadu.ca> wrote:
> aimoux wrote:
> > Dear all,
>
> > I am new to SPSS. I am wondering how I can access individual cases and
> > do simple calculations? For instance, how to calculate the sum of the
> > 1st element in column 1 (variable 1) and the 2nd element in column 2
> > (variable 2)?
>
> Either of the following will do what you want:
>
> compute sum12 = var1 + var2.
> compute sum12 = sum(var1,var2).
>
> Look up COMPUTE in the Help for a list of available functions.
>
> > Can I declare individual cases by their index?
>
> > Thank you very much!!
>
> Look up $CASENUM.
>
> --
> Bruce Weaver

> bwea...@lakeheadu.cahttp://sites.google.com/a/lakeheadu.ca/bweaver/

> "When all else fails, RTFM."

Thank you very much Bruce.

I guess i am just wondering whether there is a way to do something
like variable[1], varible[2], like what you do in other programming
languages?

How to calculate the sum of the 1st and 2nd elements of a variable? in
other languages, you may simply do variable [1] +variable[2].

Bruce Weaver

unread,

Nov 28, 2009, 4:53:49 PM11/28/09

to

Ah, OK. Look up DO REPEAT and LOOP. The latter is often used in
connection with VECTOR. There are examples in the online help,
and in the Command Syntax Reference (available via the Help menu).
I suspect there are examples on Raynald's website too
(www.spsstools.net).

--
Bruce Weaver
bwe...@lakeheadu.ca

Rich Ulrich

unread,

Nov 28, 2009, 4:54:32 PM11/28/09

to

On Sat, 28 Nov 2009 12:44:57 -0800 (PST), aimoux <xw2...@gmail.com>
wrote:

I guess - You mean that you want to use something with indexes,
instead of simply putting in the unique names? - Bruce showed how
to do the latter.

SPSS lets you declare that a set of contiguous variables comprise
a Vector, which has an arbitrary name. That name might or might
not be similar to the original name(s). Then you can use that name
with regular parentheses to index a value from 1 to N.

Or, if you can define a new variable as a vector, so that VarX(20)
specifies that there will be 20 new variables in a set; and it further
allows you to refer to the 15th new variable, say, as either VarX15
or VarX(15).

To avoid confusion, read the Help. And keep in mind that a
vector definition is temporary, lasting through the next procedure.
(I think that is the part that confused me for a while.)

--
Rich Ulrich

aimoux

unread,

Nov 28, 2009, 5:07:23 PM11/28/09

to

> Rich Ulrich- Hide quoted text -
>
> - Show quoted text -

Thank you Rich and Bruce!!!

wow. sounds like a real pain. But this vector thing seems to be my
best shot.

My task is: I have a variable X with some missing values, say X(5).
And I want to interpolate between X(4) and X(6). So it seems that I
have to create a vector(200), pick out all the missing values, and do v
(5)=(v(6)+v(4))/2, v8=(v9+v7)/2, etc.

Is there a simpler way to achieve this?

A matrix?

Happy holiday!

Rich Ulrich

unread,

Nov 29, 2009, 8:07:22 PM11/29/09

to

On Sat, 28 Nov 2009 14:07:23 -0800 (PST), aimoux <xw2...@gmail.com>
wrote:

>On Nov 28, 4:54�pm, Rich Ulrich <rich.ulr...@comcast.net> wrote:
>> On Sat, 28 Nov 2009 12:44:57 -0800 (PST), aimoux <xw2...@gmail.com>
>> wrote:

[snip, previous]

>
>Thank you Rich and Bruce!!!
>
>wow. sounds like a real pain. But this vector thing seems to be my
>best shot.
>
>My task is: I have a variable X with some missing values, say X(5).
>And I want to interpolate between X(4) and X(6). So it seems that I
>have to create a vector(200), pick out all the missing values, and do v
>(5)=(v(6)+v(4))/2, v8=(v9+v7)/2, etc.
>
>Is there a simpler way to achieve this?
>
>A matrix?

Loop #i= 2, 199 .
if miss( v(#i) ) v(#i)= mean.2( v(#i-1), v(#i+1) ).
end loop.

The algorithm runs from 2..199 since it is not defined
for 1 or 200.

The "miss( )" function is used in the obvious way.

The "mean.x( )" will be set to Missing if there are not at least
"x" values that are not missing.

"#" at the start of a variable name makes it a Temporary variable
which is not saved in any permanent file.

--
Rich Ulrich

Bruce Weaver

unread,

Nov 29, 2009, 8:41:05 PM11/29/09

to

Rich Ulrich wrote:
> On Sat, 28 Nov 2009 14:07:23 -0800 (PST), aimoux <xw2...@gmail.com>
> wrote:
>
>> On Nov 28, 4:54 pm, Rich Ulrich <rich.ulr...@comcast.net> wrote:
>>> On Sat, 28 Nov 2009 12:44:57 -0800 (PST), aimoux <xw2...@gmail.com>
>>> wrote:
> [snip, previous]
>> Thank you Rich and Bruce!!!
>>
>> wow. sounds like a real pain. But this vector thing seems to be my
>> best shot.
>>
>> My task is: I have a variable X with some missing values, say X(5).
>> And I want to interpolate between X(4) and X(6). So it seems that I
>> have to create a vector(200), pick out all the missing values, and do v
>> (5)=(v(6)+v(4))/2, v8=(v9+v7)/2, etc.
>>
>> Is there a simpler way to achieve this?
>>
>> A matrix?
>
> Loop #i= 2, 199 .

I think that needs to be Loop #i = 2 to 199.

> if miss( v(#i) ) v(#i)= mean.2( v(#i-1), v(#i+1) ).
> end loop.
>
> The algorithm runs from 2..199 since it is not defined
> for 1 or 200.
>
> The "miss( )" function is used in the obvious way.
>
> The "mean.x( )" will be set to Missing if there are not at least
> "x" values that are not missing.
>
> "#" at the start of a variable name makes it a Temporary variable
> which is not saved in any permanent file.
>

--
Bruce Weaver
bwe...@lakeheadu.ca

Antti Nevanlinna

unread,

Nov 30, 2009, 9:56:43 AM11/30/09

to

aimoux wrote:
> On Nov 28, 4:54 pm, Rich Ulrich <rich.ulr...@comcast.net> wrote:
>> On Sat, 28 Nov 2009 12:44:57 -0800 (PST), aimoux <xw2...@gmail.com>
>> wrote:

>> ...

>
> My task is: I have a variable X with some missing values, say X(5).
> And I want to interpolate between X(4) and X(6). So it seems that I
> have to create a vector(200), pick out all the missing values, and do v
> (5)=(v(6)+v(4))/2, v8=(v9+v7)/2, etc.
>
> Is there a simpler way to achieve this?
>

create
/lag_x = lag(x,1)
/lead_x = lead(x,1).
do if (missing(x)).
+ compute v = mean.2(lag_x,lead_x).
else.
+ compute v = x.
end if.
execute.

Antti Nevanlinna

aimoux

unread,

Nov 30, 2009, 4:06:23 PM11/30/09

to

Thank you for all your help, guys.

My real job is more complicated than what I've described since I have
different interpolation methods for different missing values. And it
seems the only way to do is by creating a new matrix.

Thanks again!
On Nov 30, 9:56 am, Antti Nevanlinna <antti.nevanli...@helsinki.fi>
wrote:

Bruce Weaver

unread,

Nov 30, 2009, 4:28:54 PM11/30/09

to

On Nov 30, 4:06 pm, aimoux <xw2...@gmail.com> wrote:
> Thank you for all your help, guys.
>
> My real job is more complicated than what I've described since I have
> different interpolation methods for different missing values. And it
> seems the only way to do is by creating a new matrix.
>
> Thanks again!

Have you looked at the RMV (Replace Missing Values) command? I don't
have any experience with it, but it could already have routines for
what you want to do.

I should add that given the current state of missing data methods,
substituting the mean of surrounding values is not generally
considered a very good approach. So you might want to consider some
better alternatives. If you have v17 or later, you could probably use
multiple imputation, for example.

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home

aimoux

unread,

Dec 2, 2009, 4:29:30 PM12/2/09

to

Thank you for your suggestion. I will check that.

Now another question, is there a neat way to create new variables
like:

raw data:

case no. year begin year end
1 1 5
2 2 3
3 2 4

i want to create a new variable that fills the gap between years like
this:

case no. year begin year end new variable
1 1 5 1
1 1 5 2
1 1 5 3
1 1 5 4
1 1 5 5
2 2 3 2
2 2 3 3
3 2 4 2
3 2 4 3
3 2 4 4

Thank you in advance!

On Nov 30, 4:28 pm, Bruce Weaver <bwea...@lakeheadu.ca> wrote:
> On Nov 30, 4:06 pm, aimoux <xw2...@gmail.com> wrote:
>
> > Thank you for all your help, guys.
>
> > My real job is more complicated than what I've described since I have
> > different interpolation methods for different missing values. And it
> > seems the only way to do is by creating a new matrix.
>
> > Thanks again!
>
> Have you looked at the RMV (Replace Missing Values) command? I don't
> have any experience with it, but it could already have routines for
> what you want to do.
>
> I should add that given the current state of missing data methods,
> substituting the mean of surrounding values is not generally
> considered a very good approach. So you might want to consider some
> better alternatives. If you have v17 or later, you could probably use
> multiple imputation, for example.
>
> --
> Bruce Weaver

> bwea...@lakeheadu.cahttp://sites.google.com/a/lakeheadu.ca/bweaver/Home

Bruce Weaver

unread,

Dec 2, 2009, 8:41:21 PM12/2/09

to

aimoux wrote:
> Thank you for your suggestion. I will check that.
>
> Now another question, is there a neat way to create new variables
> like:
>
> raw data:
>
> case no. year begin year end
> 1 1 5
> 2 2 3
> 3 2 4
>
>
> i want to create a new variable that fills the gap between years like
> this:
>
> case no. year begin year end new variable
> 1 1 5 1
> 1 1 5 2
> 1 1 5 3
> 1 1 5 4
> 1 1 5 5
> 2 2 3 2
> 2 2 3 3
> 3 2 4 2
> 3 2 4 3
> 3 2 4 4
>
> Thank you in advance!
>

I'm on a machine without SPSS again, so I can't test. But I think
something like this might work.

dataset declare newdat.
do repeat newvar = yrbegin to yrend.
- xsave outfile = 'newdat' / keep = casenum yrbegin yrend newvar.
end repeat.
exe.

dataset activate newdat window = front.

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

Bruce Weaver

unread,

Dec 2, 2009, 9:43:39 PM12/2/09

to

Upon reflection, I think you'll have to use LOOP, not DO REPEAT.
(The DO REPEAT example above will only loop twice, and will treat
NEWVAR as YRBEGIN the first time through, and as YREND the second
time.)

dataset declare newdat.
LOOP newvar = yrbegin to yrend.

- xsave outfile = 'newdat' / keep = casenum yrbegin yrend newvar.

END LOOP.
exe.

Bruce Weaver

unread,

Dec 3, 2009, 9:07:30 AM12/3/09

to

On Dec 2, 9:43 pm, Bruce Weaver <bwea...@lakeheadu.ca> wrote:
> Bruce Weaver wrote:

> Upon reflection, I think you'll have to use LOOP, not DO REPEAT.
> (The DO REPEAT example above will only loop twice, and will treat
> NEWVAR as YRBEGIN the first time through, and as YREND the second
> time.)
>
> dataset declare newdat.
> LOOP newvar = yrbegin to yrend.
> - xsave outfile = 'newdat' / keep = casenum yrbegin yrend newvar.
> END LOOP.
> exe.
>

I just discovered that XSAVE doesn't seem to like dataset names on the
OUTFILE subcommand. But it works if you give an actual file
specification, like this:

new file.
dataset close all.

data list list / casenum yrbegin yrend (3f2.0) .
begin data

1 1 5
2 2 3
3 2 4

end data.
dataset name file1.

LOOP newvar = yrbegin to yrend.

- xsave outfile = "C:\temp\file2.sav" /

keep = casenum yrbegin yrend newvar.
END LOOP.
exe.

get file = "C:\temp\file2.sav".
dataset name file2.
dataset activate file2.
list.

Here's the output:

casenum yrbegin yrend newvar

1 1 5 1
1 1 5 2
1 1 5 3
1 1 5 4
1 1 5 5
2 2 3 2
2 2 3 3
3 2 4 2
3 2 4 3
3 2 4 4

Number of cases read: 10 Number of cases listed: 10

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home