Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Unstacking SPSS Columns

586 views
Skip to first unread message

timal...@gmail.com

unread,
Aug 8, 2007, 5:26:18 AM8/8/07
to
I have a difficulty with SPSS 14 that doubtless has a simple answer,
but I'll be darned if I can find it, even after looking through every
SPSS FAQ I could stand to read. I'm familiar with Minitab's Unstack
capability, which neatly cuts a column with two or more groups of
values into separate pieces and copies them into two or more new
columns, allowing direct comparison. I tried SPSS's Recode command
(entirely in the GUI) and found that while SPSS would indeed copy and
paste into new variables, it would also copy missing value cells,
giving me essentially one new variable that starts at row 1, and
another variable that starts at a far lower row. Doing crosstab
between the variables therefore resulted in an empty table. I've
scoured the help file and other sources, but nothing about Recode
mentions this problem. No other command seems to be available for
this. Google searches have come up empty. How is it solved in SPSS?

Bruce Weaver

unread,
Aug 8, 2007, 11:15:48 AM8/8/07
to

If I understand what you're asking, the CASESTOVARS command is what
you're after. You can generate syntax for it via the Data ->
Restructure dialog.

--
Bruce Weaver
E-mail: bwe...@lakeheadu.ca
Web: www.angelfire.com/wv/bwhomedir

Richard Ulrich

unread,
Aug 8, 2007, 9:08:22 PM8/8/07
to
On Wed, 08 Aug 2007 02:26:18 -0700, timal...@gmail.com wrote:

> I have a difficulty with SPSS 14 that doubtless has a simple answer,
> but I'll be darned if I can find it, even after looking through every
> SPSS FAQ I could stand to read. I'm familiar with Minitab's Unstack
> capability, which neatly cuts a column with two or more groups of
> values into separate pieces and copies them into two or more new
> columns, allowing direct comparison. I tried SPSS's Recode command

[snip, about cut-and-paste, etc.]

Bruce assumes that you are referring to "paired" scores,
or something with multiple values per group.

However, it sounds to me like you misunderstand the
basic organization of data-base files, in SPSS or elsewhere,
and are trying to replicate some sort of arbitrary spreadsheet
behavior that lets you compare data in sets of cells....

If you have distinct two groups that are presently contained
in one column and you want to compare them, well, they
BELONG in one column. What you need is a group-id --
a column that has a 1 for group 1, 2 for group 2, and so on.

Then you use a t-test on groups, specifying the Variable.

I don't use Minitab, so maybe I am guessing wrong about
your problem.

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Tim

unread,
Aug 9, 2007, 4:30:28 AM8/9/07
to
On Aug 8, 9:08 pm, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:

I know it sounds counterintuitive to want to break up a single column,
and I *do* have a column ID - two of them, in fact. The problem is
that crosstabs has no way I can find to specify an ID so as to compare
subgroups within a column. I'd be perfectly happy to leave them in one
column, if there were a way to do it with crosstabs. Minitab solves it
by essentially providing spreadsheet behavior and unstacking the
column, so that the values can be compared in adjacent columns. There
are times when this is perfectly acceptable, and even very useful.

Tim

unread,
Aug 9, 2007, 4:52:32 AM8/9/07
to

>
> If I understand what you're asking, the CASESTOVARS command is what
> you're after. You can generate syntax for it via the Data ->
> Restructure dialog.
>
> --
> Bruce Weaver
> E-mail: bwea...@lakeheadu.ca
> Web:www.angelfire.com/wv/bwhomedir

I had a look at the CASEVARS command, and it isn't really what I'm
after. CASEVARS rearranges the data such that multiple observations
are assembled more conveniently across columns. I can see where this
could be useful, but it's not my situation.

Let's say that in a marketing research session I set up five blenders.
I invite in 150 people over the course of a day to experience the
blenders, and I have each person answer a number of inquiries. For
blender A, a single inquiry might be an attitude survey statement
"This blender is easy to turn on and off". There is a 5-point Likert-
type scale. All 150 answer. I now have a single column of 150 answers
for P1. The next inquiry might be "I can see this blender fitting into
my decor". Another 5-point Likert-type scale. All 150 answer and
column P2 is assembled. Other inquiries follow.

Then I do a split-half arrangement with a single blender, because I'm
interested especially in how people will react to a new easy-load
feature in one particular blender. For the morning subjects I have the
old feature in place on blender C, and in the afternoon the new
feature on that same blender. Now for the inquiry "This blender is
easy to load" I have two sets of data - one for the new feature, one
for the old, in the same column. I know which is which, because of a
separate index column that has a 0 in each row without the new
feature, and a 1 in each row with the new feature. I'm analyzing with
a crosstab, because the data is ordinal and heavily skewed, and there
is little of it to work with. I click through to crosstabs, attempt to
tell SPSS that the column P19 has both the rows and columns, and
suddenly I find I can't tell SPSS any such thing.

Okay, I regroup. Perhaps SPSS expects, like Minitab, to compare only
columns. Minitab will unstack a column, so I can now have P19New and
P19Old, and away I go. But in SPSS, recoding produces two columns, but
with missing values in the rows that did not conform to the filter
attributes. That puts the columns P19New and P19Old adjacent to one
another, but P19Old has values starting at row 1, while P19New has
values starting at 75. Do a crosstab, and it comes up empty, because
it doesn't ignore null or missing values. Now I'm stymied and looking
for assistance. Extensive Google search. Much coffee consumed. Nada.
Recourse to this group.

peter m sopp

unread,
Aug 9, 2007, 6:06:58 AM8/9/07
to
Tim schrieb:
Still confusing. Why you want to crosstabulate one variable with itself
- it doesn't make any sense. For crosstabulation you use two variables
(or more).
I'll try an answer:
Your data looks like this:
ID Group Item (1-5)
1 1 1
2 2 5
3 1 3
4 1 3
5 2 2
6 2 4
... (and so on)

CROSSTABLES Item BY Group. (or some more BYs) - and then: have a look at
the output window.
(See help for more information and examples)

GROUP
ITEM 1 2 TOTAL
1 1 0 1
2 0 1 1
3 2 0 2
4 0 1 1
5 0 1 1
TOTAL 3 3 6

Other way: Split file Layered and then frequencies.
Other way: Customer Tables (for more sophisticated tabulations)

Hope this is helpful.
Peter

Bruce Weaver

unread,
Aug 9, 2007, 8:10:50 AM8/9/07
to

After reading your clarification, I think Peter is on the right track
here. Here is an example of how to use CROSSTABS:

http://courses.washington.edu/stat217/218tutorial5.html

Substitute your 0/1 (feature present vs absent) variable for Gender
and your 5-point rating for Dxstatus.

--
Bruce Weaver
bwe...@lakeheadu.ca

Richard Ulrich

unread,
Aug 9, 2007, 9:02:31 PM8/9/07
to
On Thu, 09 Aug 2007 01:30:28 -0700, Tim <tima...@sbcglobal.net>
wrote:

[break]

I guess I figured it. There have been a couple of other posts
by now that fit details to your problem. You also describe,
elsewhere, that you do have a column or variable that says
which testing-group the response belongs to.

I described how to compare the means -- the grouping variable
is used for "groups", and the scaled response is the criterion.

If you want to see the 2xk table, you simply use
CROSSTABS TABLES= item by grp ....


> I'd be perfectly happy to leave them in one
> column, if there were a way to do it with crosstabs. Minitab solves it
> by essentially providing spreadsheet behavior and unstacking the
> column, so that the values can be compared in adjacent columns. There
> are times when this is perfectly acceptable, and even very useful.

It is mainly acceptable and useful when there is only a small
amount of data, and when the processes and procedures do not
need to be documented. - SPSS allows you to 'log' the procedures
and options that are invoked. Even better, it allows a professional
style of production, where the syntax is 'pasted' to a file where
from where it will be executed, and it can be preserved (and, if
desired, re-used), in addition to being logged with the output.

Tim

unread,
Aug 12, 2007, 7:59:45 AM8/12/07
to
I think I see where the confusion around my question is coming from.

The dataset I used has dozens of columns, but only a part of it was
originally provided to me. Of that provided, there was an index
column, but it wasn't a binary index. It could be sorted only with
variable <= 5 being one condition and variable >5 being the other. The
possible count was 9, so the n was obviously unbalanced.

Now, when I try to do crosstabs using such an index, I can't find any
way to compare two parts of a single column if the index has to be a
calculated condition. I could sort and and split, but neither solution
was easy or elegant. I could also copy data from split columns and do
the comparison after a paste. Recoding the variable into new variables
didn't solve the problem, because recoding carried over all the
missing values.

The whole situation was finally solved when I was provided with a
dataset that had a column of a truly binary index.


Richard Ulrich

unread,
Aug 12, 2007, 9:32:43 PM8/12/07
to
On Sun, 12 Aug 2007 04:59:45 -0700, Tim <tima...@sbcglobal.net>
wrote:

> I think I see where the confusion around my question is coming from.
>
> The dataset I used has dozens of columns, but only a part of it was
> originally provided to me. Of that provided, there was an index
> column, but it wasn't a binary index. It could be sorted only with
> variable <= 5 being one condition and variable >5 being the other. The
> possible count was 9, so the n was obviously unbalanced.
>
> Now, when I try to do crosstabs using such an index, I can't find any
> way to compare two parts of a single column if the index has to be a

> calculated condition. [break]

For me, creating a new variable is awkward in a spreadsheet,
because I don't know the syntax for defining cells.
I think it is more transparent in SPSS.
One way --
Compute newvar= trunc(oldvar/5.5) .
Or, another way, using recode,
Recode oldvar(1 thru 5=1)(6 thru 9=2)(else= sysmis) into newvar.
Then,
crosstabs tables= itemx by newvar /cell= count /statistics .

> I could sort and and split, but neither solution
> was easy or elegant. I could also copy data from split columns and do
> the comparison after a paste. Recoding the variable into new variables
> didn't solve the problem, because recoding carried over all the
> missing values.

Huh? You do what you want with missing values in recode,
so this is just wrong.

>
> The whole situation was finally solved when I was provided with a
> dataset that had a column of a truly binary index.

Perhaps you want to run through a tutorial or two....

Tim

unread,
Aug 12, 2007, 10:30:56 PM8/12/07
to
On Aug 12, 9:32 pm, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
> On Sun, 12 Aug 2007 04:59:45 -0700, Tim <timal...@sbcglobal.net>

Perhaps you could enlighten me as to how I might recode my single
column into two mutually exclusive variables without missing values.
Because what I saw when I tried various combinations was something
like this:

Variable 1
4
5
4
5
4
3

Variable 2

.
.
.
.
.
.
4
5
3
4
5

This is not going to produce a crosstab with values in the cells.

I tried recoding with all system/missing values as 99, but to no
avail. It still produced cells with no content, and hence no crosstab.
I specifically told SPSS to keep all values the same except for
missing, but split the variable column according to an index column
which has 0 for one condition and 1 for the other (something I didn't
have in my first analytical go-round). It still insisted on adding
missing cells, not a value I could filter out. ELSE --> 99 didn't
work, and neither did MISSING --> 99.

Believe me, before posting this I googled extensively and read the
help file. No tutorial helped. I could find literally nothing directly
on point, nothing that produced exactly the condition I'm describing.
Every tutorial assumed that I'd want to do something different than
this, and in any case each tutorial showed only ideal conditions, such
a column with neat 1

Tim

unread,
Aug 12, 2007, 10:32:37 PM8/12/07
to

Damn. Mistaken keyboard combination late at night posted before I was
ready.

As I was saying, such as a column with neat 1s and 0s. Nothing told me
how to handle the situation when the projected result didn't happen as
depicted.

Neila

unread,
Aug 12, 2007, 11:43:53 PM8/12/07
to
Perhaps a CLEAR example of what your data look like and the desired
outcome?

Richard Ulrich

unread,
Aug 13, 2007, 9:13:56 PM8/13/07
to
On Sun, 12 Aug 2007 19:32:37 -0700, Tim <tima...@sbcglobal.net>
wrote:

[snip, much previous]


> >
> > Perhaps you could enlighten me as to how I might recode my single
> > column into two mutually exclusive variables without missing values.
> > Because what I saw when I tried various combinations was something
> > like this:
> >
> > Variable 1
> > 4
> > 5
> > 4
> > 5
> > 4
> > 3
> >
> > Variable 2
> >
> > .
> > .
> > .
> > .
> > .
> > .
> > 4
> > 5
> > 3
> > 4
> > 5
> >
> > This is not going to produce a crosstab with values in the cells.

You asserted once, or I thought that you did, that
your data looked more like this, with an outcome and
a GROUP variable. One problem about the GROUP
variable is that its values were 1-9, and you needed 1-5, 6-9.

Outcome Group
4 1
4 2
1 8
5 9
2 2
...

For this, you "recode Group(1 thru 5=1)(6 thru 9=2) into grp2"

and ask for "crosstabs Table= Outcome by Grp2" .

As I mentioned before, you do seem to lack any
basic orientation to using any database. Crosstabs is
one of the first lessons. SPSS has a tutorial; try it.
A couple of hours of practice should prevent days of
mis-conceived experimentation.


--
Rich Ulrich, wpilib@pitt.
http://www.pitt.edu/~wpilib/index.html

0 new messages