In SPSS Statistics, the Frequencies procedure can produce frequency tables, which contain tallies and proportions, as well as two types of graphs appropriate for categorical data: bar charts and pie charts.
Each row can represent one subject, or can represent an observation from a subject. Each column should represent one variable. Variables that will be tabulated using frequency tables should ideally have the following variable properties defined:
Variable Type: The categorical variables in your SPSS dataset can be numeric or string. By default, the rows of the table are arranged in ascending order (for numeric codes) or alphabetically (for string variables).
Value Labels: If you have entered data using numeric codes that represent specific named categories (especially nominal/unordered categories), you should apply value labels to your variables. This can affect the display of the table.
Missing Value Handling: The frequency table will include sections for Valid (non-missing) and Missing responses. Any values recognized as system-missing or user-missing will appear in the Missing section. If you have more than one user-defined missing value code that appears in the data, those codes are tallied separately in the Missing section of the table. (For example, if you have defined the number code -99 as "Refused response" and -88 to represent "Not asked", you will be able to see how many "Refused" and "Not asked" values there were.)
Variable Measurement Levels: The variables' measurement levels should be defined as nominal or ordinal. The Frequencies procedure will still work on variables whose measurement level is set to scale; however, frequency tables should only be used when there are a limited number of response categories.
A Variable(s): The variables to produce Frequencies output for. To include a variable for analysis, double-click on its name to move it to the Variables box. Moving several variables to this box will create several frequency tables at once.
The vast majority of the descriptive statistics available in the Frequencies: Statistics window are never appropriate for nominal variables, and are rarely appropriate for ordinal variables in most situations. There are two exceptions to this:
If your categorical variables are coded numerically, it is very easy to mis-use measures like the mean and standard deviation. SPSS will compute those statistics if they are requested, regardless of whether or not they are meaningful. It is up to the researcher to determine if these measures are appropriate for their data. In general, you should never use any of these statistics for dichotomous variables or nominal variables, and should only use these statistics with caution for ordinal variables.
C Charts: Opens the Frequencies: Charts window, which contains various graphical options. Options include bar charts, pie charts, and histograms. For categorical variables, bar charts and pie charts are appropriate. Histograms should only be used for continuous variables; they should not be used for ordinal variables, and should never be used with nominal variables.
Note that the options in the Chart Values area apply only to bar charts and pie charts. In particular, these options affect whether the labeling for the pie slices or the y-axis of the bar chart uses counts or percentages. This setting will greyed out if Histograms is selected.
When working with two or more categorical variables, the Multiple Variables options only affects the order of the output. If Compare variables is selected, then the frequency tables for all of the variables will appear first, and all of the graphs for the variables will appear after. If Organize output by variables is selected, then the frequency table and graph for the first variable will appear together; then the frequency table and graph for the second variable will appear together; etc.
E Display frequency tables: When checked, frequency tables will be printed. (This box is checked by default.) If this check box is not checked, no frequency tables will be produced, and the only output will come from supplementary options from Statistics or Charts. For categorical variables, you will usually want to leave this box checked.
Two tables appear in the output: Statistics, which reports the number of missing and nonmissing observations in the dataset, plus any requested statistics; and the frequency table for variable Rank. The table title for the frequency table is determined by the variable's label (or the variable name, if a label is not assigned).
Here, the Statistics table shows that there are 406 valid and 29 missing values. It also shows the Mode statistic: here, the mode value is "1", which is the numeric code for the category Freshman. Notice that the Mode statistic isn't displaying the value labels, even though they have been assigned. (For this reason, we recommend not requesting the mode statistic; instead, determine the mode from the frequency table.)
Notice how the rows are grouped into "Valid" and "Missing" sections. This grouping allows for easy comparison of missing versus nonmissing observations. Note that "System" missing responses are observations that use SPSS's default symbol -- a period (.) -- for indicating missing values. If a user has assigned special codes for missing values in the Variable View window, those codes would appear here.
This issue should not be ignored! This particular issue affects frequency tables created from string variables that use blanks to denote missing values. SPSS does not automatically recognize blank (i.e., empty) strings as missing values, so the blank values appear as one of the "Valid" (i.e., non-missing) categories. This affects the calculation of the Valid Percent columns.
To fix this problem: To get SPSS to recognize blank strings as missing values, you'll need to run the variable through the Automatic Recode procedure. This procedure takes a string variable and converts it to a new, coded numeric variable with value labels attached. During this process, blank string values are recoded to a special missing value code. To see a worked example, see the Automatic Recode tutorial.
The Frequencies procedure is designed to drop unobserved categories from the frequency table: that is, it will not include categories with counts of 0. Although this can be desirable in some cases, it may be actively problematic or misleading in others. For example, if you create a frequency table of a 5-point Likert item or multiple choice question, readers may interpret the omission of categories as the categories not being included in the design of the survey -- which is very different than the categories being present on the survey but not selected by any respondents.
The Custom Tables procedure is included with SPSS Statistics Standard and SPSS Statistics Premium, but is not included in SPSS Statistics Base. If you do not see the Custom Tables procedure in the Analyze menu (Analyze > Tables > Custom Tables), it is possible your license did not include the Custom Tables module.
The default output of Custom Tables includes only the counts. It does not include a total row, nor the number of missing values, nor percentages. However, we see that all 5 categories are present, including the Other category, which is shown with a count of 0:
In this example, Table N% and Table Valid N% are identical, but this will not always be the case. If your variable includes user-missing values and you have enabled the Missing option, then Table N% will be different than Table Valid N%. (In general, "Table Valid N%" in Custom Tables has the same meaning as the "Valid Percent" column of the Frequencies output: it is the proportion based on the number of valid, nonmissing cases.)
The Table Total N% values are based on the total number of cases in the dataset (valid + missing). Recall that the sample dataset has 435 rows, and we know from the Frequencies procedure that variable HowCommute has 247 valid/observed values and 188 missing values (247 valid +188 missing = 435 total). We can verify that the Table Total N% values are based on the number of rows by performing the divisions ourselves and rounding to one decimal place:
News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news. I was showing a client how to use their version of SPSS to a variety of different things and when I went to run a logistic regression model, it wasn't there. Apparently, there are several versions of SPSS (I knew this already) and some of the versions do not include logistic regression (that I was surprised to find out). I had to research all the options and offer a recommendation. Here's a quick guide to what I learned by browsing through the SPSS site.
This is the version that includes logistic regression. It also has the generalized linear model, generalized linear mixed model, and generalized estimating equations. If you are doing more than just baby statistics, you will eventually need some of the capabilities in these models.
If you know the specialized procedures you will need, then you can buy just those procedures. The problem is that data analysis takes unexpected turns, so it may make sense to plan for growth into new and more advanced areas of SPSS. Here's my opinion for what it's worth. SPSS Statistics Standard Edition is a nice package for the money. It gives you enough capabilities that you can grow into many of the types of analyses you might need. If you are a serious hard core data analyst, then you go for extra large, the premium edition. You don't get enough from the large version, the professional edition, to justify the increase in cost. I won't go into the SAS versus SPSS debate, but if you are hard core (and have a big enough budget), you should take a look at and compare all your options, and SAS is a serious alternative to SPSS.
b1e95dc632