Data Analysis Software Packages

0 views
Skip to first unread message

Blanche Bunnell

unread,
Aug 3, 2024, 5:23:50 PM8/3/24
to hartfficampin

In 1968 three young men from disparate professional backgrounds developed a software system based on the idea of using statistics to turn raw data into information essential to decision-making. These three innovators were pioneers in their field, visionaries who recognized that data and how you analyze them is the driving force behind sound decision-making. This revolutionary statistical software system was called SPSS, which stood for Statistical Package for the Social Sciences. This software is now one of the most widely used survey analysis computer programs. It has gone through many versions, the latest is version 22.0 and is called IBM SPSS Statistics or just SPSS for short. This text uses version 19.0. If you have later or earlier versions, the guidelines here should work perfectly well. For more information and a free download of a demo version visit: www.spss.com/spss. These guidelines are by no means a full introduction to SPSS. It focuses just on the essential procedures you may need for a basic analysis of a dataset. For more detail, refer to Andy Field (2013) Discovering Statistics Using IBM SPSS Statistics, 4th edn. London: Sage.

Before entering any data, it is advisable first to name the variables (if you do not, you will be supplied with exciting names like var00001 and var00002). These names must begin with a letter and must not end with a full stop/period. There must be no spaces and the names chosen should not be one of the key words that SPSS uses as special computing terms, for example and, not, eq, by, all.

To enter variable names, click on the Variable View tab at the bottom left of the Data Editor window. Each variable now occupies a row rather than a column as in the Data Editor window. Enter the name of your first variable in the top left box. As soon as you hit Enter or the down arrow or right arrow, the remaining boxes will be filled with default settings, except for Label. It is usually advisable to enter labels, since these will be printed out in your tables and graphs (SPSS will otherwise use the variable names as labels). Labels can be the actual wording of the questions asked or a further explanation of the variable name.

To copy any variable information to another variable, like value labels, just use Edit/Copy and Paste. SPSS does not have an automatic timed backup facility. You need to save your work regularly as you go along. Use the FileSave sequence as usual for Windows applications. The first time you go to save, you will be given the Save As dialog box. Make sure this indicates the drive you want. FileExit will get you out of SPSS and back to the Program Manager or Windows desktop. SPSS will ask you if you want to save before exiting if unsaved changes have been made. Always save any changes to your data, but saving output is less important because it can quickly be recreated.

To regroup categories on a categorical variable, you need to use the Recode procedure. From the Menu bar, select TransformRecode Into Different Variables. From the list of variables, select the variable you wish to regroup and transfer to the Input Variable -> Output Variable box. Now click on Old and New Values. If, for example, you wish SPSS to add together the frequencies for categories that have been coded as 1 and 2, in the Old Value dialog area on the left, click on the first Range radio button and enter 1 then through and 2. In the New Value dialog area on the right, enter the code you wish the new combined category to become and click on Add. This instruction will now be entered into the Old -> New box. Repeat for any other categories you wish to combine. Click on Continue. Give the new Output Variable a name in the Name box, and click on Change then OK. The new variable will appear as the last column. To add value labels for categories of the new variable, change to the Variable View and proceed as above.

To get SPSS to compute totals from two or more metric variables, select Transform then Compute Variable. You will obtain the Compute Variable dialog box. Notice that there are lots of functions that you could perform on the variables. If all you want to do is get SPSS to add together the numeric values for each variable, highlight the first variable and put it into the Numeric Expression box by clicking on the arrow. Now click on the + button and bring over the next variable, then click on + again, and so on until you have all the variables you wish added together. Enter a variable name in the Target Variable box and click on OK. A new variable will appear in your data matrix, giving the total scores for each case. You can now, if you wish, use Recode to group the responses into, say, high-, medium- and low-score categories.

For a multiple response question, where respondents can select more than one category, each response needs to be treated as a separate variable in which each item is either ticked or not ticked. SPSS then needs to be told to treat these as a single multiple response question. Select AnalyzeMultiple ResponseDefine sets. Bring the variables across to the Variables in Set box. If a code of 1 was entered for those who had ticked the item, enter 1 in the Counted Value box. Make sure the Dichotomies radio button is clicked under Variables Are Coded As. You will also need to give the new variable a name. Click on the Add button to add the name to the Multiple Response Sets box, then on Close. The new variable, however, does not appear in the data matrix. To access it, click on AnalyzeMultiple Response and either Frequencies or Crosstabs depending on whether you want univariate or bivariate analysis.

User-defined missing values are ones that have been entered into the data matrix, but the researcher decides to exclude them from the analysis. To create them for any particular variable, from the Variable View select the little blue box in the Missing column against the variable you want and obtain the Missing Values dialog box. This enables you either to pick out particular codes to be treated as missing values by clicking on the Discrete missing values radio button and entering up to three codes, or to select a range of missing values.

To obtain univariate frequency tables for categorical variables, you will need the Frequencies procedure. This is in the AnalyzeDescriptive Statistics drop-down menu from the menu bar at the top. In the Frequencies dialog box, all variables are listed in the left box. To obtain a frequency count for any variable, simply transfer it to the Variable(s) box by highlighting it, then clicking on the direction button in the middle. To highlight blocks of adjacent variables or all the variables, hold down the shift key when highlighting the first variable, scroll down to the last variable in the block and click again. To change the order of the categories presented in a table, from the Frequencies box select Format and then select the Descending values radio button, then click on Continue and OK.

The Frequencies procedure will produce a separate table for each variable entered into the Variables box. To produce a multi-variable table, select AnalyzeTablesCustom Tables. Drag your first variable into the Rows box, then drag the next and subsequent variables to the foot of the lowest table shown. Alternatively, if the variables to be tabled are listed next to one another, highlight them all (by holding down the shift key) before dragging across. If the response categories for a number of variables are all the same and you want a table that sets out the responses as a matrix, create a multi-variable table as above, then under Category Position, select Row Labels in Columns.

Once you have obtained your chart you can edit it by double-clicking in the chart area. This will give you the Chart Editor. You can change the colours and a number of other chart features from the editor. Close it when you have finished. If you single-click on the chart area, you highlight it and it can be copied into other applications.

The Chart Builder can be used to produce both histograms and line graphs for metric variables. You can impose a normal curve on the histogram by clicking in the Display normal curve box in the Element Properties dialog box and clicking on Apply. The normal curve is explained in Chapter 4.

There are two ways of obtaining univariate data summaries for metric variables in SPSS. One is to use the Statistics button in the Frequencies dialog box. Select AnalyzeDescriptive StatisticsFrequencies. Put one or more metric variables in the Variable(s) box and then click on the Statistics button. Just put a tick in the box against the statistics you want by clicking with the left mouse button, then Continue and OK. The other procedure is found under AnalyzeDescriptive StatisticsDescriptives and gives a quick summary of each variable that includes the minimum and maximum scores and the mean and standard deviation. This is a more useful layout if there are many variables since they are listed by column rather than across the page.

SPSS provides confidence intervals for the estimation of metric variables under AnalyzeDescriptive StatisticsExplore. This gives you the Explore dialog box. Put a metric variable in the Dependent List and click on OK. The output provides many different statistical summaries, but the confidence interval for the mean at the 95 per cent level is the default. If you click on the Statistics box you can change the level of confidence, for example to 99 per cent. All these statistics can be split by a number of factors, for example gender of respondent, in which case you get separate tables for each. Just put the variable in the Factor List box.

Unfortunately, one thing that SPSS does not do is calculate the standard error of the proportion, which means that it cannot give you the corresponding confidence intervals for categorical variables. It can, however, test for differences between achieved sample proportions and hypothesized values. Under AnalyzeNonparametric TestsOne Sample, SPSS uses either a one-sample binomial test or a one-sample chi-square test as appropriate to assess the p-value for the difference between the sample result and the hypothesized values, which SPSS assumes are equal proportions in each category. To change the hypothesized proportions, select Legacy dialogs instead of One Sample and either Binomial, if the variable is binary, and change the Test Proportion or select Chi-square, if nominal with three or more categories, and change the Expected Values.

c80f0f1006
Reply all
Reply to author
Forward
0 new messages