CROSSTABS

Path: janda.org/c10 > Overview of SPSS > Crosstabs

Statistical Package for the Social Sciences

CROSSTABS

CROSSTABS is an SPSS procedure that CROSS-TABULATES two variables, thus displaying their relationship in tabular form. While FREQUENCIES is a useful procedures for summarizing information about one variable, CROSSTABS generates information about bivariate relationships.
Because CROSSTABS creates a row for each value in one variable and a column for each value in the other, the procedure is not suitable for continuous variables that assume many values. CROSSTABS is designed for discrete variables--usually those measured on nominal or ordinal scales.

CROSSTABS creates a table that contains a cell for every combination of the categories in the two variables. Inside each cell is the number of cases that fit that particular combination of responses. SPSS can also report the row, column, and total percentages for each cell of the table.

This is an example of CROSSTAB output from SPSS:

- - - - - - C R O S S T A B U L A T I O N O F - - -
CDBG8089 NET CHANGE IN CDBG FUNDS, 1980-89 BY
REGION CENSUS REGIONS
COUNT
ROW PCT I NORTH NORTH SOUTH WEST PUERTO ROW
COL PCT I EAST CENTRAL RICO TOTAL
TOT PCT I 1 I 2 I 3 I 4 I 5 I
DBG8089 +--------+--------+--------+--------+--------+-----+
1 I 38 I 33 I 51 I 61 I 4 I 187
INCREASED I 20.3 I 17.6 I 27.3 I 32.6 I 2.1 I 23.0
I 19.3 I 16.2 I 23.4 I 33.7 I 33.3 I
I 4.7 I 4.1 I 6.3 I 7.5 I .5 I
+--------+--------+--------+--------+--------+
3 I 159 I 171 I 167 I 120 I 8 I 625
DECREASED I 25.4 I 27.4 I 26.7 I 19.2 I 1.3 I 77.0
I 80.7 I 83.8 I 76.6 I 66.3 I 66.7 I
I 19.6 I 21.1 I 20.6 I 14.8 I 1.0 I
+--------+--------+--------+--------+--------+
COLUMN 197 204 218 181 12 812
TOTAL 24.3 25.1 26.8 22.3 1.5 100.0
NUMBER OF MISSING OBSERVATIONS = 49

This is the general, abstract form of CROSSTABS output

Independent Variable

Dependent Variable category 1 category 2 . . category k Totals

Category 1
Table entries consist of frequencies, or percentages, or both. Intersections of rows and columns are called "cells."
row 1

Category 2 row 1

.
.
Category j .
.
row j

Totals N N N Grand N

Percents 100.0 100.0 100.0 100.0

The percentage entries in the cells should sum to 100% at the bottom of the table if CELLS=COLUMN (print column percentages) is requested.

CROSSTABS

produces a table with one variable at the side and another at the top, a "cross-tabulation"

CROSSTABS: [varlist] list of variable names, joined with the SPSS keyword BY

/CELLS= [specifies what values are printed in the cells of the table]

main ones are COUNT of cases and % by ROW, COLumn, and TOTAL number of cases

Asking for percents by columns is desired if you lay out the table with the independent variable in the columns (i.e., across the top)

/STATISTICS: prints requested measures of association

Subcommands

CELLS= Should you use COLUMN or ROW?

It depends on which of your variables you regard as "independent."

Always compute percentages using the N's in the marginals of whichever variable you regard as independent.

Conventionally, the column variable (across the top) is the independent variable, which calls for COLUMN.

STATISTICS

CROSSTABS generates many different statistics, some of which are more useful than others.

STATISTICS= CHISQ PHI CC will generate these statistics

chi-square (X²)-- tests for independence between nominal variables
phi (2x2 tables) and Cramer's V (nxn tables)--measure of association based on X²
Contingency Coefficient, C

Consider CROSSTABS commands for today's assignment using the vote92 file. You'll have to put in the right variables for vote96

CROSSTABS V2 BY V8 V125 / V2 BY V8 BY V125

/CELLS=COUNT COLUMN / STATISTICS= [names of statistics here]

Conventions and advice concerning CROSTABS

By convention, the independent variable is arranged across the top of the table, unless number of categories or size of space prohibit.

ALWAYS, percentages are computed within the categories of the independent variable -- as shown in the sample table -- and the number of cases on which the percentages are based is always given.

Try to avoid cluttering a table with unnecessary information -- e.g., percentages by rows AND columns -- even percentages AND frequencies in the same cell.

Follow format in professional journals for constructing tables; it is bad form to submit raw printout from SPSS runs as tables in course papers.

Limitations of CROSSTABS print format

CROSSTABS tables in SPSS are limited to 10 categories for the independent variable (across the top) on standard-size (wide) paper.; There is no limitation on the number of categories for the dependent variable -- down the side.

Consequences of the limitation

Tables with more than 10 categories for the independent variable will be "wrapped around" and printed as a "continuation" of the first table.

Consider this example:

Supposing an AGE variable has values ranging from 17 to 99 and an INCOME variable has 20 coding categories.; If one specified CROSSTABS INCOME BY AGE, only the first 10 of AGE's values could fit across the top of the page, and a continuation table would be printed on another page.; However, the command CROSSTABS AGE BY INCOME would place AGE along the side, allowing it to print out in full on one table (if one really wanted age by exact years).

The AGE variable could also be "recoded" into fewer categories with handled by using the RECODE command in SPSS.

RECODE can be used either to change or to combine codes assigned to variables in an SPSS file.

For example, V8 is a 7 category measure of party identification, ranging from 0 to 6.

These scores can be "recoded" to a 3-point scale as follows:

RECODE V8 (0=1) (2,4=3) (6=5)

When placed before the CROSSTABS command, RECODE will change the variable into a trichotomy: 1=Democrats, 3=Independents, and 5=Republicans.