Path: > Overview of SPSS > Crosstabs
 Statistical Package for the Social Sciences



CROSSTABS is an SPSS procedure that CROSS-TABULATES two variables, thus displaying their relationship in tabular form. While FREQUENCIES is a useful procedures for summarizing information about one variable, CROSSTABS generates information about bivariate relationships.

Because CROSSTABS creates a row for each value in one variable and a column for each value in the other, the procedure is not suitable for continuous variables that assume many values. CROSSTABS is designed for discrete variables--usually those measured on nominal or ordinal scales.

CROSSTABS creates a table that contains a cell for every combination of the categories in the two variables. Inside each cell is the number of cases that fit that particular combination of responses. SPSS can also report the row, column, and total percentages for each cell of the table.

This is an example of CROSSTAB output from SPSS:

     - - - - - -   C R O S S T A B U L A T I O N   O F    - - -   
TOT PCT I 1 I 2 I 3 I 4 I 5 I
DBG8089 +--------+--------+--------+--------+--------+-----+
1 I 38 I 33 I 51 I 61 I 4 I 187
INCREASED I 20.3 I 17.6 I 27.3 I 32.6 I 2.1 I 23.0
I 19.3 I 16.2 I 23.4 I 33.7 I 33.3 I
I 4.7 I 4.1 I 6.3 I 7.5 I .5 I
3 I 159 I 171 I 167 I 120 I 8 I 625
DECREASED I 25.4 I 27.4 I 26.7 I 19.2 I 1.3 I 77.0
I 80.7 I 83.8 I 76.6 I 66.3 I 66.7 I
I 19.6 I 21.1 I 20.6 I 14.8 I 1.0 I
COLUMN 197 204 218 181 12 812
TOTAL 24.3 25.1 26.8 22.3 1.5 100.0

This is the general, abstract form of CROSSTABS output

Independent Variable
Dependent Variable
category 1
category 2
. . category k
Category 1
Table entries consist of frequencies, or percentages, or both. Intersections of rows and columns are called "cells."
row 1
Category 2
row 1
Category j
row j
Grand N

The percentage entries in the cells should sum to 100% at the bottom of the table if CELLS=COLUMN (print column percentages) is requested.


produces a table with one variable at the side and another at the top, a "cross-tabulation"

CROSSTABS: [varlist] list of variable names, joined with the SPSS keyword BY
/CELLS= [specifies what values are printed in the cells of the table]
main ones are COUNT of cases and % by ROW, COLumn, and TOTAL number of cases
Asking for percents by columns is desired if you lay out the table with the independent variable in the columns (i.e., across the top)
/STATISTICS: prints requested measures of association
CELLS= Should you use COLUMN or ROW?
It depends on which of your variables you regard as "independent."
Always compute percentages using the N's in the marginals of whichever variable you regard as independent.
Conventionally, the column variable (across the top) is the independent variable, which calls for COLUMN.
CROSSTABS generates many different statistics, some of which are more useful than others.
STATISTICS= CHISQ PHI CC will generate these statistics
chi-square (X2)-- tests for independence between nominal variables
phi (2x2 tables) and Cramer's V (nxn tables)--measure of association based on X2
Contingency Coefficient, C
Consider CROSSTABS commands for today's assignment using the vote92 file. You'll have to put in the right variables for vote96
CROSSTABS V2 BY V8 V125 / V2 BY V8 BY V125
/CELLS=COUNT COLUMN / STATISTICS= [names of statistics here]

Conventions and advice concerning CROSTABS
By convention, the independent variable is arranged across the top of the table, unless number of categories or size of space prohibit.
ALWAYS, percentages are computed within the categories of the independent variable -- as shown in the sample table -- and the number of cases on which the percentages are based is always given.
Try to avoid cluttering a table with unnecessary information -- e.g., percentages by rows AND columns -- even percentages AND frequencies in the same cell.
Follow format in professional journals for constructing tables; it is bad form to submit raw printout from SPSS runs as tables in course papers.
Limitations of CROSSTABS print format
CROSSTABS tables in SPSS are limited to 10 categories for the independent variable (across the top) on standard-size (wide) paper.
There is no limitation on the number of categories for the dependent variable -- down the side.
Consequences of the limitation
Tables with more than 10 categories for the independent variable will be "wrapped around" and printed as a "continuation" of the first table.
Consider this example:
Supposing an AGE variable has values ranging from 17 to 99 and an INCOME variable has 20 coding categories.
If one specified CROSSTABS INCOME BY AGE, only the first 10 of AGE's values could fit across the top of the page, and a continuation table would be printed on another page.
However, the command CROSSTABS AGE BY INCOME would place AGE along the side, allowing it to print out in full on one table (if one really wanted age by exact years).
The AGE variable could also be "recoded" into fewer categories with handled by using the RECODE command in SPSS.
RECODE can be used either to change or to combine codes assigned to variables in an SPSS file.
For example, V8 is a 7 category measure of party identification, ranging from 0 to 6.
These scores can be "recoded" to a 3-point scale as follows:
RECODE V8 (0=1) (2,4=3) (6=5)
When placed before the CROSSTABS command, RECODE will change the variable into a trichotomy: 1=Democrats, 3=Independents, and 5=Republicans.