Path:
janda.org/c10
> Overview
of SPSS
>
Crosstabs
|
CROSSTABS
in SPSS 10
|
Crosstabs is an SPSS procedure that
cross-tabulates two variables, thus displaying their
relationship in tabular form. In contrast to
Frequencies, which summarizes information about one
variable, Crosstabs generates information about
bivariate relationships.
Crosstabs creates a
table that contains a cell for every combination of
categories in the two variables.
- Inside each cell is the
number of cases that fit that particular combination of
responses.
- SPSS can also report the
row, column, and total percentages for each cell of the
table.
Because Crosstabs
creates a row for each value in one variable and a
column for each value in the other, the procedure is
not suitable for continuous variables that
assume many values. Crosstabs is designed for
discrete variables--usually those measured on nominal
or ordinal scales.
How crosstabs differs
from a scatterplot under the "Graphs"
Menu
- Like crosstabs,
scatterplot portrays the joint distribution of two
variables.
- Unlike crosstabs,
scatterplot is designed for continuous
variables which distribute cases across unique points in
space.
- To underscore the
difference, consider this scatterplot that plots
"party identification" (dependent) by "ideology"
(independent) for several hundred cases in the
vote00 file:
|
A scattergram is not
useful for this analysis because
- neither variable is
continuous, so the cases are not spread along both axes
- both variables are
discrete, so their values occupy specific points along
ech axis
- each
[]
in the scattergram simply indicates the occurrence of at
least one case at that point
- there are hundreds of
cases represented in the plot, so many cases lie behind
each []
-- but we can't tell how many.
Here is the Crosstabs output for the same two
variables:
Crosstabulation
-- K1x. PARTY ID SUMMARY by R's placement on
Liberal-Conservative scale
|
|
R's placement on
Liberal-Conservative scale
|
Total
|
K1x.
PARTY ID SUMMARY
|
extremely
liberal
|
liberal
|
slightly
liberal
|
moderate
|
slightly
conservative
|
conservative
|
extremely
conservative
|
|
Strong
Democrat
|
7
|
35
|
24
|
32
|
10
|
15
|
1
|
124
|
Weak
Democrat
|
2
|
13
|
17
|
38
|
11
|
8
|
|
89
|
Ind
Democrat
|
6
|
16
|
21
|
46
|
7
|
11
|
|
107
|
Independent
|
|
4
|
8
|
30
|
9
|
10
|
3
|
64
|
Ind
Republican
|
|
4
|
7
|
32
|
30
|
15
|
6
|
94
|
Weak
Republican
|
1
|
2
|
6
|
24
|
29
|
24
|
6
|
92
|
Strong
Republican
|
|
1
|
2
|
12
|
17
|
57
|
10
|
99
|
Total
|
16
|
75
|
85
|
214
|
113
|
140
|
26
|
669
|
Note two things about this
table:
- The cell entries
indicate the number of cases (respondents) with that
combination of scores on each variable.
- The low values
(0=Democrat) occurs at the top of the table in
crosstabs
- They occur at the
bottom of the table in scattergram
- Below is the general,
abstract form of crosstabs output
|
|
Independent
Variable
|
Dependent
Variable
|
category
1
|
category
2
|
. .
category k
|
Totals
|
Category
1
|
Table
entries consist of frequencies, or percentages,
or both. Intersections of rows and columns are
called "cells."
|
row
1
|
Category
2
|
row
1
|
.
.
Category j
|
.
.
row j
|
Totals
|
N
|
N
|
N
|
Grand
N
|
Percents
|
100.0
|
100.0
|
100.0
|
100.0
|
- Crosstabs are
usually presented with the independent variable
across the top and the dependent along the
side.
- This follows the
presentation in scattergram plots.
- As explained on page 68
of the Users' Guide, SPSS can calculate
percentages for cell entries in three different
ways
- calculated according
to the number of cases in each column
- calculated according
to the number of cases in each row
- calculated according
to the TOTAL number of cases in the table
|
Conventions and advice concerning
crosstabs
- By convention, the
independent variable is arranged across the top of the
table, unless number of categories or size of space
prohibit.
- ALWAYS, percentages are
computed within the categories of the independent
variable -- as shown in the sample table.
- Percentages are
computed by rows only if the layout of the data call
for placing the independent variable in the
rows.
- That may be needed if
there are more categories in the independent variable
than fit easily along the columns
- Only unique
analytical needs invite calculating percentages by
totals--avoid doing this unless you know
why.
- SPSS offers the option
of calculating percentages all three ways, but that
produces a cluttered table.
- avoid checking all
three options for percentages.
- Limitations of
crosstabs print format
- crosstabs
tables in SPSS can't handle more than seven categories
in a column variable without "wrapping"
over.
- There is no
limitation on the number of categories for the
dependent variable -- down the side.
- Consequences of the
limitation
- Tables with more than
10 categories for the independent variable will be
"wrapped around" and printed as a "continuation" of
the first table.
- Consider this
example:
- Supposing an AGE
variable has values ranging from 17 to 99 and an
INCOME variable has 20 coding
categories.
- If one specified
CROSSTABS INCOME BY AGE, only the first 10 of AGE's
values could fit across the top of the page, and a
continuation table would be printed on another
page.
- However, the
command CROSSTABS AGE BY INCOME would place AGE
along the side, allowing it to print out in full on
one table (if one really wanted age by exact
years).
The AGE variable could also be "recoded" into fewer
categories with handled by using the RECODE command
in SPSS.
- RECODE can be used
either to change or to combine codes assigned to
variables in an SPSS file.
- For example, V8 is
a 7 category measure of party identification,
ranging from 0 to 6.
- These scores can
be "recoded" to a 3-point scale as follows:
- RECODE V8 (0=1)
(2,4=3) (6=5)
- When placed before
the CROSSTABS command, RECODE will change the
variable into a trichotomy: 1=Democrats,
3=Independents, and 5=Republicans.
|