Several students have chosen to explain voting behavior,
using data from the States file.
- If you choose to do
this, you must pay attention to the level of
analysis inherent in that file.
- The states file
contains data on people in the aggregate--it
reports what proportion of all votes went to
Bush.
- Sociology uses the
term ecology for the study of
groups
- So this type of data
is known in social research as ecological
data.
- Because the
states file has observations on states, not
on individual voters, you will be explaining the voting
behavior of states, not voters.
- Accordingly, you can
make statements like the following:
- The correlation
between percent black and percent vote for Reagan
in 1984 across all American states was -.56, which
indicates that 31% of the variance in the states'
votes for Reagan can be explained by the racial
composition of the state.
- From the same
correlation, you cannot make this
statement:
- -- which indicates
that 31% of the variance in voting choice for
Reagan can be explained by the racial
characteristics of American voters.
- If you make such a
statement, you are committing an ecological
fallacy--making an unsupported generalization from
group data to individual behavior.
- If you wish to predict
the voting behavior of individual citizens, you must use
survey data on individual respondents, such as
- vote00
- vote96
- vote92
- vote88
This example illustrates the ecological fallacy in
operation:
Consider
this set of [ecological] data for
three communities:
|
r = 1.00
|
Community
|
% Republican
vote
|
% over $50,000
|
A
|
25
|
25
|
B
|
50
|
50
|
C
|
75
|
75
|
Observing that r=1.0, one might be
tempted to draw this erroneous
conclusion:
There is a perfect relationship between
making over $50,000 and voting
Republican.
|
In truth, the correlation between wealth and
Republican vote for individuals
in the same three communities could have been
r=-.33
Consider three separate
samples of 100 voters taken in each community:
Community A
|
Community B
|
Community C
|
|
voted Rep
|
voted Dem
|
|
Under $50,000
|
25
|
75
50
|
75%
|
Over $50,000
|
25
0
|
25
|
25%
|
|
25%
|
75%
|
|
|
|
voted Rep
|
voted Dem
|
|
Under $50,000
|
50
|
50
0
|
50%
|
Over $50,000
|
50
0
|
50
|
50%
|
|
50%
|
50%
|
|
|
|
voted Rep
|
voted Dem
|
|
Under $50,000
|
25
|
25
0
|
25%
|
Over $50,000
|
75
50
|
25
|
75%
|
|
75%
|
25%
|
|
|
Now let's combine the three
samples into one, looking at only the survey data for
individuals:
|
voted Rep
|
voted Dem
|
|
Under $50,000
|
100
|
50
|
150
|
Over $50,000
|
50
|
100
|
150
|
|
150
|
150
|
150
|
The correlation for these survey data on individuals
is r = -.33
CONCLUSION: Ecological correlations are unreliable
indicators of individual correlations.
|