Topic IX: Relationships between Discrete Variables

Path: janda.org/c10 > Syllabus > Topics > > Topic IX: Relations between Discrete Variables

IX. TESTING AND MEASURING RELATIONS BETWEEN DISCRETE VARIABLES

SPSS Users' Guide, Ch. 5: "Using Crosstabs to Obtain Crosstabulations and Measures of Association," only pages 63-71.

Schmidt, Ch. 13: "Nonparametric Tests," pp. 337-349

Unlike economists, who like to think they deal in "real" numbers, political scientists and sociologists often must rely on nominal and ordinal data measured as discrete variables. These discrete data values are commonly analyzed in contingency tables or cross-tabulations (hence Crosstabs), often only with the aid of percentage comparisons.

The world of nominal-level measures of association is quite unorderly. Some popular measures are based on chi-square. Chi-square is one of the most common tests of statistical significance in the social science literature. Its popularity lies in its applicability to nominal data and in its intuitive basis of understanding as differences between "expected" and "observed" frequencies.

Assignment: There is no substitute for computing chi-square by hand to understand the statistic. Assume that you encounter this table showing the distribution of political party preferences in three wards with a total of 400 registered voters in Evanston. Wards 1 and 2 have more Republicans than Ward 3. Overall, are the differences among the three wards statistically signicant? That is, can the observed differences be attributed to chance fluctuations?

	Democrat	Republican	Independent	TOTAL
Ward 1	50	70	30	150
Ward 2	30	60	20	110
Ward 3	60	65	15	140
TOTAL	140	195	65	400

Using this working table, compute the chi-square value and test it for significance at the .05 level.

See the middle column of the last page in the Schmidt reading to see how to calculate "expected" values.
I've done the first one for you

Cell entries:	Observed O	Expected E	O - E	(O - E)²	(O - E)²E
Row 1, Col. 1	50	52.5
Row 1, Col. 2	70
Row 1, Col. 3	30
Row 2, Col. 1	*
Row 2, Col. 2
Row 2, Col. 3
Row 3, Col. 1
Row 3, Col. 2
Row 3, Col. 3
X² = chi-square = sum of this column =
*(etc., you take over and fill in the cells; then carry out the computation.)

NOVEMBER 27

X² AND
MEASURE OF ASSOCIATION

SPSS Users' Guide, Ch. 5: "Using Crosstabs to Obtain Crosstabulations and Measures of Association," only pages 72-73.

As a soft-summer night insires songwriters to compose romantic ballads, chi-square inspires statisticians to devise measures of association based on chi-square's sensitivity to differences between observed and expected frequencies. Alas, songwriters have produced better results. All the various chi-square based measures of association have important shortcomings, which explains why lambda is usually the preferred measure of association for nominal variables.

Assignment: Example 1 in Ch. 5 of the SPSS Users' Guide illustrates Crosstabs by comparing "religious preferences" by "region" for a national sample of citizens in the gss93.sav file distributed SPSS 10. Instead of reproducing that analysis, go to the vote00 and use Crosstabs to crosstabulate "feelings toward the bible" (dependent) by "region" (independent). Click on the "Statistics" button and then check "Chi-square" and each of the "Nominal" measures.

NOVEMBER 28

Optional Session on Research Papers

NOVEMBER 29

MEASURES OF ASSOCIATION FOR ORDINAL DATA: TAKE YOUR PICK

SPSS Users' Guide, Ch. 5: "Using Crosstabs to Obtain Crosstabulations and Measures of Association," only pages 84-87.

Sanford Labovitz, "The Assignment of Numbers to Rank Order Categories," American Sociological Review, 35 (1970). (On the website.)

Most statistics texts advise using ordinal measures of association for ordinal data. Crosstabs can generate the most commonly used measures: gamma, tau, and Somer's D. For many years, I faithfully taught these measures, but no longer. They have two shortcomings:

all the measures except gamma lack a PRE interpretation, and
gamma's PRE interpretation is so strained that it is nearly useless.

I have since streamlined the course by dropping these measures, expecting that they will continue to disappear from the literature. What measures should be used instead to analyze ordinal data? I suggest using the product moment correlation for data that are clearly measured on an underlying continuous variable, even if the categories are discrete. The basis for this advice lies in the practical consequences of measurement -- as described in Labovitz's early article. Labovitz argues that it is all right to use Pearson product moment correlations with ordinal data. Why?

Assignment: Using the vote00 data, use crosstabs to relate "party identification" (dependent variable) with "liberal-conservative self-placement" (independent). Click on the "Statistics" button and then check "Correlations" and each of the "Ordinal" measures. What do you make of these different measures of association for that relationship? Does the value of Pearson's r conform closely enough to the various ordinal measures of association on your Crosstabs output to support Labovitz?

NOVEMBER 30

WRAP-UP ON MULTIVARIATE ANALYSIS

William G. Jacoby and Saundra K. Schneider, "Variability in State Policy Priorities: An Empirical Analysis," Journal of Politics, 63 (May, 2001), 544-568.

This is a recent article with a clear application of regression analysis. I've abbreviated the 22 pages to just 8 and added marginal notations. Read it, and I'll discuss it in class with reference to writing your research paper.