Path: janda.org/c10 > Syllabus > Topics and Readings > Modeling Relationships > Multiple R in state voting
Multiple Regression in State Voting Patterns

Application of multiple regression to explaining the vote for Reagan by state in 1984:

• The setting of the 1984 election:
• Republican President Ronald Reagan ran against the Democratic candidate, Walter Mondale
• How well would Reagan do in 1984 comapred with his election in 1980?
• Walter Mondale had won the nomination handily over Jesse Jackson, who promised to mobilize the black vote in 1984
• What impact would the "Jackson factor" have on turning out the Democratic vote in 1984?
• Mondale's running-mate for Vice President was Geraldine Ferraro
• What impact would the "Ferraro factor" have on turning out women to vote for the ticket in 1984?
• Let's include all three variables in a model to explain Reagan's vote in 1984.
• Matrix of correlations:
• Use of Multiple Regression in SPSS 10 to execute this model:
• Under Analyze in the Menu select Linear Regression
• In the "Dependent" box, enter the name of the variable you want to explain: Reagan84
• In the "Independents" box enter the names of the explanatory variables: Reagan80, PctBlack, PctWomen
• for this analysis, select STEPWISE [enters the variables one at a time] in the "Method" box.
• Click on the "Statistics" button and check
• "Estimates" for Regression Coefficients and
• "Model Fit"
• Press "Continue"
• Click on "OK"

Here are the series of output boxes for the above SPSS run:

• Interpretation:
• Although three independent variables were offered for inclusion using the stepwise procedure, only Reagan84 was included in Model 1
• The box for the "Excluded Variables" shows that % Black and % Women were not significant at the .05 level and therefore were not selected for inclusion.

Another try: Reconsidering the politics of the 1980 and 1984 elections

• In 1980, Reagan ran against President Jimmy Carter, a southerner
• Southern states have the highest proportions of black voters
• Perhaps Carter in 1980 ran relatively better in the south against Reagan than Minnesotan Mondale did in 1984
• So perhaps we need to control for the south before assessing the impact of the Jackson factor
• Let's redo the analysis entering the variable, South, scored 1 if the state is one of the 11 in the deep south, or 0 otherwise.
• Here's the new correlation matrix
• Note that the correlation between "south" and "Reagan80" was negative (as expected) but very small.
 CORRELATION: REAGAN80 PCTBLACK PCTWOMEN SOUTH REAGAN84 0.900 -0.565 -0.513 0.123 REAGAN80 -0.595 -0.537 -0.082 PCTBLACK 0.522 0.511 PCTWOMEN 0.211
• Still, let's carry out the new stepwise analysis using four independent variables:

Interpreting this stepwise output

• Adding south as a variable completely alters the analysis.
• In Model 2, South is added to Reagan as a second explanatory variable, raising the R2 from .81 to .85.
• In Model 3, once South is controlled for, %Black enters the equation as a significant variable (.05 level), raising the R2 to ,88.
• However, %Women is not added into the equation to further increase the explanation:
• As shown in the Excluded Variables table above, it is only significant at .612 -- which is well short of .05 and so not added.
• In truth, there is relatively little variance among states in % women, making that a weak explanatory variable.

Interpreting the regression coefficients in the REGRESSION printout:

•  REGRESSION coefficients
• Regression coefficients are embedded within a regression EQUATION
• "b" coefficients are "unstandardized" coefficients and can be interpreted using the scale of measurement for the raw data.
• "beta" coefficients are "standardized" and must be interpreted in terms of "standard deviation" units.
• Because of the standardization, "beta" coefficients can be compared WITHIN equations; "b" coefficients cannot.
• The regression coefficients in a multiple regression equation are "partial" coefficients because they state the "effect" of a given independent variable on the dependent variable while "controlling" for effects of other variables in the equation.
• Regression coefficients measure CHANGE in Y, not percent of variance explained.
•   CORRELATION coefficients:
• Measure the STRENGTH of a relationship, i.e., the fit of observed Y values around the line representing predicted Y values according to a regression equation.
• Simple correlations pertain to bivariate relationships involving only one independent variable and are expressed by r.
• Multiple correlations pertain to multivariate relationships using multiple independent variables and are expressed by R.
• The regression procedure refers to the "partial" correlations of variables not in the equation to choose which variables (if any) are to be added to the equation next.
• Partial correlations computed by regression pertain to the correlation between a dependent variable, Y, and an independent variable, Xi, out of the equation while "controlling for" the effects of the other independent variables already in the equation.
• In essence, partial correlations express the correlation between the residuals (i.e., deviations) of Y regressed on Xi and the residuals around the line of the regression equation built from the other variables in the equation.