Path: janda.org/c10 > Syllabus > Topics and Readings > Modeling Relationships > Multiple R in state voting
Multiple Regression in State Voting Patterns

 Application of multiple regression to explaining the vote for Reagan by state in 1984:

  • The setting of the 1984 election:
    • Republican President Ronald Reagan ran against the Democratic candidate, Walter Mondale
      • How well would Reagan do in 1984 comapred with his election in 1980?
    • Walter Mondale had won the nomination handily over Jesse Jackson, who promised to mobilize the black vote in 1984
      • What impact would the "Jackson factor" have on turning out the Democratic vote in 1984?
    • Mondale's running-mate for Vice President was Geraldine Ferraro
      • What impact would the "Ferraro factor" have on turning out women to vote for the ticket in 1984?
    • Let's include all three variables in a model to explain Reagan's vote in 1984.
  • Matrix of correlations:
  • Use of Multiple Regression in SPSS 10 to execute this model:
    • Under Analyze in the Menu select Linear Regression
      • In the "Dependent" box, enter the name of the variable you want to explain: Reagan84
      • In the "Independents" box enter the names of the explanatory variables: Reagan80, PctBlack, PctWomen
    • for this analysis, select STEPWISE [enters the variables one at a time] in the "Method" box.
    • Click on the "Statistics" button and check
      • "Estimates" for Regression Coefficients and
      • "Model Fit"
      • Press "Continue"
    • Click on "OK"


Here are the series of output boxes for the above SPSS run:

  • Interpretation:
    • Although three independent variables were offered for inclusion using the stepwise procedure, only Reagan84 was included in Model 1
    • The box for the "Excluded Variables" shows that % Black and % Women were not significant at the .05 level and therefore were not selected for inclusion.


Another try: Reconsidering the politics of the 1980 and 1984 elections

  • In 1980, Reagan ran against President Jimmy Carter, a southerner
    • Southern states have the highest proportions of black voters
    • Perhaps Carter in 1980 ran relatively better in the south against Reagan than Minnesotan Mondale did in 1984
    • So perhaps we need to control for the south before assessing the impact of the Jackson factor
  • Let's redo the analysis entering the variable, South, scored 1 if the state is one of the 11 in the deep south, or 0 otherwise.
    • Here's the new correlation matrix
    • Note that the correlation between "south" and "Reagan80" was negative (as expected) but very small.

    CORRELATION:

    REAGAN80
    PCTBLACK
    PCTWOMEN
    SOUTH

    REAGAN84

    0.900

    -0.565

    -0.513

    0.123

    REAGAN80

    -0.595

    -0.537

    -0.082

    PCTBLACK

    0.522

    0.511

    PCTWOMEN

    0.211

  • Still, let's carry out the new stepwise analysis using four independent variables:


Interpreting this stepwise output

  • Adding south as a variable completely alters the analysis.
  • In Model 2, South is added to Reagan as a second explanatory variable, raising the R2 from .81 to .85.
  • In Model 3, once South is controlled for, %Black enters the equation as a significant variable (.05 level), raising the R2 to ,88.
  • However, %Women is not added into the equation to further increase the explanation:
    • As shown in the Excluded Variables table above, it is only significant at .612 -- which is well short of .05 and so not added.
    • In truth, there is relatively little variance among states in % women, making that a weak explanatory variable. 


Interpreting the regression coefficients in the REGRESSION printout:

  •  REGRESSION coefficients
    • Regression coefficients are embedded within a regression EQUATION
      • "b" coefficients are "unstandardized" coefficients and can be interpreted using the scale of measurement for the raw data.
      • "beta" coefficients are "standardized" and must be interpreted in terms of "standard deviation" units.
      • Because of the standardization, "beta" coefficients can be compared WITHIN equations; "b" coefficients cannot.
    • The regression coefficients in a multiple regression equation are "partial" coefficients because they state the "effect" of a given independent variable on the dependent variable while "controlling" for effects of other variables in the equation.
    • Regression coefficients measure CHANGE in Y, not percent of variance explained.
  •   CORRELATION coefficients:
    • Measure the STRENGTH of a relationship, i.e., the fit of observed Y values around the line representing predicted Y values according to a regression equation.
    • Simple correlations pertain to bivariate relationships involving only one independent variable and are expressed by r.
    • Multiple correlations pertain to multivariate relationships using multiple independent variables and are expressed by R.
    • The regression procedure refers to the "partial" correlations of variables not in the equation to choose which variables (if any) are to be added to the equation next.
      • Partial correlations computed by regression pertain to the correlation between a dependent variable, Y, and an independent variable, Xi, out of the equation while "controlling for" the effects of the other independent variables already in the equation.
      • In essence, partial correlations express the correlation between the residuals (i.e., deviations) of Y regressed on Xi and the residuals around the line of the regression equation built from the other variables in the equation.